[00:00:01] Speaker 02:
Our next case is number 242205, Zentian Limited versus Apple Inc.

[00:00:13] Speaker 02:
Okay, Ms. Rhodes.

[00:00:16] Speaker 03:
May it please the court. Catherine Rhodes for Zentian. I'd like to reserve three minutes for rebuttal.

[00:00:22] Speaker 03:
The Patent Trial and Appeal Board legally erred in its claim construction of feature vector. The board deleted requirements in the claim itself and added concepts like code words that are found nowhere in the patent. Instead, the board based its construction on Apple's prior art reference and expressly construed the claim to cover Apple's prior art embodiment. The result here is a prior art-driven construction that is inconsistent with the specification in the claims. But under a proper construction of feature vector, Apple's obviousness theory based on code words fails.

[00:00:57] Speaker 03:
So turning to the board's construction here, the board deleted the requirement in limitation 1B, the wherein clause, that says derived quantities are calculated from said digital audio stream. And it broadened it to cover derived quantities that are representative of the derived audio stream, but not calculated from that audio stream, such as a code word. And it did so after agreeing with Zentschian's expert that a person of skill in the art would view the techniques of calculating feature vectors and code words as distinct, which is on appendix page 12, and went on to define the term feature vector broader than its plain and ordinary meaning.

[00:01:43] Speaker 03:
The real problem here, though, is that the inventors in the 377 patent did not act as their own lexicographers.

[00:01:51] Speaker 03:
there's no clear intent or express definition in the patent to redefine feature vector beyond its plain and ordinary meaning. And critically, the board does not point to anything in the specification where its construction comes from, and Apple doesn't either. It doesn't really defend the board's construction in its briefs.

[00:02:12] Speaker 03:
Now, the patent here does not mention the concept of a code word anywhere or vector quantization. That comes from Apple's prior art reference, Jing, and here the board conflated the claim construction analysis with the obviousness analysis in importing a structure from Jing into the definition of feature vector itself.

[00:02:34] Speaker 02:
But aren't the code words derived from feature vectors?

[00:02:38] Speaker 03:
They are not derived from feature vector in the meaning of the claim because code words are preset values that Jing says are derived or come from training data. So you have training data, the code words are are generated, there are vector quantities there that are computed, and then those multi-dimensional vector quantities are reduced down to single dimensional values that are stored in a code book.

[00:03:11] Speaker 03:
It could be a series of numbers, 0 through 10, And that codebook with those single dimensional values are then stored in memory, similar to a model or a dictionary. And what happens in Jing is a feature vector is calculated and then the technique of vector quantization is used, which is essentially a lookup operation to say which code word entry in this pre-existing codebook is the feature vector most similar to and it selects that code word from the pre-existing code book and uses that for word identification, not the feature vector that was calculated from the digital audio stream and that is the key aspect of the claim here that the board deleted from its construction.

[00:03:59] Speaker 03:
And Apple doesn't dispute that.

[00:04:01] Speaker 02:
What is the claim referring to if it's not referring to code words when it's talking about derived quantities?

[00:04:09] Speaker 03:
Sure. So this is where I think the board got tripped up is the meaning of derived quantities. But that meaning is clear from the specification. At column 12, lines 56 to 63, the specification provides context for what extracted and or derived quantities mean. And there the patent explains that in calculating a feature vector, you first take – a slice of audio, a 10 millisecond slice of audio, and you extract or compute spectral components of that audio signal, those are extracted quantities, and then you can take the derivatives of those quantities, and those are derived quantities.

[00:04:48] Speaker 03:
A feature vector, which is a point in an n-dimensional space, is the collection of those values. So it could be a collection of 39 extracted values, 39 derived values, or a combination of them both, but the feature vector themselves is the combination of those components, similar to a point on a 3D graph. where you have a point on a graph, that would be the feature vector, and then you have an x, y, and z coordinate. Your x, y, and z coordinates would be your extracted or derived quantities that together form the feature vector.

[00:05:25] Speaker 03:
And these things represent aspects of the audio like pitch, frequency, or signal that help the system understand what sound it's hearing, but in a numerical representation so these complex models can process it and figure out what was said.

[00:05:43] Speaker 03:
So I think here, the correct construction of feature vector here would be a point in the n-dimensional space, which comes from the patent at column 13, lines 19 to 23. But that construction would go in limitation 1A, where feature vector is introduced, because that's where the claim first says that a first programmable device is programmed to calculate a feature vector. And then limitation 1B, the wherein clause, The meaning there is clear as it is.

[00:06:15] Speaker 03:
You don't need the board's additions and deletions from the claim. Can you remind me?

[00:06:21] Speaker 01:
I'm sorry to interrupt, but can you remind me, where did this construction come from? Did it differ from the construction? Did each side propose a construction? And did it differ from both of those?

[00:06:31] Speaker 03:
Neither side proposed a construction, Your Honor. The core dispute between the parties was whether Ging's code words meet Limitation 1G in the patent. Limitation 1G refers to feature vectors, so the parties were disputing whether a code word is a feature vector.

[00:06:48] Speaker 01:
So I take your point that you started with, which is it kind of feels odd at some level to be imposing in the claim construction a word that's not in the claims. However, we've got lots of cases like O2 Micro, which in the context at least of infringement, expect that the construction will inform, if not answer, the question of infringement. So...

[00:07:13] Speaker 01:
Kind of think the board was trying to be helpful here in terms of coming up with a construction that really spoke to the actual dispute between the parties.

[00:07:22] Speaker 03:
Right? I think that's right, Your Honor. I think that's exactly what the board was trying to do. And we're not suggesting the board was wrong in thinking it needed to construe the claim. But when it did so, it deleted a critical language from the claim that says that the derived quantities are from said digital audio stream. And it treated extracted and derived quantities differently in terms of how they're created. The board's construction is consistent with the claim for extracted quantities and saying that those are from the digital audio stream. but it treated derived quantities as different.

[00:07:54] Speaker 03:
And I think what is really important is Apple and its own expert, at Appendix 775, this is Apple's expert's declaration, he agrees that the meaning of that limitation, referring to extracted and derived quantities, that a person of skill would have understood that that includes spectral components of the feature vector and their component derivatives.

[00:08:16] Speaker 01:
But what is the difference between the two? I thought what the board was pointing out was it says extracted and or, so assume that they meant two different things.

[00:08:29] Speaker 03:
Yes, they are two different things. Extracted quantities would be if you're computing the spectral components from the audio signal, like pitch, frequency, sound. derivative components or derived quantities would be you taking the derivative, the first derivative or the second derivative or the third derivative, and you're further doing computations on the extracted quantities themselves. That is the difference between those two, and so they are both values computed from the audio stream, but they're different types of calculations and they represent the audio in slightly different ways, such that either or both of those can make up a feature vector.

[00:09:09] Speaker 03:
And I think the board, I think, misunderstood Zenshin's arguments at the hearing where Zenshin agrees that extracted and derived have different meanings in the claim. We do not dispute that, but they are things that are calculated from the audio. These numbers do not exist in the audio stream as it is. They are numerical representations that require...

[00:09:37] Speaker 03:
Well, it doesn't have to be. It's a calculation in the sense that a feature vector with the extracted and derived quantities, it's a numerical value with each of the different quantities. That doesn't just exist in the audio that you can just pluck out and see the value. There's different types of computations. We're not limiting the claims and don't suggest that the claims are limited to the type of computation you can do to get a feature vector. There are a variety of different ways to do that in the ART depending on how you want to configure your system.

[00:10:13] Speaker 03:
But, the claim does require that those values are calculated from the digital audio stream, and it is undisputed in this case that the code words in Jing, and this is at Appendix 910, Jing says the code words are derived from training data, not the data that is input into the system that Apple is pointing to to map onto the claims. And so code words are not feature vectors in the meaning of the claim here, even if there might be a vector component of a code word in a general sense based on how they were computed originally from the training data.

[00:10:53] Speaker 03:
The reason this matters here is that the board's finding that Jing teaches limitation 1G, which requires distances calculated from a first feature vector, was expressly based on its claim construction. Apple suggests that the board made an independent finding here, but the board was very clear that its finding for this limitation was based on its claim construction that included code words in that definition.

[00:11:24] Speaker 03:
And so if the panel reverses the board's claim construction, it must necessarily reverse the board's finding on that limitation as well.

[00:11:37] Speaker 03:
And that is because when you're calculating these distances from these preset code words, you're not calculating them from the first feature vector that is mentioned in the claim that comes from that audio signal. It's a wholly separate value that is created before... audio ever enters the system. So under cases like Bell Communications that dealt with similar language derived from, something can't derive from audio if it is calculated before the audio ever exists.

[00:12:10] Speaker 03:
And so if the panel has no further questions, I'll reserve the remainder of my time. Okay.

[00:12:23] Speaker 00:
May it please the court, Seth Lloyd for Apple. The court correctly construed feature vector based on the claims expressed recitation of extracted and or derived quantities.

[00:12:34] Speaker 00:
I think a lot of the discussion we heard this morning says that the board deleted what the that quantities must be derived from, that the board excluded that it has to be derived from the audio signal. But the board's construction in explaining it expressly required quantities that are derived from the audio. So at the bottom of Appendix 11, the board says the claims can include extracted quantities, but also include quantities derived from the stream. The bottom of Appendix 12, again, explaining its construction. A code word or representative feature vector comprises a plurality of derived or vector quantized quantities from said digital audio stream.

[00:13:12] Speaker 02:
The board consistently... What they're arguing is that the derivation here is from, I guess, feature vectors in training data rather than the actual stream that's being analyzed here.

[00:13:24] Speaker 00:
Yes, and what the board... Is that accurate? No, let me say it does not give the complete picture, Judge Dyke, and the board made a fact finding on that issue. And what the board found is it's derived from the training data and also from the audio frame. So that's the board's explanation at Appendix 27. And I think counsel's explanation just now largely explained what supports that finding. counsel said derived from simply means calculated from the audio stream.

[00:13:56] Speaker 02:
And the way that the board understood, and the parties, I think, largely agree about... Before you go on, so where in 27 do they say it's derived in part from the training data and in part from the stream?

[00:14:08] Speaker 00:
Well, they say the second part, Judge Dyke. They say it's the... Where on page 27? The top paragraph, the very last sentence, additionally, The board says, additionally, although code words may be determined in advance, they are still used in JANG to derive representative vector quantized versions of the computed feature vectors. And the board cites our expert at appendix 788 to 90 and then also cites the JANG reference. So where does your expert say this?

[00:14:38] Speaker 00:
And so that's at 788. I think the better... Yes, it's at 788.

[00:14:47] Speaker 00:
But he's building on his explanation that came earlier, Judge Dyke. And I think the best part of his declaration on this point is that... Let me make sure I give the right site.

[00:15:01] Speaker 00:
It's at Appendix 717. Let me see if that's the one I want. Okay.

[00:15:19] Speaker 00:
So it might take two steps, but the first step is at Appendix 717, paragraph 36, so the top half of that page, they're discussing the NADAS reference, which the other side also relies on. Our expert explained that NADAS teaches that the components of the feature vectors and prototype vectors, so prototype vectors in NADAS are code words. That's what the other side has said. that both the feature vectors and the prototype vectors correspond to well-known Kepstral coefficients, linear predictive coding coefficients, or frequency band-related characteristics.

[00:15:54] Speaker 00:
So what that is saying is, that both feature vectors and code words have the same components. They're both multi-component quantities, and the components are audio frequency components.

[00:16:05] Speaker 02:
Not exactly clear as to whether it comes from the training data or not.

[00:16:09] Speaker 00:
I agree, Judge Dyke. So then the next part is to go to Nautas itself, which is what he's citing there, and that's at Appendix 2000.

[00:16:18] Speaker 00:
Okay.

[00:16:20] Speaker 00:
It's in column one of Ad Appendix 2000, the paragraph that starts about line 44. So over on the left. And again, when Natas says prototype vectors, the other side has characterized that as code words. And what Natas says is that in associating prototype vectors with feature vectors, the feature vector for each time interval is typically compared to each prototype vector. Based on a predetermined closeless measure, the distance between feature vector and each prototype vector is determined and the closest prototype vector is selected.

[00:16:53] Speaker 00:
So what that's explaining is I think largely what you just heard from my friend on the other side, which is the way, yes, the codebook of all the possible code words is determined in advance. But now when you have each audio frame, you have to make a determination. You have to make a calculation about which of those code words is the best approximation for that audio frame. And so you do a closeness measure. You make calculations using the input audio frame and the code book, and you find what is the best approximation for that specific audio frame.

[00:17:27] Speaker 00:
That's what the board found at Appendix 27. That's what our expert explained. And that's, in fact, the way, as you heard from opposing counsel, that's the way it works in the art. And so the other side's argument is really based on this false notion that the code words can only be derived based on on training data, but in fact the code words are derived from training data and then from the given audio frame.

[00:17:52] Speaker 02:
You're saying the latter is true because they're selected based on the stream.

[00:17:57] Speaker 00:
They're calculated based on the audio stream. So there's a calculation, a closeness determination is what Nautas explains. It's a calculation that's made using the audio stream and the prototype vector, going through it and basically calculating for each one which of these is the best approximation for this specific audio frame. That's what the board understood as kind of the ordinary meaning of derived, which we heard. It's not any specific type of calculation. It's just a calculation made with the input audio frame.

[00:18:27] Speaker 00:
And I think just to take a step back, What we're talking about is the board's fact-finding. The board made fact-findings about this at Appendix 12 in the context of claim construction and then at Appendix 27 in the context of analyzing the prior art. The other side, the opening brief does not challenge the board's fact-finding at all. And so this is an unchallenged finding from the board. And at the least, for the reasons we've talked about, the record amply supports that finding. And this also, I think, goes to some of the other questions that the court had the discussion about. Was this claim construction?

[00:18:58] Speaker 00:
Was this fact finding? I think the board understood. The board articulated early in its decision that there was a dispute about whether code words are feature vectors. And as is often the case, you can sort of conceptualize that as a dispute about the scope of the claim or a dispute about the finding of fact. The board said, we'll treat that as an implicit claim construction issue. Analyze claim construction. But then even in the context of claim construction, it made findings. Appendix 12 said, The board expressly found as fact that a person of skill in the art, citing our expert and crediting our expert, that a person of skill in the art would have understood that code words have multiple quantities.

[00:19:38] Speaker 00:
So this notion that it's a single quantity, the board disagreed with that on the facts and then found that those multiple quantities, a person of skill in the art, Bottom of Appendix 12 would have understood those multiple quantities are derived from the audio stream. That's the discussion we've been having. The record supports that finding. Even in the context of claim construction, the board can make findings of fact as it did here, and the other side would need to show a lack of substantial evidence, which they cannot show.

[00:20:11] Speaker 00:
I think that alone is enough to dispose of this appeal. We did raise alternative grounds. I think the other side touched on those briefly. We're, of course, happy to address those if the court has any questions.

[00:20:22] Speaker 00:
But if not, we ask that you affirm.

[00:20:26] Speaker 02:
Okay, thank you. Ms. Rhodes, you have three minutes.

[00:20:41] Speaker 03:
Starting with the derived from language that Council kept referring to and the board, that code words are derived from feature vectors because they may be selected from the code book based on feature vectors in Jing that are calculated. The claim language, though, says calculated from the audio stream, not derived from, calculated from. And so that is a mathematical computation from the audio stream. And what the board did, focusing on the construction first, is broaden that to say representative of derived quantities, which has no – I didn't hear counsel point to anything in the specification – Defending the board's actual construction here.

[00:21:22] Speaker 02:
What they're suggesting, if I understand correctly, is that there's the calculation step because the selection of the particular code word involves calculations.

[00:21:32] Speaker 03:
That's the vector quantization piece, but Jing makes clear that the audio streams that are used to calculate feature vectors for speech recognition and code words for the code book are two different audio streams, which is the key issue here with the board's construction. And this is in Jing at Appendix 907. It starts at column 6, around line 46.

[00:21:59] Speaker 03:
Where Jing starts talking about speech is input into the system in the form of an audio voice signal. And it talks about how the system then is configured to have a feature extraction module that then can extract feature vectors. We do not dispute that Jing does teach feature vectors. The problem here is Apple does not point to Jing's actual feature vectors for its mapping of the limitations requiring distances to be calculated because it relies on the code word embodiment of Jing instead.

[00:22:34] Speaker 03:
And on the next page, appendix page 908 at the top of column 7, refers to code words using vector quantization techniques in a code book. What line is this? This is line like 4 to 5.

[00:22:49] Speaker 03:
Code words using vector quantization techniques in a code book derived from training data.

[00:22:57] Speaker 03:
Gene talks about how to train the system in column 11 around line 25 and talks about that's a totally different audio stream than what Gene is actually using and what Apple is pointing to for the capabilities of the claim to train. calculate feature vectors and distances in the claim. It's just a wholly different technique. And the board agreed with Zentian's expert in Appendix 12 that calculating distances from feature vectors and calculating probabilities from code words are distinct, and a person of skill would understand that.

[00:23:29] Speaker 03:
And the only reason it got to where it did is because it broadened that definition. So for those reasons, we ask that the court reverse the board's claim construction.

[00:23:37] Speaker 02:
Okay. Thank you. Thank both counsel. The case is submitted.