-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add speech recognition context to the Web Speech API #145
base: main
Are you sure you want to change the base?
Conversation
Hi @padenot, can you review this one too and also take a look at the explainer again when you get a chance? Feel free to assign another reviewer too if needed |
index.bs
Outdated
[Exposed=Window] | ||
interface SpeechRecognitionPhraseList { | ||
readonly attribute unsigned long length; | ||
getter SpeechRecognitionPhrase item(unsigned long index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is invalid WebIDL.
It would be either:
SpeechRecognition item(unsigned long index);
or
getter SpeechRecognition(unsigned long index)
What is the intent here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this to SpeechRecognitionPhrase item(unsigned long index)
, but using getter is actually how SpeechRecognitionResultList
is doing it, as well as some list objects I've seen in other specs, e.g. https://html.spec.whatwg.org/multipage/common-dom-interfaces.html#the-domstringlist-interface. I thought it's a standard thing to always define a getter for a list, but I don't see that getter is required in our use case, so I can either keep it or remove it.
5be471c
to
f1cd16f
Compare
Hi @padenot, I've updated the specs as well as the explainer according to your comments. Please take a look again when you get a chance. Thanks! |
Can we please focus on either the explainer or the spec patch? If we have a spec patch, an explainer shouldn't be necessary. If we aren't comfortable writing the spec patch right now because we want to iterate, it doesn't seem useful to update the spec patch. Let me know which one I should look at first please? |
Hi @padenot, you can focus on the spec patch right now. I'm keeping the spec patch and the explainer in sync and the spec patch has many more details than the explainer, so if we can reach consensus on the spec, we can also reach consensus on the explainer easily. I think the explainer is still necessary when we want to launch this feature since we will be asked for a link to an explainer in many places. The explainer is also a good place to show why we want to add contextual biasing, and provides brief introduction on the changes for people who don't want to learn about every detail in specs. |
It's not necessary. Details can and should be in informative notes or MDN. |
I've closed the PR for explainer. Can you now review the spec changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting there! lmk if I have been unclear in certain parts of the review.
Hi @padenot, I decided to remove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking a lot better, some questions still, but it feels we're almost there.
Hi @padenot, a gentle ping for reviewing this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there!
index.bs
Outdated
The setter steps are: | ||
1. If the {{SpeechRecognitionPhraseList/length}} of {{SpeechRecognition/phrases}} is greater than 0 | ||
and the system using the given value for {{SpeechRecognition/[[mode]]}} does not support contextual biasing, | ||
throw a {{SpeechRecognitionErrorEvent}} with the {{SpeechRecognitionErrorCode/not-allowed}} error code and abort these steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not-supported
for consistency? Or we want to clearly distinguish the two? What do we lose by using "normal" errors here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that when user is trying to set the mode here but get a phrases-not-supported
error, it seems weird that the error is not related to mode directly, but we can also add error message to explain this better, so I'll switch to phrases-not-supported
for consistency.
<dt><dfn method for=SpeechRecognitionPhraseList>addItem(|item|)</dfn> method</dt> | ||
<dd> | ||
This method adds the {{SpeechRecognitionPhrase}} object |item| to the list. | ||
When invoked, add |item| to the end of {{SpeechRecognitionPhraseList/[[phrases]]}}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we add an element twice with different scores? Do we want to use a set instead from the infra
link above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to avoid the same phrase with different boosts, we will need to use a map like map<string, float>
for the phrase-boost pair, rather than a set. I feel like this adds more limitations to the design which is not as desired. For example, if in the future we want to add a pronunciation attribute for each SpeechRecognitionPhrase
, using a list like now is easier than a map.
For now if user adds the phrase twice with different boosts, I think the system can use the boost that's updated at the second phrase, and this case should be rare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, my point is that we can do whatever, but it needs to be specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added descriptions for this case.
Introduce a new speech recognition context feature for contextual biasing
Remove SpeechRecognitionContext and add SpeechRecognitionPhraseList to SpeechRecognition directly Remove updateContext and always update phrases instead Rename context-not-supported error code to phrases-not-supported Add removeItem to SpeechRecognitionPhraseList
Introduce a new speech recognition context feature for contextual biasing
Preview | Diff