Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add evaluation metric for accuracy except device IDs #753

Open
gcampax opened this issue Aug 26, 2021 · 0 comments
Open

Add evaluation metric for accuracy except device IDs #753

gcampax opened this issue Aug 26, 2021 · 0 comments
Assignees
Labels
enhancement New feature or request training Issues with dataset generation, augmentation, training

Comments

@gcampax
Copy link
Contributor

gcampax commented Aug 26, 2021

As discussed in the meeting with Kevin. For most skills (practically, all but IoTs) device IDs don't need to be correct at parse time because they can be added automatically as postprocessing.

We still want to include them as "exact match accuracy" because they are part of the target program. We want pre-normalization token-by-token exact match accuracy to be the target metric due to how seq2seq works.

To account for this, and remove some less-relevant errors from error analysis of devices, we should introduce a new partial accuracy metric, "ok_without_device_id". This would be implemented in SentenceEvaluator and in the associated cmdline code.
We can implement this using some token manipulation (recognizing the sequence of tokens "id = GENERIC_ENTITY_*" and removing it), or with a proper NodeVisitor that visits all DeviceSelectors and sets the id to null.

@gcampax gcampax added enhancement New feature or request training Issues with dataset generation, augmentation, training labels Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request training Issues with dataset generation, augmentation, training
Projects
None yet
Development

No branches or pull requests

2 participants