Add evaluation metric for accuracy except device IDs #753

gcampax · 2021-08-26T09:38:50Z

As discussed in the meeting with Kevin. For most skills (practically, all but IoTs) device IDs don't need to be correct at parse time because they can be added automatically as postprocessing.

We still want to include them as "exact match accuracy" because they are part of the target program. We want pre-normalization token-by-token exact match accuracy to be the target metric due to how seq2seq works.

To account for this, and remove some less-relevant errors from error analysis of devices, we should introduce a new partial accuracy metric, "ok_without_device_id". This would be implemented in SentenceEvaluator and in the associated cmdline code.
We can implement this using some token manipulation (recognizing the sequence of tokens "id = GENERIC_ENTITY_*" and removing it), or with a proper NodeVisitor that visits all DeviceSelectors and sets the id to null.

The text was updated successfully, but these errors were encountered:

gcampax added enhancement New feature or request training Issues with dataset generation, augmentation, training labels Aug 26, 2021

gcampax assigned kevintangzero Aug 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add evaluation metric for accuracy except device IDs #753

Add evaluation metric for accuracy except device IDs #753

gcampax commented Aug 26, 2021

Add evaluation metric for accuracy except device IDs #753

Add evaluation metric for accuracy except device IDs #753

Comments

gcampax commented Aug 26, 2021