You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the beginning of the spoken-to-signed translation pipeline, we perform multiple tasks, an important one of which is text normalization.
Unlike the others, which run completely offline, text normalization relies on an online-only solution which can degrade performance when offline, or create small delays when running online.
Ideally, for privacy concerns, we would also like to move this endpoint to a local model.
Furthermore, it costs us money to run this API endpoint, calling GPT-3 to automatically normalize the text.
If this API ever reaches production, we should prompt it instead of prompting ChatGPT.
Alternatives
Train our own normalization model on existing text normalization data, or collect data using ChatGPT.
Training our own model would take away resources from our main objective, and will require the user to host another model on their device (which is undesirable).
The text was updated successfully, but these errors were encountered:
Problem
At the beginning of the spoken-to-signed translation pipeline, we perform multiple tasks, an important one of which is text normalization.
Unlike the others, which run completely offline, text normalization relies on an online-only solution which can degrade performance when offline, or create small delays when running online.
Ideally, for privacy concerns, we would also like to move this endpoint to a local model.
Furthermore, it costs us money to run this API endpoint, calling GPT-3 to automatically normalize the text.
Description
Seems like every large company is pushing for local small LLMs, with limited world knowledge but superb text processing abilities.
For example, Google is pushing Gemini Nano in chrome (experimental API): https://x.com/rauchg/status/1806385778064564622
https://developer.chrome.com/docs/ai/built-in
If this API ever reaches production, we should prompt it instead of prompting ChatGPT.
Alternatives
Train our own normalization model on existing text normalization data, or collect data using ChatGPT.
Training our own model would take away resources from our main objective, and will require the user to host another model on their device (which is undesirable).
The text was updated successfully, but these errors were encountered: