-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sango to English translation #130
Comments
Hi @micuentadecasa! Yes, you're right, it's because of the language direction: the default in the starter notebook is to translate from English. However, you may use the same data for English to Sango for testing the reverse model as well - just set the paths accordingly so that your source paths point to the Sango portion and the target to the English portion. |
Hi Julia, of course, I changed the code as you suggested, but the files that it tries to download doesnt' are in your URL, Can I suggest some ideas for helping other people:
Regards. |
Dear @micuentadecasa, thanks for your ideas! I agree that the starter notebook is not adequately equipped for handling other directions. We should at least make that explicit in the notebook. The links to my Github are also not needed anymore, since the test sets are available in the Masakhane Github repo as well. If you have concrete suggestions, please pack them in a PR and we can review and merge them. Please submit a pull request for adding your new test set to https://github.com/masakhane-io/masakhane-mt/tree/master/jw300_utils/test. |
Thanks Julia, I need a bit more of your help to continue, it seems there is no corpus for Sango 2 English, "JW300_latest_xml_sg-en.xml.gz not found" In addition, how can I create the test.sg-en.en file? or the test.sg-en.sg? are these the files to be created for translating from Sango to English? I'll be happy to create a PR, but as you can see I will need some help from you. Regards. |
You can use the same data (English to Sango) and just swap the sides (rename en-sg.en for example to sg-en.en) 🙂 |
Hi, I'm trying to use your script in https://github.com/masakhane-io/masakhane-mt/blob/master/starter_notebook.ipynb to create a translator for Sango language, but it fails when trying to download the global test set, it doesn't exist for "test.en-any.en". I think it happens because my src is "sg" and my target is "en"; I tested the inverse (English to Sango) and it worked.
Regards.
The text was updated successfully, but these errors were encountered: