Author: YUTONG LI (Harry)
GitHub: https://github.com/JSLEE-0703/Seq2seq-Translator
English-to-Chinese neural machine translation with attention implemented in PyTorch. Uses jieba for Chinese word segmentation.
cleaner.py is used to remove irregular characters from datasets. For example, some datasets may contain contributor information in each line, which can be cleaned using cleaner.py.
pip install torch torchvision jieba matplotlib
This project already has a quite complete framework. Currently, due to the simplicity of the data set, there are problems with the translation of long and complex sentences as well as some unfamiliar word combinations. If you want to further improve the translation effect, please use a higher-quality data set. If you have any suggestions for improvement, please contact me.