You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cmuser@22852e086ad2:~/CM/repos/gateoverflow@mlperf-automations/script$ /home/cmuser/venv/cm/bin/python3 '/home/cmuser/CM/repos/local/cache/39a6f9adec6e40c0/inference/language/mixtral-8x7b/evaluate-accuracy.py' --checkpoint-path '/home/cmuser/CM/repos/local/cache/cab0c6b503c9423a/repo' --mlperf-accuracy-file '/home/cmuser/gh_action_results/test_results/gh_action-reference-cpu-pytorch-v2.5.1-default_config/mixtral-8x7b/offline/accuracy/mlperf_log_accuracy.json' --dataset-file '/home/cmuser/CM/repos/local/cache/8935e6cb4e364efd/mixtral-test-dataset.pkl' --dtype int32 > '/home/cmuser/gh_action_results/test_results/gh_action-reference-cpu-pytorch-v2.5.1-default_config/mixtral-8x7b/offline/accuracy/accuracy.txt'
[nltk_data] Downloading package punkt to /home/cmuser/nltk_data...
[nltk_data] Package punkt is already up-to-date!
Traceback (most recent call last):
File "/home/cmuser/CM/repos/local/cache/39a6f9adec6e40c0/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 248, in <module>
main()
File "/home/cmuser/CM/repos/local/cache/39a6f9adec6e40c0/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 190, in main
preds, targets = postprocess_text(
File "/home/cmuser/CM/repos/local/cache/39a6f9adec6e40c0/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 97, in postprocess_text
preds = ["\n".join(nltk.sent_tokenize(pred)) for pred in preds]
File "/home/cmuser/CM/repos/local/cache/39a6f9adec6e40c0/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 97, in <listcomp>
preds = ["\n".join(nltk.sent_tokenize(pred)) for pred in preds]
File "/home/cmuser/venv/cm/lib/python3.10/site-packages/nltk/tokenize/__init__.py", line 119, in sent_tokenize
tokenizer = _get_punkt_tokenizer(language)
File "/home/cmuser/venv/cm/lib/python3.10/site-packages/nltk/tokenize/__init__.py", line 105, in _get_punkt_tokenizer
return PunktTokenizer(language)
File "/home/cmuser/venv/cm/lib/python3.10/site-packages/nltk/tokenize/punkt.py", line 1744, in __init__
self.load_lang(lang)
File "/home/cmuser/venv/cm/lib/python3.10/site-packages/nltk/tokenize/punkt.py", line 1749, in load_lang
lang_dir = find(f"tokenizers/punkt_tab/{lang}/")
File "/home/cmuser/venv/cm/lib/python3.10/site-packages/nltk/data.py", line 579, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource punkt_tab not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('punkt_tab')
For more information see: https://www.nltk.org/data.html
Attempted to load tokenizers/punkt_tab/english/
Searched in:
- '/home/cmuser/nltk_data'
- '/home/cmuser/venv/cm/nltk_data'
- '/home/cmuser/venv/cm/share/nltk_data'
- '/home/cmuser/venv/cm/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
**********************************************************************
It works fine for nltk==3.8.1.
The text was updated successfully, but these errors were encountered:
It works fine for
nltk==3.8.1
.The text was updated successfully, but these errors were encountered: