GPT-J Huggingface validation error #2010

Xi0131 · 2025-01-02T19:33:17Z

Hi, in the final step of this benchmark, the following error occurs.

Finished downloading all the datasets!
[2025-01-02 11:19:46,917 preprocess_data.py:73 INFO] Creating GPT tokenizer...
[2025-01-02 11:19:46,917 preprocess_data.py:39 INFO] Initializing tokenizer from build/models/GPTJ-6B/checkpoint-final
Traceback (most recent call last):
  File "/home/cmuser/.local/lib/python3.8/site-packages/transformers/utils/hub.py", line 389, in cached_file
    resolved_file = hf_hub_download(
  File "/home/cmuser/.local/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 106, in _inner_fn
    validate_repo_id(arg_value)
  File "/home/cmuser/.local/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 154, in validate_repo_id
    raise HFValidationError(
huggingface_hub.errors.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 'build/models/GPTJ-6B/checkpoint-final'. Use `repo_type` argument if needed.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/cmuser/CM/repos/local/cache/b1932adfb3014ecd/repo/closed/NVIDIA/code/gptj/tensorrt/preprocess_data.py", line 138, in <module>
    main()
  File "/home/cmuser/CM/repos/local/cache/b1932adfb3014ecd/repo/closed/NVIDIA/code/gptj/tensorrt/preprocess_data.py", line 132, in main
    preprocess_cnndailymail_gptj6b(data_dir, model_dir, preprocessed_data_dir)
  File "/home/cmuser/CM/repos/local/cache/b1932adfb3014ecd/repo/closed/NVIDIA/code/gptj/tensorrt/preprocess_data.py", line 74, in preprocess_cnndailymail_gptj6b
    tokenizer = prepare_tokenizer(ckpt_path, padding_side="right")
  File "/home/cmuser/CM/repos/local/cache/b1932adfb3014ecd/repo/closed/NVIDIA/code/gptj/tensorrt/preprocess_data.py", line 40, in prepare_tokenizer
    tokenizer = AutoTokenizer.from_pretrained(
  File "/home/cmuser/.local/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 737, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  File "/home/cmuser/.local/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 569, in get_tokenizer_config
    resolved_config_file = cached_file(
  File "/home/cmuser/.local/lib/python3.8/site-packages/transformers/utils/hub.py", line 454, in cached_file
    raise EnvironmentError(
OSError: Incorrect path_or_model_id: 'build/models/GPTJ-6B/checkpoint-final'. Please provide either the path to a local folder or the repo_id of a model on the Hub.
make: *** [/home/cmuser/CM/repos/local/cache/b1932adfb3014ecd/repo/closed/NVIDIA/Makefile.data:36: preprocess_data] Error 1

CM error: Portable CM script failed (name = app-mlperf-inference-nvidia, return code = 256)

Here are the command used for the benchmark:

cm run script --tags=run-mlperf,inference,_find-performance,_full,_r4.1-dev \
   --model=gptj-99 \
   --implementation=nvidia \
   --framework=tensorrt \
   --category=datacenter \
   --scenario=Offline \
   --execution_mode=test \
   --device=cuda  \
   --docker --quiet \
   --test_query_count=50

After the error, I stayed in the docker container.
How do I solve this issue? Will I rerun the whole compiling process once I execute the same command, or can I continue from where it interrupts after solving the error?
Thank you.

arjunsuresh · 2025-01-03T17:05:20Z

Hi @Xi0131 Is the folder build/models/GPTJ-6B/checkpoint-final having the gptj checkpoint inside the container?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-J Huggingface validation error #2010

GPT-J Huggingface validation error #2010

Xi0131 commented Jan 2, 2025 •

edited

Loading

arjunsuresh commented Jan 3, 2025

GPT-J Huggingface validation error #2010

GPT-J Huggingface validation error #2010

Comments

Xi0131 commented Jan 2, 2025 • edited Loading

arjunsuresh commented Jan 3, 2025

Xi0131 commented Jan 2, 2025 •

edited

Loading