You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. Therefore the output file must be a file path, so that the ONNX external data can be written to the same directory. Please specify the output file name.
Details:
For reference, I tested this both on Hugging Face Spaces and on my own server, with the same result.
Log:
optimum-cli export openvino -m "mistral-community/pixtral-12b" --weight-format int8 pixtral-12b/INT8
2024-10-27 20:34:11.520991: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-10-27 20:34:11.552790: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
No ROCm runtime is found, using ROCM_HOME='/opt/rocm-6.2.2'
config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 997/997 [00:00<00:00, 13.9MB/s]
model.safetensors.index.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 57.9k/57.9k [00:00<00:00, 717kB/s]
model-00001-of-00006.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 4.99G/4.99G [01:58<00:00, 42.0MB/s]
model-00002-of-00006.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 4.96G/4.96G [01:57<00:00, 42.1MB/s]
model-00003-of-00006.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 4.91G/4.91G [01:56<00:00, 42.2MB/s]
model-00004-of-00006.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 4.91G/4.91G [01:56<00:00, 42.0MB/s]
model-00005-of-00006.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 4.26G/4.26G [01:41<00:00, 42.1MB/s]
model-00006-of-00006.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 1.34G/1.34G [00:31<00:00, 42.4MB/s]
Downloading shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [10:04<00:00, 100.77s/it]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 8.05it/s]
generation_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 1.74MB/s]
tokenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 177k/177k [00:00<00:00, 1.05MB/s]
tokenizer.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.26M/9.26M [00:00<00:00, 13.9MB/s]
special_tokens_map.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 414/414 [00:00<00:00, 5.97MB/s]
processor_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 162/162 [00:00<00:00, 2.54MB/s]
chat_template.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.63k/1.63k [00:00<00:00, 24.9MB/s]
preprocessor_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 483/483 [00:00<00:00, 7.16MB/s]
We detected that you are passing past_key_values as a tuple of tuples. This is deprecated and will be removed in v4.47. Please convert your cache or use an appropriate Cache class (https://huggingface.co/docs/transformers/kv_cache#legacy-cache-format)
/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/transformers/cache_utils.py:447: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
or len(self.key_cache[layer_idx]) == 0 # the layer has no cache
/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/transformers/cache_utils.py:432: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
elif len(self.key_cache[layer_idx]) == 0: # fills previously skipped layers; checking for tensor causes errors
Starting from v4.46, the logits model output will have the same type as the model (except at train time, where it will always be FP32)
[ WARNING ] Unexpectedly found already patched module language_model.model.embed_tokens while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.0.self_attn.q_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.0.self_attn.k_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.0.self_attn.v_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.0.self_attn.o_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.38.mlp.down_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.39.self_attn.q_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.39.self_attn.k_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.39.self_attn.v_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.39.self_attn.o_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
.
.
.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.39.mlp.gate_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.39.mlp.up_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.model.layers.39.mlp.down_proj while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
[ WARNING ] Unexpectedly found already patched module language_model.lm_head while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/transformers/models/pixtral/modeling_pixtral.py:492: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
patch_embeds_list = [self.patch_conv(img.unsqueeze(0).to(self.dtype)) for img in pixel_values]
/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/nncf/torch/dynamic_graph/wrappers.py:86: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
op1 = operator(*args, **kwargs)
/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/transformers/models/pixtral/modeling_pixtral.py:448: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
for start, end in zip(block_start_idx, block_end_idx):
[ WARNING ] Unexpectedly found already patched module while applying ModuleExtension during PyTorch model conversion. Result of the conversion maybe broken. Depending on the exact issue it may lead to broken original model.
Export model to OpenVINO directly failed with:
Config dummy inputs are not a subset of the model inputs: {'input'} vs {'kwargs', 'args'}.
Model will be exported to ONNX
Traceback (most recent call last):
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/exporters/openvino/convert.py", line 382, in export_pytorch
check_dummy_inputs_are_allowed(model, dummy_inputs)
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py", line 97, in check_dummy_inputs_are_allowed
raise ValueError(
ValueError: Config dummy inputs are not a subset of the model inputs: {'input'} vs {'kwargs', 'args'}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/local/miniconda3/envs/onnx/bin/optimum-cli", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/commands/optimum_cli.py", line 208, in main
service.run()
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/commands/export/openvino.py", line 349, in run
main_export(
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/exporters/openvino/__main__.py", line 393, in main_export
submodel_paths = export_from_model(
^^^^^^^^^^^^^^^^^^
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/exporters/openvino/convert.py", line 701, in export_from_model
export_models(
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/exporters/openvino/convert.py", line 504, in export_models
export(
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/exporters/openvino/convert.py", line 144, in export
return export_pytorch(
^^^^^^^^^^^^^^^
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/exporters/openvino/convert.py", line 408, in export_pytorch
return export_pytorch_via_onnx(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/exporters/openvino/convert.py", line 256, in export_pytorch_via_onnx
input_names, output_names = export_pytorch_to_onnx(
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py", line 584, in export_pytorch
onnx_export(
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/torch/onnx/__init__.py", line 375, in export
export(
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/torch/onnx/utils.py", line 502, in export
_export(
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/torch/onnx/utils.py", line 1564, in _export
graph, params_dict, torch_out = _model_to_graph(
^^^^^^^^^^^^^^^^
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/torch/onnx/utils.py", line 1117, in _model_to_graph
graph = _optimize_graph(
^^^^^^^^^^^^^^^^
File "/home/local/miniconda3/envs/onnx/lib/python3.11/site-packages/torch/onnx/utils.py", line 663, in _optimize_graph
_C._jit_pass_onnx_graph_shape_type_inference(
RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. Therefore the output file must be a file path, so that the ONNX external data can be written to the same directory. Please specify the output file name.
```</div>
The text was updated successfully, but these errors were encountered:
Discussed in #2479
Originally posted by matrix1233 October 28, 2024
Hello,
I followed the exact solution provided in the OpenVINO documentation here: https://docs.openvino.ai/2024/notebooks/pixtral-with-output.html, but I am encountering a persistent error during the model conversion to ONNX.
Error:
For reference, I tested this both on Hugging Face Spaces and on my own server, with the same result.
Log:
The text was updated successfully, but these errors were encountered: