-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Open
Labels
bugSomething isn't workingSomething isn't workingpendingThis problem is yet to be addressedThis problem is yet to be addressed
Description
Reminder
- I have read the above rules and searched the existing issues.
System Info
Name: torch Version: 2.9.0+cu130; Name: transformers Version: 4.57.3; Name: autoawq Version: 0.2.9; Name: llmcompressor Version: 0.9.0
Reproduction
[INFO|2026-01-05 09:34:30] llamafactory.model.model_utils.kv_cache:143 >> KV cache is disabled during training.
[WARNING|logging.py:328] 2026-01-05 09:34:30,985 >> `torch_dtype` is deprecated! Use `dtype` instead!
[INFO|auto.py:242] 2026-01-05 09:34:30,986 >>
[WARNING|quantizer_awq.py:102] 2026-01-05 09:34:30,986 >> `torch.bfloat16` is not supported for AWQ CUDA/XPU kernels yet. Casting to `torch.float16`.
[INFO|modeling_utils.py:1169] 2026-01-05 09:34:30,986 >> loading weights file /app/06-model/qwen3-8b-int4-awq/model.safetensors.index.json
[INFO|modeling_utils.py:2341] 2026-01-05 09:34:30,986 >> Instantiating Qwen3ForCausalLM model under default dtype torch.float16.
[INFO|configuration_utils.py:986] 2026-01-05 09:34:30,988 >> Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151645,
"use_cache": false
}
/opt/conda/lib/python3.11/site-packages/awq/__init__.py:21: DeprecationWarning:
I have left this message as the final dev message to help you transition.
Important Notice:
- AutoAWQ is officially deprecated and will no longer be maintained.
- The last tested configuration used Torch 2.6.0 and Transformers 4.51.3.
- If future versions of Transformers break AutoAWQ compatibility, please report the issue to the Transformers project.
Alternative:
- AutoAWQ has been adopted by the vLLM Project: https://github.com/vllm-project/llm-compressor
For further inquiries, feel free to reach out:
- X: https://x.com/casper_hansen_
- LinkedIn: https://www.linkedin.com/in/casper-hansen-804005170/
warnings.warn(_FINAL_DEV_MESSAGE, category=DeprecationWarning, stacklevel=1)
[rank0]: Traceback (most recent call last):
[rank0]: File "/app/src/llamafactory/launcher.py", line 185, in <module>
[rank0]: run_exp()
[rank0]: File "/app/src/llamafactory/train/tuner.py", line 132, in run_exp
[rank0]: _training_function(config={"args": args, "callbacks": callbacks})
[rank0]: File "/app/src/llamafactory/train/tuner.py", line 93, in _training_function
[rank0]: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank0]: File "/app/src/llamafactory/train/sft/workflow.py", line 53, in run_sft
[rank0]: model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/app/src/llamafactory/model/loader.py", line 179, in load_model
[rank0]: model = load_class.from_pretrained(**init_kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 604, in from_pretrained
[rank0]: return model_class.from_pretrained(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/lib/python3.11/site-packages/transformers/modeling_utils.py", line 277, in _wrapper
[rank0]: return func(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4998, in from_pretrained
[rank0]: hf_quantizer.preprocess_model(
[rank0]: File "/opt/conda/lib/python3.11/site-packages/transformers/quantizers/base.py", line 225, in preprocess_model
[rank0]: return self._process_model_before_weight_loading(model, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/lib/python3.11/site-packages/transformers/quantizers/quantizer_awq.py", line 119, in _process_model_before_weight_loading
[rank0]: model, has_been_replaced = replace_with_awq_linear(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/lib/python3.11/site-packages/transformers/integrations/awq.py", line 134, in replace_with_awq_linear
[rank0]: from awq.modules.linear.gemm import WQLinear_GEMM
[rank0]: File "/opt/conda/lib/python3.11/site-packages/awq/__init__.py", line 24, in <module>
[rank0]: from awq.models.auto import AutoAWQForCausalLM
[rank0]: File "/opt/conda/lib/python3.11/site-packages/awq/models/__init__.py", line 1, in <module>
[rank0]: from .mpt import MptAWQForCausalLM
[rank0]: File "/opt/conda/lib/python3.11/site-packages/awq/models/mpt.py", line 1, in <module>
[rank0]: from .base import BaseAWQForCausalLM
[rank0]: File "/opt/conda/lib/python3.11/site-packages/awq/models/base.py", line 49, in <module>
[rank0]: from awq.quantize.quantizer import AwqQuantizer
[rank0]: File "/opt/conda/lib/python3.11/site-packages/awq/quantize/quantizer.py", line 11, in <module>
[rank0]: from awq.quantize.scale import apply_scale, apply_clip
[rank0]: File "/opt/conda/lib/python3.11/site-packages/awq/quantize/scale.py", line 12, in <module>
[rank0]: from transformers.activations import NewGELUActivation, PytorchGELUTanh, GELUActivation
[rank0]: ImportError: cannot import name 'PytorchGELUTanh' from 'transformers.activations' (/opt/conda/lib/python3.11/site-packages/transformers/activations.py)
`torch_dtype` is deprecated! Use `dtype` instead!
`torch.bfloat16` is not supported for AWQ CUDA/XPU kernels yet. Casting to `torch.float16`.
/opt/conda/lib/python3.11/site-packages/awq/__init__.py:21: DeprecationWarning:
I have left this message as the final dev message to help you transition.
Important Notice:
- AutoAWQ is officially deprecated and will no longer be maintained.
- The last tested configuration used Torch 2.6.0 and Transformers 4.51.3.
- If future versions of Transformers break AutoAWQ compatibility, please report the issue to the Transformers project.
Alternative:
- AutoAWQ has been adopted by the vLLM Project: https://github.com/vllm-project/llm-compressor
For further inquiries, feel free to reach out:
- X: https://x.com/casper_hansen_
- LinkedIn: https://www.linkedin.com/in/casper-hansen-804005170/
warnings.warn(_FINAL_DEV_MESSAGE, category=DeprecationWarning, stacklevel=1)
[rank1]: Traceback (most recent call last):
[rank1]: File "/app/src/llamafactory/launcher.py", line 185, in <module>
[rank1]: run_exp()
[rank1]: File "/app/src/llamafactory/train/tuner.py", line 132, in run_exp
[rank1]: _training_function(config={"args": args, "callbacks": callbacks})
[rank1]: File "/app/src/llamafactory/train/tuner.py", line 93, in _training_function
[rank1]: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank1]: File "/app/src/llamafactory/train/sft/workflow.py", line 53, in run_sft
[rank1]: model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/app/src/llamafactory/model/loader.py", line 179, in load_model
[rank1]: model = load_class.from_pretrained(**init_kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/opt/conda/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 604, in from_pretrained
[rank1]: return model_class.from_pretrained(
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/opt/conda/lib/python3.11/site-packages/transformers/modeling_utils.py", line 277, in _wrapper
[rank1]: return func(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/opt/conda/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4998, in from_pretrained
[rank1]: hf_quantizer.preprocess_model(
[rank1]: File "/opt/conda/lib/python3.11/site-packages/transformers/quantizers/base.py", line 225, in preprocess_model
[rank1]: return self._process_model_before_weight_loading(model, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/opt/conda/lib/python3.11/site-packages/transformers/quantizers/quantizer_awq.py", line 119, in _process_model_before_weight_loading
[rank1]: model, has_been_replaced = replace_with_awq_linear(
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/opt/conda/lib/python3.11/site-packages/transformers/integrations/awq.py", line 134, in replace_with_awq_linear
[rank1]: from awq.modules.linear.gemm import WQLinear_GEMM
[rank1]: File "/opt/conda/lib/python3.11/site-packages/awq/__init__.py", line 24, in <module>
[rank1]: from awq.models.auto import AutoAWQForCausalLM
[rank1]: File "/opt/conda/lib/python3.11/site-packages/awq/models/__init__.py", line 1, in <module>
[rank1]: from .mpt import MptAWQForCausalLM
[rank1]: File "/opt/conda/lib/python3.11/site-packages/awq/models/mpt.py", line 1, in <module>
[rank1]: from .base import BaseAWQForCausalLM
[rank1]: File "/opt/conda/lib/python3.11/site-packages/awq/models/base.py", line 49, in <module>
[rank1]: from awq.quantize.quantizer import AwqQuantizer
[rank1]: File "/opt/conda/lib/python3.11/site-packages/awq/quantize/quantizer.py", line 11, in <module>
[rank1]: from awq.quantize.scale import apply_scale, apply_clip
[rank1]: File "/opt/conda/lib/python3.11/site-packages/awq/quantize/scale.py", line 12, in <module>
[rank1]: from transformers.activations import NewGELUActivation, PytorchGELUTanh, GELUActivation
[rank1]: ImportError: cannot import name 'PytorchGELUTanh' from 'transformers.activations' (/opt/conda/lib/python3.11/site-packages/transformers/activations.py)
[rank0]:[W105 09:34:31.743115696 ProcessGroupNCCL.cpp:1524] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
[rank0]:[W105 09:34:32.336205576 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
W0105 09:34:32.395000 366471 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 366491 closing signal SIGTERM
E0105 09:34:32.559000 366471 site-packages/torch/distributed/elastic/multiprocessing/api.py:882] failed (exitcode: 1) local_rank: 0 (pid: 366490) of binary: /opt/conda/bin/python
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 7, in <module>
sys.exit(main())
^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 357, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/distributed/run.py", line 936, in main
run(args)
File "/opt/conda/lib/python3.11/site-packages/torch/distributed/run.py", line 927, in run
elastic_launch(
File "/opt/conda/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 156, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 293, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
/app/src/llamafactory/launcher.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2026-01-05_09:34:32
host : 68d0f41c5a5f
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 366490)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
[W105 09:34:32.911273327 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
Traceback (most recent call last):
File "/opt/conda/bin/llamafactory-cli", line 7, in <module>
sys.exit(main())
^^^^^^
File "/app/src/llamafactory/cli.py", line 24, in main
launcher.launch()
File "/app/src/llamafactory/launcher.py", line 115, in launch
process = subprocess.run(
^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['torchrun', '--nnodes', '1', '--node_rank', '0', '--nproc_per_node', '2', '--master_addr', '127.0.0.1', '--master_port', '60195', '/app/src/llamafactory/launcher.py', '/app/self_made_train_yaml/gpu_dir/qwen3_8b_int4_awq_sft_train_v1_20260105.yaml']' returned non-zero exit status 1.
[W105 09:34:33.485959237 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
Others
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingpendingThis problem is yet to be addressedThis problem is yet to be addressed