Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/llamafactory/hparams/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -340,7 +340,7 @@ def get_train_args(args: dict[str, Any] | list[str] | None = None) -> _TRAIN_CLS
if training_args.deepspeed is not None and (finetuning_args.use_galore or finetuning_args.use_apollo):
raise ValueError("GaLore and APOLLO are incompatible with DeepSpeed yet.")

if training_args.fp8 and training_args.quantization_bit is not None:
if training_args.fp8 and model_args.quantization_bit is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This fix is correct! While reviewing, I noticed another related issue with fp8 argument handling in this file. On line 364, model_args.fp8 is assigned a value. However, the fp8 attribute is defined in TrainingArguments, not ModelArguments. This will cause an AttributeError if that code path is executed. To make this fp8 fix complete, could you please also change line 364 to training_args.fp8 = True?

raise ValueError("FP8 training is not compatible with quantization. Please disable one of them.")

if model_args.infer_backend != EngineName.HF:
Expand Down