You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
and after run this code ,i meet this error .the error belows :
Traceback (most recent call last):
File "/root/full_parameter_ft_use_lomo.py", line 9, in
args = TrainingArguments(
^^^^^^^^^^^^^^^^^^
File "", line 128, in init
File "/root/miniconda3/lib/python3.12/site-packages/transformers/training_args.py", line 1623, in post_init
self.optim = OptimizerNames(self.optim)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/enum.py", line 757, in call
return cls.new(cls, value)
^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/enum.py", line 1179, in new
raise exc
File "/root/miniconda3/lib/python3.12/enum.py", line 1156, in new
result = cls.missing(value)
^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/transformers/utils/generic.py", line 495, in missing
raise ValueError(
ValueError: adalomo is not a valid OptimizerNames, please select one of ['adamw_hf', 'adamw_torch', 'adamw_torch_fused', 'adamw_torch_xla', 'adamw_torch_npu_fused', 'adamw_apex_fused', 'adafactor', 'adamw_anyprecision', 'sgd', 'adagrad', 'adamw_bnb_8bit', 'adamw_8bit', 'lion_8bit', 'lion_32bit', 'paged_adamw_32bit', 'paged_adamw_8bit', 'paged_lion_32bit', 'paged_lion_8bit', 'rmsprop', 'rmsprop_bnb', 'rmsprop_bnb_8bit', 'rmsprop_bnb_32bit', 'galore_adamw', 'galore_adamw_8bit', 'galore_adafactor', 'galore_adamw_layerwise', 'galore_adamw_8bit_layerwise', 'galore_adafactor_layerwise']
Expected behavior
i think this code would be run normaly ,thank you .
The text was updated successfully, but these errors were encountered:
Hi @luoruijie, this is normal since adalomo is not on the latest version of transformers. You need use the main branch of transformers or wait until the new release !
System Info
torch==2.3.0+cu121
transformers==4.41.2
trl==0.9.4
Who can help?
@ArthurZucker
@muellerzr
@SunMarc
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
import torch
import datasets
from transformers import TrainingArguments, AutoTokenizer, AutoModelForCausalLM
import trl
train_dataset = datasets.load_dataset('imdb', split='train')
args = TrainingArguments(
output_dir="./test-lomo",
max_steps=1000,
per_device_train_batch_size=2,
optim="adalomo",
gradient_checkpointing=False,
logging_strategy="steps",
logging_steps=1,
learning_rate=5e-4,
save_strategy="no",
run_name="lomo-imdb",
)
model_id = "mistralai/Mistral-7B-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, low_cpu_mem_usage=True).to(0)
trainer = trl.SFTTrainer(
model=model,
args=args,
train_dataset=train_dataset,
dataset_text_field='text',
max_seq_length=512,
)
trainer.train()
and after run this code ,i meet this error .the error belows :
Traceback (most recent call last):
File "/root/full_parameter_ft_use_lomo.py", line 9, in
args = TrainingArguments(
^^^^^^^^^^^^^^^^^^
File "", line 128, in init
File "/root/miniconda3/lib/python3.12/site-packages/transformers/training_args.py", line 1623, in post_init
self.optim = OptimizerNames(self.optim)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/enum.py", line 757, in call
return cls.new(cls, value)
^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/enum.py", line 1179, in new
raise exc
File "/root/miniconda3/lib/python3.12/enum.py", line 1156, in new
result = cls.missing(value)
^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/transformers/utils/generic.py", line 495, in missing
raise ValueError(
ValueError: adalomo is not a valid OptimizerNames, please select one of ['adamw_hf', 'adamw_torch', 'adamw_torch_fused', 'adamw_torch_xla', 'adamw_torch_npu_fused', 'adamw_apex_fused', 'adafactor', 'adamw_anyprecision', 'sgd', 'adagrad', 'adamw_bnb_8bit', 'adamw_8bit', 'lion_8bit', 'lion_32bit', 'paged_adamw_32bit', 'paged_adamw_8bit', 'paged_lion_32bit', 'paged_lion_8bit', 'rmsprop', 'rmsprop_bnb', 'rmsprop_bnb_8bit', 'rmsprop_bnb_32bit', 'galore_adamw', 'galore_adamw_8bit', 'galore_adafactor', 'galore_adamw_layerwise', 'galore_adamw_8bit_layerwise', 'galore_adafactor_layerwise']
Expected behavior
i think this code would be run normaly ,thank you .
The text was updated successfully, but these errors were encountered: