Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPO 复现,模型重复输出 #65

Open
XDeepAzure opened this issue Oct 28, 2024 · 1 comment
Open

CPO 复现,模型重复输出 #65

XDeepAzure opened this issue Oct 28, 2024 · 1 comment

Comments

@XDeepAzure
Copy link

你好,我按照脚本里默认的超参数(learning rate),以及论文提到各参数配置、偏好数据,在ALMA-7B-Lora上做CPO,但是训出来的模型输出大量重复前文甚至不翻译的情况,如下图(zh->en,raw_res 是没用utils里的clean函数的结果),请问是哪里没设好超参吗?谢谢你。
image

下面是的训练脚本

accelerate launch --main_process_port ${port} --config_file configs/deepspeed_train_config_bf16.yaml \
     run_cpo_llmmt.py \
    --model_name_or_path xxxx/ALMA-7B-Pretrain \
    --tokenizer_name xxxx/ALMA-7B-Pretrain \
    --peft_model_id  xxxx/ALMA-7B-Pretrain-LoRA \
    --cpo_scorer kiwi_xcomet \
    --beta 0.1 \
    --use_flash_attention_2 True \
    --use_peft \
    --use_fast_tokenizer False \
    --cpo_data_path  xxxx/ALMA-R-Preference \
    --do_train \
    --language_pairs ${pairs} \
    --low_cpu_mem_usage \
    --bf16 \
    --learning_rate 1e-4 \
    --weight_decay 0.01 \
    --gradient_accumulation_steps 4 \
    --gradient_checkpointing True \
    --lr_scheduler_type inverse_sqrt \
    --warmup_ratio 0.01 \
    --ignore_pad_token_for_loss \
    --ignore_prompt_token_for_loss \
    --per_device_train_batch_size 16 \
    --evaluation_strategy no \
    --save_strategy steps \
    --save_total_limit 2 \
    --logging_strategy steps \
    --logging_steps 0.05 \
    --output_dir ${OUTPUT_DIR} \
    --num_train_epochs 1 \
    --prediction_loss_only \
    --max_new_tokens 256 \
    --max_source_length 256 \
    --max_prompt_length 256 \
    --max_length 512 \
    --seed 42 \
    --overwrite_output_dir \
    --report_to tensorboard \
    --overwrite_cache 
@fe1ixxu
Copy link
Owner

fe1ixxu commented Nov 7, 2024

可以分享一下你的环境吗,已经ALMA repo的commit版本?应该是不会出现这样的情况的, 这个情况看起来像未被fine-tune的模型/训练的时候模型没有学到哪里结束(没有eos)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants