We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what's the pp_size means?
in here:
parallel_build:这个参数就是 猛猿:图解大模型训练之:流水线并行(Pipeline Parallelism),以Gpipe为例 感觉知识实在太多就不做过多深究,就是流式并行加快速度的.此处pp_size=1,说明也没有进行并行.
when I use this script:I got a error python build.py --hf_model_dir qwenv1.5-7b-GPTQ-int4 --quant_ckpt_path qwenv1.5-7b-GPTQ-int4 \ --dtype 16\ --remove_input_paddig\ --gpt_attention_plugin float16\ --gemm_plugin float 16\ --enable_context_fmha\ --use_weight_only\ --weight_only_precision int4_gptq\ --per_group\ --word_size 8 \ --tp_size 4\ --pp_size 2 \ --output_dir qwnv1.5-trengine I got error: examples/qweb2/weight.py,line 875,in load_from_gptq_qwen tensorrt_llm_qwen.lm_head.weight.vaule = np/ascontiguousarray( AttributeError:'Qwen2GForCausalLM' object has no attribute 'lm_head'. Dod you mean" 'num_heads'?
python build.py --hf_model_dir qwenv1.5-7b-GPTQ-int4 --quant_ckpt_path qwenv1.5-7b-GPTQ-int4 \ --dtype 16\ --remove_input_paddig\ --gpt_attention_plugin float16\ --gemm_plugin float 16\ --enable_context_fmha\ --use_weight_only\ --weight_only_precision int4_gptq\ --per_group\ --word_size 8 \ --tp_size 4\ --pp_size 2 \ --output_dir qwnv1.5-trengine
The text was updated successfully, but these errors were encountered:
No branches or pull requests
what's the pp_size means?
in here:
parallel_build:这个参数就是 猛猿:图解大模型训练之:流水线并行(Pipeline Parallelism),以Gpipe为例
感觉知识实在太多就不做过多深究,就是流式并行加快速度的.此处pp_size=1,说明也没有进行并行.
when I use this script:I got a error
python build.py --hf_model_dir qwenv1.5-7b-GPTQ-int4 --quant_ckpt_path qwenv1.5-7b-GPTQ-int4 \ --dtype 16\ --remove_input_paddig\ --gpt_attention_plugin float16\ --gemm_plugin float 16\ --enable_context_fmha\ --use_weight_only\ --weight_only_precision int4_gptq\ --per_group\ --word_size 8 \ --tp_size 4\ --pp_size 2 \ --output_dir qwnv1.5-trengine
I got error:
examples/qweb2/weight.py,line 875,in load_from_gptq_qwen
tensorrt_llm_qwen.lm_head.weight.vaule = np/ascontiguousarray(
AttributeError:'Qwen2GForCausalLM' object has no attribute 'lm_head'. Dod you mean" 'num_heads'?
The text was updated successfully, but these errors were encountered: