Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* minimum dependency sft * fix dispatch * add timer * add tgs * internlm2 tp * rms support tp * gradient checkpointing * lazy load pretrain * temp * fix bugs * add data pipeline example * fix lints * remove useless code * fix hard pack bug * add comments * clean code * add shard strategy * support cpu offload * support cpu offload * trust remote code * fix soft packer bug * fix soft packer bug * fix soft packer bug * refactor data pipeline * fixup * fix pad tokens bug * check input_ids and labels * check input_ids and labels in collator * fix load local datasets bug * fix load cache datasts * restore dset order * save cached infos * accelerate start up * avoid all gather cached datasets * fixup * fix cache bug * Support group length (#4) * replace rmsnorm kernel * suport ftdp ds * suport load_bin * suport group by maxlen * add fsdp_ftdp_sft and fix fsdp_sft * suport ftdp ds * add lr min * fix bugs * fix bugs * delete * support llava * support packer cache * refactor dist load * Add sp tp (#5) * support sp and tp * add fsdp_tp_sft and modify fsdp_sft * move chat_template * fix load_ds * delete useless codes * delete useless codes * fix jsonl load * refactor * fix bug * fix lr scheduler * refactor setup parallel * update data load * fix bugs * move fsdp * adapt new parallel load * fix setup_parallel (#7) * fix some bugs * add remote codes * add convert script * support load image from ceph * support load image from ceph * fix cache dataset bugs * support mulit images * support llava interleave * fix load timeout * refactor datasets: optimize the cache mechanism and clean up code * distinguish dataset components based on algorithms * support fsdp2+3d parallel * fix lints * support contiguous batching * refactor parallel * zero wasting ppo * support asend npu * fix openai convert * fix npu bugs * fix npu bug * dispatch npu flash attn * adapt asend npu * fix ppo losses * steady increase in reward * faster ppo * fix top-p generate * support internlm3 * baseline 2.5 * fix internlm3 * (ing)support hard pack * support qwen2 * fix dataset bugs * baseline * del ppo.py * fixup * support hybrid sp * fix hybrid sp * qwen2 + hybird sp * fix requirements * avoid re-initialize dist * support group pack * pretrain (#13) * first commit: support internlm3 moe streaming dataset * move codes * Moe pretrain (#14) * first commit: support internlm3 moe streaming dataset * move codes * rmsnorm kernel support low version flash_attn * add barrier * support prompt length control (#15) * support VLM Base (#16) * add internvl * fix bug * remove dup code * support liger of internvl * fix bug * add get_repo_git_info * fix * add minicpmv * add minicpmv dispatch * accelerate tokenize * Updata InternVL (#17) * fix dpo error * fix sp error * update dataset * fix * fix rand sampler (#18) * llama support transformers >= 4.45 (#19) * convert fsdp1 to fsdp2 in sft.py * [Feature] Support Liger Kernel (#20) * filter data by max length (#21) * fix causal forward, prefetch, and remote code (#22) * [Enhancement] Accelerating Data Pipeline (#23) * sample ratio greater than 1.0 and trunc max len * accelerating the counting of tokens * log reduced loss * fix mirco bs greater than 1 * [Enhancement] Ensure data integrity when the sampling ratio is more than 1 (#24) * repeat dataset * fixup * fix typos * fix typos * [Fix] Pass in temperature during generation (#25) * Support Janus and fix some error (#27) * add prefetch * update prefetch * add janus * add janus * fix * fix * fix llama position id error * fix ProcessPoolExecutor * update * fix llama * delete cache * remove useless code --------- Co-authored-by: whcao <[email protected]> Co-authored-by: Happy <[email protected]> Co-authored-by: Haian Huang(深度眸) <[email protected]>
- Loading branch information