[Feature] XTuner Lite (#974)

* minimum dependency sft * fix dispatch * add timer * add tgs * internlm2 tp * rms support tp * gradient checkpointing * lazy load pretrain * temp * fix bugs * add data pipeline example * fix lints * remove useless code * fix hard pack bug * add comments * clean code * add shard strategy * support cpu offload * support cpu offload * trust remote code * fix soft packer bug * fix soft packer bug * fix soft packer bug * refactor data pipeline * fixup * fix pad tokens bug * check input_ids and labels * check input_ids and labels in collator * fix load local datasets bug * fix load cache datasts * restore dset order * save cached infos * accelerate start up * avoid all gather cached datasets * fixup * fix cache bug * Support group length (#4) * replace rmsnorm kernel * suport ftdp ds * suport load_bin * suport group by maxlen * add fsdp_ftdp_sft and fix fsdp_sft * suport ftdp ds * add lr min * fix bugs * fix bugs * delete * support llava * support packer cache * refactor dist load * Add sp tp (#5) * support sp and tp * add fsdp_tp_sft and modify fsdp_sft * move chat_template * fix load_ds * delete useless codes * delete useless codes * fix jsonl load * refactor * fix bug * fix lr scheduler * refactor setup parallel * update data load * fix bugs * move fsdp * adapt new parallel load * fix setup_parallel (#7) * fix some bugs * add remote codes * add convert script * support load image from ceph * support load image from ceph * fix cache dataset bugs * support mulit images * support llava interleave * fix load timeout * refactor datasets: optimize the cache mechanism and clean up code * distinguish dataset components based on algorithms * support fsdp2+3d parallel * fix lints * support contiguous batching * refactor parallel * zero wasting ppo * support asend npu * fix openai convert * fix npu bugs * fix npu bug * dispatch npu flash attn * adapt asend npu * fix ppo losses * steady increase in reward * faster ppo * fix top-p generate * support internlm3 * baseline 2.5 * fix internlm3 * (ing)support hard pack * support qwen2 * fix dataset bugs * baseline * del ppo.py * fixup * support hybrid sp * fix hybrid sp * qwen2 + hybird sp * fix requirements * avoid re-initialize dist * support group pack * pretrain (#13) * first commit: support internlm3 moe streaming dataset * move codes * Moe pretrain (#14) * first commit: support internlm3 moe streaming dataset * move codes * rmsnorm kernel support low version flash_attn * add barrier * support prompt length control (#15) * support VLM Base (#16) * add internvl * fix bug * remove dup code * support liger of internvl * fix bug * add get_repo_git_info * fix * add minicpmv * add minicpmv dispatch * accelerate tokenize * Updata InternVL (#17) * fix dpo error * fix sp error * update dataset * fix * fix rand sampler (#18) * llama support transformers >= 4.45 (#19) * convert fsdp1 to fsdp2 in sft.py * [Feature] Support Liger Kernel (#20) * filter data by max length (#21) * fix causal forward, prefetch, and remote code (#22) * [Enhancement] Accelerating Data Pipeline (#23) * sample ratio greater than 1.0 and trunc max len * accelerating the counting of tokens * log reduced loss * fix mirco bs greater than 1 * [Enhancement] Ensure data integrity when the sampling ratio is more than 1 (#24) * repeat dataset * fixup * fix typos * fix typos * [Fix] Pass in temperature during generation (#25) * Support Janus and fix some error (#27) * add prefetch * update prefetch * add janus * add janus * fix * fix * fix llama position id error * fix ProcessPoolExecutor * update * fix llama * delete cache * remove useless code --------- Co-authored-by: whcao <[email protected]> Co-authored-by: Happy <[email protected]> Co-authored-by: Haian Huang(深度眸) <[email protected]>
InternLM · Dec 27, 2024 · e443aa9 · e443aa9
1 parent 90192ff
commit e443aa9
Show file tree

Hide file tree

Showing 104 changed files with 19,470 additions and 21 deletions.
diff --git a/.pre-commit-config-zh-cn.yaml b/.pre-commit-config-zh-cn.yaml
@@ -1,4 +1,4 @@
-exclude: ^tests/data/|^xtuner/tools/model_converters/modeling_internlm2_reward/
+exclude: ^tests/data/|^xtuner/tools/model_converters/modeling_internlm2_reward/|^xtuner/_lite/modelings/|^xtuner/_lite/accelerate/dispatches/huggingface/
 repos:
   - repo: https://gitee.com/openmmlab/mirrors-flake8
     rev: 5.0.4

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -1,4 +1,4 @@
-exclude: ^tests/data/|^xtuner/tools/model_converters/modeling_internlm2_reward/
+exclude: ^tests/data/|^xtuner/tools/model_converters/modeling_internlm2_reward/|^xtuner/_lite/modelings/|^xtuner/_lite/accelerate/dispatches/huggingface/
 repos:
   - repo: https://github.com/PyCQA/flake8
     rev: 5.0.4

diff --git a/requirements/lmdeploy.txt b/requirements/lmdeploy.txt
@@ -0,0 +1 @@
+lmdeploy>=0.6.2 --no-deps
diff --git a/requirements/runtime.txt b/requirements/runtime.txt
@@ -1,16 +1,11 @@
-# Minimum 0.40.0.post4 to fix some 4-bit precision bugs
-bitsandbytes>=0.40.0.post4
 # Minimum 2.16.0 to fix some bugs, see https://github.com/huggingface/datasets/pull/6444
 datasets>=2.16.0
-einops
-# Minimum 0.1.2 to fix some bugs, see https://github.com/InternLM/lagent/pull/44
-lagent>=0.1.2
+einop
+# Avoid `import cv2` failed
+opencv-python==4.7.0.72
 # Minimum 0.10.3 to support distributed evaluation for MMBench
 # see https://github.com/open-mmlab/mmengine/pull/1469
 mmengine>=0.10.3
-openpyxl
-# Minimum 0.4.0 to support QLoRA, see https://github.com/huggingface/peft/pull/476
-peft>=0.4.0
 scikit-image
 scipy
 SentencePiece
@@ -23,5 +18,7 @@ torchvision
 # https://github.com/huggingface/transformers/blob/v4.38.0/src/transformers/models/llama/modeling_llama.py#L921-L923
 # transformers >= 4.43.0 use _flash_attention_forward but not self._flash_attention_forward
 # to calculate attn output which lead to bc braeking
-transformers>=4.36.0,!=4.38.0,!=4.38.1,!=4.38.2,<=4.42.4
+transformers>=4.45
 transformers_stream_generator
+loguru
+pydantic
diff --git a/setup.py b/setup.py
@@ -117,10 +117,12 @@ def gen_packages_items():
             'Programming Language :: Python :: 3.8',
             'Programming Language :: Python :: 3.9',
             'Programming Language :: Python :: 3.10',
+            'Programming Language :: Python :: 3.11',
+            'Programming Language :: Python :: 3.12',
             'Topic :: Utilities',
         ],
         # Python maximum version <3.11, to support mpi4py-mpich
-        python_requires='>=3.8, <3.11',
+        python_requires='>=3.8, <3.13',
         license='Apache License 2.0',
         install_requires=parse_requirements('requirements/runtime.txt'),
         extras_require={