Skip to content

Commit

Permalink
[Feature] XTuner Lite (#974)
Browse files Browse the repository at this point in the history
* minimum dependency sft

* fix dispatch

* add timer

* add tgs

* internlm2 tp

* rms support tp

* gradient checkpointing

* lazy load pretrain

* temp

* fix bugs

* add data pipeline example

* fix lints

* remove useless code

* fix hard pack bug

* add comments

* clean code

* add shard strategy

* support cpu offload

* support cpu offload

* trust remote code

* fix soft packer bug

* fix soft packer bug

* fix soft packer bug

* refactor data pipeline

* fixup

* fix pad tokens bug

* check input_ids and labels

* check input_ids and labels in collator

* fix load local datasets bug

* fix load cache datasts

* restore dset order

* save cached infos

* accelerate start up

* avoid all gather cached datasets

* fixup

* fix cache bug

* Support group length (#4)

* replace rmsnorm kernel

* suport ftdp ds

* suport load_bin

* suport group by maxlen

* add fsdp_ftdp_sft and fix fsdp_sft

* suport ftdp ds

* add lr min

* fix bugs

* fix bugs

* delete

* support llava

* support packer cache

* refactor dist load

* Add sp tp (#5)

* support sp and tp

* add fsdp_tp_sft and modify fsdp_sft

* move chat_template

* fix load_ds

* delete useless codes

* delete useless codes

* fix jsonl load

* refactor

* fix bug

* fix lr scheduler

* refactor setup parallel

* update data load

* fix bugs

* move fsdp

* adapt new parallel load

* fix setup_parallel (#7)

* fix some bugs

* add remote codes

* add convert script

* support load image from ceph

* support load image from ceph

* fix cache dataset bugs

* support mulit images

* support llava interleave

* fix load timeout

* refactor datasets: optimize the cache mechanism and clean up code

* distinguish dataset components based on algorithms

* support fsdp2+3d parallel

* fix lints

* support contiguous batching

* refactor parallel

* zero wasting ppo

* support asend npu

* fix openai convert

* fix npu bugs

* fix npu bug

* dispatch npu flash attn

* adapt asend npu

* fix ppo losses

* steady increase in reward

* faster ppo

* fix top-p generate

* support internlm3

* baseline 2.5

* fix internlm3

* (ing)support hard pack

* support qwen2

* fix dataset bugs

* baseline

* del ppo.py

* fixup

* support hybrid sp

* fix hybrid sp

* qwen2 + hybird sp

* fix requirements

* avoid re-initialize dist

* support group pack

* pretrain (#13)

* first commit: support internlm3 moe streaming dataset

* move codes

* Moe pretrain (#14)

* first commit: support internlm3 moe streaming dataset

* move codes

* rmsnorm kernel support low version flash_attn

* add barrier

* support prompt length control (#15)

* support VLM Base (#16)

* add internvl

* fix bug

* remove dup code

* support liger of internvl

* fix bug

* add get_repo_git_info

* fix

* add minicpmv

* add minicpmv dispatch

* accelerate tokenize

* Updata InternVL (#17)

* fix dpo error

* fix sp error

* update dataset

* fix

* fix rand sampler (#18)

* llama support transformers >= 4.45 (#19)

* convert fsdp1 to fsdp2 in sft.py

* [Feature] Support Liger Kernel (#20)

* filter data by max length (#21)

* fix causal forward, prefetch, and remote code (#22)

* [Enhancement] Accelerating Data Pipeline (#23)

* sample ratio greater than 1.0 and trunc max len

* accelerating the counting of tokens

* log reduced loss

* fix mirco bs greater than 1

* [Enhancement] Ensure data integrity when the sampling ratio is more than 1 (#24)

* repeat dataset

* fixup

* fix typos

* fix typos

* [Fix] Pass in temperature during generation (#25)

* Support Janus and fix some error (#27)

* add prefetch

* update prefetch

* add janus

* add janus

* fix

* fix

* fix llama position id error

* fix ProcessPoolExecutor

* update

* fix llama

* delete cache

* remove useless code

---------

Co-authored-by: whcao <[email protected]>
Co-authored-by: Happy <[email protected]>
Co-authored-by: Haian Huang(深度眸) <[email protected]>
  • Loading branch information
4 people authored Dec 27, 2024
1 parent 90192ff commit e443aa9
Show file tree
Hide file tree
Showing 104 changed files with 19,470 additions and 21 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config-zh-cn.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
exclude: ^tests/data/|^xtuner/tools/model_converters/modeling_internlm2_reward/
exclude: ^tests/data/|^xtuner/tools/model_converters/modeling_internlm2_reward/|^xtuner/_lite/modelings/|^xtuner/_lite/accelerate/dispatches/huggingface/
repos:
- repo: https://gitee.com/openmmlab/mirrors-flake8
rev: 5.0.4
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
exclude: ^tests/data/|^xtuner/tools/model_converters/modeling_internlm2_reward/
exclude: ^tests/data/|^xtuner/tools/model_converters/modeling_internlm2_reward/|^xtuner/_lite/modelings/|^xtuner/_lite/accelerate/dispatches/huggingface/
repos:
- repo: https://github.com/PyCQA/flake8
rev: 5.0.4
Expand Down
1 change: 1 addition & 0 deletions requirements/lmdeploy.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
lmdeploy>=0.6.2 --no-deps
15 changes: 6 additions & 9 deletions requirements/runtime.txt
Original file line number Diff line number Diff line change
@@ -1,16 +1,11 @@
# Minimum 0.40.0.post4 to fix some 4-bit precision bugs
bitsandbytes>=0.40.0.post4
# Minimum 2.16.0 to fix some bugs, see https://github.com/huggingface/datasets/pull/6444
datasets>=2.16.0
einops
# Minimum 0.1.2 to fix some bugs, see https://github.com/InternLM/lagent/pull/44
lagent>=0.1.2
einop
# Avoid `import cv2` failed
opencv-python==4.7.0.72
# Minimum 0.10.3 to support distributed evaluation for MMBench
# see https://github.com/open-mmlab/mmengine/pull/1469
mmengine>=0.10.3
openpyxl
# Minimum 0.4.0 to support QLoRA, see https://github.com/huggingface/peft/pull/476
peft>=0.4.0
scikit-image
scipy
SentencePiece
Expand All @@ -23,5 +18,7 @@ torchvision
# https://github.com/huggingface/transformers/blob/v4.38.0/src/transformers/models/llama/modeling_llama.py#L921-L923
# transformers >= 4.43.0 use _flash_attention_forward but not self._flash_attention_forward
# to calculate attn output which lead to bc braeking
transformers>=4.36.0,!=4.38.0,!=4.38.1,!=4.38.2,<=4.42.4
transformers>=4.45
transformers_stream_generator
loguru
pydantic
4 changes: 3 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,10 +117,12 @@ def gen_packages_items():
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9',
'Programming Language :: Python :: 3.10',
'Programming Language :: Python :: 3.11',
'Programming Language :: Python :: 3.12',
'Topic :: Utilities',
],
# Python maximum version <3.11, to support mpi4py-mpich
python_requires='>=3.8, <3.11',
python_requires='>=3.8, <3.13',
license='Apache License 2.0',
install_requires=parse_requirements('requirements/runtime.txt'),
extras_require={
Expand Down
Loading

0 comments on commit e443aa9

Please sign in to comment.