add Sequence Parallelism by HaoshengZou · Pull Request #6506 · hiyouga/LlamaFactory

HaoshengZou · 2025-01-02T04:01:31Z

What does this PR do?

add Sequence Parallelism (#4733 #5024 #5207 #5815 #5841 etc.)
direct plug&play use at https://github.com/Qihoo360/360-LLaMA-Factory

We have a separate README and chat-group at https://github.com/Qihoo360/360-LLaMA-Factory, only for Sequence Parallelism part. They are not to be merged.
We developed based on LLaMA-Factory's latest release v0.9.1. We also based on https://github.com/zhuzilin/ring-flash-attention. The original repos are fully acknowledged.
We developed this at 360. I am PhD from Tsinghua-CS Prof. Jun Zhu's group.

Feel free to review and comment on changes as you see fit. We'll make it better.
Thank you!

Before submitting

Did you read the contributor guideline?
Did you write any new necessary tests?

support llava-next(video)/video-llava

hiyouga · 2025-01-17T14:44:49Z

Hi Haosheng, sorry for the delay in our processing. Recently, we are busy in work and it's quite difficult to merge it. The expected time of finish is before Feb 10th.

mi-iro · 2025-01-23T13:10:05Z

Hi Haosheng, Sequence Parallelism for LoRA is also important. Have you implemented this, or have any plan?

HaoshengZou · 2025-01-23T13:11:18Z

@mi-iro SP with LoRA is already supported for SFT and DPO.

* Update loader.py to pass in `preprocessing_num_workers`

shiningliang · 2025-02-05T12:03:06Z

Hi @hiyouga Are there any merge blockers on this PR? I'm SFT qwen2.5 on a long context task and I think sequence parallel will much help to accelerate it.
If I directly use this PR to run before it merged, will some new models run with errors as I notice that this PR is behind some new models supporting PRs.

HaoshengZou · 2025-02-09T10:08:57Z

@shiningliang This PR diverges from LLaMA-Factory's last release v0.9.1
For now, known errors with SP are with multi-modal data & models. Pure text models should work well.

shiningliang · 2025-02-10T11:00:49Z

@shiningliang This PR diverges from LLaMA-Factory's last release v0.9.1 For now, known errors with SP are with multi-modal data & models. Pure text models should work well.

Hi @HaoshengZou Thanks for your reply. Do you have plan to support ORPO, KTO, etc.? In my work, we found some scenarios that ORPO performs better than DPO and saves much GPU memory

HaoshengZou · 2025-02-11T02:25:31Z

@shiningliang In (360-)LLaMA-Factory, ORPO uses the same trainer as DPO. So ORPO should be directly supported and you only need to configure it.

liuqianchao · 2025-02-22T01:50:06Z

Do we have any ETA to finish this PR? Sequence Parallelism is quite important for lots of long context LLM task training.

hiyouga · 2025-02-22T02:26:26Z

@liuqianchao Sorry we are struggling with the refactoring of trainers in LlamaFactory to support RL training. You can use 360-llama-factory now for long-sequence training.

githisw · 2025-03-05T01:33:37Z

+1

Kaimar666 · 2025-03-05T08:25:24Z

Does the HUAWEI Ascend 910B support? Or is there any other way?
I want to fine-tune the qwen2.5-14B-Instruct model in SP mode and configure the environment by referring to the 360-llamafactory guide. However, an error is reported in pip install flash-attn.

hiyouga · 2025-03-11T09:11:05Z

Hello, we just used BFG repo cleaner to remove large files in this repo. Unfortunately, this operation accidentally made all PRs invalid. Could you please recreate the same PRs using the latest main branch at your convenience? Thank you so much for your understanding, and we sincerely apologize for any inconvenience this has brought to you.

P.S. You can set https://github.com/hiyouga/LLaMA-Factory-backup as the upstream to find the changes back.

Eisenhower · 2025-05-27T06:36:14Z

@shiningliang This PR diverges from LLaMA-Factory's last release v0.9.1 For now, known errors with SP are with multi-modal data & models. Pure text models should work well.
Hi @shiningliang, thanks for the update! I see that SP is fully stable for pure text models but still has known issues with multi-modal data & models. Could you please let me know if VLM training is supported now? Thanks!

BUAADreamer and others added 30 commits September 29, 2024 18:00

fix tests

97d1536

fix style

6ddea0f

tiny fix

7397827

fix readme

bf0bcbc

fix readme_zh

1a757c5

Update mm_plugin.py

0257a67

Update mm_plugin.py

ffaea30

fix readme_zh

45b01df

Merge branch 'main' of https://github.com/BUAADreamer/LLaMA-Factory

87c8a7e

add more llava-next series template

65a8923

Merge branch 'main' into main

83abf86

fix style

23916d5

Update requirements.txt

905b7c0

fix constants

485fc04

Merge branch 'main' of https://github.com/BUAADreamer/LLaMA-Factory

671824d

fix constants

bec1cb8

fix template

96bec68

fix template

01ca056

Update test_mm_plugin.py

8b50ce5

Update constants.py

b257b91

Update README.md

2d37fa1

Update README.md

63148e7

Update README_zh.md

e472f35

Update common.py

2c17d91

Merge pull request hiyouga#5574 from BUAADreamer/main

c7b334e

support llava-next(video)/video-llava

add Exaone3.0 template

3a95696

update docs Support model Exaone3.0

826675f

Update README.md

d06440e

Update README_zh.md

0dfe9f7

fix chat template Exaone3.0

2964b20

Fix multi workers pre-processing (#8)

c8bac39

* Update loader.py to pass in `preprocessing_num_workers`

root added 2 commits February 6, 2025 14:04

update wecaht 0206

5d6471c

update wecaht 250206

3808fab

hiyouga mentioned this pull request Feb 12, 2025

merge easycontext #4733

Closed

HaoshengZou added 3 commits February 13, 2025 20:12

update wechat

06876ca

transformers<4.46 (#1)

6fd8b28

update wechat

16fd0fe

HaoshengZou added 2 commits February 27, 2025 23:29

preprocessing_num_workers and report_to in 360-example.sh

8f846ea

update wechat

f295a57

gom168 added 2 commits March 7, 2025 20:12

Updated the data preprocessing

adfaf6a

sp-pr merge sp 0309

926f6df

hiyouga closed this Mar 11, 2025

hiyouga force-pushed the main branch from 6b25268 to 7a7071e Compare March 11, 2025 08:44

hiyouga added invalid This doesn't seem right and removed pending This problem is yet to be addressed labels Mar 11, 2025

This was referenced Mar 17, 2025

add Sequence Parallelism (reopened #6506) #7338

Closed

Integrates DeepSpeed-Ulysses sequence parallel, enabling efficient training of large language models with ultra-long sequences. #7335

Open

Conversation

HaoshengZou commented Jan 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

hiyouga commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mi-iro commented Jan 23, 2025

Uh oh!

HaoshengZou commented Jan 23, 2025

Uh oh!

shiningliang commented Feb 5, 2025

Uh oh!

HaoshengZou commented Feb 9, 2025

Uh oh!

shiningliang commented Feb 10, 2025

Uh oh!

HaoshengZou commented Feb 11, 2025

Uh oh!

liuqianchao commented Feb 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hiyouga commented Feb 22, 2025

Uh oh!

githisw commented Mar 5, 2025

Uh oh!

Kaimar666 commented Mar 5, 2025

Uh oh!

hiyouga commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Eisenhower commented May 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants

HaoshengZou commented Jan 2, 2025 •

edited

Loading

hiyouga commented Jan 17, 2025 •

edited

Loading

liuqianchao commented Feb 22, 2025 •

edited

Loading

hiyouga commented Mar 11, 2025 •

edited

Loading