Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trt-llm integration #194

Merged
merged 1 commit into from
Aug 30, 2024
Merged

trt-llm integration #194

merged 1 commit into from
Aug 30, 2024

Conversation

gshennvm
Copy link
Collaborator

@gshennvm gshennvm commented Jun 8, 2024

What does this PR do ?

added trt-llm integration. the nemo path is mostly unaffected

@gshennvm gshennvm force-pushed the geshen/trt_llm_to_main branch from 57d7743 to 993e358 Compare June 8, 2024 00:39
@gshennvm gshennvm changed the base branch from main to geshen/critic_speedup June 27, 2024 19:49
@github-actions github-actions bot added Servers and removed Servers labels Jun 27, 2024
@gshennvm gshennvm force-pushed the geshen/critic_speedup branch from 6c8d698 to 606f690 Compare July 11, 2024 07:15
Base automatically changed from geshen/critic_speedup to main July 11, 2024 22:30
Dockerfile Outdated Show resolved Hide resolved
@gshennvm gshennvm changed the title Geshen/trt llm to main trt-llm integration Jul 15, 2024
@gshennvm gshennvm requested a review from odelalleau July 15, 2024 08:00
@gshennvm gshennvm marked this pull request as ready for review July 15, 2024 08:00
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jul 16, 2024
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I'm still far from being finished with this review, but submitting my partial review so you can start addressing comments.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
docs/user-guide/rlhf.rst Outdated Show resolved Hide resolved
nemo_aligner/algorithms/ppo.py Outdated Show resolved Hide resolved
nemo_aligner/algorithms/ppo.py Outdated Show resolved Hide resolved
nemo_aligner/algorithms/ppo.py Outdated Show resolved Hide resolved
nemo_aligner/algorithms/ppo.py Outdated Show resolved Hide resolved
nemo_aligner/algorithms/ppo.py Show resolved Hide resolved
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second half of the review!

nemo_aligner/utils/trt_llm.py Outdated Show resolved Hide resolved
nemo_aligner/algorithms/spin.py Outdated Show resolved Hide resolved
nemo_aligner/data/nlp/builders.py Show resolved Hide resolved
nemo_aligner/models/nlp/gpt/megatron_gpt_ppo_actor.py Outdated Show resolved Hide resolved
nemo_aligner/models/nlp/gpt/megatron_gpt_ppo_actor.py Outdated Show resolved Hide resolved
nemo_aligner/utils/trt_llm.py Outdated Show resolved Hide resolved
nemo_aligner/utils/trt_llm.py Outdated Show resolved Hide resolved
nemo_aligner/utils/trt_llm.py Show resolved Hide resolved
nemo_aligner/utils/utils.py Show resolved Hide resolved
trtllm.patch Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@gshennvm gshennvm force-pushed the geshen/trt_llm_to_main branch 2 times, most recently from 0493ee8 to 9c165a0 Compare August 26, 2024 00:05
@gshennvm
Copy link
Collaborator Author

fixed a bug I caught while running e2e:

f96cb88

it turns out torch masked tensor doesn't support dim is None, probably a bug on their side

Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of final comments to help understand what's going on

nemo_aligner/utils/trt_llm.py Show resolved Hide resolved
nemo_aligner/utils/trt_llm.py Show resolved Hide resolved
odelalleau
odelalleau previously approved these changes Aug 30, 2024
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah let's gooooooooo

terrykong
terrykong previously approved these changes Aug 30, 2024
@gshennvm gshennvm force-pushed the geshen/trt_llm_to_main branch 3 times, most recently from 273b392 to 86f9b11 Compare August 30, 2024 21:42
Signed-off-by: Gerald Shen <[email protected]>
@gshennvm gshennvm force-pushed the geshen/trt_llm_to_main branch from 86f9b11 to 2b5032d Compare August 30, 2024 21:54
@gshennvm
Copy link
Collaborator Author

tests look good! so i'm merging now

@gshennvm gshennvm merged commit 3efbd77 into main Aug 30, 2024
5 checks passed
@gshennvm gshennvm deleted the geshen/trt_llm_to_main branch August 30, 2024 23:59
abukharin3 pushed a commit to abukharin3/NeMo-Aligner that referenced this pull request Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algorithms documentation Improvements or additions to documentation Servers Utils
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants