generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
🧺 [2/N] Refactor _generate
in GRPO/RLOO: Use prompt_ids
from generation
#4152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
92 commits
Select commit
Hold shift + click to select a range
552e899
Refactor image handling: replace `image_split_sizes` with `image_grid…
qgallouedec 449ef07
simpler
qgallouedec c8933aa
gfpo
qgallouedec 229c554
multi-image grpo
qgallouedec 3ca6ad5
log with wandb
qgallouedec dcf4b92
no vlm reward models
qgallouedec 30ad7ca
rloo
qgallouedec 86cc30b
gfpo
qgallouedec 088897b
fix
qgallouedec d2adc63
test peft
qgallouedec f4c82bf
fix gfpo
qgallouedec 1257796
rloo test
qgallouedec 099a39b
peft rloo
qgallouedec 529add6
oops
qgallouedec fc6b11f
update test
qgallouedec ae1f497
generate method
qgallouedec f998432
debug
qgallouedec fa73876
skip failing test
qgallouedec 52d8bd9
Merge branch 'main' into drop-image_split_sizes
qgallouedec dfc0d38
Merge branch 'drop-image_split_sizes' into multi-image-support
qgallouedec fc52e68
test fixed!
qgallouedec 4d12aeb
Merge branch 'multi-image-support' into generate-method
qgallouedec 4fc2b5b
gfpo
qgallouedec b628744
rm vllm
qgallouedec d3a769f
fix doc
qgallouedec e17ec42
Merge branch 'main' into drop-image_split_sizes
qgallouedec efbb03a
Merge branch 'drop-image_split_sizes' into multi-image-support
qgallouedec 562c662
Merge branch 'main' into multi-image-support
qgallouedec 485781c
Merge branch 'main' into multi-image-support
qgallouedec 05270f8
update layers to ignore
qgallouedec 1c53094
clarify image column desc
qgallouedec 9b6652e
rm VLM x RM warning
qgallouedec c500440
Merge branch 'multi-image-support' into generate-method
qgallouedec a6a8c44
Merge branch 'main' into generate-method
qgallouedec d8665e1
Merge branch 'main' into generate-method
qgallouedec 365d501
Merge branch 'main' into generate-method
qgallouedec cdb4c76
Merge branch 'main' into generate-method
qgallouedec c83e710
same for rloo
qgallouedec ec6ad25
nits style and align
qgallouedec b4cadde
Merge branch 'main' into generate-method
qgallouedec b0dceb9
restart
qgallouedec ebe32c2
progress
qgallouedec 0213662
progress continues
qgallouedec 8b3a724
progress again again
qgallouedec c1ae6aa
back to working point
qgallouedec 1a66b43
revert chage data utils
qgallouedec 2dc69a6
Merge branch 'main' into generate-method
qgallouedec 9435a94
refactor in grpo
qgallouedec d3f1d3c
Merge branch 'main' into refactor_generate
qgallouedec 3d8ea27
wrong merge commit
qgallouedec 27dc958
fix num_input_tokens_seen
qgallouedec 53772ef
getting closer
qgallouedec 8766fa5
consistent naming
qgallouedec 236b78b
better
qgallouedec 9da4830
simplify a bit + comment
qgallouedec b3bd0b0
another one
qgallouedec d79b9e1
get prompt ids from generation
qgallouedec 8d34d54
remove pad token removal
qgallouedec e770efe
Merge branch 'refactor_generate' into refactor_generate_2
qgallouedec 55a2480
rloo + doc
qgallouedec c8041e1
Merge branch 'refactor_generate' into refactor_generate_2
qgallouedec 7b7a11d
test and doc
qgallouedec c5064d6
gfpo
qgallouedec effb41b
Merge branch 'main' into refactor_generate
qgallouedec e82bfb4
Merge branch 'main' into refactor_generate
qgallouedec 4b9c126
Merge branch 'refactor_generate' into refactor_generate_2
qgallouedec f11759e
Merge branch 'main' into refactor_generate_2
qgallouedec e7aa945
fix vllm client server
qgallouedec e164ec5
repicate all_prompt_ids
qgallouedec 49577ad
Same for RLOO
qgallouedec 5fca5b8
fix normal generation path
qgallouedec d599c20
Merge branch 'main' into refactor_generate_2
qgallouedec e82db74
🔣 Fix test: replace `trainer.tokenizer` by `trainer.processing_class`…
qgallouedec 192deb3
Fix CI ImportError: FlashAttention2 and decorator order for all param…
albertvillanova cf9d8e7
Hotfix wrong formatting of docstrings with blockquote tips (#4187)
albertvillanova f9c3c3c
🌡️ Have vLLM return processed (temperature scaled) log probs (#4163)
YonatanGideoni 6489479
Replace remaining trainer.tokenizer with trainer.processing_class in …
albertvillanova 21a67fc
[DOCS] Lora without regret (#4181)
burtenshaw c1e7ad2
[DOCS/FIX] lora without regrets - fix lr (#4207)
burtenshaw 5d34144
Remove custome_container for building the docs (#4198)
albertvillanova ae2a0e7
Remove tokenizer creation from `sft` example script (#4197)
sergiopaniego 6543f51
Hotfix: Exclude transformers 4.57.0 for Python 3.9 (#4209)
albertvillanova 8319ce0
Replace unittest with pytest (#4188)
albertvillanova 4fdaa4c
Updated vLLM integration guide (#4162)
sergiopaniego d258e36
Remove `Optional` from `processing_class` in `PPOTrainer` (#4212)
sergiopaniego 7f5b499
Replace setup with pyproject and fix packaging unintended modules (#4…
albertvillanova df386f9
Merge branch 'main' into refactor_generate_2
qgallouedec 5b9a6ab
Merge branch 'main' into refactor_generate_2
qgallouedec 4a274d5
Merge branch 'main' into refactor_generate_2
qgallouedec 6324eda
Merge branch 'main' into refactor_generate_2
qgallouedec 3955643
fix prompt mask
qgallouedec ee6638c
remove no-op
qgallouedec File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.