Skip to content

Conversation

zhassan223
Copy link

new data class for GRPO train_kwargs instead of dictionary. Makes it cleaner and has in-data-class argument checks.

@zhassan223
Copy link
Author

@Ziems

Copy link
Collaborator

@Ziems Ziems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the tutorials as well and test that they still work?

Those are in docs/docs/tutorials/rl_multihop and docs/docs/tutorials/rl_papillon

@@ -0,0 +1,138 @@
from dataclasses import dataclass, field
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for us to make a telepromp/grpo/ directory where we can put grpo.py and grpo_config.py? The GEPA optimizer does this so I think there is already a precedent allowing us to do this.

@Ziems
Copy link
Collaborator

Ziems commented Oct 1, 2025

@zhassan223 Great work! Requested some small changes but wonderful progress!

" multitask=True,\n",
" num_dspy_examples_per_grpo_step=4,\n",
" num_samples_per_input=8,\n",
" num_rollouts_per_grpo_step=8,#changed from num_generations since that parameter doesn't exist anymore\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Collaborator

@Ziems Ziems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more things!

@Ziems
Copy link
Collaborator

Ziems commented Oct 6, 2025

@zhassan223 it looks like a few tests failed too, could you take a look?

@Ziems Ziems changed the title made new GRPO data class for train_kwargs, integrated to lm_local_arb… Replace GRPO kwargs with dataclass Oct 6, 2025
@zhassan223 zhassan223 changed the title Replace GRPO kwargs with dataclass Add GRPOConfig for Arbor Oct 12, 2025
@zhassan223 zhassan223 closed this Oct 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants