Skip to content

Conversation

@jacobthebanana
Copy link
Collaborator

@jacobthebanana jacobthebanana commented Sep 17, 2025

This pull request includes:

  • GRPO example on openai/gsm8k using LLM judge to compare between proposed answer and ground truth.
  • Agent SDK integration- define the environment using the familiar OpenAI Agent SDK and run RL on the LLM powering the agent. Not yet tested on multi-agent setups (agent as tool or handoff)
  • Extensive typing for simplified function signatures and IDE support- static type checking, pyright lints, proper autocompletion even within the training loop.

@jwilles jwilles requested a review from kohankhaki October 15, 2025 15:18
@jacobthebanana jacobthebanana marked this pull request as ready for review October 29, 2025 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants