-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat(tuner): Enhance Agent Tune Interface #1079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
yanxi-chen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 29 out of 30 changed files in this pull request and generated 19 comments.
DavdGao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my inline comments. I have some concerns about whether abstractions like Algorithm actually simplify RL training configuration. In my view, consolidating all tuning parameters in a single JSON file would be more straightforward and easier to manage.
lingzhq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To support the data-augmentation example in AgentScope-Tuner, we need to pass the Task Selector parameters to Trinity and fix an issue with loading eval tasksets via YAML. The corresponding implementation can be checked in the samples PR. Thanks for looking into this!
DavdGao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz see inline comments
DavdGao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
AgentScope Version
1.0.11
Description
Enhance the current tune related modules.
flowchart TD Model[Model] --> WorkflowFunction[Workflow Function] WorkflowFunction --> JudgeFunction[Judge Function] Task[Task] --> WorkflowFunction Task[Task] --> JudgeFunction JudgeFunction --> Reward[Reward] classDef wfcolor fill:#e67e22,stroke:#333,color:#111; classDef judgecolor fill:#1abc9c,stroke:#333,color:#111,stroke-dasharray: 5 5; classDef taskcolor fill:#3498db,stroke:#333,color:#111; class WorkflowFunction wfcolor; class JudgeFunction judgecolor; class Task taskcolor;Enhancements include:
auxiliary_models(Dict[str, BaseChatModel]) toworkflow_functionto support the use of different models in multi-agent applications.workflow_functioncan return raw response with any type.judge_functionto calculate rewards based on the raw response returned byworkflow_function, and thejudge_functioncan useauxiliary_models(Dict[BaseChatModel]) to implement LLM-as-a-judge.Dataset,TunerChatModelandAlgorithmto construct the tuning configuration.Checklist
Please check the following items before code is ready to be reviewed.
pre-commit run --all-filescommand