agentscope-ai · DavdGao · Jan 14, 2026 · Dec 26, 2025 · Dec 26, 2025 · Dec 26, 2025
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -195,7 +195,7 @@ Now our examples are organized into subdirectories based on their type:
 - `examples/game/` for game-related examples
 - `examples/evaluation/` for evaluation scripts
 - `examples/workflows/` for workflow demonstrations
-- `examples/training/` for training-related examples
+- `examples/tuner/` for tuning-related examples
 
 An example structure could be:
 

diff --git a/CONTRIBUTING_zh.md b/CONTRIBUTING_zh.md
@@ -189,7 +189,7 @@ examples/
 - `examples/functionality/` 用于展示 AgentScope 的特定基础功能
 - `examples/evaluation/` 用于评估
 - `examples/workflows/` 用于工作流演示
-- `examples/training/` 用于训练相关示例
+- `examples/tuner/` 用于微调相关示例
 
 示例结构如下：
 

diff --git a/README.md b/README.md
@@ -59,7 +59,7 @@
 - **[2025-12]** AgentScope supports [TTS(Text-to-Speech)](https://doc.agentscope.io/tutorial/task_tts.html) now! Check our [example](https://github.com/agentscope-ai/agentscope/tree/main/examples/functionality/tts) and [tutorial](https://doc.agentscope.io/tutorial/task_tts.html) for more details.
 - **[2025-11]** AgentScope supports [Anthropic Agent Skill](https://claude.com/blog/skills) now! Check our [example](https://github.com/agentscope-ai/agentscope/tree/main/examples/functionality/agent_skill) and [tutorial](https://doc.agentscope.io/tutorial/task_agent_skill.html) for more details.
 - **[2025-11]** AgentScope open-sources [Alias-Agent](https://github.com/agentscope-ai/agentscope-samples/tree/main/alias) for diverse real-world tasks and [Data-Juicer Agent](https://github.com/agentscope-ai/agentscope-samples/tree/main/data_juicer_agent) for data processing.
-- **[2025-11]** AgentScope supports [Agentic RL](https://github.com/agentscope-ai/agentscope/tree/main/examples/training/react_agent) via integrating [Trinity-RFT](https://github.com/modelscope/Trinity-RFT) library.
+- **[2025-11]** AgentScope supports [Agentic RL](https://github.com/agentscope-ai/agentscope/tree/main/examples/tuner/react_agent) via integrating [Trinity-RFT](https://github.com/modelscope/Trinity-RFT) library.
 - **[2025-11]** AgentScope integrates [ReMe](https://github.com/agentscope-ai/agentscope/tree/main/examples/functionality/long_term_memory/reme) for enhanced long-term memory.
 - **[2025-11]** AgentScope launches [agentscope-samples](https://github.com/agentscope-ai/agentscope-samples) repository and upgrades [agentscope-runtime](https://github.com/agentscope-ai/agentscope-runtime) with Docker/K8s deployment and VNC-powered GUI sandboxes.
 - **[2025-11]** [Contributing Guide](./CONTRIBUTING.md) is online now! Welcome to contribute to AgentScope.
@@ -411,8 +411,8 @@ as_studio
     - [Multi-agent Concurrent](https://github.com/agentscope-ai/agentscope/tree/main/examples/workflows/multiagent_concurrent)
   - Evaluation
     - [ACEBench](https://github.com/agentscope-ai/agentscope/tree/main/examples/evaluation/ace_bench)
-  - Training
-    - [Reinforcement learning (RL) with Trinity-RFT](https://github.com/agentscope-ai/agentscope/tree/main/examples/training/react_agent)
+  - Tuner
+    - [Tune ReAct Agent](https://github.com/agentscope-ai/agentscope/tree/main/examples/tuner/react_agent)
 
 
 ## 🤝 Contributing

diff --git a/README_zh.md b/README_zh.md
@@ -59,7 +59,7 @@
 - **[2025-12]** AgentScope 已支持 [TTS(Text-to-Speech) 模型](https://doc.agentscope.io/zh_CN/tutorial/task_tts.html) ！欢迎查看 [样例]() 和 [教程](https://doc.agentscope.io/zh_CN/tutorial/task_tts.html) 了解更多详情。
 - **[2025-11]** AgentScope 已支持 [Anthropic Agent Skill](https://claude.com/blog/skills) ！欢迎查看 [样例](https://github.com/agentscope-ai/agentscope/tree/main/examples/functionality/agent_skill) 和 [教程](https://doc.agentscope.io/zh_CN/tutorial/task_agent_skill.html) 了解更多详情。
 - **[2025-11]** AgentScope 开源 [Alias-Agent](https://github.com/agentscope-ai/agentscope-samples/tree/main/alias) 用于处理多样化的真实任务，以及 [Data-Juicer Agent](https://github.com/agentscope-ai/agentscope-samples/tree/main/data_juicer_agent) 用于自然语言驱动的数据处理。
-- **[2025-11]** AgentScope 通过集成 [Trinity-RFT](https://github.com/modelscope/Trinity-RFT) 实现对 [Agentic RL](https://github.com/agentscope-ai/agentscope/tree/main/examples/training/react_agent) 的支持。
+- **[2025-11]** AgentScope 通过集成 [Trinity-RFT](https://github.com/modelscope/Trinity-RFT) 实现对 [Agentic RL](https://github.com/agentscope-ai/agentscope/tree/main/examples/tuner/react_agent) 的支持。
 - **[2025-11]** AgentScope 集成 [ReMe](https://github.com/agentscope-ai/agentscope/tree/main/examples/functionality/long_term_memory/reme) 增强长期记忆能力。
 - **[2025-11]** AgentScope 推出 [agentscope-samples](https://github.com/agentscope-ai/agentscope-samples) 样例库，并升级 [agentscope-runtime](https://github.com/agentscope-ai/agentscope-runtime) 支持 Docker/K8s 部署和 VNC 驱动的图形化沙盒。
 - **[2025-11]** [Contributing Guide](./CONTRIBUTING.md) 已更新，欢迎贡献到 AgentScope！
@@ -412,8 +412,8 @@ as_studio
     - [多智能体并发](https://github.com/agentscope-ai/agentscope/tree/main/examples/workflows/multiagent_concurrent)
   - 评测
     - [ACEBench](https://github.com/agentscope-ai/agentscope/tree/main/examples/evaluation/ace_bench)
-  - 训练
-    - [使用 Trinity-RFT 进行强化学习训练](https://github.com/agentscope-ai/agentscope/tree/main/examples/training/react_agent)
+  - 微调
+    - [微调 ReAct 智能体](https://github.com/agentscope-ai/agentscope/tree/main/examples/tuner/react_agent)
 
 
 ## 🤝 贡献

diff --git a/docs/tutorial/en/index.rst b/docs/tutorial/en/index.rst
@@ -70,6 +70,7 @@ Welcome to AgentScope's documentation!
    tutorial/task_eval
    tutorial/task_embedding
    tutorial/task_tts
+   tutorial/task_tuner
 
 .. toctree::
    :maxdepth: 1

diff --git a/docs/tutorial/en/src/task_tuner.py b/docs/tutorial/en/src/task_tuner.py
@@ -0,0 +1,247 @@
+# -*- coding: utf-8 -*-
+"""
+.. _tuner:
+
+Tuner
+=================
+
+AgentScope provides the ``tuner`` module for training agent applications using reinforcement learning (RL).
+This tutorial will guide you through how to leverage the ``tuner`` module to improve agent performance on specific tasks, including:
+
+- Introducing the core components of the ``tuner`` module
+- Demonstrating the key code required for the tuning workflow
+- Showing how to configure and run the tuning process
+
+Main Components
+~~~~~~~~~~~~~~~~~~~
+The ``tuner`` module introduces three core components essential for RL-based agent training:
+
+- **Task Dataset**: A collection of tasks for training and evaluating the agent.
+- **Workflow Function**: Encapsulates the agent's logic to be tuned.
+- **Judge Function**: Evaluates the agent's performance on tasks and provides reward signals for tuning.
+
+In addition, ``tuner`` provides several configuration classes for customizing the tuning process, including:
+
+- **TunerModelConfig**: Model configurations for tuning purposes.
+- **AlgorithmConfig**: Specifies the RL algorithm (e.g., GRPO, PPO) and its parameters.
+
+Implementation
+~~~~~~~~~~~~~~~~~~~
+This section demonstrates how to use ``tuner`` to train a simple math agent.
+
+Task Dataset
+--------------------
+The task dataset contains tasks for training and evaluating your agent.
+
+You dataset should follow the Huggingface `datasets <https://huggingface.co/docs/datasets/quickstart>`_ format, which can be loaded with ``datasets.load_dataset``. For example:
+
+.. code-block:: text
+
+    my_dataset/
+        ├── train.jsonl  # training samples
+        └── test.jsonl   # evaluation samples
+
+Suppose your `train.jsonl` contains:
+
+.. code-block:: json
+
+    {"question": "What is 2 + 2?", "answer": "4"}
+    {"question": "What is 4 + 4?", "answer": "8"}
+
+Before starting tuning, you can verify that your dataset is loaded correctly with:
+
+.. code-block:: python
+
+    from agentscope.tuner import DatasetConfig
+
+    dataset = DatasetConfig(path="my_dataset", split="train")
+    dataset.preview(n=2)
+    # Output the first two samples to verify correct loading
+    # [
+    #   {
+    #     "question": "What is 2 + 2?",
+    #     "answer": "4"
+    #   },
+    #   {
+    #     "question": "What is 4 + 4?",
+    #     "answer": "8"
+    #   }
+    # ]
+
+Workflow Function
+--------------------
+The workflow function defines how the agent interacts with the environment and makes decisions. All workflow functions should follow the input/output signature defined in ``agentscope.tuner.WorkflowType``.
+
+Below is an example workflow function using a ReAct agent to answer math questions:
+"""
+
+from typing import Dict, Optional
+from agentscope.agent import ReActAgent
+from agentscope.formatter import OpenAIChatFormatter
+from agentscope.message import Msg
+from agentscope.model import ChatModelBase
+from agentscope.tuner import WorkflowOutput
+
+
+async def example_workflow_function(
+    task: Dict,
+    model: ChatModelBase,
+    auxiliary_models: Optional[Dict[str, ChatModelBase]] = None,
+) -> WorkflowOutput:
+    """An example workflow function for tuning.
+
+    Args:
+        task (`Dict`): The task information.
+        model (`ChatModelBase`): The chat model used by the agent.
+        auxiliary_models (`Optional[Dict[str, ChatModelBase]]`): Additional
+            chat models, generally used to simulate the behavior of other
+            non-training agents in multi-agent scenarios.
+
+    Returns:
+        `WorkflowOutput`: The output generated by the workflow.
+    """
+    agent = ReActAgent(
+        name="react_agent",
+        sys_prompt="You are a helpful math problem solving agent.",
+        model=model,
+        formatter=OpenAIChatFormatter(),
+    )
+
+    response = await agent.reply(
+        msg=Msg(
+            "user",
+            task["question"],
+            role="user",
+        ),  # extract question from task
+    )
+
+    return WorkflowOutput(  # return the response
+        response=response,
+    )
+
+
+# %%
+# You can directly run this workflow function with a task dictionary and a ``DashScopeChatModel`` / ``OpenAIChatModel`` to test its correctness before formal training. For example:
+
+import asyncio
+import os
+from agentscope.model import DashScopeChatModel
+
+task = {"question": "What is 123 plus 456?", "answer": "579"}
+model = DashScopeChatModel(
+    model_name="qwen-max",
+    api_key=os.environ["DASHSCOPE_API_KEY"],
+)
+workflow_output = asyncio.run(example_workflow_function(task, model))
+assert isinstance(
+    workflow_output.response,
+    Msg,
+), "In this example, the response should be a Msg instance."
+print("\nWorkflow response:", workflow_output.response.get_text_content())
+
+# %%
+#
+# Judge Function
+# --------------------
+# The judge function evaluates the agent's performance on a given task and provides a reward signal for tuning.
+# All judge functions should follow the input/output signature defined in ``agentscope.tuner.JudgeType``.
+# Below is a simple judge function that compares the agent's response with the ground truth answer:
+
+from typing import Any
+from agentscope.tuner import JudgeOutput
+
+
+async def example_judge_function(
+    task: Dict,
+    response: Any,
+    auxiliary_models: Optional[Dict[str, ChatModelBase]] = None,
+) -> JudgeOutput:
+    """A very simple judge function only for demonstration.
+
+    Args:
+        task (`Dict`): The task information.
+        response (`Any`): The response field from the WorkflowOutput.
+        auxiliary_models (`Optional[Dict[str, ChatModelBase]]`): Additional
+            chat models for LLM-as-a-Judge purpose.
+    Returns:
+        `JudgeOutput`: The reward assigned by the judge.
+    """
+    ground_truth = task["answer"]
+    reward = 1.0 if ground_truth in response.get_text_content() else 0.0
+    return JudgeOutput(reward=reward)
+
+
+judge_output = asyncio.run(
+    example_judge_function(
+        task,
+        workflow_output.response,
+    ),
+)
+print(f"Judge reward: {judge_output.reward}")
+
+# %%
+# The judge function can also be locally tested in the same way as shown above before formal training to ensure its logic is correct.
+#
+# .. tip::
+#    You can leverage existing `MetricBase <https://github.com/agentscope-ai/agentscope/blob/main/src/agentscope/evaluate/_metric_base.py>`_ implementations in your judge function to compute more sophisticated metrics and combine them into a composite reward.
+#
+# Configuration and Running
+# ~~~~~~~~~~~~~~~
+# Finally, you can configure and run the tuning process using the ``tuner`` module.
+# Before starting, ensure that `Trinity-RFT <https://github.com/modelscope/Trinity-RFT>`_ is installed in your environment, as it is required for tuning.
+#
+# Below is an example of configuring and starting the tuning process:
+#
+# .. note::
+#    This example is for demonstration only. For a complete runnable example, see `Tune ReActAgent <https://github.com/agentscope-ai/agentscope/tree/main/examples/tuner/react_agent>`_
+#
+# .. code-block:: python
+#
+#        from agentscope.tuner import tune, AlgorithmConfig, DatasetConfig, TunerModelConfig
+#        # your workflow / judge function here...
+#
+#        if __name__ == "__main__":
+#            dataset = DatasetConfig(path="my_dataset", split="train")
+#            model = TunerModelConfig(model_path="Qwen/Qwen3-0.6B", max_model_len=16384)
+#            algorithm = AlgorithmConfig(
+#                algorithm_type="multi_step_grpo",
+#                group_size=8,
+#                batch_size=32,
+#                learning_rate=1e-6,
+#            )
+#            tune(
+#                workflow_func=example_workflow_function,
+#                judge_func=example_judge_function,
+#                model=model,
+#                train_dataset=dataset,
+#                algorithm=algorithm,
+#            )
+#
+# Here, ``DatasetConfig`` configures the training dataset, ``TunerModelConfig`` sets the parameters for the trainable model, and ``AlgorithmConfig`` specifies the reinforcement learning algorithm and its hyperparameters.
+#
+# .. tip::
+#    The ``tune`` function is based on `Trinity-RFT <https://github.com/modelscope/Trinity-RFT>`_ and internally converts input parameters to a YAML configuration.
+#    Advanced users can skip the ``model``, ``train_dataset``, and ``algorithm`` arguments and instead provide a YAML config file path via the ``config_path`` argument.
+#    Using a configuration file is recommended for fine-grained control and to leverage advanced Trinity-RFT features. See the Trinity-RFT `Configuration Guide <https://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_configs.html>`_ for more options.
+#
+# Save the above code as ``main.py`` and run it with:
+#
+# .. code-block:: bash
+#
+#        ray start --head
+#        python main.py
+#
+# Checkpoints and logs are automatically saved to the ``checkpoints/AgentScope`` directory under your workspace, with each run in a timestamped sub-directory. Tensorboard logs can be found in ``monitor/tensorboard`` within the checkpoint directory.
+#
+# .. code-block:: text
+#
+#        your_workspace/
+#            └── checkpoints/
+#                └──AgentScope/
+#                    └── Experiment-20260104185355/  # each run saved in a sub-directory with timestamp
+#                        ├── monitor/
+#                        │   └── tensorboard/  # tensorboard logs
+#                        └── global_step_x/    # saved model checkpoints at step x
+#
+# .. tip::
+#    For more tuning examples, refer to the `tuner directory <https://github.com/agentscope-ai/agentscope-samples/tree/main/tuner>`_ of the AgentScope-Samples repository.
diff --git a/docs/tutorial/zh_CN/index.rst b/docs/tutorial/zh_CN/index.rst
@@ -71,6 +71,7 @@ Welcome to AgentScope's documentation!
    tutorial/task_eval
    tutorial/task_embedding
    tutorial/task_tts
+   tutorial/task_tuner
 
 .. toctree::
    :maxdepth: 1