-
Notifications
You must be signed in to change notification settings - Fork 587
feat(middleware): document new retry_model_request hook #747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
c92a1f1
feat(middleware): document new retry_model_request hook
christian-bromann d5616b3
Update src/oss/langchain/middleware.mdx
christian-bromann 9bb4659
Update src/oss/langchain/middleware.mdx
christian-bromann bb571c5
cr
christian-bromann 6f09d32
fix python imports
christian-bromann File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,11 +44,12 @@ graph TD | |
| </Card> | ||
|
|
||
| Middleware provides control over what happens before and after those steps. | ||
| Each middleware can add in three different types of modifiers: | ||
| Each middleware can add in four different types of modifiers: | ||
|
|
||
| :::python | ||
| - `before_model`: Runs before model execution. Can update state or jump to a different node (`model`, `tools`, `end`) | ||
| - `modify_model_request`: Runs before model execution, to prepare the model request object. Can only modify the current model request object (no permanent state updates) and cannot jump to a different node. | ||
| - `retry_model_request`: Runs when model execution fails. Can modify the model request to retry with different parameters or return None to propagate the error. | ||
| - `after_model`: Runs after model execution, before tools are executed. Can update state or jump to a different node (`model`, `tools`, `END`) | ||
|
|
||
| In addition to that, each middleware can define the following static properties: | ||
|
|
@@ -59,6 +60,7 @@ In addition to that, each middleware can define the following static properties: | |
| :::js | ||
| - `beforeModel`: Runs before model execution. Can update state or jump to a different node (`model`, `tools`, `end`) | ||
| - `modifyModelRequest`: Runs before model execution, to prepare the model request object. Can only modify the current model request object (no permanent state updates) and cannot jump to a different node. | ||
| - `retryModelRequest`: Runs when model execution fails. Can modify the model request to retry with different parameters or return undefined to propagate the error. | ||
| - `afterModel`: Runs after model execution, before tools are executed. Can update state or jump to a different node (`model`, `tools`, `__end__`) | ||
|
|
||
| In addition to that, each middleware can define the following static properties: | ||
|
|
@@ -69,10 +71,10 @@ In addition to that, each middleware can define the following static properties: | |
| ::: | ||
|
|
||
| :::python | ||
| An agent can contain `before_model`, `modify_model_request`, or `after_model` middleware. All three do not need to be implemented. | ||
| An agent can contain `before_model`, `modify_model_request`, `retry_model_request`, or `after_model` middleware. All four do not need to be implemented. | ||
| ::: | ||
| :::js | ||
| An agent can contain multiple middleware. Each middleware does not need to implement all three of `beforeModel`, `modifyModelRequest`, `afterModel`. | ||
| An agent can contain multiple middleware. Each middleware does not need to implement all four of `beforeModel`, `modifyModelRequest`, `retryModelRequest`, `afterModel`. | ||
| ::: | ||
|
|
||
| <Card> | ||
|
|
@@ -166,6 +168,7 @@ LangChain provides several built in middleware to use off-the-shelf | |
| - [Human-in-the-loop](#human-in-the-loop) | ||
| - [Anthropic prompt caching](#anthropic-prompt-caching) | ||
| - [Dynamic system prompt](#dynamic-system-prompt) | ||
| - [Model fallback](#model-fallback) | ||
|
|
||
| ### Summarization | ||
|
|
||
|
|
@@ -453,20 +456,98 @@ const agent = createAgent({ | |
| ``` | ||
| ::: | ||
|
|
||
| ## Custom Middleware | ||
| ### Model fallback | ||
|
|
||
| The `ModelFallbackMiddleware` provides automatic model fallback on errors. This middleware attempts to retry failed model calls with alternative models in sequence. When a model call fails, it tries the next model in the fallback list until either a call succeeds or all models have been exhausted. If all fallback models fail, the original error from the last attempt is raised. | ||
|
|
||
| **Use cases:** | ||
|
|
||
| - Handling model outages or rate limits | ||
| - Cost optimization by trying cheaper models first | ||
| - Ensuring high availability for critical applications | ||
|
|
||
| :::python | ||
| ```python | ||
| from langchain.agents import create_agent | ||
| from langchain.agents.middleware import ModelFallbackMiddleware | ||
|
|
||
| agent = create_agent( | ||
| model="openai:gpt-4o", # Primary model | ||
| tools=[weather_tool, calculator_tool], | ||
| middleware=[ | ||
| ModelFallbackMiddleware( | ||
| "openai:gpt-4o-mini", # First fallback | ||
| "anthropic:claude-3-5-sonnet-20241022", # Second fallback | ||
| ), | ||
| ], | ||
| ) | ||
|
|
||
| # If gpt-4o fails, automatically tries gpt-4o-mini, then claude | ||
| result = agent.invoke({"messages": [HumanMessage("Hello")]}) | ||
| ``` | ||
| ::: | ||
|
|
||
| :::js | ||
| ```typescript | ||
| import { createAgent, modelFallbackMiddleware, HumanMessage } from "langchain"; | ||
|
|
||
| const agent = createAgent({ | ||
| model: "openai:gpt-4o", // Primary model | ||
| tools: [weatherTool, calculatorTool], | ||
| middleware: [ | ||
| modelFallbackMiddleware( | ||
| "openai:gpt-4o-mini", // First fallback | ||
| "anthropic:claude-3-5-sonnet-20241022" // Second fallback | ||
| ), | ||
| ], | ||
| }); | ||
|
|
||
| // If gpt-4o fails, automatically tries gpt-4o-mini, then claude | ||
| const result = await agent.invoke({ | ||
| messages: [new HumanMessage("Hello")] | ||
| }); | ||
| ``` | ||
| ::: | ||
|
|
||
| **Configuration:** | ||
|
|
||
| :::python | ||
| The `ModelFallbackMiddleware` constructor accepts fallback models in order of preference: | ||
|
|
||
| - `first_model`: The first fallback model (required) | ||
| - `*additional_models`: Additional fallback models in order | ||
|
|
||
| Models can be specified as: | ||
| - Model name strings (e.g., `"openai:gpt-4o-mini"`) | ||
| - `BaseChatModel` instances for pre-configured models | ||
| ::: | ||
|
|
||
| :::js | ||
| The `modelFallbackMiddleware` function accepts fallback models in order of preference: | ||
|
|
||
| - `...fallbackModels`: Fallback models in order of preference | ||
|
|
||
| Models can be specified as: | ||
| - Model name strings (e.g., `"openai:gpt-4o-mini"`) | ||
| - `LanguageModelLike` instances for pre-configured models | ||
| ::: | ||
|
|
||
| ## Custom middleware | ||
|
|
||
| Middleware for agents are subclasses of `AgentMiddleware`, which implement one or more of its hooks. | ||
|
|
||
| `AgentMiddleware` currently provides three different ways to modify the core agent loop: | ||
| `AgentMiddleware` currently provides four different ways to modify the core agent loop: | ||
|
|
||
| :::python | ||
| - `before_model`: runs before the model is run. Can update state or exit early with a jump. | ||
| - `modify_model_request`: runs before the model is run. Cannot update state or exit early with a jump. | ||
| - `retry_model_request`: runs when the model call fails. Can modify the request to retry or return None to propagate the error. | ||
| - `after_model`: runs after the model is run. Can update state or exit early with a jump. | ||
| ::: | ||
| :::js | ||
| - `beforeModel`: runs before the model is run. Can update state or exit early with a jump. | ||
| - `modifyModelRequest`: runs before the model is run. Cannot update state or exit early with a jump. | ||
| - `retryModelRequest`: runs when the model call fails. Can modify the request to retry or return undefined to propagate the error. | ||
| - `afterModel`: runs after the model is run. Can update state or exit early with a jump. | ||
| ::: | ||
|
|
||
|
|
@@ -607,6 +688,96 @@ const myMiddleware = createMiddleware({ | |
| ``` | ||
| ::: | ||
|
|
||
| :::python | ||
| ### `retry_model_request` | ||
| ::: | ||
| :::js | ||
| ### `retryModelRequest` | ||
| ::: | ||
|
|
||
| Runs when a model call fails with an exception. This hook allows middleware to handle errors and optionally retry the model call with modified parameters. | ||
|
|
||
| :::python | ||
| The `retry_model_request` hook is called with the following parameters: | ||
| - `error` (`Exception`): The exception that occurred during model invocation | ||
| - `request` (`ModelRequest`): The original model request that failed | ||
| - `state` (`AgentState`): The current agent state | ||
| - `runtime` (`Runtime`): The runtime context | ||
| - `attempt` (`int`): The current attempt number (1-indexed) | ||
|
|
||
| The hook can return: | ||
| - `ModelRequest`: A modified request to retry with | ||
| - `None`: Propagate the error (re-raise the exception) | ||
| ::: | ||
|
|
||
| :::js | ||
| The `retryModelRequest` hook is called with the following parameters: | ||
| - `error` (`Error`): The exception that occurred during model invocation | ||
| - `request` (`ModelRequest`): The original model request that failed | ||
| - `state` (agent state): The current agent state | ||
| - `runtime` (`Runtime`): The runtime context | ||
| - `attempt` (`number`): The current attempt number (1-indexed) | ||
|
|
||
| The hook can return: | ||
| - `ModelRequest`: A modified request to retry with | ||
| - `undefined`: Propagate the error (re-raise the exception) | ||
| ::: | ||
|
|
||
| **Key behaviors:** | ||
|
|
||
| - Multiple middleware with `retryModelRequest` are processed in order | ||
| - The first middleware that returns a modified request will trigger a retry | ||
| - Subsequent middleware in the chain are not called for that attempt | ||
| - If no middleware wants to retry, the original error is propagated | ||
| - There's a hard limit of 100 attempts to prevent infinite loops | ||
|
|
||
| Signature: | ||
| :::python | ||
| ```python | ||
| from langchain.agents import AgentState | ||
| from langchain.agents.middleware import ModelRequest, AgentMiddleware | ||
| from langgraph.runtime import Runtime | ||
|
|
||
| class RetryMiddleware(AgentMiddleware): | ||
| def retry_model_request( | ||
| self, | ||
| error: Exception, | ||
| request: ModelRequest, | ||
| state: AgentState, | ||
| runtime: Runtime, | ||
| attempt: int | ||
| ) -> ModelRequest | None: | ||
| # Example: Switch to a fallback model on the first retry | ||
| if attempt == 1: | ||
| # Modify the request to use a different model | ||
| request.model = "openai:gpt-4o-mini" | ||
| return request | ||
| # Don't retry after first attempt | ||
| return None | ||
| ``` | ||
| ::: | ||
| :::js | ||
| ```typescript | ||
| import { createMiddleware } from "langchain"; | ||
|
|
||
| const retryMiddleware = createMiddleware({ | ||
| name: "RetryMiddleware", | ||
| retryModelRequest: (error, request, state, runtime, attempt) => { | ||
| // Example: Switch to a fallback model on the first retry | ||
| if (attempt === 1) { | ||
| // Modify the request to use a different model | ||
| return { | ||
| ...request, | ||
| model: "openai:gpt-4o-mini", | ||
| }; | ||
| } | ||
| // Don't retry after first attempt | ||
| return undefined; | ||
| }, | ||
| }); | ||
| ``` | ||
| ::: | ||
|
|
||
| :::python | ||
| ### `after_model` | ||
| ::: | ||
|
|
@@ -897,11 +1068,13 @@ You can provide multiple middlewares. They are executed in the following logic: | |
| :::python | ||
| **`before_model`**: Are run in the order they are passed in. If an earlier middleware exits early, then following middleware are not run | ||
| **`modify_model_request`**: Are run in the order they are passed in. | ||
| **`retry_model_request`**: Are run in the order they are passed in when a model call fails. The first middleware that returns a modified request triggers a retry, and subsequent middleware are not called for that attempt. | ||
| **`after_model`**: Are run in the _reverse_ order that they are passed in. If an earlier middleware exits early, then following middleware are not run | ||
| ::: | ||
| :::js | ||
| **`beforeModel`**: Are run in the order they are passed in. If an earlier middleware exits early, then following middleware are not run | ||
| **`modifyModelRequest`**: Are run in the order they are passed in. | ||
| **`retryModelRequest`**: Are run in the order they are passed in when a model call fails. The first middleware that returns a modified request triggers a retry, and subsequent middleware are not called for that attempt. | ||
| **`afterModel`**: Are run in the _reverse_ order that they are passed in. If an earlier middleware exits early, then following middleware are not run | ||
| ::: | ||
|
|
||
|
|
@@ -977,7 +1150,7 @@ Use middleware to dynamically select which tools are available at runtime based | |
|
|
||
| ```python | ||
| from langchain.agents import create_agent | ||
| from langchain.agents.middleware import AgentState, ModelRequest, modify_model_request | ||
| from langchain.agents.middleware.types import AgentState, ModelRequest, modify_model_request | ||
|
||
|
|
||
| @modify_model_request | ||
| def tool_selector(state: AgentState, request: ModelRequest) -> ModelRequest: | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.