The system is composed of a UI Extension for user interaction, a ReAct Agent for intelligent decision-making, and an MCP Server that acts as the intermediary between the agent and the Rancher API.
The ReAct Agent is the orchestrator of the system’s intelligence. It leverages a Large Language Model (LLM) as its reasoning engine, while the agent itself provides the structure, coordination, and secure execution of actions.
- Interprets user inputs expressed in natural language.
- Performs the reasoning: breaks down complex requests into smaller steps.
- Decides what should happen next (keep reasoning vs. take action).
- Synthesizes outputs from tools into clear, human-readable responses.
- Wraps the LLM with the ReAct (Reason + Act)
- Decides when to let the LLM keep reasoning or when to act on its instructions.
- Calls external tools (via the MCP Server) as directed by the LLM.
- Ensures secure interactions by passing the user’s Rancher token to the MCP server for authentication and authorization.
The MCP Server acts as a secure, controlled gateway between the ReAct Agent and the Rancher and Kubernetes APIs. Its functions include:
- Exposing Tools: It provides a set of well-defined, safe tools (API endpoints) that the ReAct Agent can call. These tools abstract away the complexity of direct Rancher/Kubernetes API interactions.
- Interaction with Rancher: It translates tool calls from the agent into the appropriate API requests to the Rancher management server, retrieving or modifying resources as needed.
The UI Extension provides the user-facing chat interface within the Rancher dashboard. It is designed to be a seamless part of the Rancher experience and is responsible for:
- User Input: It captures user queries and sends them to the ReAct Agent.
- Displaying Responses: It receives responses from the ReAct Agent and presents them in a chat-like format.
The UI Tools system allows the AI agent to intelligently select and dispatch specialized UI components (viewers, editors, comparators, etc.) to enhance responses. This is a generic mechanism that can be integrated into any user interface.
For comprehensive documentation on defining, configuring, and using UI tools, see UI Tools Configuration Guide. This covers:
- How to define custom UI tools
- Publishing tools via ConfigMaps
- Filtering tools via WebSocket requests
- Tool validation and schema properties
- Prompt keywords and best practices
For detailed documentation on the supervisor/child-agents architecture, middleware system, and human validation flow, see ARCHITECTURE.md.
- User Request → Agent → LLM
The user submits a natural language query through the UI Extension. - LLM Reasoning
The LLM interprets the request, reasons about the problem, and proposes a plan. - Agent Acting
If the plan requires external operations, the agent calls the appropriate MCP Server tools. - LLM Response Formulation
The LLM takes the tool outputs and crafts a coherent, human-readable response. - Response → User
The answer is returned to the user through the UI Extension.
graph TD
subgraph "A. User Interaction"
User[User / Analyst / DevOps]
end
subgraph "B. Rancher AI Assistant (Processing)"
Agent(Agent)
LLM(LLM local or online)
MCPServer(MCP Server)
end
subgraph "C. Infrastructure"
Kubernetes(Clusters)
end
%% Step 1: User sends intent
User -- "1. Query " --> Agent
%% Step 2 & 3: Initial Reasoning Loop
Agent -- "2. Tool Planning Request" --> LLM
LLM -- "3. Action Plan (e.g., list_pods tool)" --> Agent
%% Step 4: Action Execution
Agent -- "4. Execute Tool via MCP" --> MCPServer
%% Step 5 & 6: Cluster Interaction
MCPServer -- "5. Secure Action " --> Kubernetes
Kubernetes -- "6. Raw Results / Status" --> MCPServer
%% Step 7: Raw results back to Agent
MCPServer -- "7. Raw Tool Output" --> Agent
%% Step 8 & 9: Final Interpretation Loop (Crucial step)
Agent -- "8. Interpret Raw Output" --> LLM
LLM -- "9. Formatted Final Answer" --> Agent
%% Step 10: Final Response to User
Agent -- "10. Final Conversational Response" --> User
%% Styles
style User fill:#e0f2f7,stroke:#1e88e5,stroke-width:2px
style Agent fill:#d4edda,stroke:#28a745,stroke-width:2px
style LLM fill:#fff3cd,stroke:#ffc107,stroke-width:2px
style MCPServer fill:#cce5ff,stroke:#007bff,stroke-width:2px
style Kubernetes fill:#f8d7da,stroke:#dc3545,stroke-width:2px
- Add the Helm Repository First, add the Rancher AI Agent Helm repository to your cluster.
helm repo add rancher-ai https://rancher.github.io/rancher-ai-agent- Update Your Helm Repositories Fetch the latest list of charts from all your configured repositories.
helm repo update- Install the Chart Install the agent chart into a new namespace using the values.yaml file you created.
helm install rancher-ai-agent rancher-ai/rancher-ai-agent \
--namespace cattle-ai-agent-system \
--create-namespace \
--devel \
-f values.yaml
--create-namespace--devel: Includes pre-release versions of the chart, which is currently required.
-f values.yaml: Applies your custom LLM configuration.
Once the agent is installed, you need to install the UI extension
The AI Assistant maintains persistent chat history with REST API endpoints for accessing and managing conversations:
GET /api/chats- List all chat sessionsGET /api/chats/{id}/messages- Retrieve all messages for a specific chat sessionDELETE /api/chats/{id}- Delete a specific chat session
See the OpenAPI spec for full API details.
To use the AI agent a user needs get permission to the 'llm-config' secret and the 'http:rancher-ai-agent:80' services/proxy.
NOTE: This is a temporary solution. The following GlobalRole can be used to grant permissions to use the AI Agent:
apiVersion: management.cattle.io/v3
displayName: ai
kind: GlobalRole
metadata:
name: ai-agent
namespacedRules:
cattle-ai-agent-system:
- apiGroups:
- ''
resourceNames:
- http:rancher-ai-agent:80
resources:
- services/proxy
verbs:
- get
- apiGroups:
- ''
resourceNames:
- llm-config
resources:
- secrets
verbs:
- get
The AI Agent supports multiple authentication methods for connecting to MCP servers. The authentication type is configured per agent via the AIAgentConfig CRD's authenticationType field.
No authentication is applied. The agent connects to the MCP server without any credentials. Suitable for MCP servers running locally or behind a network-level security boundary.
authenticationType: NONEUses the current user's Rancher session token to authenticate with the MCP server. The token is passed in the request headers, allowing the MCP server to enforce Rancher RBAC on behalf of the user.
authenticationType: RANCHERHTTP Basic authentication using credentials stored in a Kubernetes secret. The secret must contain username and password keys.
authenticationType: BASIC
authenticationSecret: my-basic-auth-secretCustom header-based authentication. The agent sends a static token (e.g., API key) stored in a Kubernetes secret as a request header to the MCP server.
authenticationType: HEADER
authenticationSecret: my-header-auth-secretOAuth 2.0 with PKCE (Proof Key for Code Exchange) for MCP servers that implement the OAuth 2.0 authorization framework. It supports:
- OAuth Discovery: Automatically discovers authorization and token endpoints from the MCP server's
/.well-known/oauth-authorization-servermetadata. - Dynamic Client Registration (RFC 7591): If no client credentials are provided and the server supports it, the agent registers itself dynamically.
- Token Refresh: Before initiating a new authorization flow, the agent attempts to use a stored refresh token to silently obtain a new access token.
- PKCE Flow: Uses
S256code challenge for secure authorization code exchange.
authenticationType: OAUTH2
authenticationSecret: my-oauth-secret
OAuth Secret Format (when using static credentials):
The Kubernetes secret referenced by authenticationSecret should contain:
| Key | Required | Description |
|--------------------|----------|--------------------------------------------------------|
| clientID | Yes | The OAuth2 client identifier |
| clientSecret | No | The OAuth2 client secret (for confidential clients) |
| scope | No | Space-separated list of OAuth scopes |
| metadata_endpoint| Yes | Authorization server metadata endpoint (RFC 8414/OIDC) |
Flow:
- The agent detects a
401 Unauthorizedresponse from the MCP server. - It checks for an existing refresh token and attempts a silent token refresh.
- If no refresh token is available (or the refresh fails), it initiates the full OAuth flow by sending an authorization URL to the client.
- The user authenticates in a browser popup.
- The callback exchanges the authorization code for access and refresh tokens.
- The agent resumes the original request with the new access token.