diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index d4f2736..87859c8 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -27,14 +27,19 @@ ** xref:mcp:create-server.adoc[Create a server] ** xref:mcp:register-remote.adoc[Register a self-managed server] ** xref:mcp:user-delegated-oauth.adoc[User-delegated OAuth] +** xref:mcp:oauth-providers.adoc[Configure an OAuth Provider] ** xref:mcp:test-tools.adoc[Test a server's tools] ** xref:mcp:managed/index.adoc[Managed catalog] *** xref:mcp:managed/managed-catalog.adoc[Catalog reference] *** xref:mcp:managed/sql.adoc[SQL] *** xref:mcp:managed/kafka.adoc[Kafka] *** xref:mcp:managed/slack.adoc[Slack] +*** xref:mcp:managed/ironclad.adoc[Ironclad] *** xref:mcp:managed/jira.adoc[Jira] *** xref:mcp:managed/openapi.adoc[OpenAPI] +*** xref:mcp:managed/ramp.adoc[Ramp] +*** xref:mcp:managed/workday.adoc[Workday] +*** xref:mcp:managed/zendesk.adoc[Zendesk] * xref:ai-gateway:index.adoc[AI Gateway] ** xref:ai-gateway:overview.adoc[Overview] ** xref:ai-gateway:gateway-quickstart.adoc[Quickstart] @@ -47,7 +52,7 @@ **** xref:ai-gateway:admin/setup-guide.adoc[Setup guide] *** xref:ai-gateway:builders/index.adoc[For Builders] **** xref:ai-gateway:builders/discover-gateways.adoc[Discover gateways] -* xref:governance:index.adoc[Trust & Governance] +* xref:governance:index.adoc[Governance] ** xref:governance:dashboard/index.adoc[Governance dashboard] *** xref:governance:dashboard/overview.adoc[Read the governance overview] *** xref:governance:dashboard/agent-network.adoc[Agent network] @@ -73,6 +78,7 @@ ** xref:integrations:continue.adoc[Continue] ** xref:integrations:cline.adoc[Cline] ** xref:integrations:copilot.adoc[GitHub Copilot] +** xref:integrations:remote-mcp-clients.adoc[Remote MCP clients (Claude Desktop, ChatGPT, Gemini)] * xref:reference:index.adoc[Reference] ** xref:reference:glossary.adoc[Glossary] ** xref:reference:api.adoc[API reference] diff --git a/modules/ROOT/partials/ai-hub/configure-ai-hub.adoc b/modules/ROOT/partials/ai-hub/configure-ai-hub.adoc index 40a77f2..d7b8d0f 100644 --- a/modules/ROOT/partials/ai-hub/configure-ai-hub.adoc +++ b/modules/ROOT/partials/ai-hub/configure-ai-hub.adoc @@ -427,8 +427,6 @@ For ejection instructions, see xref:ai-gateway/admin/eject-to-custom-mode.adoc[] == Next steps -Now that you've configured your AI Hub gateway: - -* xref:ai-gateway/builders/use-ai-hub-gateway.adoc[Share this guide with builders] - Help your teams connect to the gateway -* xref:ai-gateway/admin/eject-to-custom-mode.adoc[Learn about ejecting to Custom mode] - Understand the transition path if you need more control -* xref:ai-gateway/gateway-architecture.adoc[Deep dive into architecture] - Understand how AI Hub routing works +* xref:ai-gateway/builders/use-ai-hub-gateway.adoc[Share this guide with builders] +* xref:ai-gateway/admin/eject-to-custom-mode.adoc[Learn about ejecting to Custom mode] +* xref:ai-gateway/gateway-architecture.adoc[Deep dive into architecture] diff --git a/modules/ROOT/partials/ai-hub/eject-to-custom-mode.adoc b/modules/ROOT/partials/ai-hub/eject-to-custom-mode.adoc index 6b6db36..c27a4cc 100644 --- a/modules/ROOT/partials/ai-hub/eject-to-custom-mode.adoc +++ b/modules/ROOT/partials/ai-hub/eject-to-custom-mode.adoc @@ -391,8 +391,6 @@ To get back to AI Hub mode: == Next steps -Now that you've ejected to Custom mode: - -* xref:ai-gateway/admin/setup-guide.adoc[Complete Custom mode configuration] - Configure routing rules and backend pools -* xref:ai-gateway:routing-cel.adoc[Learn CEL routing patterns] - Write powerful routing expressions -* xref:ai-gateway/gateway-architecture.adoc[Understand architecture] - Deep dive into Custom mode architecture +* xref:ai-gateway/admin/setup-guide.adoc[Complete Custom mode configuration] +* xref:ai-gateway:routing-cel.adoc[Learn CEL routing patterns] +* xref:ai-gateway/gateway-architecture.adoc[Understand architecture] diff --git a/modules/ROOT/partials/ai-hub/gateway-modes.adoc b/modules/ROOT/partials/ai-hub/gateway-modes.adoc index 55bd4a7..32075cf 100644 --- a/modules/ROOT/partials/ai-hub/gateway-modes.adoc +++ b/modules/ROOT/partials/ai-hub/gateway-modes.adoc @@ -257,16 +257,9 @@ For detailed instructions on ejecting to Custom mode, see xref:ai-gateway/admin/ == Next steps -Now that you understand gateway modes: - -*For Administrators:* - -* xref:ai-gateway/admin/configure-ai-hub.adoc[Configure AI Hub Gateway] - Set up AI Hub mode -* xref:ai-gateway/admin/setup-guide.adoc[Setup Guide] - Configure Custom mode -* xref:ai-gateway/admin/eject-to-custom-mode.adoc[Eject to Custom Mode] - Transition from AI Hub to Custom - -*For Builders:* - -* xref:ai-gateway/builders/use-ai-hub-gateway.adoc[Use AI Hub Gateway] - Connect to AI Hub gateways -* xref:ai-gateway/builders/discover-gateways.adoc[Discover Gateways] - Find available gateways -* xref:ai-gateway/builders/connect-your-agent.adoc[Connect Your Agent] - Integrate your application +* xref:ai-gateway/admin/configure-ai-hub.adoc[Configure AI Hub Gateway] +* xref:ai-gateway/admin/setup-guide.adoc[Setup Guide] +* xref:ai-gateway/admin/eject-to-custom-mode.adoc[Eject to Custom Mode] +* xref:ai-gateway/builders/use-ai-hub-gateway.adoc[Use AI Hub Gateway] +* xref:ai-gateway/builders/discover-gateways.adoc[Discover Gateways] +* xref:ai-gateway/builders/connect-your-agent.adoc[Connect Your Agent] diff --git a/modules/ROOT/partials/ai-hub/use-ai-hub-gateway.adoc b/modules/ROOT/partials/ai-hub/use-ai-hub-gateway.adoc index ab1b3e7..4e8d6e7 100644 --- a/modules/ROOT/partials/ai-hub/use-ai-hub-gateway.adoc +++ b/modules/ROOT/partials/ai-hub/use-ai-hub-gateway.adoc @@ -420,8 +420,6 @@ Discuss your requirements with your administrator. They can either: == Next steps -Now that you're using an AI Hub gateway: - -* xref:ai-gateway/builders/connect-your-agent.adoc[Connect Your Agent] - Integrate AI agents with advanced patterns -* xref:ai-gateway/builders/discover-gateways.adoc[Discover Gateways] - Find other available gateways -* xref:ai-gateway:aggregation.adoc[MCP Aggregation] - Use tool aggregation with AI agents +* xref:ai-gateway/builders/connect-your-agent.adoc[Connect Your Agent] +* xref:ai-gateway/builders/discover-gateways.adoc[Discover Gateways] +* xref:ai-gateway:aggregation.adoc[MCP Aggregation] diff --git a/modules/ROOT/partials/integrations/claude-code-admin.adoc b/modules/ROOT/partials/integrations/claude-code-admin.adoc index 799e767..af82e65 100644 --- a/modules/ROOT/partials/integrations/claude-code-admin.adoc +++ b/modules/ROOT/partials/integrations/claude-code-admin.adoc @@ -492,5 +492,5 @@ Causes and solutions: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules -* xref:mcp/remote/overview.adoc[]: Deploy Remote MCP servers for custom tools +* xref:ai-gateway:routing-cel.adoc[] +* xref:mcp/remote/overview.adoc[] diff --git a/modules/ROOT/partials/integrations/claude-code-user.adoc b/modules/ROOT/partials/integrations/claude-code-user.adoc index 804794c..819e64c 100644 --- a/modules/ROOT/partials/integrations/claude-code-user.adoc +++ b/modules/ROOT/partials/integrations/claude-code-user.adoc @@ -395,8 +395,8 @@ chmod 600 ~/.claude.json == Next steps -* xref:ai-gateway:aggregation.adoc[]: Configure deferred tool loading to reduce token costs -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Claude Code requests based on context +* xref:ai-gateway:aggregation.adoc[] +* xref:ai-gateway:routing-cel.adoc[] == Related pages diff --git a/modules/ROOT/partials/integrations/cline-admin.adoc b/modules/ROOT/partials/integrations/cline-admin.adoc index 469286f..991f97b 100644 --- a/modules/ROOT/partials/integrations/cline-admin.adoc +++ b/modules/ROOT/partials/integrations/cline-admin.adoc @@ -573,5 +573,5 @@ Causes and solutions: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules -* xref:mcp/remote/overview.adoc[]: Deploy Remote MCP servers for custom tools +* xref:ai-gateway:routing-cel.adoc[] +* xref:mcp/remote/overview.adoc[] diff --git a/modules/ROOT/partials/integrations/cline-user.adoc b/modules/ROOT/partials/integrations/cline-user.adoc index a337b7b..fc7a92d 100644 --- a/modules/ROOT/partials/integrations/cline-user.adoc +++ b/modules/ROOT/partials/integrations/cline-user.adoc @@ -723,8 +723,8 @@ The gateway automatically blocks requests that would exceed the limit. == Next steps -* xref:ai-gateway:aggregation.adoc[]: Configure deferred tool loading to reduce token costs -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Cline requests based on task complexity +* xref:ai-gateway:aggregation.adoc[] +* xref:ai-gateway:routing-cel.adoc[] == Related pages diff --git a/modules/ROOT/partials/integrations/continue-admin.adoc b/modules/ROOT/partials/integrations/continue-admin.adoc index 441ff31..ec2f631 100644 --- a/modules/ROOT/partials/integrations/continue-admin.adoc +++ b/modules/ROOT/partials/integrations/continue-admin.adoc @@ -735,5 +735,5 @@ This is expected behavior, not a configuration issue: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules -* xref:mcp/remote/overview.adoc[]: Deploy Remote MCP servers for custom tools +* xref:ai-gateway:routing-cel.adoc[] +* xref:mcp/remote/overview.adoc[] diff --git a/modules/ROOT/partials/integrations/continue-user.adoc b/modules/ROOT/partials/integrations/continue-user.adoc index 476d9ff..11856c3 100644 --- a/modules/ROOT/partials/integrations/continue-user.adoc +++ b/modules/ROOT/partials/integrations/continue-user.adoc @@ -838,8 +838,8 @@ Autocomplete rarely needs more than 256 tokens, while chat responses can vary. == Next steps -* xref:ai-gateway:aggregation.adoc[]: Configure deferred tool loading to reduce token costs -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Continue.dev requests based on context +* xref:ai-gateway:aggregation.adoc[] +* xref:ai-gateway:routing-cel.adoc[] == Related pages diff --git a/modules/ROOT/partials/integrations/cursor-admin.adoc b/modules/ROOT/partials/integrations/cursor-admin.adoc index ecbf4b4..006df4d 100644 --- a/modules/ROOT/partials/integrations/cursor-admin.adoc +++ b/modules/ROOT/partials/integrations/cursor-admin.adoc @@ -808,5 +808,5 @@ Causes and solutions: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules for model prefix routing -* xref:mcp/remote/overview.adoc[]: Deploy Remote MCP servers for custom tools +* xref:ai-gateway:routing-cel.adoc[] +* xref:mcp/remote/overview.adoc[] diff --git a/modules/ROOT/partials/integrations/cursor-user.adoc b/modules/ROOT/partials/integrations/cursor-user.adoc index f78513a..c51dbc9 100644 --- a/modules/ROOT/partials/integrations/cursor-user.adoc +++ b/modules/ROOT/partials/integrations/cursor-user.adoc @@ -805,8 +805,8 @@ This sends only search + orchestrator tools initially, reducing token usage sign == Next steps -* xref:ai-gateway:aggregation.adoc[]: Configure deferred tool loading to work within Cursor's 40-tool limit -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Cursor requests based on context +* xref:ai-gateway:aggregation.adoc[] +* xref:ai-gateway:routing-cel.adoc[] == Related pages diff --git a/modules/ROOT/partials/integrations/github-copilot-admin.adoc b/modules/ROOT/partials/integrations/github-copilot-admin.adoc index a486cb3..d55a4f1 100644 --- a/modules/ROOT/partials/integrations/github-copilot-admin.adoc +++ b/modules/ROOT/partials/integrations/github-copilot-admin.adoc @@ -818,5 +818,4 @@ Causes and solutions: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules for model aliasing - +* xref:ai-gateway:routing-cel.adoc[] diff --git a/modules/ROOT/partials/integrations/github-copilot-user.adoc b/modules/ROOT/partials/integrations/github-copilot-user.adoc index ec665db..16fc444 100644 --- a/modules/ROOT/partials/integrations/github-copilot-user.adoc +++ b/modules/ROOT/partials/integrations/github-copilot-user.adoc @@ -902,8 +902,8 @@ Generate project-specific cost reports from the gateway dashboard. == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Copilot requests based on context -* xref:ai-gateway:aggregation.adoc[]: Learn about MCP tool integration (if using Copilot Workspace) +* xref:ai-gateway:routing-cel.adoc[] +* xref:ai-gateway:aggregation.adoc[] == Related pages diff --git a/modules/ROOT/partials/migration-guide.adoc b/modules/ROOT/partials/migration-guide.adoc index fac2184..eb880a0 100644 --- a/modules/ROOT/partials/migration-guide.adoc +++ b/modules/ROOT/partials/migration-guide.adoc @@ -873,5 +873,5 @@ A/B testing == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Configure advanced routing policies. -* xref:ai-gateway:aggregation.adoc[]: Explore MCP aggregation. +* xref:ai-gateway:routing-cel.adoc[] +* xref:ai-gateway:aggregation.adoc[] diff --git a/modules/ROOT/partials/observability-logs.adoc b/modules/ROOT/partials/observability-logs.adoc index c9c4a44..8c10e8f 100644 --- a/modules/ROOT/partials/observability-logs.adoc +++ b/modules/ROOT/partials/observability-logs.adoc @@ -767,4 +767,4 @@ Note: Cost estimates are approximate. Use provider invoices for billing. == Next steps -* xref:ai-gateway/observability-metrics.adoc[]: Aggregate analytics and cost tracking. \ No newline at end of file +* xref:ai-gateway/observability-metrics.adoc[] diff --git a/modules/ROOT/partials/observability-metrics.adoc b/modules/ROOT/partials/observability-metrics.adoc index 15b8ca5..371b3b2 100644 --- a/modules/ROOT/partials/observability-metrics.adoc +++ b/modules/ROOT/partials/observability-metrics.adoc @@ -855,4 +855,4 @@ Solution: == Next steps -* xref:ai-gateway/observability-logs.adoc[]: View individual requests and debug issues. +* xref:ai-gateway/observability-logs.adoc[] diff --git a/modules/agents/pages/a2a-concepts.adoc b/modules/agents/pages/a2a-concepts.adoc index f42321a..5c96f76 100644 --- a/modules/agents/pages/a2a-concepts.adoc +++ b/modules/agents/pages/a2a-concepts.adoc @@ -118,4 +118,3 @@ The A2A protocol uses semantic versioning (major.minor.patch). Agents declare th * xref:integration-overview.adoc[] * xref:create-agent.adoc[] -* link:https://a2a.ag/spec[A2A Protocol Specification^] diff --git a/modules/agents/pages/quickstart.adoc b/modules/agents/pages/quickstart.adoc index dbaff5a..9298069 100644 --- a/modules/agents/pages/quickstart.adoc +++ b/modules/agents/pages/quickstart.adoc @@ -177,8 +177,6 @@ Common quickstart issue: == Next steps -You've created an agent that orchestrates MCP tools through natural language. Explore more: - * xref:overview.adoc[] * xref:create-agent.adoc[] * xref:system-prompts.adoc[] diff --git a/modules/ai-gateway/pages/admin/setup-guide.adoc b/modules/ai-gateway/pages/admin/setup-guide.adoc index 7cd5b6a..86eaa80 100644 --- a/modules/ai-gateway/pages/admin/setup-guide.adoc +++ b/modules/ai-gateway/pages/admin/setup-guide.adoc @@ -378,17 +378,5 @@ Users can then discover and connect to the gateway using the information provide == Next steps -*Configure and optimize:* - -// * xref:admin/manage-gateways.adoc[Manage Gateways] - List, edit, and delete gateways -* xref:routing-cel.adoc[CEL Routing Cookbook] - Advanced routing patterns -// * xref:admin/networking-configuration.adoc[Networking Configuration] - Configure private endpoints and connectivity - -//*Monitor and observe:* -// - -ifdef::integrations-available[] -*Integrate tools:* - -* xref:integrations/index.adoc[Integrations] - Admin guides for Claude Code, Cursor, and other tools -endif::[] +* xref:routing-cel.adoc[CEL Routing Cookbook] +* xref:integrations/index.adoc[Integrations] diff --git a/modules/ai-gateway/pages/builders/discover-gateways.adoc b/modules/ai-gateway/pages/builders/discover-gateways.adoc index a529b8b..205f8db 100644 --- a/modules/ai-gateway/pages/builders/discover-gateways.adoc +++ b/modules/ai-gateway/pages/builders/discover-gateways.adoc @@ -302,7 +302,4 @@ echo -e "\n=== Gateway validated successfully ===" == Next steps -* xref:connect-agent.adoc[Connect Your Agent] - Integrate your application -// * xref:builders/available-models.adoc[Available Models] - Learn about model selection and routing -// * xref:builders/use-mcp-tools.adoc[Use MCP Tools] - Access tools from MCP servers -// * xref:builders/monitor-your-usage.adoc[Monitor Your Usage] - Track requests and costs +* xref:connect-agent.adoc[Connect Your Agent] diff --git a/modules/ai-gateway/pages/configure-provider.adoc b/modules/ai-gateway/pages/configure-provider.adoc index 75516e1..1083639 100644 --- a/modules/ai-gateway/pages/configure-provider.adoc +++ b/modules/ai-gateway/pages/configure-provider.adoc @@ -23,6 +23,7 @@ After completing this guide, you will be able to: + // TODO: xref the secrets-management page for ADP once confirmed. +[[open-the-create-llm-provider-page]] == Open the Create LLM provider page . Sign in to ADP. @@ -63,10 +64,10 @@ The *Provider type* card shows five cards. Pick the one that matches your upstre |Call Claude Opus, Sonnet, and Haiku directly. Strong at coding, long-context reasoning, and tool use. Supports forwarding client `Authorization` headers to Anthropic for enterprise and Max-plan subscription passthrough (see <>). |*Google AI* -|Reach Gemini Pro, Flash, and multimodal models via Google AI Studio. Ideal for long-context workloads and image/video inputs. +|Reach Gemini Pro, Flash, and multimodal models through Google AI Studio. Ideal for long-context workloads and image/video inputs. |*AWS Bedrock* -|Invoke foundation models (Claude, Llama, Titan, Nova) hosted inside your AWS account. Requires an AWS region and credentials (static, STS-assumed role, or the default credential chain). +|Invoke foundation models (Claude, Llama, Titan, Nova, Mistral, AI21 Jamba) hosted inside your AWS account. Requires an AWS region and credentials (static, STS-assumed role, or the default credential chain). Supports the native Bedrock APIs (`InvokeModel`, `Converse`) and an OpenAI-compatible Chat Completions endpoint for `gpt-oss` models. See <> for picking the right model identifier. |*OpenAI-compatible* |Point at any OpenAI-compatible endpoint that ships `/v1/chat/completions` (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). Useful for self-hosted models and aggregator gateways. Requires a *Base URL*; authentication is optional. @@ -166,26 +167,114 @@ TIP: OpenAI-compatible endpoints can serve any model. Enter the exact model iden Models you select on this form become the catalog the provider exposes. Leave the list empty to allow every model the upstream catalog returns. -For *OpenAI*, *Anthropic*, *Google AI*, and *AWS Bedrock*, the form shows a picker backed by the provider's catalog. Pick from the list, or type a model identifier the catalog doesn't show. For *OpenAI-compatible*, the form takes a freeform list — type the exact identifiers your upstream serves. +For *OpenAI*, *Anthropic*, *Google AI*, and *AWS Bedrock*, the form shows a picker backed by the provider's catalog. Pick from the list, or type a model identifier the catalog doesn't show. For *OpenAI-compatible*, the form takes a freeform list: type the exact identifiers your upstream serves. + +For Bedrock, the picker exposes inference profiles, not raw foundation-model IDs. See <>. [NOTE] ==== Models are stored as structured `ProviderModel` entries (one entry per model, with the model name as the only required field). A future Phase 2 release will add per-model metadata such as custom pricing overrides. The legacy flat `models` field still works on writes for backward compatibility. ==== -After you create the provider, the detail page renders each model as a card with capability badges (for example, *Vision*, *Reasoning*, *Streaming*) lifted from the catalog. +After you create the provider, the detail page renders each model as a row with capability badges (*Vision*, *Reasoning*, *Streaming*, and others lifted from the catalog), the model's 7-day spend, and a link to the per-model detail page. The model list supports search and filtering. + +The detail page also carries a *Last 7 days* KPI strip (*TOTAL SPEND*, *REQUESTS*, *TOKENS*) with sparklines and _vs previous period_ deltas. *View all* on each card opens the relevant Governance drill-down (Spending, Requests, or Tokens) with this provider pre-filtered. See xref:governance:dashboard/overview.adoc[the governance overview] for the drill-down details. == Save and verify . Click *Create provider*. The button activates once *Name* and *Type* are both set; the right-hand *Summary* panel checks them off as you fill them in. -. On the provider's detail page, the *Connection* card shows your *Proxy URL*, *Discovery* URL, *Base URL*, and *API key ref*. Copy the *Proxy URL* — this is where your applications point. -. Scroll to the *Verify connection* section. Pick a model from the dropdown and click *Test Connection*. The status updates from "Not tested yet" to a pass/fail indicator. Use the *Show commands* disclosure if you want to see the equivalent curl or SDK call. +. On the provider's detail page, the *Connection* card shows your *Proxy URL*, *Discovery* URL, *Base URL*, and *API key ref*. Copy the *Proxy URL*: this is where your applications point. +. Scroll to the *Verify connection* section. Pick a model from the dropdown and click *Test Connection*. The status updates from _Not tested yet_ to a pass/fail indicator. Use the *Show commands* disclosure if you want to see the equivalent curl or SDK call. . To wire up an application, open *Connect your app* further down the page or follow xref:connect-agent.adoc[Connect your agent]. A successful Test Connection result confirms that the provider's credentials, region (Bedrock), and network path are all correct. If the call fails, see <>. +[[bedrock-inference-profiles]] +== AWS Bedrock: Inference profiles and IAM + +Bedrock has three concepts that affect how you configure a provider: foundation models, cross-region inference profiles, and IAM. Get these right and the *Test connection* check passes; get them wrong and you see `AccessDenied` or `ValidationException` errors. + +=== Foundation models versus inference profiles + +A *foundation model* is the base model AWS exposes (for example, `anthropic.claude-sonnet-4-6`). It runs in the AWS region you call. + +A *cross-region inference profile* wraps a foundation model with a geography prefix that routes requests across multiple regions for higher availability and throughput. The prefix tells AWS which geography the request should run in: + +[cols="1,2"] +|=== +|Prefix |Geography + +|`us.` +|US regions + +|`eu.` +|EU regions + +|`apac.` +|Asia-Pacific regions + +|`au.` +|Australia regions + +|`jp.` +|Japan regions + +|`global.` +|Any region; routes for lowest cost +|=== + +Examples: `us.anthropic.claude-sonnet-4-6` (Claude Sonnet 4.6 routed across US regions), `eu.anthropic.claude-haiku-4-5` (Haiku 4.5 routed across EU regions). + +[IMPORTANT] +==== +Anthropic Claude 4.6+ models (Sonnet 4.6, Opus 4.6, Opus 4.7) cannot be invoked with the bare foundation-model ID; they require an inference profile. If you try the bare ID, Bedrock returns: + +> "Invocation of model ID … with on-demand throughput isn't supported. Retry your request with the ID or ARN of an inference profile that contains this model." + +Older 4.5 and earlier Claude models still accept bare IDs. +==== + +Pricing varies by profile. The bare foundation-model ID and the `global.` profile share AWS's headline rate; geo profiles (`us.`, `eu.`, `apac.`, `au.`, `jp.`) carry approximately a 10% cross-region inference premium. Use `global.` when you want the headline rate and don't need a specific geography; use `us.` / `eu.` / `apac.` when data residency matters. + +=== IAM policy patterns + +Bedrock IAM resources have different ARN structures depending on whether you reference a foundation model, a system-defined inference profile, or an account-scoped application inference profile. The provider's IAM principal needs `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream` on every resource it calls. + +[cols="1,3"] +|=== +|Resource type |ARN shape + +|Foundation model +|`arn:aws:bedrock:\{region}::foundation-model/\{model-id}` (no account ID; AWS-owned) + +|System-defined inference profile +|`arn:aws:bedrock:\{region}:*:inference-profile/\{profile-id}` (wildcard account; system-defined) + +|Application inference profile (account-scoped) +|`arn:aws:bedrock:\{region}:\{account-id}:application-inference-profile/\{profile-id}` +|=== + +A minimal policy granting access to all foundation models plus all cross-region profiles: + +[source,json] +---- +{ + "Version": "2012-10-17", + "Statement": [{ + "Effect": "Allow", + "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"], + "Resource": [ + "arn:aws:bedrock:*::foundation-model/*", + "arn:aws:bedrock:*:*:inference-profile/*" + ] + }] +} +---- + +For production, scope to specific models and regions instead of using wildcards. + [[anthropic-authorization-passthrough]] -== Anthropic: authorization passthrough +== Anthropic: Authorization passthrough If you want each client to authenticate against Anthropic with its own subscription (Claude Pro, Max, Team, or enterprise), enable *Auth passthrough* instead of configuring a server-side API key. In this mode: @@ -195,11 +284,37 @@ If you want each client to authenticate against Anthropic with its own subscript The provider detail page shows whether Auth passthrough is enabled in the *Connection* card. +== Browse providers in the list view + +The *LLM Providers* list page is the at-a-glance home for every provider in your dataplane. Open it from the sidebar's *LLM Providers* entry. + +[cols="1,3"] +|=== +|Column |What it shows + +|*Provider* +|User-given name plus the provider-type icon (OpenAI, Anthropic, Google, AWS Bedrock, OpenAI-compatible) and a copyable preview of the proxy base URL. + +|*Status* +|*Enabled* or *Disabled*. + +|*Models* +|First two model identifiers exposed by the provider, plus a `+N` overflow chip when more are configured. + +|*Spend (7d)* +|Spend over the last 7 days with a small sparkline and a "vs previous period" delta. The window is fixed at 7 days on this view; longer-range analysis runs through the xref:governance:dashboard/overview.adoc[governance dashboard]. + +|*Updated* +|Relative timestamp of the last edit. +|=== + +A list/grid view toggle in the top-right switches between table and card layouts. The *Filter* button narrows the list by provider type, status, or name. The *Create provider* button opens the create flow described in <>. + == Edit, disable, or delete a provider -* *Edit*: click *Edit* on the detail page. You can change any field *except* `Name` and `Type`, which are immutable. Model lists, credential references, and the enabled state can all change. -* *Disable*: click *Disable* on the detail page. The provider remains in the list, but requests to its proxy URL are rejected until you enable it again. Use this when you want to pause traffic without losing configuration. -* *Delete*: scroll to the *Delete this provider* section at the bottom of the detail page and click *Delete*. The action is permanent; in-flight requests fail and downstream clients receive errors until reconfigured. +* *Edit*: Click *Edit* on the detail page. You can change any field *except* `Name` and `Type`, which are immutable. Model lists, credential references, and the enabled state can all change. +* *Disable*: Click *Disable* on the detail page. The provider remains in the list, but requests to its proxy URL are rejected until you enable it again. Use this when you want to pause traffic without losing configuration. +* *Delete*: Scroll to the *Delete this provider* section at the bottom of the detail page and click *Delete*. The action is permanent; in-flight requests fail and downstream clients receive errors until reconfigured. [[troubleshooting]] == Troubleshooting @@ -212,7 +327,10 @@ The provider detail page shows whether Auth passthrough is enabled in the *Conne |Confirm the secret exists in your dataplane's secret store and the reference in the provider configuration is spelled identically (`UPPER_SNAKE_CASE`, no typos). |Bedrock returns `AccessDenied` or region errors -|Verify the AWS region field matches the region where your Bedrock models are enabled. Bedrock model availability varies by region. +|Verify the AWS region field matches the region where your Bedrock models are enabled. Bedrock model availability varies by region. Confirm the IAM principal has `bedrock:InvokeModel` on the foundation-model and inference-profile ARNs you use. See <>. + +|Bedrock returns "Invocation of model ID … with on-demand throughput isn't supported" +|You called a Claude 4.6+ model with a bare foundation-model ID. Switch to an inference profile (for example, `us.anthropic.claude-sonnet-4-6` instead of `anthropic.claude-sonnet-4-6`). See <>. |Anthropic returns 401 when passthrough is enabled |Confirm the client is sending its own `Authorization` header and the *API key* field on the provider is empty. @@ -239,4 +357,4 @@ AI Gateway does not provide these capabilities. For current status, consult the == Next steps -* xref:connect-agent.adoc[Connect your agent]. Point your application's SDK at the proxy URL and make requests. +* xref:connect-agent.adoc[Connect your agent] diff --git a/modules/ai-gateway/pages/connect-agent.adoc b/modules/ai-gateway/pages/connect-agent.adoc index c6d127d..55c34d0 100644 --- a/modules/ai-gateway/pages/connect-agent.adoc +++ b/modules/ai-gateway/pages/connect-agent.adoc @@ -1,13 +1,13 @@ = Connect Your Agent -:description: Point your application or AI agent at an AI Gateway provider's proxy URL. Covers the URL shape, the rpai-based local auth flow, the OIDC client-credentials flow for CI, and SDK examples for OpenAI, Anthropic, Google AI, AWS Bedrock, and OpenAI-compatible endpoints. +:description: Point your application or AI agent at an AI Gateway provider's proxy URL. Covers the URL shape, the local auth flow with the `rpk ai` plugin, the OIDC client-credentials flow for CI, and SDK examples for OpenAI, Anthropic, Google AI, AWS Bedrock, and OpenAI-compatible endpoints. :page-topic-type: how-to :personas: app_developer :page-aliases: redpanda-cloud:ai-agents:ai-gateway/builders/connect-your-agent.adoc :learning-objective-1: Construct the proxy URL for an LLM provider you have configured -:learning-objective-2: Authenticate to AI Gateway using rpai for local development or OIDC client credentials for CI and programmatic clients +:learning-objective-2: Authenticate to AI Gateway using the `rpk ai` plugin for local development or OIDC client credentials for CI and programmatic clients :learning-objective-3: Send requests through the proxy URL with the SDK of your choice -This guide shows how to connect your glossterm:AI agent[] or application to AI Gateway. You'll construct the proxy URL for a provider you have already created, authenticate (with `rpai` for local development or with OIDC client credentials for CI), and send your first request with the SDK of your choice. +This guide shows how to connect your glossterm:AI agent[] or application to AI Gateway. You'll construct the proxy URL for a provider you have already created, authenticate (with the `rpk ai` plugin for local development or with OIDC client credentials for CI), and send your first request with the SDK of your choice. After completing this guide, you will be able to: @@ -18,7 +18,7 @@ After completing this guide, you will be able to: == Prerequisites * A configured LLM provider. If you haven't created one yet, see xref:configure-provider.adoc[Configure an LLM provider]. -* For local development: nothing else — you'll install the `rpai` CLI in the next section. +* For local development: nothing else; you'll install the `rpk ai` plugin in the next section. * For CI or programmatic clients: a Redpanda Cloud service account with OIDC client credentials. See xref:redpanda-cloud:security:cloud-authentication.adoc[Authenticate to Redpanda Cloud]. + // TODO: confirm whether ADP hosts its own service-account IAM post-standalone, or continues to share Redpanda Cloud Organization IAM. @@ -39,42 +39,43 @@ Every provider you create in AI Gateway gets its own proxy URL: AI Gateway forwards the request to the upstream provider, attaches the configured credentials, and records the request for observability. Your application never sees the upstream API key. -TIP: The provider detail page generates ready-to-run `rpai`-based snippets pre-filled with the correct proxy URL and paths. When in doubt, copy from the *Connect your app* section there. +TIP: The provider detail page generates ready-to-run snippets pre-filled with the correct proxy URL and paths. When in doubt, copy from the *Connect your app* section there. +[[authenticate-with-rpk-ai]] [[authenticate-with-rpai]] -== Authenticate with `rpai` (recommended for local development) +== Authenticate with `rpk ai` (recommended for local development) -The provider detail page surfaces an *Install rpai CLI* card with copy-pasteable steps. The flow is the same for every provider type: +The `rpk ai` plugin is distributed through `rpk`'s plugin manager. The provider detail page surfaces an *Install* card with copy-pasteable steps. The flow is the same for every provider type: -. Install the CLI. Pick the install method that matches your OS — for example, on macOS: +. Install the plugin: + [source,bash] ---- -brew install redpanda-data/tap/rpai +rpk plugin install ai ---- -+ -// TODO: confirm the canonical install methods for Linux and Windows once the standalone ADP UI ships. . Log in with the gateway URL from the provider's *Connection* card: + [source,bash] ---- -rpai auth login --server https://aigw..clusters.rdpa.co +rpk ai auth login --server https://aigw..clusters.rdpa.co ---- -. Point your SDK at the proxy URL and let `rpai auth token` mint a fresh token on each call. Set environment variables: +. Point your SDK at the proxy URL and let `rpk ai auth token` mint a fresh token on each call. Set environment variables: + [source,bash] ---- export PROXY_URL="/llm/v1/providers/" -export OPENAI_API_KEY="$(rpai auth token)" # or ANTHROPIC_API_KEY, etc. +export OPENAI_API_KEY="$(rpk ai auth token)" # or ANTHROPIC_API_KEY, etc. ---- -`rpai auth token` returns a short-lived OIDC access token. Refresh by running it again — most users wire it into a wrapper script or shell function. +`rpk ai auth token` returns a short-lived OIDC access token. Refresh by running it again: most users wire it into a wrapper script or shell function. + +TIP: The plugin supports named profiles for pointing at multiple gateways. Run `rpk ai profile create --dataplane-url --auth-mode device` to create one, then `rpk ai profile use ` to switch. See `rpk ai profile --help` for the full set of subcommands. == Authenticate with OIDC client credentials (CI and programmatic) -When `rpai` isn't available (CI runners, server-side processes, headless agents), use the OIDC `client_credentials` grant directly. Values are surfaced on the provider's *Connection* card; defaults at the time of writing are below. +When the `rpk ai` plugin isn't available (CI runners, server-side processes, headless agents), use the OIDC `client_credentials` grant directly. Values are surfaced on the provider's *Connection* card; defaults at the time of writing are below. [cols="1,2", options="header"] |=== @@ -174,7 +175,7 @@ IMPORTANT: Your client is responsible for refreshing tokens before they expire. * Proactively refresh at ~80% of the token's TTL to avoid failed requests. * `authlib` (Python) handles renewal automatically when you pass `token_endpoint` to `OAuth2Session`. * For other languages, cache the token and its expiry, then request a new token before the current one expires. -* If you're using `rpai`, just rerun `rpai auth token` — it handles refresh against the same OIDC endpoint. +* If you're using `rpk ai`, just rerun `rpk ai auth token`: it handles refresh against the same OIDC endpoint. == Send requests with your SDK @@ -183,7 +184,7 @@ The examples in this section assume you've set: [source,bash] ---- export PROXY_URL="/llm/v1/providers/" -export AUTH_TOKEN="$(rpai auth token)" # or an OIDC access token from above +export AUTH_TOKEN="$(rpk ai auth token)" # or an OIDC access token from above ---- [tabs] @@ -197,7 +198,7 @@ from openai import OpenAI client = OpenAI( base_url=os.environ["PROXY_URL"], # .../llm/v1/providers/my-openai - api_key=os.environ["AUTH_TOKEN"], # rpai or OIDC access token + api_key=os.environ["AUTH_TOKEN"], # rpk ai or OIDC access token ) response = client.chat.completions.create( @@ -218,7 +219,7 @@ from anthropic import Anthropic client = Anthropic( base_url=os.environ["PROXY_URL"], # .../llm/v1/providers/my-anthropic - auth_token=os.environ["AUTH_TOKEN"], # rpai or OIDC access token + auth_token=os.environ["AUTH_TOKEN"], # rpk ai or OIDC access token ) message = client.messages.create( @@ -257,14 +258,16 @@ Gemini authenticates with the `x-goog-api-key` header, not `Authorization: Beare AWS Bedrock:: + -Bedrock is different: SigV4 signing is performed *server-side* by AI Gateway using the credentials on the provider. Your client only needs to call the proxy URL with an `rpai` or OIDC token. +Bedrock is different: SigV4 signing is performed *server-side* by AI Gateway using the credentials on the provider. Your client only needs to call the proxy URL with an `rpk ai` or OIDC token. + [source,python] ---- import os, httpx +# Bedrock 4.6+ Anthropic models require an inference profile (us./eu./apac./global.). +# Replace with the inference profile your provider exposes. response = httpx.post( - f"{os.environ['PROXY_URL']}/model/anthropic.claude-3-5-sonnet-20241022-v2:0/invoke", + f"{os.environ['PROXY_URL']}/model/us.anthropic.claude-sonnet-4-6/invoke", headers={"Authorization": f"Bearer {os.environ['AUTH_TOKEN']}"}, json={ "anthropic_version": "bedrock-2023-05-31", @@ -274,8 +277,10 @@ response = httpx.post( ) print(response.json()) ---- + +See xref:configure-provider.adoc#bedrock-inference-profiles[the Bedrock provider reference] for inference-profile selection guidance. + -// TODO: verify Bedrock request shape end-to-end on adp-production once credentials are available; replace placeholder model ID with the inference profile your provider exposes. +TIP: Bedrock's `Converse` API works the same way: send to `/model/\{MODEL_ID}/converse` with a Converse-shaped body. Or use the AWS SDK's `bedrockruntime` client and set its `BaseEndpoint` to the proxy URL; the SDK signs the request, AI Gateway re-signs server-side with the provider's credentials, and your client never sees AWS keys. OpenAI-compatible:: + @@ -350,7 +355,7 @@ AI Gateway returns standard HTTP status codes. The upstream provider's error bod == Best practices * *Use environment variables* for the proxy URL and token; never hard-code them. -* *Wrap `rpai auth token`* in a script or shell function so refresh is invisible to your SDK code. +* *Wrap `rpk ai auth token`* in a script or shell function so refresh is invisible to your SDK code. * *Implement retry with exponential backoff* for 5xx and timeout conditions. * *Respect `Retry-After`* on 429 responses. * *Rotate service account credentials* on a schedule your organization accepts. @@ -360,7 +365,7 @@ AI Gateway returns standard HTTP status codes. The upstream provider's error bod === 401 Unauthorized -* If you're using `rpai`: rerun `rpai auth login` to refresh the session, then `rpai auth token` to mint a new token. +* If you're using `rpk ai`: rerun `rpk ai auth login` to refresh the session, then `rpk ai auth token` to mint a new token. * If you're using OIDC client credentials: check the token hasn't expired and refresh it. Verify the audience is `cloudv2-production.redpanda.cloud` and the `Authorization` header is formatted `Bearer `. * For Gemini: ensure the token is sent as `x-goog-api-key`, not `Authorization`. * For Anthropic with passthrough: ensure the client is sending a valid Anthropic `Authorization` header. @@ -383,4 +388,4 @@ AI Gateway returns standard HTTP status codes. The upstream provider's error bod == Next steps -* xref:configure-provider.adoc[Configure an LLM provider]. Add another provider to your dataplane. +* xref:configure-provider.adoc[Configure an LLM provider] diff --git a/modules/ai-gateway/pages/gateway-architecture.adoc b/modules/ai-gateway/pages/gateway-architecture.adoc index d832dd5..388931e 100644 --- a/modules/ai-gateway/pages/gateway-architecture.adoc +++ b/modules/ai-gateway/pages/gateway-architecture.adoc @@ -215,5 +215,5 @@ endif::[] == Next steps -* xref:gateway-quickstart.adoc[]: Route your first request through AI Gateway -* xref:aggregation.adoc[]: Configure MCP server aggregation for AI agents +* xref:gateway-quickstart.adoc[] +* xref:aggregation.adoc[] diff --git a/modules/ai-gateway/pages/gateway-quickstart.adoc b/modules/ai-gateway/pages/gateway-quickstart.adoc index 108a544..2b11f83 100644 --- a/modules/ai-gateway/pages/gateway-quickstart.adoc +++ b/modules/ai-gateway/pages/gateway-quickstart.adoc @@ -527,15 +527,8 @@ const openai = new OpenAI({ == Next steps -Explore advanced AI Gateway features: - -* xref:routing-cel.adoc[]: Advanced CEL routing patterns for traffic distribution and cost optimization -* xref:aggregation.adoc[]: Configure MCP server aggregation and deferred tool loading -ifdef::integrations-available[] -* xref:integrations/index.adoc[]: Connect more AI development tools -endif::[] - -Learn about the architecture: - -* xref:gateway-architecture.adoc[]: Technical architecture, request lifecycle, and deployment models -* xref:overview.adoc[]: Problems AI Gateway solves and common use cases +* xref:routing-cel.adoc[] +* xref:aggregation.adoc[] +* xref:integrations/index.adoc[] +* xref:gateway-architecture.adoc[] +* xref:overview.adoc[] diff --git a/modules/ai-gateway/pages/overview.adoc b/modules/ai-gateway/pages/overview.adoc index 838f614..5f38ffb 100644 --- a/modules/ai-gateway/pages/overview.adoc +++ b/modules/ai-gateway/pages/overview.adoc @@ -46,7 +46,7 @@ Use the provider's own SDK: OpenAI, Anthropic, Google AI, AWS Bedrock, or any Op === Managed authentication -Applications authenticate to ADP with OIDC service accounts instead of long-lived provider API keys. Service accounts use the same role and audit model as every other ADP resource, and mint short-lived tokens that are easy to revoke. The recommended local flow uses the `rpai` CLI for token refresh; CI and programmatic clients use the OIDC client-credentials grant directly. See xref:connect-agent.adoc[Connect your agent]. +Applications authenticate to ADP with OIDC service accounts instead of long-lived provider API keys. Service accounts use the same role and audit model as every other ADP resource, and mint short-lived tokens that are easy to revoke. The recommended local flow uses the `rpk ai` plugin for token refresh; CI and programmatic clients use the OIDC client-credentials grant directly. See xref:connect-agent.adoc[Connect your agent]. === Per-provider observability @@ -116,5 +116,5 @@ AI Gateway does not provide these capabilities. For current status, consult the == Next steps -. xref:configure-provider.adoc[Configure an LLM provider]. Create your first provider and copy its proxy URL. -. xref:connect-agent.adoc[Connect your agent]. Point your application's SDK at the proxy URL. +. xref:configure-provider.adoc[Configure an LLM provider] +. xref:connect-agent.adoc[Connect your agent] diff --git a/modules/ai-gateway/pages/routing-cel.adoc b/modules/ai-gateway/pages/routing-cel.adoc index 2aad059..a4aa6a5 100644 --- a/modules/ai-gateway/pages/routing-cel.adoc +++ b/modules/ai-gateway/pages/routing-cel.adoc @@ -946,7 +946,3 @@ Each request evaluates CEL expression once. Total latency impact: | Field exists | `has(request.body.max_tokens)` |=== - -== Next steps - -* *Apply CEL routing*: See the gateway configuration options available in ADP. diff --git a/modules/ai-gateway/partials/ai-hub/configure-ai-hub.adoc b/modules/ai-gateway/partials/ai-hub/configure-ai-hub.adoc index 40a77f2..d7b8d0f 100644 --- a/modules/ai-gateway/partials/ai-hub/configure-ai-hub.adoc +++ b/modules/ai-gateway/partials/ai-hub/configure-ai-hub.adoc @@ -427,8 +427,6 @@ For ejection instructions, see xref:ai-gateway/admin/eject-to-custom-mode.adoc[] == Next steps -Now that you've configured your AI Hub gateway: - -* xref:ai-gateway/builders/use-ai-hub-gateway.adoc[Share this guide with builders] - Help your teams connect to the gateway -* xref:ai-gateway/admin/eject-to-custom-mode.adoc[Learn about ejecting to Custom mode] - Understand the transition path if you need more control -* xref:ai-gateway/gateway-architecture.adoc[Deep dive into architecture] - Understand how AI Hub routing works +* xref:ai-gateway/builders/use-ai-hub-gateway.adoc[Share this guide with builders] +* xref:ai-gateway/admin/eject-to-custom-mode.adoc[Learn about ejecting to Custom mode] +* xref:ai-gateway/gateway-architecture.adoc[Deep dive into architecture] diff --git a/modules/ai-gateway/partials/ai-hub/eject-to-custom-mode.adoc b/modules/ai-gateway/partials/ai-hub/eject-to-custom-mode.adoc index 6b6db36..c27a4cc 100644 --- a/modules/ai-gateway/partials/ai-hub/eject-to-custom-mode.adoc +++ b/modules/ai-gateway/partials/ai-hub/eject-to-custom-mode.adoc @@ -391,8 +391,6 @@ To get back to AI Hub mode: == Next steps -Now that you've ejected to Custom mode: - -* xref:ai-gateway/admin/setup-guide.adoc[Complete Custom mode configuration] - Configure routing rules and backend pools -* xref:ai-gateway:routing-cel.adoc[Learn CEL routing patterns] - Write powerful routing expressions -* xref:ai-gateway/gateway-architecture.adoc[Understand architecture] - Deep dive into Custom mode architecture +* xref:ai-gateway/admin/setup-guide.adoc[Complete Custom mode configuration] +* xref:ai-gateway:routing-cel.adoc[Learn CEL routing patterns] +* xref:ai-gateway/gateway-architecture.adoc[Understand architecture] diff --git a/modules/ai-gateway/partials/ai-hub/gateway-modes.adoc b/modules/ai-gateway/partials/ai-hub/gateway-modes.adoc index 55bd4a7..32075cf 100644 --- a/modules/ai-gateway/partials/ai-hub/gateway-modes.adoc +++ b/modules/ai-gateway/partials/ai-hub/gateway-modes.adoc @@ -257,16 +257,9 @@ For detailed instructions on ejecting to Custom mode, see xref:ai-gateway/admin/ == Next steps -Now that you understand gateway modes: - -*For Administrators:* - -* xref:ai-gateway/admin/configure-ai-hub.adoc[Configure AI Hub Gateway] - Set up AI Hub mode -* xref:ai-gateway/admin/setup-guide.adoc[Setup Guide] - Configure Custom mode -* xref:ai-gateway/admin/eject-to-custom-mode.adoc[Eject to Custom Mode] - Transition from AI Hub to Custom - -*For Builders:* - -* xref:ai-gateway/builders/use-ai-hub-gateway.adoc[Use AI Hub Gateway] - Connect to AI Hub gateways -* xref:ai-gateway/builders/discover-gateways.adoc[Discover Gateways] - Find available gateways -* xref:ai-gateway/builders/connect-your-agent.adoc[Connect Your Agent] - Integrate your application +* xref:ai-gateway/admin/configure-ai-hub.adoc[Configure AI Hub Gateway] +* xref:ai-gateway/admin/setup-guide.adoc[Setup Guide] +* xref:ai-gateway/admin/eject-to-custom-mode.adoc[Eject to Custom Mode] +* xref:ai-gateway/builders/use-ai-hub-gateway.adoc[Use AI Hub Gateway] +* xref:ai-gateway/builders/discover-gateways.adoc[Discover Gateways] +* xref:ai-gateway/builders/connect-your-agent.adoc[Connect Your Agent] diff --git a/modules/ai-gateway/partials/ai-hub/use-ai-hub-gateway.adoc b/modules/ai-gateway/partials/ai-hub/use-ai-hub-gateway.adoc index ab1b3e7..4e8d6e7 100644 --- a/modules/ai-gateway/partials/ai-hub/use-ai-hub-gateway.adoc +++ b/modules/ai-gateway/partials/ai-hub/use-ai-hub-gateway.adoc @@ -420,8 +420,6 @@ Discuss your requirements with your administrator. They can either: == Next steps -Now that you're using an AI Hub gateway: - -* xref:ai-gateway/builders/connect-your-agent.adoc[Connect Your Agent] - Integrate AI agents with advanced patterns -* xref:ai-gateway/builders/discover-gateways.adoc[Discover Gateways] - Find other available gateways -* xref:ai-gateway:aggregation.adoc[MCP Aggregation] - Use tool aggregation with AI agents +* xref:ai-gateway/builders/connect-your-agent.adoc[Connect Your Agent] +* xref:ai-gateway/builders/discover-gateways.adoc[Discover Gateways] +* xref:ai-gateway:aggregation.adoc[MCP Aggregation] diff --git a/modules/governance/pages/budgets.adoc b/modules/governance/pages/budgets.adoc index 37e1a2d..5cb8ffa 100644 --- a/modules/governance/pages/budgets.adoc +++ b/modules/governance/pages/budgets.adoc @@ -93,6 +93,5 @@ Until those features ship, treat the dashboard and breakdown queries as your vis == Next steps -* Open the dashboard to see your current spend: xref:governance:dashboard/overview.adoc[Read the governance overview]. -* Investigate a specific agent's cost: xref:observability:transcripts.adoc[Read a transcript]. -* Configure platform-level safety filtering: xref:governance:guardrails.adoc[Configure guardrails]. +* xref:governance:dashboard/overview.adoc[Read the governance overview] +* xref:observability:transcripts.adoc[Read a transcript] diff --git a/modules/governance/pages/dashboard/overview.adoc b/modules/governance/pages/dashboard/overview.adoc index 5327bb7..e36fb5a 100644 --- a/modules/governance/pages/dashboard/overview.adoc +++ b/modules/governance/pages/dashboard/overview.adoc @@ -1,13 +1,17 @@ = Read the Governance Overview -:description: See your AI deployment's spending, fleet, and activity in one place; drill into the transcript behind any number to investigate further. +:description: See your AI deployment's spending, request volume, token usage, and agent fleet on a single page; filter by provider and model, compare to a previous period, and drill into Spending, Requests, or Tokens views. :page-topic-type: how-to :personas: evaluator, platform_admin, app_developer -// TODO: confirm persona vocabulary. The Governance V0 PRD names HoT (Head of Trust), CIO/CFO, CISO, and FDE; this page uses canonical docs-team-standards personas (`evaluator`, `platform_admin`, `app_developer`) and surfaces the PRD names only in the "Reading the dashboard for your role" section headings. Confirm with docs-team-standards owner whether to add `executive` and `security_admin` (or equivalents) so the persona metadata matches the PRD audience exactly. -:learning-objective-1: Identify the widgets on the governance overview and what each one shows -:learning-objective-2: Choose where to focus first based on your role -:learning-objective-3: Investigate a metric, agent, or event by opening its underlying transcript +:learning-objective-1: Identify the four KPI cards, the Spending chart, and the Agents list, and describe what each one surfaces +:learning-objective-2: Filter the Spending chart by provider, model, and cost type, and compare to a previous period +:learning-objective-3: Drill into the Spending, Requests, or Tokens views to investigate further -The governance overview shows your AI deployment's spending, fleet, and activity on a single page. Every number, agent, and event links to the transcript that produced it, so you can investigate without leaving the dashboard. +[IMPORTANT] +==== +The Governance dashboard described here is an early prototype. The design is in active flux, and details (including filter affordances, drill-down URL contracts, and chart visuals) may change before general availability. Verify against the live UI before relying on specifics. +==== + +The Governance overview shows your AI deployment's spend, requests, tokens, and agent fleet on a single page. Filter by provider and model, compare current activity against a previous period, and drill into dedicated Spending, Requests, or Tokens views to investigate further. After reading this page, you will be able to: @@ -18,198 +22,196 @@ After reading this page, you will be able to: == Prerequisites * Access to the Agentic Data Plane. -+ -// TODO: confirm sign-in URL and IAM/role requirement once the standalone ADP UI ships. -* At least one agent or MCP server with recent activity, or an empty deployment if you want to see the empty-state guidance below. -* Read access to the Spending, Agent registry, and MCP server APIs (`dataplane_adp_spending_get`, `dataplane_adp_agent_get`, `dataplane_adp_mcpserver_get`). -+ -// TODO: confirm role-to-permission mapping once the standalone ADP UI ships. +* At least one LLM provider with recent activity, or an empty deployment if you want to see the empty-state behavior. +* The `dataplane_adp_spending_get` permission. The dashboard composes data from `aigw.SpendingService` and the agent registry; without this permission the spend cards and chart are empty. -== Open the governance overview +== Open the dashboard -Sign in to the ADP UI and select *Governance* from the sidebar. The overview is the default view. +Sign in to ADP and select *Governance* from the sidebar. The overview is the default view. -// TODO: annotated screenshot of the V0 governance overview, with callouts for the five widgets (summary cards, token spend by provider, events timeline, agents table, MCP servers table). V0 prototype lands on `adp-production` 2026-05-08 per the PRD milestone. +// TODO: annotated screenshot of the Governance overview with callouts for the four cards, the Spending chart, and the Agents list. -The overview shows five widgets, top to bottom: +The page is laid out top to bottom in three sections: -. *Summary cards* — total spend, agent count, request count, trend versus the previous period. -. *Token spend by LLM provider* — provider-by-provider cost and request breakdown. -. *Events over time* — time series of activity, filterable by event type. -. *Agents* — every agent in your deployment, with status, error count, and tool invocation count. -. *MCP servers* — every MCP server, with type and connection status. +. Four summary cards. +. The *Spending* chart with provider, model, and cost-type filters. +. The *Agents* list. -== Reading each widget +A *date-range picker* in the top-right scopes everything on the page. -=== Summary cards +== Set the date range -The summary cards answer "how much are we using right now?" at a glance. +The date-range picker offers seven preset windows plus a custom calendar: -[cols="1,3"] -|=== -|Card |What it shows +* *Last 7 days* (default) +* *Last 14 days* +* *Last 30 days* +* *Last 90 days* +* *Month to date* +* *Quarter to date* +* *Year to date* -|*Total spend* -|Sum of all LLM provider costs in the current period. Reported in microcents (1 cent = 100 microcents, $1 = 10,000 microcents). Divide by 10,000 to get dollars. +For a custom range, pick start and end dates from the two-month calendar. -|*Agent count* -|Number of agents (managed plus BYOA) registered in your deployment. +A *Compare to previous period* toggle lives in the same popover. When on, every chart on the page renders the previous comparable window as a dashed-line overlay so you can spot week-over-week or month-over-month trends. Click *Apply* to commit the selection. -|*Request count* -|Total LLM requests routed through AI Gateway in the current period. +// TODO: screenshot of the date-range picker open with presets visible. -|*Trend* -|Each card carries a "vs last period" delta. The delta uses the same window length as the current period: a 30-day current view compares against the prior 30 days. Calculation: `(current - previous) / previous`. +== Read the four summary cards + +The cards across the top answer the _how much are you using right now_ question at a glance. + +[cols="1,3"] |=== +|Card |What it shows -The current period defaults to the last 30 days. Change the time-range selector at the top of the page to inspect a shorter or longer window. +|*TOTAL SPEND* +|Sum of all LLM provider costs in the active window. Displayed in dollars. Sparkline shows daily spend across the window. The _vs previous period_ delta compares to the prior window of equal length. -// TODO: confirm time-range selector defaults and allowable windows once the V0 prototype lands. +|*REQUESTS* +|Total LLM requests routed through AI Gateway in the active window. *View all →* opens the Requests drill-down (see <>). -=== Token spend by LLM provider +|*TOKENS* +|Total tokens consumed in the active window: input plus output plus cached, summed across all providers and models. *View all →* opens the Tokens drill-down. -Each row shows a single LLM provider (`openai`, `anthropic`, `bedrock`, …) with cost, request count, and token totals for the current period. Click a provider row to filter the events timeline and agents table to that provider. +|*AGENTS* +|Number of agents registered in your deployment. The sub-line breaks the total into running and other states (`9 running · 1 other`). The visual on this card is a bar chart, not a sparkline. +|=== -The breakdown is also available by model, by user, and by provider type. Use the dimension toggle above the chart to switch. +The first three cards (Spend, Requests, Tokens) carry sparklines and a _vs previous period_ delta when *Compare to previous period* is on. The Agents card carries a state breakdown instead. -// TODO: confirm dimension-toggle UX once the V0 prototype lands. +== Read the Spending chart -=== Events over time +The *Spending* card shows spend over time, broken out by provider or model. The chart subtitle explains what its lines represent for the active filter combination: -A time-series chart of activity across your deployment. +[quote] +Filter by provider, then drill into models. Cost type narrows what each line sums; hover for the per-day breakdown. -* For ranges shorter than 7 days, the chart uses *hourly* buckets. -* For ranges of 7 days or more, the chart uses *daily* buckets. +=== Filter the chart -Use the event-type filter to focus on a specific kind of activity (request, error, violation, agent state change). Each bar links to a filtered transcript view for the time bucket. +The *Filter* button opens a hierarchical popover with three sibling categories: *Provider*, *Model*, *Cost type*. Each category opens a search-filterable, multi-select list. Active filters display as chips above the chart; click a chip's *×* to remove it, or click *Clear* to reset all filters. -// TODO: confirm event-type filter values once the V0 prototype lands. +[cols="1,3"] +|=== +|Filter |What it does -=== Agents +|*Provider* +|Multi-select of every LLM provider configured in your dataplane. When 2+ providers are picked, the chart renders a colored line per provider. Chip operator: `is any of`. -Every agent in your deployment, managed or BYOA. The table lists: +|*Model* +|Multi-select of every model exposed by your providers. When models are picked, the chart drills into per-model lines instead of per-provider. Chip operator: `is any of`. -* *Name* — the agent's resource name. -* *Type* — managed or BYOA. -* *Status* — current runtime state. The state badge maps to the agent's runtime state: -+ -[cols="1,2"] +|*Cost type* +|Multi-select of *Input*, *Output*, *Cached*. Narrows what each line in the chart sums; the totals strip below the chart still reports all four buckets. Chip operator: `is` for one selection, `is any of` for two or more. |=== -|State |Display - -|`AGENT_STATE_RUNNING` -|*Running* -|`AGENT_STATE_STOPPED` -|*Stopped* +NOTE: *Cost type* narrows the chart's lines, not the totals strip. If you select `Cost type | is | Cached`, each line in the chart sums only cached spend; the *INPUT*, *OUTPUT*, and *TOTAL* values in the strip below still reflect the full active Provider/Model selection. -|`AGENT_STATE_STARTING` / `AGENT_STATE_CREATING` -|*Starting* (transient; refresh for the latest state) +=== Compare to a previous period -|`AGENT_STATE_STOPPING` / `AGENT_STATE_DELETING` -|*Stopping* (transient; refresh for the latest state) +When *Compare to previous period* is on (in the date-range picker), each line gets a dashed-line counterpart drawn from the same data for the previous comparable window. Hover anywhere on the chart to see a per-day, per-line breakdown with each value's *Previous* sub-row. -|`AGENT_STATE_DEGRADED` -|*Degraded* (the row shows `state_reason`) +// TODO: screenshot of the spend chart with previous-period overlay and a hover tooltip visible. -|`AGENT_STATE_FAILED` -|*Failed* (the row shows `state_reason`) -|=== +=== Read the cost buckets -* *Error count* — number of errored executions in the current period. -+ -// TODO: confirm error-count source (transcripts vs separate metric) once the V0 prototype lands. -* *Tool invocations* — number of tool calls the agent has made in the current period. -+ -// TODO: confirm tool-invocation source (transcripts vs separate metric). -* *Last active* — most recent execution timestamp. +Below the chart, four totals report the active Provider/Model selection: -Click any agent row to drill into its transcript history. +[cols="1,3"] +|=== +|Bucket |Meaning -=== MCP servers +|*TOTAL* +|Sum of input, output, and cached spend. -Every MCP server registered in your deployment. The table lists: +|*INPUT* +|Cost attributable to prompt tokens, including tool-use inputs. -* *Name* — the server's resource name. -* *Type* — managed or self-managed. -* *Status* — connection state. +|*OUTPUT* +|Cost attributable to completion tokens, including any reasoning tokens for models that emit them (Anthropic 4.x, OpenAI o-series). -For deeper MCP server reading, see xref:mcp:overview.adoc[MCP Servers]. +|*CACHED* +|Cost attributable to cache-read tokens (5-minute or 1-hour prompt caches). Billed at a steep discount versus fresh INPUT. +|=== -== Reading the dashboard for your role +The dashboard surfaces these as the cost-tracking pipeline records them. Specific multipliers (fast-mode premium, cross-region inference premium, tiered Gemini pricing) are baked into the per-bucket totals automatically; no manual conversion is required. -The dashboard is designed for non-technical leadership. Different roles read it differently. +== Read the Agents list -=== If you're a CIO or CFO +The Agents card lists every agent in your deployment, managed and BYOA. Columns: -Start with the *summary cards*. The total spend and trend tell you at a glance whether AI cost is moving in the expected direction. +[cols="1,3"] +|=== +|Column |What it shows -Next, scan *Token spend by LLM provider* to see which provider is the largest line item. If one provider dominates your spend, click into it to see which agents drive that cost. +|*Name* +|Agent resource name. -You should be able to answer "how much are we spending?" within 30 seconds of opening the page. +|*Type* +|*Redpanda Managed* or *BYOA*. -=== If you're a CISO +|*LLM provider* +|The provider the agent calls, with its provider-type icon. -Start with the *Agents* and *MCP servers* tables. They give you full visibility into what's running in the deployment, including BYOA agents that operate outside your direct control. +|*Model* +|The model identifier the agent uses. +|=== -Scan for unfamiliar agents, agents in `Failed` or `Degraded` states, or MCP servers in disconnected states. Click any anomaly to drill into the transcript. +A *Filter* button above the table narrows the list by name, type, provider, or model. -=== If you're a Head of Trust +NOTE: Per-agent state, error count, tool-invocation count, and last-active timestamp are not in the V0 dashboard; they ship at GA. Today, click into the agent's resource page for those details. -Start with the *Agents* table sorted by error count or tool-invocation count. Outlier rows are usually where investigation begins. +[[drill-down]] +== Drill into a specific view -Use the *Events over time* chart to spot bursts of activity that don't match your expectations. Drill into a bar to see exactly what happened in that time bucket. +The *View all →* link on each KPI card opens a dedicated drill-down page. Filters, model selection, and date range carry over as URL search parameters, so deep-links and back-navigation preserve the current state. -=== If you're demoing to a prospect +[cols="1,3"] +|=== +|View |What it shows -Walk top to bottom: summary cards, then provider breakdown, then events timeline, then agent fleet, then MCP servers. The page is designed to tell the full story of an AI deployment in under two minutes, with no setup or pre-staged data. +|*Spending* +|The same chart as the dashboard's Spending card, in a full-page layout. Bottom strip: TOTAL / INPUT / OUTPUT / CACHED. -== Investigate any number, agent, or activity +|*Requests* +|Per-provider and per-model request count over time. Cost-type filter does not apply. Bottom strip: TOTAL REQUESTS. -Every value on this page is a link. +|*Tokens* +|Input + output + cached token volume by provider and model. Each line sums all three token buckets per provider or model. Bottom strip: TOTAL TOKENS. +|=== -* Click a *summary card* to filter the page to its underlying period. -* Click a *provider row* to scope every other widget to that provider. -* Click an *agent row* to open the agent's transcript history. -* Click an *event bucket* to open transcripts for that time range. -* Click an *MCP server row* to see invocations against that server. +A `<` back arrow returns to the dashboard with filter state preserved. -For the data model behind transcripts, see xref:observability:transcripts.adoc[Read a transcript]. +You can also reach the Spending, Requests, and Tokens drill-downs from a specific provider's detail page. Open *LLM Providers > \{provider}* and click *View all →* on the provider-scoped KPI strip. The drill-down opens with that provider already filtered in. == Empty states -The dashboard handles three common empty states: - [cols="1,3"] |=== |State |What you see -|*No telemetry yet* -|For a BYOA agent that has not yet streamed any transcripts. The agent row shows a *Connect telemetry* call to action that links to xref:observability:byoa-telemetry.adoc[BYOA telemetry]. +|*No providers yet* +|The KPI cards show zeroes and a hint that data populates after the first LLM request. The Spending chart is hidden. The Agents list is empty with a link to xref:agents:create-agent.adoc[Create a declarative agent]. -|*No spend recorded* -|For fresh deployments before the first agent run. The summary cards show zeroes and a hint that data will populate after the first request. +|*Providers but no traffic* +|The KPI cards show zeroes; the chart is empty with axis labels. Useful for confirming a fresh provider is wired up before traffic flows. -|*No agents* -|For deployments before the first agent is created. The page links to xref:agents:create-agent.adoc[Create a declarative agent]. +|*Provider missing in filter* +|If a provider was deleted, its name disappears from the filter; historical lines for the deleted provider continue to render until they fall outside the active window. |=== -// TODO: confirm exact CTA text and behavior once the V0 prototype lands. - == What's coming at GA -The current beta release ships the V0 overview described above. Later in the GA window, the dashboard adds: - -* *Cost over time* and *cost drill-down by agent within a provider* on the same page. -* *Token-to-dollar conversion* on every cost figure. -* *Agent Network* — an interactive topology graph that traces every agent through its LLM providers and MCP servers. See xref:dashboard/agent-network.adoc[Agent Network] (shipping at GA). -* *Authorization denials and violations* — an aggregated feed of policy events. See xref:dashboard/violations.adoc[Authorization denials and violations] (shipping at GA). -* *Kill switch* — pause an agent without removing it. +The current beta release ships the V0 overview described above. At GA: -// TODO: refresh this list against the V1 design when those design docs land. Per the Governance V0 PRD, V1 also adds a top-spenders table and agent-owner metadata; confirm the final V1 surface before GA. +* *Per-agent enrichment*: The Agents list adds error counts, tool-invocation counts, last-active timestamps, and a per-agent spend column. +* *Agent Network*: An interactive topology graph that traces every agent through its LLM providers and MCP servers. See xref:governance:dashboard/agent-network.adoc[Agent Network]. +* *Authorization denials and violations*: An aggregated feed of policy events. See xref:governance:dashboard/violations.adoc[Authorization denials and violations]. +* *Kill switch*: Pause an agent without removing it. -== Next steps +== Investigate further -* Investigate an anomaly: xref:observability:transcripts.adoc[Read a transcript]. -* Wire up missing telemetry: xref:observability:byoa-telemetry.adoc[BYOA telemetry]. -* Add agents to your deployment: xref:agents:create-agent.adoc[Create a declarative agent]. +* xref:observability:transcripts.adoc[Read a transcript]: Full conversation history for an agent run. +* xref:governance:budgets.adoc[Token budgets and limits]: Set spend caps when a provider's cost trends above expectations. +* xref:observability:byoa-telemetry.adoc[BYOA telemetry]: Wire up missing telemetry for BYOA agents that haven't streamed transcripts yet. +* xref:agents:create-agent.adoc[Create a declarative agent]: Add agents to your deployment. diff --git a/modules/governance/pages/guardrails/cost-tracking.adoc b/modules/governance/pages/guardrails/cost-tracking.adoc index e7eae3a..7c87aa4 100644 --- a/modules/governance/pages/guardrails/cost-tracking.adoc +++ b/modules/governance/pages/guardrails/cost-tracking.adoc @@ -65,6 +65,6 @@ A typical optimization: disable Toxicity on `INPUT` and run it only on `OUTPUT`. == Next steps -* xref:governance:guardrails/types-reference.adoc[Evaluator types reference] — config schemas per evaluator type. -* xref:governance:budgets.adoc[Token budgets and limits] — the spending-event pipeline that aggregates guardrail and user-facing LLM cost. -* xref:governance:dashboard/overview.adoc[Read the governance overview] — provider-breakdown view that shows guardrail-attributed spend. +* xref:governance:guardrails/types-reference.adoc[Evaluator types reference] +* xref:governance:budgets.adoc[Token budgets and limits] +* xref:governance:dashboard/overview.adoc[Read the governance overview] diff --git a/modules/governance/pages/guardrails/create-guardrail.adoc b/modules/governance/pages/guardrails/create-guardrail.adoc index 1182a4c..a2aba8f 100644 --- a/modules/governance/pages/guardrails/create-guardrail.adoc +++ b/modules/governance/pages/guardrails/create-guardrail.adoc @@ -29,15 +29,15 @@ After reading this page, you will be able to: // TODO: finalize this section once the ADP UI ships a Guardrails surface. As of 2026-04-28, `apps/adp-ui/src/routes/` has no `guardrails/` route. The walkthrough may need to lead with `aigwctl` instead of the UI and add the UI flow in a later refresh. Open Qs C4 and C5 in the companion plan. -In the ADP UI, open *Trust & Governance* → *Guardrails* → *Create guardrail*. +In the ADP UI, open *Governance* → *Guardrails* → *Create guardrail*. == Pick an evaluator type Choose one of the supported evaluator types: -* *PII* — detects personally identifiable information using regex and entity-recognition rules. No per-call LLM cost. -* *Toxicity* — runs content through a toxicity classifier. Per-call LLM cost. -* *Custom webhook* — delegates the decision to your HTTPS endpoint. Gateway charges nothing per call. +* *PII*: detects personally identifiable information using regex and entity-recognition rules. No per-call LLM cost. +* *Toxicity*: runs content through a toxicity classifier. Per-call LLM cost. +* *Custom webhook*: delegates the decision to your HTTPS endpoint. Gateway charges nothing per call. For each type's full config schema and behavior, see xref:governance:guardrails/types-reference.adoc[Evaluator types reference]. @@ -47,15 +47,15 @@ For each type's full config schema and behavior, see xref:governance:guardrails/ Pick the phase or phases at which the evaluator runs: -* `INPUT` — runs against the user's prompt before the gateway forwards it upstream. -* `OUTPUT` — runs against the model's response before the gateway returns it to the caller. -* `BOTH` — runs at both phases. +* `INPUT`: runs against the user's prompt before the gateway forwards it upstream. +* `OUTPUT`: runs against the model's response before the gateway returns it to the caller. +* `BOTH`: runs at both phases. Decision rule: * PII guardrails typically run at `BOTH` (defend data exfiltration in both directions). * Toxicity guardrails typically run at `OUTPUT` only (filter what the model generates; INPUT-side toxicity filtering rarely improves outcomes). -* Custom webhook depends on what your webhook does — start with `INPUT` for prompt-injection heuristics, `OUTPUT` for brand-safety lists, `BOTH` for either-direction checks. +* Custom webhook depends on what your webhook does: start with `INPUT` for prompt-injection heuristics, `OUTPUT` for brand-safety lists, `BOTH` for either-direction checks. == Configure the evaluator @@ -65,13 +65,13 @@ Fill in the per-type config block. The form fields differ per evaluator type; se == Attach to LLM providers -Select one or more LLM providers to attach the guardrail to. Multi-attach is supported — one guardrail can apply to many providers. +Select one or more LLM providers to attach the guardrail to. Multi-attach is supported: one guardrail can apply to many providers. // TODO: confirm whether guardrails also attach at other scopes (agents, MCP servers, organizations). The pre-pivot proto attached via `provider_ids[]` and `route_ids[]`; routes were removed in cloudv2 commit `7eff2ecbbf`. Open Qs A3, A4 in the companion plan. == Enable the guardrail -Toggle the guardrail to *Enabled*. Disabled guardrails skip evaluation entirely — useful when staging a new policy before turning it on, or when troubleshooting whether a guardrail is responsible for unexpected blocks. +Toggle the guardrail to *Enabled*. Disabled guardrails skip evaluation entirely: useful when staging a new policy before turning it on, or when troubleshooting whether a guardrail is responsible for unexpected blocks. == Verify the guardrail fires @@ -90,20 +90,20 @@ The request should return an error. Open the request's transcript and confirm a == Edit, disable, or delete -* *Edit* — change the per-type config or the attached providers. Changes apply on the next request. -* *Disable* — short-circuit the middleware without losing the config. Useful when staging or troubleshooting. -* *Delete* — permanently remove the guardrail. If the guardrail is currently firing on production traffic, the UI requires confirmation. +* *Edit*: change the per-type config or the attached providers. Changes apply on the next request. +* *Disable*: short-circuit the middleware without losing the config. Useful when staging or troubleshooting. +* *Delete*: permanently remove the guardrail. If the guardrail is currently firing on production traffic, the UI requires confirmation. // TODO: confirm exact UI labels and the delete-confirmation copy once the UI ships. Open Q C4 in the companion plan. == Troubleshooting -* *Evaluator returns false positives* — see xref:governance:guardrails/violations.adoc[Read violations] for tuning patterns per evaluator type. -* *Evaluator times out or is unavailable* — see xref:governance:guardrails/violations.adoc[Read violations] for the evaluator-down section. -* *Attached provider doesn't fire the guardrail* — confirm attachment (right provider, right phase), enabled state, and that requests are actually reaching the gateway (not bypassing via a direct provider URL). +* *Evaluator returns false positives*: see xref:governance:guardrails/violations.adoc[Read violations] for tuning patterns per evaluator type. +* *Evaluator times out or is unavailable*: see xref:governance:guardrails/violations.adoc[Read violations] for the evaluator-down section. +* *Attached provider doesn't fire the guardrail*: confirm attachment (right provider, right phase), enabled state, and that requests are actually reaching the gateway (not bypassing via a direct provider URL). == Next steps -* xref:governance:guardrails/types-reference.adoc[Evaluator types reference] — config schemas and gotchas per evaluator type. -* xref:governance:guardrails/violations.adoc[Read violations] — investigate fired guardrails and tune false-positive rates. -* xref:governance:guardrails/cost-tracking.adoc[Cost tracking] — see what each evaluator costs and where it shows up. +* xref:governance:guardrails/types-reference.adoc[Evaluator types reference] +* xref:governance:guardrails/violations.adoc[Read violations] +* xref:governance:guardrails/cost-tracking.adoc[Cost tracking] diff --git a/modules/governance/pages/guardrails/violations.adoc b/modules/governance/pages/guardrails/violations.adoc index 8ac2e53..52ea5f8 100644 --- a/modules/governance/pages/guardrails/violations.adoc +++ b/modules/governance/pages/guardrails/violations.adoc @@ -92,6 +92,6 @@ Per the AI Gateway design, evaluators run async where possible — specifically, == Next steps -* xref:governance:guardrails/types-reference.adoc[Evaluator types reference] — config schemas and per-type tuning surface. -* xref:governance:guardrails/cost-tracking.adoc[Cost tracking] — what each evaluator costs and where the cost surfaces. -* xref:observability:transcripts.adoc[Read a transcript] — the full transcript walkthrough for finding a specific violation. +* xref:governance:guardrails/types-reference.adoc[Evaluator types reference] +* xref:governance:guardrails/cost-tracking.adoc[Cost tracking] +* xref:observability:transcripts.adoc[Read a transcript] diff --git a/modules/governance/pages/index.adoc b/modules/governance/pages/index.adoc index f0b0181..88395c0 100644 --- a/modules/governance/pages/index.adoc +++ b/modules/governance/pages/index.adoc @@ -1,3 +1,3 @@ -= Trust & Governance += Governance :description: Govern agent behavior with guardrails, token budgets, the kill switch, and the governance dashboard. :page-layout: index diff --git a/modules/integrations/pages/remote-mcp-clients.adoc b/modules/integrations/pages/remote-mcp-clients.adoc new file mode 100644 index 0000000..6e60bfd --- /dev/null +++ b/modules/integrations/pages/remote-mcp-clients.adoc @@ -0,0 +1,228 @@ += Connect Remote MCP Clients to AI Gateway +:description: Connect external MCP clients (Claude Desktop, ChatGPT desktop, Gemini Apps) to MCP servers hosted in AI Gateway. Covers the three-piece architecture (MCP server, OAuth Provider, OAuth Client) and the two-step OAuth flow that runs end-to-end. +:page-topic-type: how-to +:personas: app_developer, platform_admin +:learning-objective-1: Register an OAuth Client in AI Gateway so an external chat app can authenticate users +:learning-objective-2: Wire a custom connector in Claude Desktop (or another chat client) to your MCP server +:learning-objective-3: Walk a user through the two-step OAuth flow that runs end-to-end + +External MCP clients (Claude Desktop, ChatGPT desktop, Gemini Apps, Cursor) connect to MCP servers hosted in AI Gateway by registering as OAuth clients. End-users get the MCP tools inside their preferred chat app, with Redpanda mediating both client-app authentication and upstream-system authentication. + +After completing this guide, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +== When to use this + +Use a remote MCP client connection when: + +* You want users to invoke MCP tools from inside Claude Desktop, ChatGPT desktop, Gemini Apps, or Cursor without writing custom integration code. +* You already have, or are about to create, an MCP server (managed or self-managed) in AI Gateway. +* End-users have accounts with the chat client and the upstream system you're integrating with. + +Use a different approach when: + +* You need programmatic, server-side tool invocation. See xref:ai-gateway:connect-agent.adoc[Connect your agent] for SDK-based access. +* You need an in-house chat UI. Build against the AI Gateway's MCP endpoints directly with the SDK of your choice. + +== Architecture: Three resources work together + +Wiring a remote chat client to an MCP server uses three resources in AI Gateway: + +[cols="1,3"] +|=== +|Resource |Role + +|*MCP server* +|The tool surface itself. Managed (Redpanda hosts it) or self-managed (you host it). See xref:mcp:create-server.adoc[Create an MCP Server]. + +|*OAuth Provider* +|Defines how AI Gateway authenticates against the upstream system on behalf of users (for example, GitHub) when the MCP server uses user-delegated OAuth. See xref:mcp:oauth-providers.adoc[Configure an OAuth Provider]. Optional: only needed if the MCP server requires per-user upstream identity. + +|*OAuth Client* +|Defines how an external chat app (Claude Desktop, ChatGPT, Gemini, Cursor) authenticates *against AI Gateway* on behalf of users. The chat client gets a `client_id` + `client_secret` it uses to negotiate access tokens. *This is what makes the chat-client integration possible.* +|=== + +Putting it together with a GitHub example: + +* The *MCP server* is a managed GitHub MCP, configured to use user-delegated OAuth. +* The *OAuth Provider* points at GitHub's OAuth endpoints; AI Gateway uses it to act as each user against GitHub. +* The *OAuth Client* is registered for Claude Desktop; Claude Desktop uses it to act as each user against AI Gateway. + +When a user invokes a tool, AI Gateway runs both auth handshakes: Claude to AI Gateway through the OAuth Client, then AI Gateway to GitHub through the OAuth Provider. + +== Prerequisites + +Before you wire up the chat-client connector, make sure you have: + +* An MCP server already created in AI Gateway. See xref:mcp:create-server.adoc[Create an MCP Server]. +* The MCP server's *API URL*. Copy it from the server's *Overview* tab. +* For user-delegated MCP servers: an OAuth Provider configured for the upstream system. See xref:mcp:oauth-providers.adoc[Configure an OAuth Provider]. +* End-users have accounts with the chat client (Claude, ChatGPT, Gemini, Cursor) and the upstream system the MCP server connects to. + +== Register an OAuth Client in AI Gateway + +Create an OAuth Client to give the chat app the credentials it needs to authenticate against AI Gateway: + +. Sign in to ADP and open *OAuth Clients* in the sidebar. The page lists every external tool registered to request access tokens from this gateway. +. Click *Register Client*. +. Fill in the form: ++ +[cols="1,3"] +|=== +|Field |Notes + +|*Name* +|Human-readable identifier. Pick something descriptive: `Claude Desktop (GitHub MCP)` or `Cursor (production)`. + +|*Grant types* +|*Authorization Code* and *Refresh Token*. This is the standard combination for browser-based chat clients with long-lived tokens. Most external chat clients require both. + +|*Redirect URIs* +|The URIs the gateway redirects to after a user approves. Each chat client publishes its own callback URLs; copy them from the chat client's connector documentation. Multiple URIs are allowed. Claude Desktop, for example, uses two: `\https://claude.ai/api/mcp/auth_callback` and `\https://claude.ai/api/organizations/custom-connectors/oauth/callback`. +|=== ++ +. Click *Register*. + +After registration, the detail page shows the *Client ID* and *Client Secret*. Copy both. The Client Secret is shown only once; rotate it through the secret store if you need to recover it later. + +// TODO: capture screenshots of the Register Client form and the post-create detail page. + +NOTE: OAuth Clients and OAuth Providers are different resources with different sidebar entries. *Provider* defines how AI Gateway authenticates against an upstream system; *Client* defines how an external app authenticates against AI Gateway. They are configured independently and the same upstream (GitHub, Slack, Atlassian) often has both: one per direction. + +== Wire up Claude + +Anthropic supports custom MCP connectors in Claude.ai (web), Claude Desktop, and the Claude organization-settings UI. The setup flow is the same in each: + +. Open *Settings > Connectors* (or *Customize > Connectors* in newer builds; Anthropic surfaces a _Connectors have moved to Customize_ notice during the migration). +. Click *Add custom connector*. The *Add custom connector BETA* modal opens. +. Fill in the connector details: ++ +[cols="1,3"] +|=== +|Field |Value + +|*Name* +|Anything that helps the user identify the connector (for example, `Redpanda GitHub`). Surfaces in Claude's tool list. + +|*Remote MCP server URL* +|The MCP server's *API URL* from AI Gateway. Format: `\https://aigw..clusters.rdpa.co/mcp/v1/`. + +|*OAuth client ID* (optional, under *Advanced settings*) +|The Client ID from the AI Gateway OAuth Client. Required for any MCP server that requires authentication. Leave blank only for public, no-auth MCP servers. + +|*OAuth client secret* (under *Advanced settings*) +|The Client Secret from the AI Gateway OAuth Client. Required whenever Client ID is set. +|=== ++ +. Click *Add*. The connector appears in the Connectors list with a `CUSTOM` badge. +. Click *Connect* on the new connector row. Claude opens a browser tab pointed at AI Gateway's authorization endpoint. Sign in with your AI Gateway identity (Auth0 today, Zitadel in a future release). Once approved, the connector becomes invokable in any conversation. + +// TODO: capture screenshots of the Add custom connector modal and the post-connect Connectors list against `adp-production`. + +NOTE: Anthropic's modal warns that connectors are user-trust-based; Anthropic doesn't control which tools developers expose. If you're publishing a connector for end-users, document the upstream system and scopes clearly so users know what they're authorizing. + +== Wire up other chat clients + +The flow mirrors Claude Desktop. The exact menu paths and field labels differ by client: + +* *ChatGPT desktop*: Recent builds support remote MCP custom connectors. Confirm the latest menu path; OpenAI iterates on this surface. +* *Gemini apps*: Recent builds support remote MCP custom connectors. +* *Cursor*: Supports remote MCP servers in recent builds. + +The required inputs are the same as Claude Desktop: connector name, MCP URL, Client ID, Client Secret. The chat client's redirect URIs must be registered on the AI Gateway OAuth Client. + +// TODO: confirm and document the ChatGPT, Gemini, and Cursor menu paths once each integration ships. + +== The two-step OAuth flow + +When a user calls a tool that needs upstream access, two OAuth handshakes run end-to-end. Most users only see the second one (and only on the very first tool call). + +=== Chat client connects to AI Gateway + +This handshake runs *once per user* when the connector is first added. + +. The user clicks *Connect* in the chat client. +. The chat client opens a browser tab at the AI Gateway authorization endpoint, parameterized with the OAuth Client's `client_id` and one of the registered redirect URIs. +. AI Gateway authenticates the user against the configured IdP (Auth0 today, Zitadel later) and presents an *Authorize access* consent screen. The screen shows: ++ +* The OAuth Client's name (for example, _Claude (GitHub Read demo) wants to access your data_). +* The *Resource* being authorized: The MCP server name and URL. +* The *Requested permissions*: The gateway's internal scopes for this handshake (`mcp` and `offline_access`). These are *not* the upstream system's scopes; the upstream's scopes appear during the next handshake. +* A footer reminding the user that they can revoke access from their *Connections* page in the ADP UI. ++ +. The user clicks *Allow*. AI Gateway redirects the chat client back to the redirect URI with an authorization code. +. The chat client exchanges the code for an access token and a refresh token. Tokens are stored locally in the chat client's credential store. +. Subsequent calls to AI Gateway send the access token in `Authorization: Bearer ...`. The chat client refreshes the token automatically when it expires. + +=== AI Gateway connects to the upstream system + +(Only for user-delegated MCP servers.) + +This handshake runs *once per user, per upstream*. For an MCP server using user-delegated OAuth (GitHub, Slack, Atlassian, Workday, etc.): + +. The user invokes a tool that requires upstream auth. +. AI Gateway has no stored upstream token for this user yet. The MCP protocol returns a `FAILED_PRECONDITION` response with an `OAuthConnectionRequired` error detail. The detail carries an `authorize_url` pointing at AI Gateway's OAuth bridge for the configured upstream provider, for example: `\https://aigw..clusters.rdpa.co/oauth/v1/authorize?provider_name=github&scopes=read:user,repo`. +. The chat client renders the link in its response to the user. Inside Claude this appears as a hyperlinked URL with prose telling the user to authorize the upstream connection (for example, _Authorize the GitHub connection first_) before retrying. +. The user clicks the link. AI Gateway redirects them to the *upstream system's own OAuth consent page* (for example, GitHub's standard authorization UI) listing the requested repositories and scopes. +. The user clicks *Authorize* on the upstream's consent page. The upstream redirects back to AI Gateway with an authorization code. AI Gateway exchanges the code for a token and stores it in its token vault under the user's identity. +. The user tells the chat client they've connected. The chat client retries the original tool call, which now succeeds. Subsequent calls reuse the stored upstream token automatically. + +After both steps complete, the user can invoke any tool on the MCP server transparently. They re-consent only if scopes change or the refresh tokens expire. + +NOTE: Claude (and other chat clients) layer their own *per-tool consent prompts* on top of the OAuth flow described here. The first time a connector tries to invoke a specific tool, Claude shows a prompt of the form _Claude wants to use \{tool_name} from \{connector_name}_ with *Always allow* / *Deny* buttons. This is the chat client's own user-trust UX, not an additional AI Gateway auth step. Once a user picks *Always allow* for a tool, Claude won't prompt again for that tool from that connector. + +NOTE: If the MCP server uses a service-account auth mode instead of user-delegated OAuth, only Step 1 runs. AI Gateway calls the upstream with one shared identity and the user never sees the upstream consent flow. + +== Manage and rotate + +Maintain registered OAuth Clients without re-creating them: + +* *List registered clients*: Open *OAuth clients*. Each row shows the Name, Grant Types, and Redirect URIs. +* *Edit a client*: Change the redirect URIs or grant types. The Client ID is immutable. +* *Rotate the secret*: Generate a new Client Secret on the detail page. Update the value in every chat client that uses this OAuth Client; old tokens continue to work until they expire. +* *Delete a client*: Invalidates every active token issued under it. Every chat-client connector that depends on this OAuth Client breaks until reconfigured against a replacement. + +== Troubleshooting + +Common symptoms and fixes: + +[cols="1,2"] +|=== +|Symptom |What to check + +|_Couldn't connect to MCP server_ or connector setup fails immediately +|The MCP URL is wrong, or the Client ID + Client Secret don't match an OAuth Client. Confirm the API URL on the MCP server's *Overview* tab and the credentials on the OAuth Client's detail page. + +|`redirect_uri_mismatch` during the connect flow +|The chat client's callback URL isn't registered on the OAuth Client. Add the URL the chat client publishes (Claude Desktop has two; check Claude's docs for the current set). + +|Connector authorized but no tools appear +|The MCP server has zero tools, or `tools/list` failed at connection time. Open the server in the Inspector to confirm tools are discovered. See xref:mcp:test-tools.adoc[Test a server's tools]. + +|Tool call returns an _authorize_ link to the user +|First call from a user with no stored upstream token. The user follows the link, completes upstream consent, and the call retries automatically (Step 2 of the flow above). + +|`scope_upgrade_required` from a tool call +|The MCP server's `required_scopes` was extended after the user consented at the upstream. The user re-consents at the upstream with the higher scope. + +|`401 Unauthorized` from every call after working previously +|The chat client's access token expired and the refresh token also expired (or the OAuth Client secret was rotated). Disconnect the connector and re-add it to mint fresh tokens. +|=== + +== Out of scope + +This page does not cover: + +* *Custom desktop or mobile UIs*: Build against the AI Gateway MCP endpoints directly using your platform's HTTP client; you don't need an OAuth Client unless you want the same external-app flow. +* *Agent-to-agent calls (A2A)*: See the Agents docs; remote MCP clients are end-user-facing. +* *MCP server authoring*: See xref:mcp:create-server.adoc[Create an MCP Server] for the server side. + +== Related topics + +* xref:mcp:create-server.adoc[Create an MCP Server] +* xref:mcp:oauth-providers.adoc[Configure an OAuth Provider] +* xref:mcp:user-delegated-oauth.adoc[User-delegated OAuth] +* xref:ai-gateway:connect-agent.adoc[Connect your agent] diff --git a/modules/integrations/partials/integrations/claude-code-admin.adoc b/modules/integrations/partials/integrations/claude-code-admin.adoc index 799e767..af82e65 100644 --- a/modules/integrations/partials/integrations/claude-code-admin.adoc +++ b/modules/integrations/partials/integrations/claude-code-admin.adoc @@ -492,5 +492,5 @@ Causes and solutions: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules -* xref:mcp/remote/overview.adoc[]: Deploy Remote MCP servers for custom tools +* xref:ai-gateway:routing-cel.adoc[] +* xref:mcp/remote/overview.adoc[] diff --git a/modules/integrations/partials/integrations/claude-code-user.adoc b/modules/integrations/partials/integrations/claude-code-user.adoc index 804794c..819e64c 100644 --- a/modules/integrations/partials/integrations/claude-code-user.adoc +++ b/modules/integrations/partials/integrations/claude-code-user.adoc @@ -395,8 +395,8 @@ chmod 600 ~/.claude.json == Next steps -* xref:ai-gateway:aggregation.adoc[]: Configure deferred tool loading to reduce token costs -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Claude Code requests based on context +* xref:ai-gateway:aggregation.adoc[] +* xref:ai-gateway:routing-cel.adoc[] == Related pages diff --git a/modules/integrations/partials/integrations/cline-admin.adoc b/modules/integrations/partials/integrations/cline-admin.adoc index 469286f..991f97b 100644 --- a/modules/integrations/partials/integrations/cline-admin.adoc +++ b/modules/integrations/partials/integrations/cline-admin.adoc @@ -573,5 +573,5 @@ Causes and solutions: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules -* xref:mcp/remote/overview.adoc[]: Deploy Remote MCP servers for custom tools +* xref:ai-gateway:routing-cel.adoc[] +* xref:mcp/remote/overview.adoc[] diff --git a/modules/integrations/partials/integrations/cline-user.adoc b/modules/integrations/partials/integrations/cline-user.adoc index a337b7b..fc7a92d 100644 --- a/modules/integrations/partials/integrations/cline-user.adoc +++ b/modules/integrations/partials/integrations/cline-user.adoc @@ -723,8 +723,8 @@ The gateway automatically blocks requests that would exceed the limit. == Next steps -* xref:ai-gateway:aggregation.adoc[]: Configure deferred tool loading to reduce token costs -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Cline requests based on task complexity +* xref:ai-gateway:aggregation.adoc[] +* xref:ai-gateway:routing-cel.adoc[] == Related pages diff --git a/modules/integrations/partials/integrations/continue-admin.adoc b/modules/integrations/partials/integrations/continue-admin.adoc index 441ff31..ec2f631 100644 --- a/modules/integrations/partials/integrations/continue-admin.adoc +++ b/modules/integrations/partials/integrations/continue-admin.adoc @@ -735,5 +735,5 @@ This is expected behavior, not a configuration issue: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules -* xref:mcp/remote/overview.adoc[]: Deploy Remote MCP servers for custom tools +* xref:ai-gateway:routing-cel.adoc[] +* xref:mcp/remote/overview.adoc[] diff --git a/modules/integrations/partials/integrations/continue-user.adoc b/modules/integrations/partials/integrations/continue-user.adoc index 476d9ff..11856c3 100644 --- a/modules/integrations/partials/integrations/continue-user.adoc +++ b/modules/integrations/partials/integrations/continue-user.adoc @@ -838,8 +838,8 @@ Autocomplete rarely needs more than 256 tokens, while chat responses can vary. == Next steps -* xref:ai-gateway:aggregation.adoc[]: Configure deferred tool loading to reduce token costs -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Continue.dev requests based on context +* xref:ai-gateway:aggregation.adoc[] +* xref:ai-gateway:routing-cel.adoc[] == Related pages diff --git a/modules/integrations/partials/integrations/cursor-admin.adoc b/modules/integrations/partials/integrations/cursor-admin.adoc index ecbf4b4..006df4d 100644 --- a/modules/integrations/partials/integrations/cursor-admin.adoc +++ b/modules/integrations/partials/integrations/cursor-admin.adoc @@ -808,5 +808,5 @@ Causes and solutions: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules for model prefix routing -* xref:mcp/remote/overview.adoc[]: Deploy Remote MCP servers for custom tools +* xref:ai-gateway:routing-cel.adoc[] +* xref:mcp/remote/overview.adoc[] diff --git a/modules/integrations/partials/integrations/cursor-user.adoc b/modules/integrations/partials/integrations/cursor-user.adoc index f78513a..c51dbc9 100644 --- a/modules/integrations/partials/integrations/cursor-user.adoc +++ b/modules/integrations/partials/integrations/cursor-user.adoc @@ -805,8 +805,8 @@ This sends only search + orchestrator tools initially, reducing token usage sign == Next steps -* xref:ai-gateway:aggregation.adoc[]: Configure deferred tool loading to work within Cursor's 40-tool limit -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Cursor requests based on context +* xref:ai-gateway:aggregation.adoc[] +* xref:ai-gateway:routing-cel.adoc[] == Related pages diff --git a/modules/integrations/partials/integrations/github-copilot-admin.adoc b/modules/integrations/partials/integrations/github-copilot-admin.adoc index a486cb3..d55a4f1 100644 --- a/modules/integrations/partials/integrations/github-copilot-admin.adoc +++ b/modules/integrations/partials/integrations/github-copilot-admin.adoc @@ -818,5 +818,4 @@ Causes and solutions: == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Implement advanced routing rules for model aliasing - +* xref:ai-gateway:routing-cel.adoc[] diff --git a/modules/integrations/partials/integrations/github-copilot-user.adoc b/modules/integrations/partials/integrations/github-copilot-user.adoc index ec665db..16fc444 100644 --- a/modules/integrations/partials/integrations/github-copilot-user.adoc +++ b/modules/integrations/partials/integrations/github-copilot-user.adoc @@ -902,8 +902,8 @@ Generate project-specific cost reports from the gateway dashboard. == Next steps -* xref:ai-gateway:routing-cel.adoc[]: Use CEL expressions to route Copilot requests based on context -* xref:ai-gateway:aggregation.adoc[]: Learn about MCP tool integration (if using Copilot Workspace) +* xref:ai-gateway:routing-cel.adoc[] +* xref:ai-gateway:aggregation.adoc[] == Related pages diff --git a/modules/mcp/pages/create-server.adoc b/modules/mcp/pages/create-server.adoc index 29bd110..0620c8f 100644 --- a/modules/mcp/pages/create-server.adoc +++ b/modules/mcp/pages/create-server.adoc @@ -35,8 +35,8 @@ After completing this guide, you will be able to: The marketplace picker lists every managed type as a card and includes a *Remote (Proxied)* option for self-managed servers. -* *Managed* — pick a card. Redpanda hosts the server in-process. The configuration form is rendered from the type's protobuf schema; field labels and help text come straight from the proto. -* *Self-managed* — pick *Remote (Proxied)*. You provide a URL and a transport, and Redpanda proxies requests to your server. +* *Managed*: pick a card. Redpanda hosts the server in-process. The configuration form is rendered from the type's protobuf schema; field labels and help text come straight from the proto. +* *Self-managed*: pick *Remote (Proxied)*. You provide a URL and a transport, and Redpanda proxies requests to your server. // TODO: screenshot of the marketplace picker, with both a managed card and the Remote (Proxied) option visible. @@ -98,7 +98,7 @@ Two fields on top of the identity fields: == Configure authentication -Both managed and self-managed servers offer the same five authentication modes (managed types only show the modes that make sense for that type — for example, SQL never offers user-delegated OAuth): +Both managed and self-managed servers offer the same five authentication modes. Managed types only show the modes that make sense for that type; for example, SQL never offers user-delegated OAuth. [cols="1,3"] |=== @@ -128,7 +128,7 @@ NOTE: Choosing between *Service-account OAuth* and *User-delegated OAuth* is the Toggle *Code mode* to add `{name}_search` and `{name}_execute` tools alongside the server's own tools. Agents can use `_search` to discover available tools and `_execute` to run sandboxed Python or JavaScript that orchestrates them. This is useful when you'd rather have the agent generate a small program than call tools one at a time. -When code mode is enabled, the server's detail page surfaces a second URL — the *code-mode MCP URL* — that clients can connect to instead of the standard one. +When code mode is enabled, the server's detail page surfaces a second URL, the *code-mode MCP URL*, that clients can connect to instead of the standard one. // TODO: screenshot of the code-mode toggle and the resulting two URLs on the detail page. @@ -136,19 +136,69 @@ NOTE: Defer advanced code-mode patterns (sandboxing limits, runtime selection, d == Save and verify -. Click *Create*. The server appears in the list with a *Type* badge — *Managed* or *Self-managed*. -. Open the detail page. The *Overview* tab shows the *API URL* — this is the MCP URL agents connect to. Copy it for use later. +. Click *Create*. The server appears in the list with a *Type* badge: *Managed* or *Self-managed*. +. Open the detail page. The *Overview* tab shows the *API URL*: this is the MCP URL agents connect to. Copy it for use later. . Open the *Inspector* tab. Redpanda performs a live `tools/list` against the server and lists every tool it discovered. See xref:test-tools.adoc[Test a server's tools] for how to call them. A populated tools list confirms that the connection works and credentials resolve correctly. If the list is empty or the tab shows an error, see <>. +== Create from the CLI + +The `rpk ai` plugin offers a non-UI path for the same create flow. Useful for scripting and CI. + +[source,bash] +---- +# Managed type (Workday example) +rpk ai mcp create workday-hr \ + --managed-config '{ + "@type": "type.googleapis.com/redpanda.mcps.workday.v1.WorkdayMCPConfig", + "tenant": "acme", + "host": "wd2-impl-services1.workday.com", + "oauth_refresh_token": { + "username": "isu_user@acme", + "password_secret_ref": "${secrets.WORKDAY_PASSWORD}", + "refresh_token_secret_ref": "${secrets.WORKDAY_REFRESH_TOKEN}" + } + }' + +# Managed type with user-delegated OAuth (Ramp example) +rpk ai mcp create my-ramp \ + --managed-config '{"@type":"type.googleapis.com/redpanda.mcps.ramp.v1.RampMCPConfig","environment":"production"}' \ + --user-oauth-provider ramp \ + --user-oauth-scopes transactions:read,cards:read,users:read + +# Update a server's user-delegated OAuth scopes +rpk ai mcp update my-ramp \ + --user-oauth-provider ramp \ + --user-oauth-scopes transactions:read,cards:read,cards:write,users:read +---- + +[cols="1,3"] +|=== +|Flag |Notes + +|`--managed-config` +|JSON blob carrying the managed type's `_config.proto` shape, including a `@type` URL. + +|`--user-oauth-provider` +|Name of an OAuth Provider already registered under *OAuth Providers*. See xref:oauth-providers.adoc[Configure an OAuth Provider]. The principal needs `dataplane_aigateway_oauthprovider_attach` on the named provider (AI-893). + +|`--user-oauth-scopes` +|Comma-separated scopes the server requires. Provide every scope any tool may need; user re-consent is required if scopes change later. + +|`--auth-config` +|Alternative when a server needs an auth shape that doesn't map to the convenience flags. Takes a JSON blob matching the appropriate auth oneof variant. +|=== + +The CLI uses your active `rpk ai` profile for the gateway URL and authentication. + == Edit, disable, and delete a server * *Edit:* most fields can change. The *name* and *type* are *immutable* after create. * *Disable:* toggle *Enabled* off. The server stays in the list, but every tool call returns an error until you re-enable it. * *Delete:* permanently removes the server. In-flight user OAuth connections for this server are also discarded; users will need to re-consent if you re-create a server with the same name. + -// TODO: confirm exact delete semantics with eng — does deletion drop tokens from the vault, or just unbind the server? +// TODO: confirm exact delete semantics with eng: does deletion drop tokens from the vault, or just unbind the server? [[troubleshooting]] == Troubleshooting @@ -176,7 +226,7 @@ A populated tools list confirms that the connection works and credentials resolv The following capabilities are not configured on this page; see the linked content instead. -* *User-delegated OAuth consent flow* — see xref:user-delegated-oauth.adoc[User-delegated OAuth]. -* *Inspector usage* — see xref:test-tools.adoc[Test a server's tools]. -* *Multi-server aggregation* — handled by AI Gateway. See xref:ai-gateway:aggregation.adoc[MCP aggregation]. -* *Per-type configuration depth* — see xref:managed/managed-catalog.adoc[Managed catalog] and the deep-dive pages. +* *User-delegated OAuth consent flow*: see xref:user-delegated-oauth.adoc[User-delegated OAuth]. +* *Inspector usage*: see xref:test-tools.adoc[Test a server's tools]. +* *Multi-server aggregation*: handled by AI Gateway. See xref:ai-gateway:aggregation.adoc[MCP aggregation]. +* *Per-type configuration depth*: see xref:managed/managed-catalog.adoc[Managed catalog] and the deep-dive pages. diff --git a/modules/mcp/pages/managed/ironclad.adoc b/modules/mcp/pages/managed/ironclad.adoc new file mode 100644 index 0000000..54dd2b2 --- /dev/null +++ b/modules/mcp/pages/managed/ironclad.adoc @@ -0,0 +1,164 @@ += Ironclad Managed MCP Server +:description: Read and manage contracts in Ironclad CLM. Per-user OAuth so each agent action runs as the calling end-user with their own Ironclad permissions. +:page-topic-type: how-to +:personas: app_developer, platform_admin +:learning-objective-1: Configure the Ironclad managed MCP server with per-user OAuth +:learning-objective-2: Pick the right region and scopes for your tenant +:learning-objective-3: List, fetch, and launch contract workflows from an agent + +The *Ironclad* managed MCP server gives an LLM read and write access to https://ironcladapp.com/[Ironclad], a contract lifecycle management (CLM) platform. Useful for agents that need to find contracts, check signature status, launch new contracts from templates, or retrieve executed documents. + +After completing this guide, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +== What this MCP server does + +Per-user OAuth 2.0 (Authorization Code grant). No static API key is stored in the MCP config; each user authorizes their own Ironclad account through AI Gateway's OAuth flow. + +It is *not* a replacement for the Ironclad web UI for complex workflow management or template authoring. + +== Prerequisites + +Before you create the server, make sure you have: + +* An Ironclad tenant where you can register an OAuth app. +* An OAuth Provider configured in ADP for Ironclad. See xref:../oauth-providers.adoc[Configure an OAuth Provider]. +* Familiarity with xref:../user-delegated-oauth.adoc[User-delegated OAuth]. + +== Get Ironclad credentials + +Set up the OAuth app on Ironclad and the matching OAuth Provider in ADP: + +. Log in to your Ironclad account and go to *Settings > API > OAuth Apps*. +. Create a new OAuth app. Set the redirect URI to your AI Gateway callback URL (typically `\https://aigw..clusters.rdpa.co/oauth/v1/callback`). +. Select the following scopes: ++ +* `public.workflows.readWorkflows` +* `public.workflows.readSchemas` +* `public.workflows.createWorkflows` +* `public.workflows.readDocuments` +. Copy the *Client ID* and *Client Secret*. +. In ADP, register an OAuth Provider with: ++ +* *Authorization endpoint*: `\https://na1.ironcladapp.com/oauth/authorize` (use `eu1` for EU-hosted accounts; `demo` for sandbox) +* *Token endpoint*: `\https://na1.ironcladapp.com/oauth/token` (adjust region accordingly) +* The Client ID and a secret-store reference for the Client Secret. + +== Configure + +Create a new Ironclad MCP server in the ADP UI: + +. Open *MCP Servers > Create Server*. +. Pick *Ironclad* from the marketplace picker. +. Fill in identity fields (`name`, `description`). +. In the Ironclad configuration form: ++ +[cols="1,3"] +|=== +|Field |Notes + +|*Region* +|`IRONCLAD_REGION_NA` (default), `IRONCLAD_REGION_EU` for EU-hosted accounts, or `IRONCLAD_REGION_DEMO` for sandbox testing. + +|*OAuth Provider* +|The Ironclad OAuth Provider you configured. + +|*Required scopes* +|`public.workflows.readWorkflows`, `public.workflows.readSchemas`, `public.workflows.createWorkflows`, `public.workflows.readDocuments`. +|=== ++ +. Click *Create*. + +// TODO: capture screenshots of the Ironclad create form. + +=== Configure from the CLI + +[source,bash] +---- +rpk ai mcp create my-ironclad \ + --managed-config '{ + "@type": "type.googleapis.com/redpanda.mcps.ironclad.v1.IroncladMCPConfig", + "region": "IRONCLAD_REGION_NA" + }' \ + --user-oauth-provider ironclad-prod \ + --user-oauth-scopes public.workflows.readWorkflows,public.workflows.readSchemas,public.workflows.createWorkflows,public.workflows.readDocuments +---- + +For EU-hosted accounts, use `"IRONCLAD_REGION_EU"`. For sandbox testing, use `"IRONCLAD_REGION_DEMO"`. + +== Tools + +The Ironclad MCP exposes the following tools: + +[cols="1,2"] +|=== +|Tool |Description + +|`list_workflows` +|List contracts with optional `status` filter and page/per_page pagination. + +|`get_workflow` +|Get full details of a contract by `workflow_id`. + +|`list_workflow_schemas` +|List available contract templates with their field IDs. + +|`create_workflow` +|Launch a new contract from a template (`schema_id` + `attributes_json`). + +|`list_workflow_documents` +|List documents attached to a contract. +|=== + +=== Example: Find all contracts awaiting signature + +[source,bash] +---- +curl -X POST https://aigw..clusters.rdpa.co/mcp/v1/my-ironclad \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", "id": 1, "method": "tools/call", + "params": { + "name": "list_workflows", + "arguments": {"status": "signing", "per_page": 10} + } + }' +---- + +== Troubleshooting + +Common symptoms and fixes: + +[cols="1,2"] +|=== +|Symptom |What to check + +|`OAuthConnectionRequired` +|First call from a user with no stored token. The user completes Ironclad's OAuth consent flow, the token lands in the vault, and subsequent calls reuse it. + +|`scope_upgrade_required` +|Server's `required_scopes` was extended after users had already consented. Users re-consent with the higher scope. + +|Wrong region results +|Confirm the *Region* field matches your Ironclad tenant. EU-hosted accounts use `IRONCLAD_REGION_EU`; sandbox tenants use `IRONCLAD_REGION_DEMO`. + +|`schema_id` not found in `create_workflow` +|Run `list_workflow_schemas` first to get valid IDs for your tenant. +|=== + +== Out of scope + +This page does not cover: + +* *Template authoring*: Define templates in the Ironclad web UI, then reference them by `schema_id`. +* *eSignature flows*: Handled inside Ironclad; this MCP launches and reads workflow state. + +== Related topics + +* xref:../oauth-providers.adoc[Configure an OAuth Provider] +* xref:../user-delegated-oauth.adoc[User-delegated OAuth] +* xref:../create-server.adoc[Create an MCP Server] diff --git a/modules/mcp/pages/managed/managed-catalog.adoc b/modules/mcp/pages/managed/managed-catalog.adoc index 6ca3c38..72eabe1 100644 --- a/modules/mcp/pages/managed/managed-catalog.adoc +++ b/modules/mcp/pages/managed/managed-catalog.adoc @@ -85,6 +85,10 @@ If any of these answers are "no," prefer xref:register-remote.adoc[a self-manage |*Slack* |Post messages and read channels on Slack. |xref:managed/slack.adoc[See the deep-dive →] + +|*Zendesk* +|Search and manage Zendesk Support tickets, users, and Help Center articles. +|xref:managed/zendesk.adoc[See the deep-dive →] |=== == Database @@ -170,6 +174,10 @@ If any of these answers are "no," prefer xref:register-remote.adoc[a self-manage |Manage jobs, candidates, and applications in Greenhouse ATS. |— +|*Ironclad* +|Read and manage contracts in Ironclad CLM. +|xref:managed/ironclad.adoc[See the deep-dive →] + |*Okta* |Manage Okta users and groups. |— @@ -178,6 +186,10 @@ If any of these answers are "no," prefer xref:register-remote.adoc[a self-manage |Expose any OpenAPI/Swagger HTTP API as MCP tools. |xref:managed/openapi.adoc[See the deep-dive →] +|*Ramp* +|Manage Ramp corporate cards, transactions, spend limits, and reimbursements. +|xref:managed/ramp.adoc[See the deep-dive →] + |*Salesforce* |Query, create, update, and delete Salesforce CRM records using SOQL and the REST API. |— @@ -185,6 +197,10 @@ If any of these answers are "no," prefer xref:register-remote.adoc[a self-manage |*Text Chunker* |Split and chunk text for RAG and LLM ingestion pipelines. |— + +|*Workday* +|Drive Workday Human Resources business processes via SOAP. +|xref:managed/workday.adoc[See the deep-dive →] |=== == Where to go next diff --git a/modules/mcp/pages/managed/ramp.adoc b/modules/mcp/pages/managed/ramp.adoc new file mode 100644 index 0000000..596f93f --- /dev/null +++ b/modules/mcp/pages/managed/ramp.adoc @@ -0,0 +1,221 @@ += Ramp Managed MCP Server +:description: Manage Ramp corporate cards, transactions, spend limits, and reimbursements from an LLM agent. Per-user OAuth so each agent action runs as the calling end-user. +:page-topic-type: how-to +:personas: app_developer, platform_admin +:learning-objective-1: Configure the Ramp managed MCP server with per-user OAuth +:learning-objective-2: Pick the right scopes and environment for production vs sandbox +:learning-objective-3: List transactions, manage cards, and adjust spend limits from an agent + +The *Ramp* managed MCP server lets an LLM read and act on your company's Ramp spend data: listing and inspecting transactions, browsing cards, managing spend limits, querying users and departments, looking up vendors, and reviewing reimbursements. + +After completing this guide, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +== What this MCP server does + +Wraps the https://docs.ramp.com/developer-api/v1[Ramp Developer API v1] using per-user OAuth tokens, so each user's Ramp permissions are enforced automatically and no shared API key is stored. + +It is suitable for expense analysis, spend-policy enforcement, and corporate card management workflows. It is *not* intended for accounting system integrations or bulk data exports; use Ramp's native accounting sync or data export features for those tasks. + +== Prerequisites + +Before you create the server, make sure you have: + +* A Ramp account with admin access to the Ramp Developer Portal. +* An OAuth Provider configured in ADP for Ramp. See xref:../oauth-providers.adoc[Configure an OAuth Provider]. +* Familiarity with xref:../user-delegated-oauth.adoc[User-delegated OAuth]. + +== Get Ramp credentials + +Set up the OAuth app on Ramp and the matching OAuth Provider in ADP: + +. Sign in to the https://app.ramp.com/developer[Ramp Developer Portal]. +. Go to *Developer Settings > Applications* and click *Create Application*. +. Set the redirect URI to your AI Gateway OAuth callback (typically `\https://aigw..clusters.rdpa.co/oauth/v1/callback`). +. Note the *Client ID* and *Client Secret*. +. Required scopes: ++ +* `transactions:read` +* `cards:read` +* `cards:write` +* `users:read` +* `departments:read` +* `vendors:read` +* `reimbursements:read` +* `limits:read` +* `limits:write` +. In ADP, register an OAuth Provider with: ++ +* *Authorization endpoint*: `\https://app.ramp.com/v1/authorize` +* *Token endpoint*: `\https://api.ramp.com/developer/v1/token` +* The Client ID and a secret-store reference for the Client Secret. + +== Configure + +Create a new Ramp MCP server in the ADP UI: + +. Open *MCP Servers > Create Server*. +. Pick *Ramp* from the marketplace picker. +. Fill in identity fields (`name`, `description`). +. In the Ramp configuration form: ++ +[cols="1,3"] +|=== +|Field |Notes + +|*Environment* +|`production` for the live Ramp API. `demo` for https://docs.ramp.com/developer-api/v1/testing/sandbox-setup[Ramp's sandbox environment]. Omit (or leave empty) for production. + +|*OAuth Provider* +|The Ramp OAuth Provider you configured. + +|*Required scopes* +|All nine scopes listed above. Drop write scopes (`cards:write`, `limits:write`) if the MCP only needs to read. +|=== ++ +. Click *Create*. + +// TODO: capture screenshots of the Ramp create form. + +=== Configure from the CLI + +[source,bash] +---- +rpk ai mcp create my-ramp \ + --managed-config '{ + "@type": "type.googleapis.com/redpanda.mcps.ramp.v1.RampMCPConfig", + "environment": "production" + }' \ + --user-oauth-provider ramp \ + --user-oauth-scopes transactions:read,cards:read,cards:write,users:read,departments:read,vendors:read,reimbursements:read,limits:read,limits:write +---- + +Set `environment` to `"demo"` to target Ramp's sandbox. + +== Tools + +The Ramp MCP exposes the following tools: + +[cols="1,2"] +|=== +|Tool |Description + +|`list_transactions` +|List transactions with optional filters. Supports pagination through the `start` cursor. Returns up to `page_size` results (max 100). + +|`get_transaction` +|Retrieve a single transaction by ID, including line items, accounting selections, and policy violations. + +|`list_cards` +|List corporate cards. Supports pagination. + +|`create_card` +|Issue a new virtual card. Returns a deferred task ID, since Ramp creates cards asynchronously. + +|`suspend_card` +|Suspend an active card by ID. Returns a deferred task ID. + +|`list_users` +|List Ramp users in your organization. Supports pagination. + +|`list_departments` +|List departments. Supports pagination. + +|`list_vendors` +|List vendors. Supports pagination. + +|`list_reimbursements` +|List out-of-pocket reimbursement requests. Supports pagination. + +|`list_limits` +|List spend limits. Supports pagination. + +|`create_limit` +|Create a new spend limit. Returns a deferred task ID, since Ramp creates limits asynchronously. + +|`update_limit` +|Update an existing spend limit's display name or spending restrictions synchronously. +|=== + +=== Example: List recent transactions + +[source,bash] +---- +curl -s https://aigw..clusters.rdpa.co/mcp/v1/my-ramp \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": { + "name": "list_transactions", + "arguments": { + "page_size": 25 + } + } + }' +---- + +=== Example: Create a virtual card for a vendor + +[source,bash] +---- +curl -s https://aigw..clusters.rdpa.co/mcp/v1/my-ramp \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", + "id": 2, + "method": "tools/call", + "params": { + "name": "create_card", + "arguments": { + "display_name": "AWS Services", + "user_id": "usr_abc123", + "idempotency_key": "create-aws-card-2026", + "spending_limit_amount": 5000.0, + "spending_limit_interval": "MONTHLY", + "spending_limit_currency": "USD" + } + } + }' +---- + +== Troubleshooting + +Common symptoms and fixes: + +[cols="1,2"] +|=== +|Symptom |What to check + +|`OAuthConnectionRequired` +|First call from a user with no stored token. The user completes Ramp's OAuth consent flow, the token lands in the vault, and subsequent calls reuse it. + +|`scope_upgrade_required` +|Server's `required_scopes` was extended after users had already consented. Users re-consent with the higher scope. + +|`create_card` / `create_limit` returns a task ID with no card / limit details +|These operations are asynchronous on Ramp's side. The MCP returns a task ID that you can poll against Ramp's API; the actual card or limit appears once the task completes. + +|`403 Forbidden` reading or writing +|The calling user's Ramp role doesn't grant the action. Ramp's role-based access control runs end-to-end: per-user OAuth means each user only sees what their Ramp account permits. +|=== + +== Out of scope + +This page does not cover: + +* *Bulk data export*: Use Ramp's native data export. +* *Accounting system integration*: Use Ramp's accounting sync. +* *Receipt management and approvals*: Handled in the Ramp web UI. + +== Related topics + +* xref:../oauth-providers.adoc[Configure an OAuth Provider] +* xref:../user-delegated-oauth.adoc[User-delegated OAuth] +* xref:../create-server.adoc[Create an MCP Server] diff --git a/modules/mcp/pages/managed/slack.adoc b/modules/mcp/pages/managed/slack.adoc index 9c3a428..037385d 100644 --- a/modules/mcp/pages/managed/slack.adoc +++ b/modules/mcp/pages/managed/slack.adoc @@ -6,7 +6,7 @@ :learning-objective-2: Walk through the consent flow and verify the connection in My Connections :learning-objective-3: Send a test message through the Inspector -The *Slack* managed MCP server is the canonical user-delegated OAuth example for ADP. Each agent caller authenticates against Slack with their own credentials, and Redpanda injects their token at call time — so messages posted by the agent appear as the user, not as a shared bot. +The *Slack* managed MCP server is the canonical user-delegated OAuth example for ADP. Each agent caller authenticates against Slack with their own credentials, and Redpanda injects their token at call time: so messages posted by the agent appear as the user, not as a shared bot. After completing this guide, you will be able to: @@ -26,31 +26,35 @@ The Slack managed type exposes tools for: == Prerequisites +Before you create the server, make sure you have: + * A Slack workspace where you can install or authorize an OAuth app. * A Slack OAuth app registered (your own or a Redpanda-published reference app). + // TODO: confirm whether Redpanda ships a reference Slack OAuth app or whether each customer brings their own. Document the path. -* An OAuth Provider configured in the ADP UI under *OAuth Providers*, pointing at Slack's authorize/token URLs and carrying the OAuth app's client credentials. -+ -// TODO: link the OAuth Provider how-to once it exists. +* An OAuth Provider configured in the ADP UI under *OAuth Providers*, pointing at Slack's authorize/token URLs and carrying the OAuth app's client credentials. See xref:../oauth-providers.adoc[Configure an OAuth Provider]. * Familiarity with xref:../user-delegated-oauth.adoc[]. == Configure +Create a new Slack MCP server in the ADP UI: + . Open *MCP Servers > Create Server*. . Pick *Slack* from the marketplace picker. . Fill in the identity fields (`name`, `description`). . In the Slack configuration form: + -// TODO: enumerate exact fields. Slack's MCP type likely doesn't need workspace-specific configuration — the OAuth flow tells Slack which workspace. +// TODO: enumerate exact fields. Slack's MCP type likely doesn't need workspace-specific configuration: the OAuth flow tells Slack which workspace. + -* *Auth* — choose *User-delegated OAuth*. -* *OAuth Provider* — pick the Slack provider you configured. -* *Required scopes* — typical: `channels:read`, `chat:write`, `users:read`. Adjust to your need. +* *Auth*: choose *User-delegated OAuth*. +* *OAuth Provider*: pick the Slack provider you configured. +* *Required scopes*: typical: `channels:read`, `chat:write`, `users:read`. Adjust to your need. . Click *Create*. == Test the consent flow +After creating the server, run a tool that requires Slack auth to verify the consent flow end-to-end: + . Open the *Inspector* tab. . Run a tool that requires the user's identity, for example `chat_postMessage`. . The first call returns `OAuthConnectionRequired` with a Slack `authorize_url`. The Inspector surfaces it as a consent prompt. @@ -66,7 +70,7 @@ Slack distinguishes user tokens from bot tokens, and many tools need user-token * `chat:write` is a *user* scope; `chat:write.public` is a separate scope for posting in channels the user isn't a member of. * `channels:read` returns only channels the user can see. -* Tokens are workspace-scoped — the same user authorizing twice across two workspaces produces two separate connections. +* Tokens are workspace-scoped: the same user authorizing twice across two workspaces produces two separate connections. // TODO: confirm exact required scopes per tool from the Slack server's tool registration. @@ -76,6 +80,8 @@ Once the server is created and at least one user has consented, you can point an == Troubleshooting +Common symptoms and fixes: + [cols="1,2"] |=== |Symptom |What to check @@ -97,5 +103,7 @@ Once the server is created and at least one user has consented, you can point an == Out of scope -* *Configuring the Slack OAuth app* — Slack-side configuration (creating the app, picking redirect URIs, choosing scopes) happens in api.slack.com, not in ADP. -* *Posting as a Slack bot* — this page covers user-delegated auth. For bot-token posting, use static-key auth with a bot user OAuth token. +This page does not cover: + +* *Configuring the Slack OAuth app*: Slack-side configuration (creating the app, picking redirect URIs, choosing scopes) happens in api.slack.com, not in ADP. +* *Posting as a Slack bot*: This page covers user-delegated auth. For bot-token posting, use static-key auth with a bot user OAuth token. diff --git a/modules/mcp/pages/managed/sql.adoc b/modules/mcp/pages/managed/sql.adoc index b6684fe..869bd3c 100644 --- a/modules/mcp/pages/managed/sql.adoc +++ b/modules/mcp/pages/managed/sql.adoc @@ -6,7 +6,7 @@ :learning-objective-2: Run a canonical SELECT query through the Inspector :learning-objective-3: Pick the right authentication and connection-string pattern for your database -The SQL managed MCP server gives agents read access to a SQL database through MCP. Redpanda runs the server in-process; you provide a connection string and credentials. +The *SQL* managed MCP server gives agents read access to a SQL database through MCP. Redpanda runs the server in-process; you provide a connection string and credentials. After completing this guide, you will be able to: diff --git a/modules/mcp/pages/managed/workday.adoc b/modules/mcp/pages/managed/workday.adoc new file mode 100644 index 0000000..a2d7ed5 --- /dev/null +++ b/modules/mcp/pages/managed/workday.adoc @@ -0,0 +1,201 @@ += Workday Managed MCP Server +:description: Drive Workday Human Resources business processes from an LLM agent. The Workday managed MCP wraps Workday's Human_Resources SOAP web services and authenticates with a service-account refresh-token grant. +:page-topic-type: how-to +:personas: app_developer, platform_admin +:learning-objective-1: Configure the Workday managed MCP server with an Integration System User (ISU) refresh token +:learning-objective-2: Choose the right WSDL version and tenant settings +:learning-objective-3: Run a Change_Personal_Information business process from the Inspector or an agent + +The *Workday* managed MCP server lets agents drive Workday Human Resources business processes (multi-step, approval-driven workflows like onboarding, hiring, and personal-info changes) through Workday's `Human_Resources` SOAP API. + +After completing this guide, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +== What this MCP server does + +Workday is a SaaS HR and payroll platform. Customer mutations land through *business processes*: multi-step, approval-driven workflows. Workday's REST API covers a partial read-side surface, but the business processes themselves live behind the SOAP `Human_Resources` WSDL. This MCP wraps the SOAP surface so an LLM can drive a business process the same way it would call any other tool. + +It is *not* a generic Workday browser. There is no SQL/RaaS access, no report execution, and no general "search the tenant" tool. Each MCP tool maps 1:1 to one business process. + +The current build exposes a single tool, `change_personal_information`, with more business processes landing as customers ask for them. + +== Authentication model + +Workday's `Human_Resources` SOAP API authenticates with the OAuth 2.0 *refresh-token grant* plus HTTP Basic on the token endpoint. Unlike most managed MCPs, this is a vendor-specific auth shape that doesn't fit the shared `static_key`, `service_account_oauth`, or `user_delegated_oauth` modes; Workday uses an `oauth_refresh_token` variant. + +The MCP exchanges the refresh token (in the request body) plus `username:password` (HTTP Basic) for a short-lived access token at `\https:///ccx/oauth2//token`, then sends `Authorization: Bearer ` on every SOAP call. + +Authentication is one ISU per MCP instance, not per end-user. Customers that need per-user-delegated access mount multiple MCP instances (one per ISU/scope), not multiple users behind one MCP. + +== Prerequisites + +Before you create the server, make sure you have: + +* A Workday tenant where you can create an Integration System User and register an API client. +* Admin access to *Workday > Create Integration System User* and *Workday > Register API Client for Integrations*. +* Two ADP secret-store entries: ++ +* `WORKDAY_PASSWORD`: The ISU password. +* `WORKDAY_REFRESH_TOKEN`: The non-expiring refresh token. + +== Get Workday credentials + +Set up authentication on the Workday side before configuring the MCP: + +. *Create an Integration System User (ISU)* under *Workday > Create Integration System User*. Note the username; it usually ends up as `@`. +. *Register an API Client for Integrations* under *Workday > Register API Client for Integrations*: ++ +* *Grant types*: Include both *Refresh Token* (required) and *Authorization Code*. Workday's UX requires both to be checked even when only the refresh-token grant is used at runtime. +* *Non-Expiring Refresh Tokens*: Tick this option. Required for static-credential MCP usage; if Workday rotates the refresh token on every exchange, the cached value goes stale and authentication breaks. +* *Scope*: Include *Human Resources* (and any other functional areas your business processes touch). +. *Issue a refresh token to the ISU* by completing the one-time authorization-code flow Workday walks you through, or by using *View API Clients > Manage Refresh Tokens for Integrations* to mint one directly. +. Save four values: the `tenant`, the `host` (the Workday data-center hostname, for example `wd2-impl-services1.workday.com`), the ISU `username`, and the ISU `password`. Save the `refresh_token` separately. + +== Configure + +Create a new Workday MCP server in the ADP UI: + +. Open *MCP Servers > Create Server*. +. Pick *Workday* from the marketplace picker. +. Fill in identity fields (`name`, `description`). +. In the Workday configuration form: ++ +[cols="1,3"] +|=== +|Field |Notes + +|*Tenant* +|Your Workday tenant identifier, for example `acme`. + +|*Host* +|The Workday data-center hostname, for example `wd2-impl-services1.workday.com`. The MCP exchanges credentials at `\https:///ccx/oauth2//token`. + +|*WSDL version* +|Optional; defaults to `v46.0`. Older tenants on `v44.x` or `v45.x` must set this explicitly to match the WSDL surface their tenant has enabled. + +|*Username* +|The ISU username (typically `@`). + +|*Password ref* +|Secret-store reference for the ISU password (`UPPER_SNAKE_CASE`). Example: `WORKDAY_PASSWORD`. + +|*Refresh token ref* +|Secret-store reference for the non-expiring refresh token (`UPPER_SNAKE_CASE`). Example: `WORKDAY_REFRESH_TOKEN`. +|=== ++ +. Click *Create*. + +// TODO: capture screenshots of the Workday create form on `adp-production`. + +=== Configure from the CLI + +[source,bash] +---- +rpk ai mcp create workday-hr --managed-config '{ + "@type": "type.googleapis.com/redpanda.mcps.workday.v1.WorkdayMCPConfig", + "tenant": "acme", + "host": "wd2-impl-services1.workday.com", + "wsdl_version": "v46.0", + "oauth_refresh_token": { + "username": "isu_user@acme", + "password_secret_ref": "${secrets.WORKDAY_PASSWORD}", + "refresh_token_secret_ref": "${secrets.WORKDAY_REFRESH_TOKEN}" + } +}' +---- + +== Tools + +The Workday MCP exposes the following tools: + +[cols="1,2"] +|=== +|Tool |Description + +|`change_personal_information` +|Kicks off the *Change_Personal_Information* business process for a worker. All fields except `worker_id` are optional. Only fields you set are sent to Workday, leaving the rest of the worker's personal data unchanged. +|=== + +// TODO: refresh this list when additional Workday business processes ship. + +=== Example: Change a worker's date of birth and marital status + +[source,bash] +---- +curl -s https://aigw..clusters.rdpa.co/mcp/v1/workday-hr \ + -H 'Content-Type: application/json' -d '{ + "jsonrpc":"2.0","method":"tools/call","id":1, + "params":{ + "name":"change_personal_information", + "arguments":{ + "worker_id":"E1001", + "worker_id_type":"Employee_ID", + "effective_date":{"year":2026,"month":5,"day":1}, + "date_of_birth":{"year":1990,"month":5,"day":20}, + "marital_status":"Married" + } + } +}' +---- + +Dates use the `google.type.Date` shape (`{year, month, day}`); a missing field, or one with `year: 0`, is treated as "unset" and Workday applies its own default (today, for `effective_date`). + +A successful response surfaces the Workday Event WID and confirms the worker WID: + +[source,json] +---- +{ + "result": { + "content": [{ + "type": "text", + "text": "{\"event_wid\":\"ev-wid-001\",\"worker_wid\":\"worker-wid-002\",\"version\":\"v46.0\"}" + }] + } +} +---- + +If Workday returns a SOAP Fault (validation error, missing permissions, worker not found), the MCP surfaces the `faultstring` as a structured tool error so the LLM can decide whether to retry or ask the user. + +== Tenant-specific values + +`gender`, `marital_status`, and `citizenship_status_ids` accept Workday IDs from the *customer's* tenant configuration. Common defaults like `Single` / `Married` and ISO country codes work in most tenants, but check Workday's "Maintain Marital Status" and "Maintain Citizenship Status" reports if a value is rejected. + +== Troubleshooting + +Common symptoms and fixes: + +[cols="1,2"] +|=== +|Symptom |What to check + +|`401 Unauthorized` on token exchange +|ISU credentials wrong, or the refresh token has been rotated. Confirm `WORKDAY_PASSWORD` and `WORKDAY_REFRESH_TOKEN` in the secret store are correct, and re-mint the refresh token in *View API Clients > Manage Refresh Tokens for Integrations* if needed. + +|`invalid_grant` on every refresh +|*Non-Expiring Refresh Tokens* was not checked when you registered the API client. Edit the client, tick the option, and re-mint the refresh token. + +|SOAP fault: `Invalid_Field_Value` +|A tenant-specific field ID (marital status, citizenship status, ethnicity) doesn't match what your tenant accepts. Check the corresponding "Maintain ..." report in Workday for the exact IDs. + +|SOAP fault: `Insufficient_Permissions` +|The ISU lacks rights for the business process you're invoking. Grant the relevant security domain on the ISU's security group. + +|SOAP fault: `Worker_Not_Found` +|The `worker_id` plus `worker_id_type` doesn't resolve. Verify the type (`Employee_ID`, `Workday_ID`, `Contingent_Worker_ID`) and the value. +|=== + +== Out of scope + +This page does not cover: + +* *Per-user-delegated access*: Workday auth is one shared ISU per MCP. For per-user identities, mount multiple MCP instances (one per ISU/scope). +* *Custom report execution*: This MCP wraps SOAP business processes, not reports. Use Workday RaaS or the report API for custom reports. +* *Read-side data exploration*: There is no general _search Workday_ tool. Add specific business-process tools as needed. + +== Related topics + +* xref:create-server.adoc[Create an MCP Server] +* xref:test-tools.adoc[Test a server's tools] diff --git a/modules/mcp/pages/managed/zendesk.adoc b/modules/mcp/pages/managed/zendesk.adoc new file mode 100644 index 0000000..630ef60 --- /dev/null +++ b/modules/mcp/pages/managed/zendesk.adoc @@ -0,0 +1,279 @@ += Zendesk Managed MCP Server +:description: Search and manage Zendesk Support tickets, users, and Help Center articles. Two auth modes (service-account API token or per-user OAuth) plus token-efficient response shaping for LLM agents. +:page-topic-type: how-to +:personas: app_developer, platform_admin +:learning-objective-1: Configure the Zendesk managed MCP server in API-token or User-OAuth mode +:learning-objective-2: Pick the right scopes and Zendesk role for your workflows +:learning-objective-3: Search, read, create, and update tickets from the Inspector or an agent + +The *Zendesk* managed MCP server lets agents search, read, create, and update tickets in your Zendesk Support instance, look up users and organizations, and search Help Center articles. + +After completing this guide, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +== What this MCP server does + +Wraps the Zendesk REST API. Two auth modes are supported: + +* *API token (Basic auth)*: A long-lived agent token paired with the agent's email. Best for service-account-style use. +* *User OAuth*: Per-user Zendesk OAuth tokens resolved from the gateway's token vault. Best when you want each agent action attributed to the calling end-user. + +Responses are curated for token efficiency: HATEOAS URLs, transport metadata, and rarely-used fields are dropped before reaching the LLM. Related users, groups, and organizations are resolved into nested ref objects through Zendesk side-loading (single round trip), and Help Center article HTML is converted to GitHub-flavored markdown. Typical responses are 3–7× smaller than raw Zendesk JSON. + +It is *not* intended for Zendesk admin operations (managing macros, triggers, ticket forms, custom fields, schedules, or SLAs); use the Zendesk Admin Center or a Terraform provider for those. + +== Prerequisites + +Before you create the server, make sure you have: + +* A Zendesk Support instance. +* For *API token* mode: ability to create an API token under *Apps and integrations > APIs > Zendesk API*. +* For *User OAuth* mode: a Zendesk OAuth client and an OAuth Provider configured in ADP. See xref:../oauth-providers.adoc[Configure an OAuth Provider]. + +== Get Zendesk credentials + +=== Option 1: API token (recommended for service accounts) + +. In the Zendesk Admin Center, go to *Apps and integrations > APIs > Zendesk API*. +. On the *Settings* tab, enable *Token access*. +. Click *Add API token*, give it a descriptive label (for example, `redpanda-aigw`), and copy the token value. It is shown only once. +. Note the *agent email* the token will act as (the email of the user who created the token). The HTTP Basic auth string the MCP builds is `base64(/token:)`. The `/token` literal is Zendesk's API-token quirk. +. Store the token in the ADP secret store under a name like `ZENDESK_API_TOKEN`. + +*Required role*: Agents and Admins can use the API. Most ticket operations work for the *Agent* role; reading users with `search_users` requires *Light Agent* or higher; Help Center search works for any authenticated user. + +=== Option 2: User OAuth + +For per-user authentication, register an OAuth client on Zendesk and a matching OAuth Provider in ADP: + +. Configure a Zendesk OAuth client under *Apps and integrations > APIs > OAuth Clients* (Confidential client, Authorization Code grant). +. Register a matching OAuth Provider in ADP. See xref:../oauth-providers.adoc[Configure an OAuth Provider]. Use Zendesk's authorize and token endpoints. +. Each end-user authenticates once through the OAuth flow; tokens are stored in the gateway's token vault. + +*Required scopes*: `read tickets:write hc:read` covers all 12 tools. Drop `tickets:write` if the MCP only needs to read. + +== Configure + +Create a new Zendesk MCP server in the ADP UI: + +. Open *MCP Servers > Create Server*. +. Pick *Zendesk* from the marketplace picker. +. Fill in identity fields (`name`, `description`). +. In the Zendesk configuration form: ++ +[cols="1,3"] +|=== +|Field |Notes + +|*Subdomain* +|Your Zendesk subdomain (the part before `.zendesk.com`). For `acme.zendesk.com`, set this to `acme`. + +|*Auth* +|*Basic auth* for API-token mode, or *User-delegated OAuth* for per-user mode. + +|*Username* (Basic auth only) +|Agent email used with the API token (for example, `agent@acme.com`). + +|*Password ref* (Basic auth only) +|Secret-store reference holding the API token (for example, `ZENDESK_API_TOKEN`). `UPPER_SNAKE_CASE`. + +|*OAuth Provider* (User OAuth only) +|Pick the Zendesk OAuth Provider you configured. + +|*Required scopes* (User OAuth only) +|`read`, `tickets:write`, `hc:read` covers all 12 tools. +|=== ++ +. Click *Create*. + +// TODO: capture screenshots of the Zendesk create form on `adp-production`. + +=== Configure from the CLI + +[tabs] +====== +API-token mode:: ++ +[source,bash] +---- +rpk ai mcp create acme-zendesk --managed-config '{ + "@type": "type.googleapis.com/redpanda.mcps.zendesk.v1.ZendeskMCPConfig", + "subdomain": "acme", + "basic_auth": { + "username": "agent@acme.com", + "password_secret_ref": "ZENDESK_API_TOKEN" + } +}' +---- + +User-OAuth mode:: ++ +[source,bash] +---- +rpk ai mcp create acme-zendesk-oauth \ + --managed-config '{ + "@type": "type.googleapis.com/redpanda.mcps.zendesk.v1.ZendeskMCPConfig", + "subdomain": "acme" + }' \ + --user-oauth-provider zendesk-prod \ + --user-oauth-scopes read,tickets:write,hc:read +---- +====== + +== Tools + +The Zendesk MCP exposes 12 tools across tickets, users, organizations, and Help Center articles: + +[cols="1,2"] +|=== +|Tool |Description + +|`search_tickets` +|Search tickets with Zendesk's search syntax (`status:open priority:high tags:bug`). The handler enforces a `type:ticket` qualifier and post-filters results, so non-ticket records can never leak through. + +|`list_tickets` +|Filter tickets by discrete fields (`status`, `assignee_id`, `group_id`). Faster than `search_tickets` and not subject to the search-API quota. Prefer this for "show me all open tickets assigned to X" workflows. + +|`get_ticket` +|Fetch a single ticket by ID with side-loaded requester, submitter, assignee, group, and organization. Set `include_comments=true` to embed up to 500 comments inline in the same response. + +|`list_ticket_comments` +|List the comment thread on a ticket with explicit pagination. Use this when the thread exceeds 500 comments; otherwise prefer `get_ticket` with `include_comments=true`. + +|`create_ticket` +|Open a new ticket (subject, description, optional priority/type/assignee/group/tags). Subject ≤150 chars, description ≤65536 chars, tags ≤50 chars each. + +|`update_ticket` +|Modify a ticket: status, priority, type, assignee, group, tags. Optionally append a public or internal comment in the same call. Distinct `tags` (replace) / `add_tags` / `remove_tags` fields. + +|`search_users` +|Find a Zendesk user by name, email, or other user-search fields. Returns full User objects. + +|`get_user` +|Fetch a single user by ID. Used to drill into a `UserRef` from a side-load when the agent needs the full User shape. + +|`list_organizations` +|List organizations in the Zendesk account. + +|`get_organization` +|Fetch a single organization by ID. Drills into an `OrganizationRef` from a side-load. + +|`search_articles` +|Search Help Center articles. Body is converted from HTML to GitHub-flavored markdown (tables included). + +|`get_article` +|Fetch a single Help Center article by ID. +|=== + +=== Example: Triage open tickets + +[source,bash] +---- +curl -s https://aigw..clusters.rdpa.co/mcp/v1/acme-zendesk \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": { + "name": "search_tickets", + "arguments": { + "query": "status:open priority:urgent", + "max_results": 10 + } + } + }' +---- + +=== Example: Solve a ticket with a closing comment + +[source,bash] +---- +curl -s https://aigw..clusters.rdpa.co/mcp/v1/acme-zendesk \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", + "id": 2, + "method": "tools/call", + "params": { + "name": "update_ticket", + "arguments": { + "ticket_id": 12345, + "status": "solved", + "add_tags": ["resolved-by-agent"], + "comment": { + "body": "Resetting your password should fix this. Reopen if it persists.", + "public": true + } + } + } + }' +---- + +=== Example: Read a ticket with its full comment thread + +For "summarize this ticket" flows, inline the comments: + +[source,bash] +---- +curl -s https://aigw..clusters.rdpa.co/mcp/v1/acme-zendesk \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", + "id": 4, + "method": "tools/call", + "params": { + "name": "get_ticket", + "arguments": { + "ticket_id": 12345, + "include_comments": true + } + } + }' +---- + +The handler follows Zendesk's `next_page` URLs only when they point at the same host as the configured subdomain, so pagination cannot be hijacked by a malicious upstream response. + +== Troubleshooting + +Common symptoms and fixes: + +[cols="1,2"] +|=== +|Symptom |What to check + +|`401 Unauthorized` (API-token mode) +|Confirm `ZENDESK_API_TOKEN` content matches the value Zendesk showed at token creation, and the *Username* field is the agent email of the user who created the token. + +|`403 Forbidden` on `search_users` +|The agent role on Zendesk's side is below *Light Agent*. Upgrade the role or use API-token mode with a Light Agent or Admin email. + +|`OAuthConnectionRequired` (User-OAuth mode) +|First call from a user with no stored token. The user completes Zendesk's OAuth consent flow, the token lands in the vault, and subsequent calls reuse it. See xref:../user-delegated-oauth.adoc[User-delegated OAuth]. + +|`scope_upgrade_required` (User-OAuth mode) +|Server's `required_scopes` was extended after users had already consented. Users re-consent with the higher scope. + +|Search returns non-ticket records +|Cannot happen: the handler enforces a `type:ticket` qualifier and post-filters results. If you see something unexpected, file an issue. +|=== + +== Out of scope + +This page does not cover: + +* *Zendesk admin operations*: Managing macros, triggers, ticket forms, custom fields, schedules, or SLAs. Use the Zendesk Admin Center or a Terraform provider. +* *Voice / chat / Talk*: This MCP wraps Support tickets and Help Center; voice and chat are separate Zendesk products with their own APIs. + +== Related topics + +* xref:../oauth-providers.adoc[Configure an OAuth Provider] +* xref:../user-delegated-oauth.adoc[User-delegated OAuth] +* xref:../create-server.adoc[Create an MCP Server] +* xref:../test-tools.adoc[Test a server's tools] diff --git a/modules/mcp/pages/oauth-providers.adoc b/modules/mcp/pages/oauth-providers.adoc new file mode 100644 index 0000000..d0f06c4 --- /dev/null +++ b/modules/mcp/pages/oauth-providers.adoc @@ -0,0 +1,265 @@ += Configure an OAuth Provider +:description: Register an OAuth provider in ADP so MCP servers can authenticate users (or service accounts) against an upstream system like Slack, Jira, GitHub, or Salesforce. +:page-topic-type: how-to +:personas: platform_admin +:learning-objective-1: Register an OAuth provider for an upstream system you want MCP servers to authenticate against +:learning-objective-2: Grant the right permissions so principals can attach the provider to MCP servers +:learning-objective-3: Edit, rotate credentials on, or delete an OAuth provider + +An OAuth provider defines an upstream system (Slack, Jira, GitHub, Salesforce, Workday, and so on) that AI Gateway can authenticate against on behalf of users (user-delegated OAuth) or service accounts. After you register a provider, any MCP server that talks to that upstream can attach to it instead of carrying its own credentials. + +After completing this guide, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +[IMPORTANT] +==== +OAuth providers and OAuth clients are different resources. An *OAuth provider* (this page) is a definition of an upstream system the gateway authenticates against. An *OAuth client* is a per-application credential issued *by* the AI Gateway's own identity provider, used by external clients to call ADP. They live under separate sidebar entries (*OAuth Providers* and *OAuth Clients*) and have separate proto, permissions, and lifecycles. +==== + +== Prerequisites + +Before you register the provider, make sure you have: + +* An OAuth 2.0 application registered with the upstream provider, with the gateway's redirect URI configured. The redirect URI is the AI Gateway's OAuth callback (typically `\https://aigw..clusters.rdpa.co/oauth/v1/callback`). +* The OAuth app's *client ID* and *client secret*. +* A secret already created in the ADP secret store for the client secret. Secret references must be `UPPER_SNAKE_CASE`, for example `SLACK_CLIENT_SECRET`. ++ +// TODO: xref the ADP secrets-management page once confirmed. +* The list of *scopes* the upstream API needs. Include every scope any MCP server attached to this provider may need; users re-consent when scopes are added later. + +== Required permissions + +OAuth providers are governed by their own permission set. Granting create/update/delete to an admin role is independent from granting attach to MCP-editor roles. + +[cols="1,2"] +|=== +|Permission |Allows + +|`dataplane_aigateway_oauthprovider_create` +|Create new OAuth providers. + +|`dataplane_aigateway_oauthprovider_get` +|Read existing OAuth providers. + +|`dataplane_aigateway_oauthprovider_update` +|Edit an existing OAuth provider's endpoints, scopes, or credentials. + +|`dataplane_aigateway_oauthprovider_delete` +|Delete an OAuth provider. + +|`dataplane_aigateway_oauthprovider_attach` +|*Required to attach this provider to an MCP server.* Enforced as a sub-resource check in `CreateMCPServer` and `UpdateMCPServer` whenever `authConfig.userOauth.provider_name` is set or swapped. Without this permission, a principal with `mcpserver_update` could otherwise bind any provider's token vault to an MCP they control and indirectly consume its tokens. +|=== + +NOTE: The `_attach` permission is independent from `_get`, `_create`, `_update`, and `_delete`. Grant it only to roles that should be able to bind a given provider's token vault to an MCP server. + +== Manage external identity providers for user-delegated MCP authentication + +The *OAuth Providers* page is your starting point. + +[cols="1,3"] +|=== +|Column |What it shows + +|*Name* +|The provider's machine identifier (used in MCP server configuration to attach this provider). + +|*Grant types* +|A badge per grant type. Typically *Browser consent* for user-delegated OAuth. + +|*Status* +|*Enabled* or *Disabled*. + +|*Scopes* +|A chip list of the supported scopes, for example `read:user`, `repo`, `read:org` for a GitHub provider. +|=== + +A *Filter* button narrows the list. The *Create provider* button opens the create form. + +== Register an OAuth provider in the UI + +Walk through the create form to register the upstream: + +. Sign in to ADP. +. Open *OAuth Providers* in the sidebar. +. Click *Create provider*. +. Pick the upstream's *Category*. The picker offers presets for popular providers. When you pick one, the form pre-fills the standard authorization and token endpoints. ++ +[cols="1,2"] +|=== +|Category |Examples + +|*Identity* +|Okta, Microsoft Entra (Azure AD), Google + +|*Source control* +|GitHub, GitLab, Bitbucket + +|*Productivity* +|Notion, Linear, Asana, Atlassian + +|*Storage* +|Dropbox, Box + +|*Communication* +|Slack, Zoom, Webex, Discord, Zendesk + +|*CRM* +|Salesforce, HubSpot + +|*Data warehouse* +|Snowflake, Databricks + +|*Monitoring* +|Datadog, Splunk, Sentry, PagerDuty + +|*Security* +|Cloudflare, Tailscale, 1Password + +|*Business* +|Workday, DocuSign, ServiceNow +|=== ++ +// TODO: capture screenshots of the well-known catalog picker. ++ +. Fill in the identity fields: ++ +[cols="1,1,3"] +|=== +|Field |Required |Notes + +|*Name* +|Yes +|Lowercase letters, numbers, and hyphens only. Used to reference the provider in MCP server configuration. Immutable after create. + +|*Display name* +|No +|Human-readable label shown in the UI. + +|*Authorization endpoint* +|Yes +|The upstream's OAuth authorize URL, for example `\https://slack.com/oauth/v2/authorize`. + +|*Token endpoint* +|Yes +|The upstream's OAuth token URL, for example `\https://slack.com/api/oauth.v2.access`. + +|*Revocation endpoint* +|No +|RFC 7009 token-revocation URL. When set, the gateway calls it on disconnect (best-effort). Not all providers support this. +|=== ++ +. Pick a *Grant type*: ++ +* *Browser Consent*: The user approves access in their browser (OAuth 2.0 Authorization Code flow). The default for user-delegated OAuth. +* *JWT Bearer*: RFC 8693 token exchange. Phase 4 feature; the gateway swaps the user's identity-provider JWT for a provider-scoped token without browser interaction. ++ +. Pick a *Token-endpoint authentication method*: ++ +* *HTTP Basic*: `client_id:client_secret` sent as the Basic auth header. Most common. +* *POST body*: Credentials sent as form fields in the token-request body. +* *None*: For public clients that rely on PKCE only. ++ +. Provide the *Client ID* and a secret reference for the *Client secret* (for example, `SLACK_CLIENT_SECRET`). +. Define the *Supported scopes*. Include every scope any MCP server may need. +. Click *Create*. + +The provider appears in the *OAuth providers* list. + +== Register from the CLI + +Use the `rpk ai` plugin to script provider registration: + +[source,bash] +---- +rpk ai oauth-provider create --name ramp \ + --type oauth2 \ + --client-id "$RAMP_CLIENT_ID" \ + --client-secret-ref RAMP_CLIENT_SECRET \ + --auth-url "https://app.ramp.com/v1/authorize" \ + --token-url "https://api.ramp.com/developer/v1/token" \ + --scopes "transactions:read,cards:read,users:read" +---- + +[cols="1,3"] +|=== +|Flag |Notes + +|`--name` +|Resource name. Lowercase letters, numbers, hyphens. Immutable. + +|`--type` +|`oauth2` for the standard authorization-code flow. + +|`--client-id` +|Client ID from the upstream OAuth app. + +|`--client-secret-ref` +|Secret-store reference (`UPPER_SNAKE_CASE`). + +|`--auth-url` +|Authorization endpoint. + +|`--token-url` +|Token endpoint. + +|`--scopes` +|Comma-separated scope list. +|=== + +== Attach to an MCP server + +To attach an OAuth provider to an MCP server, the principal needs `dataplane_aigateway_oauthprovider_attach` on the named provider plus the usual `mcpserver_create` / `mcpserver_update` permission. See xref:create-server.adoc[Create an MCP Server] for the full attach flow and xref:user-delegated-oauth.adoc[User-delegated OAuth] for the consent flow that runs on first call. + +== Edit and rotate credentials + +You can change the provider's configuration or rotate its client secret without re-creating the resource: + +* *Edit*: Open the provider's detail page and click *Edit*. Endpoints, scopes, display name, and the client-secret reference can change. The `Name` is immutable. +* *Rotate credentials*: Update the secret content in the secret store under the same name (for example, `SLACK_CLIENT_SECRET`). The provider's reference is unchanged. Existing tokens in the vault stay valid; the new client secret is used the next time AI Gateway exchanges credentials. +* *Disable temporarily*: Deleting the provider invalidates every MCP server's connections that reference it, so disable the dependent MCP servers first if you want to pause traffic without losing user consent. + +== Delete a provider + +Deleting an OAuth provider: + +* Removes the provider record. +* Causes every MCP server that referenced it to fail authentication on the next call (the `provider_name` reference no longer resolves). +* Leaves user-stored tokens in the vault until garbage-collected, but they're unusable without the provider definition. + +Plan the deletion: disable or reconfigure dependent MCP servers first, communicate the cutover to users so they can re-consent against a replacement provider, then delete. + +== Troubleshooting + +Common symptoms and fixes: + +[cols="1,2"] +|=== +|Symptom |What to check + +|`PermissionDenied` when creating an MCP server with this provider attached +|The principal lacks `dataplane_aigateway_oauthprovider_attach` on this provider. Have an admin grant the permission, or delegate the attach to a role that already has it. + +|Consent flow fails with `redirect_uri_mismatch` +|The OAuth app's registered redirect URI doesn't match the gateway's callback. Update the upstream OAuth app to include `\https://aigw..clusters.rdpa.co/oauth/v1/callback`. + +|`invalid_client` during token exchange +|Client ID or client secret is wrong, or the *Token-endpoint authentication method* doesn't match what the upstream expects. Check the upstream OAuth app's settings. + +|`invalid_scope` during consent +|A scope in *Supported scopes* isn't valid for the upstream. Check the upstream's scope reference and remove or rename the offending scope. +|=== + +== Related topics + +* xref:user-delegated-oauth.adoc[User-delegated OAuth] +* xref:create-server.adoc[Create an MCP Server] +* xref:managed/slack.adoc[Slack managed MCP] +* xref:managed/jira.adoc[Jira managed MCP] +* xref:managed/zendesk.adoc[Zendesk managed MCP] +* xref:managed/workday.adoc[Workday managed MCP] +* xref:managed/ironclad.adoc[Ironclad managed MCP] +* xref:managed/ramp.adoc[Ramp managed MCP] diff --git a/modules/mcp/pages/test-tools.adoc b/modules/mcp/pages/test-tools.adoc index 9baec55..4c9eb45 100644 --- a/modules/mcp/pages/test-tools.adoc +++ b/modules/mcp/pages/test-tools.adoc @@ -6,7 +6,7 @@ :learning-objective-2: Inspect resources, prompts, and call history :learning-objective-3: Diagnose common errors (auth missing, scope upgrade required, transport mismatch) before pointing an agent at the server -Test your MCP server's glossterm:tool[,tools], glossterm:resource[,resources], and glossterm:prompt[,prompts] using the Inspector — a built-in MCP client in the ADP UI. It runs on the same JSON-RPC connection that agents use, so if a tool works in the Inspector, it will work for an agent. Use this after creating your server (xref:create-server.adoc[Create an MCP Server]) or whenever you change a tool's schema. +Test your MCP server's glossterm:tool[,tools], glossterm:resource[,resources], and glossterm:prompt[,prompts] using the Inspector: a built-in MCP client in the ADP UI. It runs on the same JSON-RPC connection that agents use, so if a tool works in the Inspector, it will work for an agent. Use this after creating your server (xref:create-server.adoc[Create an MCP Server]) or whenever you change a tool's schema. After completing this guide, you will be able to: @@ -49,7 +49,7 @@ If the server has *Code mode* enabled, the Tools panel also lists `{name}_search == Resources panel -The Resources panel lists any resources the server exposes through `resources/list`. Many MCP servers don't expose resources at all — if the panel is empty, that's fine. +The Resources panel lists any resources the server exposes through `resources/list`. Many MCP servers don't expose resources at all: if the panel is empty, that's fine. If your server does expose resources: @@ -68,7 +68,7 @@ If your server does expose prompts: == Session panel -The Session panel keeps a running history of every call you've made through the Inspector — request, response, latency. Use it to: +The Session panel keeps a running history of every call you've made through the Inspector: request, response, latency. Use it to: * Replay a previous call by clicking it in the history list. * Diff two responses side-by-side (useful when iterating on a tool's logic). @@ -101,8 +101,27 @@ The Session panel keeps a running history of every call you've made through the // TODO: screenshots of each error case once captured against `adp-production`. +== Test from the CLI + +The `rpk ai` plugin offers a non-UI path for the same tool calls. Use it when scripting smoke tests or running checks from CI. + +[source,bash] +---- +# List every tool exposed by a server +rpk ai mcp tools list + +# Call a tool with a JSON arg blob +rpk ai mcp tools call --input '{"arg1":"value"}' + +# Get server detail; includes the tool list by default. Add --no-tools +# to skip discovery (faster when you only want metadata). +rpk ai mcp get +---- + +The CLI uses your active `rpk ai` profile for the gateway URL and authentication. See xref:ai-gateway:connect-agent.adoc[Connect your agent] for installation and profile setup. + == Out of scope -* *Pointing an agent at the server* — see the Agents docs. -* *Aggregating multiple servers* — see xref:ai-gateway:aggregation.adoc[MCP aggregation]. -* *Debugging the upstream system itself* (your SQL database, your Slack app) — outside the scope of MCP tooling. +* *Pointing an agent at the server*: see the Agents docs. +* *Aggregating multiple servers*: see xref:ai-gateway:aggregation.adoc[MCP aggregation]. +* *Debugging the upstream system itself* (your SQL database, your Slack app): outside the scope of MCP tooling. diff --git a/modules/mcp/pages/user-delegated-oauth.adoc b/modules/mcp/pages/user-delegated-oauth.adoc index 27c02d9..b24fa5b 100644 --- a/modules/mcp/pages/user-delegated-oauth.adoc +++ b/modules/mcp/pages/user-delegated-oauth.adoc @@ -2,11 +2,11 @@ :description: Have each end-user authenticate against the MCP server's upstream system with their own credentials. Redpanda stores their token in the vault and injects it at call time. :page-topic-type: how-to :personas: platform_admin, app_developer -:learning-objective-1: Configure an MCP server to use user-delegated OAuth with a registered OAuth Provider +:learning-objective-1: Configure an MCP server to use user-delegated OAuth with a registered OAuth provider :learning-objective-2: Walk an end-user through the consent flow and verify the connection :learning-objective-3: Troubleshoot scope upgrades, token expiry, and refresh failures -User-delegated OAuth means each end-user authenticates against the MCP server's upstream system (Slack, Jira, Google, …) with their own credentials. Redpanda stores their token in the token vault and injects it at call time. Contrast with service-account OAuth, where one shared identity is used for every caller. +User-delegated OAuth means each end-user authenticates against the MCP server's upstream system (for example, Slack, Jira, Google) with their own credentials. Redpanda stores their token in the token vault and injects it at call time. Contrast with service-account OAuth, where one shared identity is used for every caller. After completing this guide, you will be able to: @@ -16,9 +16,7 @@ After completing this guide, you will be able to: == Prerequisites -* An OAuth Provider resource configured in the ADP UI under *OAuth Providers*. The provider declares the upstream's `authorize_url`, `token_url`, supported scopes, and client credentials. -+ -// TODO: link the OAuth Provider how-to once it exists. +* An OAuth provider resource configured in the ADP UI under *OAuth providers*. The provider declares the upstream's `authorize_url`, `token_url`, supported scopes, and client credentials. See xref:oauth-providers.adoc[Configure an OAuth Provider]. * The required scopes for the upstream API you plan to call. * For *self-managed* MCP servers: the server URL must be `https://` (proto rule `remote_mcp.user_oauth_requires_https`). HTTP is rejected at create time. * For *managed* MCP servers: the type must support user-delegated OAuth. SQL doesn't; Slack, Jira, and Google managed types do. Check xref:managed/managed-catalog.adoc[Managed catalog] before configuring. @@ -27,14 +25,16 @@ After completing this guide, you will be able to: . Create or edit your MCP server (see xref:create-server.adoc[Create an MCP Server]). . In the auth section, choose *User-delegated OAuth*. -. Pick the configured *OAuth Provider* (`UserOAuthAuth.provider_name`). +. Pick the configured *OAuth provider* (`UserOAuthAuth.provider_name`). . List the *required scopes* (`UserOAuthAuth.required_scopes`). Redpanda enforces these at consent time. . (Optional) Override token injection. By default Redpanda sends `Authorization: Bearer `. To use a different header, set `TokenInjection.header_name`. To omit the prefix entirely (for example, an upstream that expects a bare API key as the token), set `TokenInjection.header_prefix` to the empty string. + // TODO: confirm exact UI labels for `TokenInjection.header_name` and `header_prefix`. . Save. -NOTE: Choosing user-delegated OAuth instead of service-account OAuth *is* the credential-mode decision — there's no separate field. User-delegated gives each caller a per-user upstream identity; service-account gives every caller one shared identity. Switching between them later requires re-consent for every active user. +NOTE: Choosing user-delegated OAuth instead of service-account OAuth *is* the credential-mode decision: there's no separate field. User-delegated gives each caller a per-user upstream identity; service-account gives every caller one shared identity. Switching between them later requires re-consent for every active user. + +TIP: To configure user-delegated OAuth from the CLI, use `--user-oauth-provider` and `--user-oauth-scopes` on `rpk ai mcp create` or `rpk ai mcp update`. See xref:create-server.adoc[Create an MCP Server]. == The user connection flow @@ -70,8 +70,8 @@ For the field-by-field service-account-OAuth setup, see xref:create-server.adoc# == Worked examples -* xref:managed/slack.adoc[Slack] — consumer-facing user-delegated OAuth example. Shows the consent flow against a real Slack workspace. -* xref:managed/jira.adoc[Jira] — enterprise user-delegated OAuth example. Atlassian's OAuth flow differs from Slack's; this page calls out scope-management gotchas. +* xref:managed/slack.adoc[Slack]: consumer-facing user-delegated OAuth example. Shows the consent flow against a real Slack workspace. +* xref:managed/jira.adoc[Jira]: enterprise user-delegated OAuth example. Atlassian's OAuth flow differs from Slack's; this page calls out scope-management gotchas. == Troubleshooting @@ -80,7 +80,7 @@ For the field-by-field service-account-OAuth setup, see xref:create-server.adoc# |Symptom |What to check |"OAuth provider not found" -|The provider name on the server doesn't match an OAuth Provider in the ADP UI. Check spelling and that the provider exists. +|The provider name on the server doesn't match an OAuth provider in the ADP UI. Check spelling and that the provider exists. |"HTTPS required" on save (self-managed only) |User-delegated OAuth requires `https://` URLs on the MCP server (proto rule `remote_mcp.user_oauth_requires_https`). Switch the server's URL to HTTPS. @@ -97,6 +97,6 @@ For the field-by-field service-account-OAuth setup, see xref:create-server.adoc# == Out of scope -* *Configuring an OAuth Provider* — separate workflow under the *OAuth Providers* section. -* *Service-account OAuth setup* — see xref:create-server.adoc#configure-authentication[create-server.adoc]. -* *Token vault internals* — Redpanda manages the vault; users see their own connections under *My Connections*, but the underlying storage isn't user-configurable. +* *Configuring an OAuth provider*: separate workflow under the *OAuth providers* section. +* *Service-account OAuth setup*: see xref:create-server.adoc#configure-authentication[create-server.adoc]. +* *Token vault internals*: Redpanda manages the vault; users see their own connections under *My Connections*, but the underlying storage isn't user-configurable. diff --git a/modules/observability/pages/ingest-custom-traces.adoc b/modules/observability/pages/ingest-custom-traces.adoc index dc50f05..982c37d 100644 --- a/modules/observability/pages/ingest-custom-traces.adoc +++ b/modules/observability/pages/ingest-custom-traces.adoc @@ -621,5 +621,5 @@ If requests succeed but traces do not appear in `redpanda.otel_traces`: * xref:ai-agents:observability/transcripts.adoc[] * xref:ai-agents:agents/monitor-agents.adoc[Observability for declarative agents] -* xref:develop:connect/components/inputs/otlp_http.adoc[OTLP HTTP input reference] - Complete configuration options for the `otlp_http` component -* xref:develop:connect/components/inputs/otlp_grpc.adoc[OTLP gRPC input reference] - Alternative gRPC-based trace ingestion +* xref:develop:connect/components/inputs/otlp_http.adoc[OTLP HTTP input reference] +* xref:develop:connect/components/inputs/otlp_grpc.adoc[OTLP gRPC input reference] diff --git a/modules/observability/partials/observability-logs.adoc b/modules/observability/partials/observability-logs.adoc index c9c4a44..8c10e8f 100644 --- a/modules/observability/partials/observability-logs.adoc +++ b/modules/observability/partials/observability-logs.adoc @@ -767,4 +767,4 @@ Note: Cost estimates are approximate. Use provider invoices for billing. == Next steps -* xref:ai-gateway/observability-metrics.adoc[]: Aggregate analytics and cost tracking. \ No newline at end of file +* xref:ai-gateway/observability-metrics.adoc[] diff --git a/modules/observability/partials/observability-metrics.adoc b/modules/observability/partials/observability-metrics.adoc index 15b8ca5..371b3b2 100644 --- a/modules/observability/partials/observability-metrics.adoc +++ b/modules/observability/partials/observability-metrics.adoc @@ -855,4 +855,4 @@ Solution: == Next steps -* xref:ai-gateway/observability-logs.adoc[]: View individual requests and debug issues. +* xref:ai-gateway/observability-logs.adoc[]