diff --git a/modules/ai-gateway/pages/admin/setup-guide.adoc b/modules/ai-gateway/pages/admin/setup-guide.adoc index 86eaa80..35e1913 100644 --- a/modules/ai-gateway/pages/admin/setup-guide.adoc +++ b/modules/ai-gateway/pages/admin/setup-guide.adoc @@ -379,4 +379,4 @@ Users can then discover and connect to the gateway using the information provide == Next steps * xref:routing-cel.adoc[CEL Routing Cookbook] -* xref:integrations/index.adoc[Integrations] +* xref:integrations:index.adoc[Integrations] diff --git a/modules/ai-gateway/pages/connect-agent.adoc b/modules/ai-gateway/pages/connect-agent.adoc index 55c34d0..cd4b35f 100644 --- a/modules/ai-gateway/pages/connect-agent.adoc +++ b/modules/ai-gateway/pages/connect-agent.adoc @@ -1,13 +1,13 @@ = Connect Your Agent -:description: Point your application or AI agent at an AI Gateway provider's proxy URL. Covers the URL shape, the local auth flow with the `rpk ai` plugin, the OIDC client-credentials flow for CI, and SDK examples for OpenAI, Anthropic, Google AI, AWS Bedrock, and OpenAI-compatible endpoints. +:description: Point your application or AI agent at an AI Gateway provider's proxy URL. Covers the URL shape, the local development workflow with `rpk ai`, the OIDC client-credentials flow for CI and application code, and SDK examples for OpenAI, Anthropic, Google AI, AWS Bedrock, and OpenAI-compatible endpoints. :page-topic-type: how-to :personas: app_developer :page-aliases: redpanda-cloud:ai-agents:ai-gateway/builders/connect-your-agent.adoc :learning-objective-1: Construct the proxy URL for an LLM provider you have configured -:learning-objective-2: Authenticate to AI Gateway using the `rpk ai` plugin for local development or OIDC client credentials for CI and programmatic clients +:learning-objective-2: Authenticate to AI Gateway with `rpk` for local development or with OIDC client credentials for CI and programmatic clients :learning-objective-3: Send requests through the proxy URL with the SDK of your choice -This guide shows how to connect your glossterm:AI agent[] or application to AI Gateway. You'll construct the proxy URL for a provider you have already created, authenticate (with the `rpk ai` plugin for local development or with OIDC client credentials for CI), and send your first request with the SDK of your choice. +This guide shows how to connect your glossterm:AI agent[] or application to AI Gateway. You construct the proxy URL for a provider you have already created, authenticate (with `rpk cloud login` for local development or with OIDC client credentials for CI and application code), and send your first request with the SDK of your choice. After completing this guide, you will be able to: @@ -17,8 +17,8 @@ After completing this guide, you will be able to: == Prerequisites -* A configured LLM provider. If you haven't created one yet, see xref:configure-provider.adoc[Configure an LLM provider]. -* For local development: nothing else; you'll install the `rpk ai` plugin in the next section. +* A configured LLM provider. If you haven't created one yet, see xref:ai-gateway:configure-provider.adoc[Configure an LLM provider]. +* For local development, nothing else. You'll install `rpk ai` in the next section. * For CI or programmatic clients: a Redpanda Cloud service account with OIDC client credentials. See xref:redpanda-cloud:security:cloud-authentication.adoc[Authenticate to Redpanda Cloud]. + // TODO: confirm whether ADP hosts its own service-account IAM post-standalone, or continues to share Redpanda Cloud Organization IAM. @@ -41,41 +41,84 @@ AI Gateway forwards the request to the upstream provider, attaches the configure TIP: The provider detail page generates ready-to-run snippets pre-filled with the correct proxy URL and paths. When in doubt, copy from the *Connect your app* section there. +// Updated for PRs #30273 / #30327 / #30360 (rpk ai managed plugin). [[authenticate-with-rpk-ai]] [[authenticate-with-rpai]] -== Authenticate with `rpk ai` (recommended for local development) +== Use `rpk ai` for local development -The `rpk ai` plugin is distributed through `rpk`'s plugin manager. The provider detail page surfaces an *Install* card with copy-pasteable steps. The flow is the same for every provider type: +The `rpk ai` command is the Redpanda AI CLI. Use it to manage AI Gateway resources (LLM providers, MCP servers, OAuth providers) and call MCP tools from the command line. Authentication for `rpk ai` is owned by `rpk cloud login`. The active AI Gateway URL comes from your active rpk cloud profile. -. Install the plugin: +. Install `rpk ai`: + [source,bash] ---- -rpk plugin install ai +rpk ai install ---- ++ +Update later with `rpk ai upgrade`; remove with `rpk ai uninstall`. -. Log in with the gateway URL from the provider's *Connection* card: +. Log in to Redpanda Cloud: + [source,bash] ---- -rpk ai auth login --server https://aigw..clusters.rdpa.co +rpk cloud login ---- ++ +This caches a cloud token in `~/.config/rpk/rpk.yaml`. On every invocation, `rpk ai` reads the cached token automatically. -. Point your SDK at the proxy URL and let `rpk ai auth token` mint a fresh token on each call. Set environment variables: +. Select a profile that points at a cluster with AI Gateway v2 attached. The AI Gateway URL is cached on the profile when you create it. + [source,bash] ---- -export PROXY_URL="/llm/v1/providers/" -export OPENAI_API_KEY="$(rpk ai auth token)" # or ANTHROPIC_API_KEY, etc. +rpk profile use +# or, to switch the cluster the active profile points at: +rpk cloud cluster use ---- -`rpk ai auth token` returns a short-lived OIDC access token. Refresh by running it again: most users wire it into a wrapper script or shell function. +. Verify the connection: ++ +[source,bash] +---- +rpk ai llm list +---- -TIP: The plugin supports named profiles for pointing at multiple gateways. Run `rpk ai profile create --dataplane-url --auth-mode device` to create one, then `rpk ai profile use ` to switch. See `rpk ai profile --help` for the full set of subcommands. +If the cached cloud token has expired, `rpk ai` returns a 401 with a hint to rerun `rpk cloud login`. + +[TIP] +==== +To target a specific gateway URL for a single invocation (for example, when running against a staging gateway without switching profiles), pass `--rpai-endpoint`: + +[source,bash] +---- +rpk ai --rpai-endpoint https://aigw..clusters.rdpa.co llm list +---- + +You can also export `RPAI_ENDPOINT` to override for the shell session. +==== + +// TODO(rpk-ai): rpai suppresses auth/profile subtrees in plugin mode today (cloudv2 apps/rpai/internal/cmd/root.go:127-135). If that changes, document `rpk ai auth` and `rpk ai profile` here. + +=== Environment variables + +The `rpk ai` command honors the following environment variables: + +[cols="1,3"] +|=== +|Variable |Purpose + +|`RPAI_TOKEN` +|Bearer token for the gateway. Normally injected automatically from your cached `rpk cloud login` token; set explicitly to override. + +|`RPAI_ENDPOINT` +|AI Gateway URL. Normally resolved from your active rpk cloud profile; set explicitly to override. + +|`RPAI_PROFILE`, `RPAI_CONFIG`, `RPAI_VERBOSE`, `RPAI_FORMAT` +|Map to `--rpai-profile`, `--rpai-config`, `--rpai-verbose`, `--format`. Long flag names are renamed under `rpk ai` to avoid collision with `rpk`'s globals; short flags (`-p`, `-c`, `-v`, `-o`) are unchanged. +|=== == Authenticate with OIDC client credentials (CI and programmatic) -When the `rpk ai` plugin isn't available (CI runners, server-side processes, headless agents), use the OIDC `client_credentials` grant directly. Values are surfaced on the provider's *Connection* card; defaults at the time of writing are below. +For application code, CI runners, server-side processes, and headless agents, use the OIDC `client_credentials` grant directly. This is the canonical authentication path for SDK-style usage; `rpk ai` is for command-line workflows, not for embedding in application code. Values are surfaced on the provider's *Connection* card; defaults at the time of writing are below. [cols="1,2", options="header"] |=== @@ -146,6 +189,7 @@ Passing `token_endpoint` to the `OAuth2Session` constructor lets `authlib` handl Node.js (openid-client):: + +-- [source,javascript] ---- import { Issuer } from 'openid-client'; @@ -166,6 +210,7 @@ const tokenSet = await client.grant({ const accessToken = tokenSet.access_token; ---- +-- ====== === Token lifecycle management @@ -175,7 +220,7 @@ IMPORTANT: Your client is responsible for refreshing tokens before they expire. * Proactively refresh at ~80% of the token's TTL to avoid failed requests. * `authlib` (Python) handles renewal automatically when you pass `token_endpoint` to `OAuth2Session`. * For other languages, cache the token and its expiry, then request a new token before the current one expires. -* If you're using `rpk ai`, just rerun `rpk ai auth token`: it handles refresh against the same OIDC endpoint. +* For SDK code, refresh OIDC client-credentials tokens through your client library (see the `authlib` example above). == Send requests with your SDK @@ -184,13 +229,14 @@ The examples in this section assume you've set: [source,bash] ---- export PROXY_URL="/llm/v1/providers/" -export AUTH_TOKEN="$(rpk ai auth token)" # or an OIDC access token from above +export AUTH_TOKEN="" # from the client_credentials flow above ---- [tabs] ====== OpenAI SDK:: + +-- [source,python] ---- import os @@ -198,7 +244,7 @@ from openai import OpenAI client = OpenAI( base_url=os.environ["PROXY_URL"], # .../llm/v1/providers/my-openai - api_key=os.environ["AUTH_TOKEN"], # rpk ai or OIDC access token + api_key=os.environ["AUTH_TOKEN"], # OIDC access token ) response = client.chat.completions.create( @@ -207,11 +253,13 @@ response = client.chat.completions.create( ) print(response.choices[0].message.content) ---- -+ + The OpenAI SDK calls the proxy's `/v1/chat/completions` path, which AI Gateway forwards to OpenAI unchanged. Use it with any OpenAI provider and, with a different `base_url`, with any OpenAI-compatible provider (vLLM, Ollama, LM Studio, Together, Groq, OpenRouter). +-- Anthropic SDK:: + +-- [source,python] ---- import os @@ -219,7 +267,7 @@ from anthropic import Anthropic client = Anthropic( base_url=os.environ["PROXY_URL"], # .../llm/v1/providers/my-anthropic - auth_token=os.environ["AUTH_TOKEN"], # rpk ai or OIDC access token + auth_token=os.environ["AUTH_TOKEN"], # OIDC access token ) message = client.messages.create( @@ -229,11 +277,13 @@ message = client.messages.create( ) print(message.content[0].text) ---- -+ + The Anthropic SDK hits `v1/messages` on the proxy, which AI Gateway forwards to Anthropic. If the provider is configured with *Auth passthrough*, send your own Anthropic `Authorization` header instead of an `auth_token`. AI Gateway forwards it unchanged. +-- Google Gemini SDK:: + +-- [source,python] ---- import os @@ -250,16 +300,18 @@ response = client.models.generate_content( ) print(response.text) ---- -+ + [IMPORTANT] ==== Gemini authenticates with the `x-goog-api-key` header, not `Authorization: Bearer`. Most Google SDKs set `x-goog-api-key` automatically from the `api_key` parameter. If you hand-roll the request, set the header yourself. ==== +-- AWS Bedrock:: + -Bedrock is different: SigV4 signing is performed *server-side* by AI Gateway using the credentials on the provider. Your client only needs to call the proxy URL with an `rpk ai` or OIDC token. -+ +-- +Bedrock is different: SigV4 signing is performed *server-side* by AI Gateway using the credentials on the provider. Your client only needs to call the proxy URL with an OIDC access token. + [source,python] ---- import os, httpx @@ -278,14 +330,16 @@ response = httpx.post( print(response.json()) ---- -See xref:configure-provider.adoc#bedrock-inference-profiles[the Bedrock provider reference] for inference-profile selection guidance. -+ +See xref:ai-gateway:configure-provider.adoc#bedrock-inference-profiles[the Bedrock provider reference] for inference-profile selection guidance. + TIP: Bedrock's `Converse` API works the same way: send to `/model/\{MODEL_ID}/converse` with a Converse-shaped body. Or use the AWS SDK's `bedrockruntime` client and set its `BaseEndpoint` to the proxy URL; the SDK signs the request, AI Gateway re-signs server-side with the provider's credentials, and your client never sees AWS keys. +-- OpenAI-compatible:: + +-- Use the OpenAI SDK with the proxy URL of the OpenAI-compatible provider and whatever model identifier the upstream exposes: -+ + [source,python] ---- import os @@ -301,6 +355,7 @@ response = client.chat.completions.create( messages=[{"role": "user", "content": "Hello"}], ) ---- +-- ====== [NOTE] @@ -354,18 +409,18 @@ AI Gateway returns standard HTTP status codes. The upstream provider's error bod == Best practices -* *Use environment variables* for the proxy URL and token; never hard-code them. -* *Wrap `rpk ai auth token`* in a script or shell function so refresh is invisible to your SDK code. -* *Implement retry with exponential backoff* for 5xx and timeout conditions. -* *Respect `Retry-After`* on 429 responses. -* *Rotate service account credentials* on a schedule your organization accepts. -* *Observe usage* through the ADP UI on each provider's detail page. A *Cost & usage* section is in development (the UI shows a "Coming soon" placeholder today). +* Use environment variables for the proxy URL and token. Never hard-code them. +* Refresh OIDC tokens through your client library so refresh is invisible to your SDK code (`authlib` for Python, `openid-client` for Node.js, etc.). +* Implement retry with exponential backoff for 5xx and timeout conditions. +* Respect `Retry-After` on 429 responses. +* Rotate service account credentials on a schedule your organization accepts. +* Observe usage through the ADP UI on each provider's detail page. A *Cost & usage* section is in development (the UI shows a "Coming soon" placeholder today). == Troubleshooting === 401 Unauthorized -* If you're using `rpk ai`: rerun `rpk ai auth login` to refresh the session, then `rpk ai auth token` to mint a new token. +* If you're using `rpk ai`: rerun `rpk cloud login` to refresh the cached cloud token. Token expiry surfaces as a 401 with this hint in the error. * If you're using OIDC client credentials: check the token hasn't expired and refresh it. Verify the audience is `cloudv2-production.redpanda.cloud` and the `Authorization` header is formatted `Bearer `. * For Gemini: ensure the token is sent as `x-goog-api-key`, not `Authorization`. * For Anthropic with passthrough: ensure the client is sending a valid Anthropic `Authorization` header. @@ -388,4 +443,4 @@ AI Gateway returns standard HTTP status codes. The upstream provider's error bod == Next steps -* xref:configure-provider.adoc[Configure an LLM provider] +* xref:ai-gateway:configure-provider.adoc[Configure an LLM provider] diff --git a/modules/ai-gateway/pages/gateway-quickstart.adoc b/modules/ai-gateway/pages/gateway-quickstart.adoc index 2b11f83..a6a623e 100644 --- a/modules/ai-gateway/pages/gateway-quickstart.adoc +++ b/modules/ai-gateway/pages/gateway-quickstart.adoc @@ -529,6 +529,6 @@ const openai = new OpenAI({ * xref:routing-cel.adoc[] * xref:aggregation.adoc[] -* xref:integrations/index.adoc[] +* xref:integrations:index.adoc[] * xref:gateway-architecture.adoc[] * xref:overview.adoc[] diff --git a/modules/ai-gateway/pages/overview.adoc b/modules/ai-gateway/pages/overview.adoc index 5f38ffb..1736408 100644 --- a/modules/ai-gateway/pages/overview.adoc +++ b/modules/ai-gateway/pages/overview.adoc @@ -46,7 +46,7 @@ Use the provider's own SDK: OpenAI, Anthropic, Google AI, AWS Bedrock, or any Op === Managed authentication -Applications authenticate to ADP with OIDC service accounts instead of long-lived provider API keys. Service accounts use the same role and audit model as every other ADP resource, and mint short-lived tokens that are easy to revoke. The recommended local flow uses the `rpk ai` plugin for token refresh; CI and programmatic clients use the OIDC client-credentials grant directly. See xref:connect-agent.adoc[Connect your agent]. +Applications authenticate to ADP with OIDC service accounts instead of long-lived provider API keys. Service accounts use the same role and audit model as every other ADP resource, and mint short-lived tokens that are easy to revoke. For local command-line workflows, use `rpk cloud login` to authenticate and `rpk ai` to talk to the gateway. CI and programmatic clients use the OIDC client-credentials grant directly. See xref:ai-gateway:connect-agent.adoc[Connect your agent]. === Per-provider observability @@ -78,7 +78,7 @@ AI Gateway supports five provider types. The UI labels and short descriptions ma |Call Claude Opus, Sonnet, and Haiku directly. Optionally forwards the client's `Authorization` header for enterprise and Max-plan subscription passthrough. |*Google AI* -|Reach Gemini Pro, Flash, and multimodal models via Google AI Studio. Ideal for long-context workloads and image/video inputs. +|Reach Gemini Pro, Flash, and multimodal models through Google AI Studio. Ideal for long-context workloads and image/video inputs. |*AWS Bedrock* |Invoke foundation models (Claude, Llama, Titan, Nova) hosted inside your AWS account. Use when data residency, IAM, or VPC egress matter more than raw feature parity. Signed with SigV4 server-side by AI Gateway. @@ -87,7 +87,7 @@ AI Gateway supports five provider types. The UI labels and short descriptions ma |Point at any OpenAI-compatible endpoint (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). Useful for self-hosted models and aggregator gateways that ship `/v1/chat/completions`. |=== -See xref:configure-provider.adoc[Configure an LLM provider] for the full form reference for each type. +See xref:ai-gateway:configure-provider.adoc[Configure an LLM provider] for the full form reference for each type. == When to use AI Gateway @@ -116,5 +116,5 @@ AI Gateway does not provide these capabilities. For current status, consult the == Next steps -. xref:configure-provider.adoc[Configure an LLM provider] -. xref:connect-agent.adoc[Connect your agent] +. xref:ai-gateway:configure-provider.adoc[Configure an LLM provider] +. xref:ai-gateway:connect-agent.adoc[Connect your agent] diff --git a/modules/governance/pages/budgets.adoc b/modules/governance/pages/budgets.adoc index 5cb8ffa..b9cfc8a 100644 --- a/modules/governance/pages/budgets.adoc +++ b/modules/governance/pages/budgets.adoc @@ -61,7 +61,7 @@ Some guardrail evaluators call an LLM to do their work. A toxicity classifier, f Guardrail evaluator cost surfaces in the same spending pipeline as user-facing LLM calls. The evaluator's cost is attributed to the *evaluator's configured upstream provider* — usually a small classifier model, separate from the user-facing LLM — so per-provider breakdowns separate the two automatically. -For the per-evaluator cost model and how it interacts with the dashboard's spend view, see xref:governance:guardrails.adoc[Configure guardrails]. +For the per-evaluator cost model and how it interacts with the dashboard's spend view, see xref:governance:guardrails/index.adoc[Configure guardrails]. // TODO: confirm with eng that guardrail evaluator cost flows into the same SpendingService as user-facing LLM cost (vs. a separate stream). Open Q A3 in the companion plan, also flagged on the Guardrails plan. @@ -87,7 +87,7 @@ Cap-management arrives after GA per the Governance V0 PRD. The planned feature s * *Alert hooks* — webhook, email, or chat notifications when a cap is approached or exceeded. * *Multi-tenant cap-setting* — per-tenant caps with override semantics. -Until those features ship, treat the dashboard and breakdown queries as your visibility layer and use platform-level guardrails (xref:governance:guardrails.adoc[Configure guardrails]) for selective request blocking. +Until those features ship, treat the dashboard and breakdown queries as your visibility layer and use platform-level guardrails (xref:governance:guardrails/index.adoc[Configure guardrails]) for selective request blocking. // TODO: once the cap-management surface lands, replace this section with a forward link to the configuration how-to. If cap-management content grows beyond a single section, split this page into a sub-folder. Open Q C1 in the companion plan. diff --git a/modules/mcp/pages/create-server.adoc b/modules/mcp/pages/create-server.adoc index 0620c8f..ccfc9b7 100644 --- a/modules/mcp/pages/create-server.adoc +++ b/modules/mcp/pages/create-server.adoc @@ -22,7 +22,7 @@ After completing this guide, you will be able to: * For any auth mode that uses upstream credentials: the credentials in hand and a secret already created in the ADP secret store. Secret references must be `UPPER_SNAKE_CASE` (proto regex `^[A-Z][A-Z0-9_]*$`). For example: `MCP_API_KEY`, `OPENAI_API_KEY`. + // TODO: xref the ADP secrets-management page once confirmed. -* For user-delegated OAuth: an OAuth Provider resource already configured. See xref:user-delegated-oauth.adoc[User-delegated OAuth]. +* For user-delegated OAuth: an OAuth Provider resource already configured. See xref:mcp:user-delegated-oauth.adoc[User-delegated OAuth]. == Open the MCP Servers page @@ -40,7 +40,7 @@ The marketplace picker lists every managed type as a card and includes a *Remote // TODO: screenshot of the marketplace picker, with both a managed card and the Remote (Proxied) option visible. -For a tour of every managed type and which one fits your use case, see xref:managed/managed-catalog.adoc[Managed catalog]. To go deep on the self-managed path (transport choices, TLS, multi-server aggregation), see xref:register-remote.adoc[Register a self-managed server]. +For a tour of every managed type and which one fits your use case, see xref:mcp:managed/managed-catalog.adoc[Managed catalog]. To go deep on the self-managed path (transport choices, TLS, multi-server aggregation), see xref:mcp:register-remote.adoc[Register a self-managed server]. == Name and basic fields @@ -67,17 +67,11 @@ Every server has the same identity fields. == Configure the managed flow (managed types only) -Each managed type ships its own configuration schema. The form on this page is rendered from the type's `_config.proto`, so field labels and help text come from the proto definition itself. There is no hand-written form code per type. +Each managed type ships its own configuration schema. The form on this page is rendered from the type's `_config.proto`, so field labels and help text come directly from the proto definition. No per-type hand-written form code is required. // TODO: screenshot of the SQL configuration form as the exemplar (it covers the most common field shapes). -For per-type fields, see xref:managed/managed-catalog.adoc[Managed catalog] for one-line entries on every type, and the deep-dive pages for SQL, Kafka, Slack, Jira, and OpenAPI: - -* xref:managed/sql.adoc[SQL] -* xref:managed/kafka.adoc[Kafka] -* xref:managed/slack.adoc[Slack] -* xref:managed/jira.adoc[Jira] -* xref:managed/openapi.adoc[OpenAPI] +For per-type fields, see the xref:mcp:managed/managed-catalog.adoc[Managed catalog]: a reference of every managed MCP type Redpanda hosts, grouped by category, with a description and a link to its deep-dive page where one exists. == Configure the self-managed flow (Remote/Proxied only) @@ -93,7 +87,7 @@ Two fields on top of the identity fields: |Transport |Yes -|`SSE` (server-sent events) or `Streamable HTTP` (newer bidirectional protocol). Pick whichever your server speaks. See xref:register-remote.adoc[Register a self-managed server] for how to test which transport your server uses. +|`SSE` (server-sent events) or `Streamable HTTP` (newer bidirectional protocol). Pick whichever your server speaks. See xref:mcp:register-remote.adoc[Register a self-managed server] for how to test which transport your server uses. |=== == Configure authentication @@ -117,7 +111,7 @@ Both managed and self-managed servers offer the same five authentication modes. |2-legged OAuth client credentials. One shared upstream identity for every caller. Provide `client_id`, `client_secret_ref`, `token_url`, and any required `scopes`. |*User-delegated OAuth* -|Each end-user authenticates against the upstream system with their own credentials, and Redpanda injects the user's token at call time. Pick the configured *OAuth Provider* and the required scopes. The first time a user calls a tool that needs this server, Redpanda surfaces a consent prompt; the resulting connection is stored in the token vault and shows up under *My Connections*. See xref:user-delegated-oauth.adoc[User-delegated OAuth] for the full flow. +|Each end-user authenticates against the upstream system with their own credentials, and Redpanda injects the user's token at call time. Pick the configured *OAuth Provider* and the required scopes. The first time a user calls a tool that needs this server, Redpanda surfaces a consent prompt; the resulting connection is stored in the token vault and shows up under *My Connections*. See xref:mcp:user-delegated-oauth.adoc[User-delegated OAuth] for the full flow. |=== NOTE: Choosing between *Service-account OAuth* and *User-delegated OAuth* is the credential-mode decision. Service-account auth gives every caller the same identity at the upstream; user-delegated auth gives each caller their own. @@ -138,13 +132,13 @@ NOTE: Defer advanced code-mode patterns (sandboxing limits, runtime selection, d . Click *Create*. The server appears in the list with a *Type* badge: *Managed* or *Self-managed*. . Open the detail page. The *Overview* tab shows the *API URL*: this is the MCP URL agents connect to. Copy it for use later. -. Open the *Inspector* tab. Redpanda performs a live `tools/list` against the server and lists every tool it discovered. See xref:test-tools.adoc[Test a server's tools] for how to call them. +. Open the *Inspector* tab. Redpanda performs a live `tools/list` against the server and lists every tool it discovered. See xref:mcp:test-tools.adoc[Test a server's tools] for how to call them. A populated tools list confirms that the connection works and credentials resolve correctly. If the list is empty or the tab shows an error, see <>. == Create from the CLI -The `rpk ai` plugin offers a non-UI path for the same create flow. Useful for scripting and CI. +Use `rpk ai` for a non-UI path through the same create flow, useful for scripting and CI. [source,bash] ---- @@ -181,16 +175,15 @@ rpk ai mcp update my-ramp \ |JSON blob carrying the managed type's `_config.proto` shape, including a `@type` URL. |`--user-oauth-provider` -|Name of an OAuth Provider already registered under *OAuth Providers*. See xref:oauth-providers.adoc[Configure an OAuth Provider]. The principal needs `dataplane_aigateway_oauthprovider_attach` on the named provider (AI-893). +|Name of an OAuth Provider already registered under *OAuth Providers*. See xref:mcp:oauth-providers.adoc[Configure an OAuth Provider]. The principal needs `dataplane_aigateway_oauthprovider_attach` on the named provider (AI-893). |`--user-oauth-scopes` |Comma-separated scopes the server requires. Provide every scope any tool may need; user re-consent is required if scopes change later. - -|`--auth-config` -|Alternative when a server needs an auth shape that doesn't map to the convenience flags. Takes a JSON blob matching the appropriate auth oneof variant. |=== -The CLI uses your active `rpk ai` profile for the gateway URL and authentication. +// TODO(rpk-ai): --auth-config does not exist on `rpk ai mcp create` today (verified against cloudv2/apps/rpai/internal/cmd/mcp/cmd.go on 2026-05-05). For non-convenience auth shapes, encode them inside `--managed-config` JSON. Re-add this row if a flag lands later. + +The command resolves the gateway URL from your active rpk cloud profile and reads the cached `rpk cloud login` token. == Edit, disable, and delete a server @@ -226,7 +219,7 @@ The CLI uses your active `rpk ai` profile for the gateway URL and authentication The following capabilities are not configured on this page; see the linked content instead. -* *User-delegated OAuth consent flow*: see xref:user-delegated-oauth.adoc[User-delegated OAuth]. -* *Inspector usage*: see xref:test-tools.adoc[Test a server's tools]. +* *User-delegated OAuth consent flow*: see xref:mcp:user-delegated-oauth.adoc[User-delegated OAuth]. +* *Inspector usage*: see xref:mcp:test-tools.adoc[Test a server's tools]. * *Multi-server aggregation*: handled by AI Gateway. See xref:ai-gateway:aggregation.adoc[MCP aggregation]. -* *Per-type configuration depth*: see xref:managed/managed-catalog.adoc[Managed catalog] and the deep-dive pages. +* *Per-type configuration depth*: see xref:mcp:managed/managed-catalog.adoc[Managed catalog] and the deep-dive pages. diff --git a/modules/mcp/pages/oauth-providers.adoc b/modules/mcp/pages/oauth-providers.adoc index d0f06c4..99b8186 100644 --- a/modules/mcp/pages/oauth-providers.adoc +++ b/modules/mcp/pages/oauth-providers.adoc @@ -54,11 +54,11 @@ OAuth providers are governed by their own permission set. Granting create/update |*Required to attach this provider to an MCP server.* Enforced as a sub-resource check in `CreateMCPServer` and `UpdateMCPServer` whenever `authConfig.userOauth.provider_name` is set or swapped. Without this permission, a principal with `mcpserver_update` could otherwise bind any provider's token vault to an MCP they control and indirectly consume its tokens. |=== -NOTE: The `_attach` permission is independent from `_get`, `_create`, `_update`, and `_delete`. Grant it only to roles that should be able to bind a given provider's token vault to an MCP server. +NOTE: The `_attach` permission is independent from `_get`, `_create`, `_update`, and `_delete`. Grant it only to roles that need to bind a given provider's token vault to an MCP server. -== Manage external identity providers for user-delegated MCP authentication +== Browse OAuth providers -The *OAuth Providers* page is your starting point. +The *OAuth Providers* page lists every provider registered in your organization. The list shows the following columns: [cols="1,3"] |=== @@ -171,16 +171,19 @@ The provider appears in the *OAuth providers* list. == Register from the CLI -Use the `rpk ai` plugin to script provider registration: +Use `rpk ai` to script provider registration: + +// TODO(rpk-ai): aliases (oauth-provider, mcp-server, llm-provider) are kept for one release per cloudv2/apps/rpai/CLAUDE.md. Verified canonical command is `rpk ai oauth create` on 2026-05-05; flip examples cleanly to canonical names if the alias gets dropped. [source,bash] ---- -rpk ai oauth-provider create --name ramp \ - --type oauth2 \ +rpk ai oauth create \ + --name ramp \ + --display-name "Ramp" \ + --auth-endpoint "https://app.ramp.com/v1/authorize" \ + --token-endpoint "https://api.ramp.com/developer/v1/token" \ --client-id "$RAMP_CLIENT_ID" \ --client-secret-ref RAMP_CLIENT_SECRET \ - --auth-url "https://app.ramp.com/v1/authorize" \ - --token-url "https://api.ramp.com/developer/v1/token" \ --scopes "transactions:read,cards:read,users:read" ---- @@ -189,30 +192,45 @@ rpk ai oauth-provider create --name ramp \ |Flag |Notes |`--name` -|Resource name. Lowercase letters, numbers, hyphens. Immutable. +|Resource name. Lowercase letters, numbers, hyphens. Immutable. Required. + +|`--display-name` +|Human-readable display name shown in the UI. Required. + +|`--auth-endpoint` +|OAuth authorization endpoint URL. Required. -|`--type` -|`oauth2` for the standard authorization-code flow. +|`--token-endpoint` +|OAuth token endpoint URL. Required. |`--client-id` -|Client ID from the upstream OAuth app. +|Client ID from the upstream OAuth app. Required. |`--client-secret-ref` |Secret-store reference (`UPPER_SNAKE_CASE`). -|`--auth-url` -|Authorization endpoint. - -|`--token-url` -|Token endpoint. - |`--scopes` |Comma-separated scope list. + +|`--grant-types` +|Grant types: `browser-consent` (default), `token-exchange`. Comma-separated. + +|`--token-auth-method` +|Token-endpoint authentication method: `client-secret-basic` (default), `client-secret-post`, `none`. + +|`--pkce` +|Require PKCE for authorization code grants. + +|`--revocation-endpoint` +|OAuth token revocation endpoint URL. + +|`--enabled` +|Whether the provider is enabled (default `true`). |=== == Attach to an MCP server -To attach an OAuth provider to an MCP server, the principal needs `dataplane_aigateway_oauthprovider_attach` on the named provider plus the usual `mcpserver_create` / `mcpserver_update` permission. See xref:create-server.adoc[Create an MCP Server] for the full attach flow and xref:user-delegated-oauth.adoc[User-delegated OAuth] for the consent flow that runs on first call. +To attach an OAuth provider to an MCP server, the principal needs `dataplane_aigateway_oauthprovider_attach` on the named provider plus the usual `mcpserver_create` / `mcpserver_update` permission. See xref:mcp:create-server.adoc[Create an MCP Server] for the full attach flow and xref:mcp:user-delegated-oauth.adoc[User-delegated OAuth] for the consent flow that runs on first call. == Edit and rotate credentials @@ -255,11 +273,11 @@ Common symptoms and fixes: == Related topics -* xref:user-delegated-oauth.adoc[User-delegated OAuth] -* xref:create-server.adoc[Create an MCP Server] -* xref:managed/slack.adoc[Slack managed MCP] -* xref:managed/jira.adoc[Jira managed MCP] -* xref:managed/zendesk.adoc[Zendesk managed MCP] -* xref:managed/workday.adoc[Workday managed MCP] -* xref:managed/ironclad.adoc[Ironclad managed MCP] -* xref:managed/ramp.adoc[Ramp managed MCP] +* xref:mcp:user-delegated-oauth.adoc[User-delegated OAuth] +* xref:mcp:create-server.adoc[Create an MCP Server] +* xref:mcp:managed/slack.adoc[Slack managed MCP] +* xref:mcp:managed/jira.adoc[Jira managed MCP] +* xref:mcp:managed/zendesk.adoc[Zendesk managed MCP] +* xref:mcp:managed/workday.adoc[Workday managed MCP] +* xref:mcp:managed/ironclad.adoc[Ironclad managed MCP] +* xref:mcp:managed/ramp.adoc[Ramp managed MCP] diff --git a/modules/mcp/pages/register-remote.adoc b/modules/mcp/pages/register-remote.adoc index 897d1bb..b867c02 100644 --- a/modules/mcp/pages/register-remote.adoc +++ b/modules/mcp/pages/register-remote.adoc @@ -128,7 +128,6 @@ If the tools list is empty or stale, hit the *Refresh tools* action on the Overv |TLS errors when registering an `https://` URL |Confirm the server's certificate chains to a public CA (or the CA Redpanda's egress trusts). Self-signed certs aren't supported. -+ // TODO: confirm TLS / private CA story for the standalone ADP product surface. |`401 Unauthorized` from the upstream diff --git a/modules/mcp/pages/test-tools.adoc b/modules/mcp/pages/test-tools.adoc index 4c9eb45..7af74f6 100644 --- a/modules/mcp/pages/test-tools.adoc +++ b/modules/mcp/pages/test-tools.adoc @@ -6,7 +6,7 @@ :learning-objective-2: Inspect resources, prompts, and call history :learning-objective-3: Diagnose common errors (auth missing, scope upgrade required, transport mismatch) before pointing an agent at the server -Test your MCP server's glossterm:tool[,tools], glossterm:resource[,resources], and glossterm:prompt[,prompts] using the Inspector: a built-in MCP client in the ADP UI. It runs on the same JSON-RPC connection that agents use, so if a tool works in the Inspector, it will work for an agent. Use this after creating your server (xref:create-server.adoc[Create an MCP Server]) or whenever you change a tool's schema. +Test your MCP server's glossterm:tool[,tools], glossterm:resource[,resources], and glossterm:prompt[,prompts] using the Inspector: a built-in MCP client in the ADP UI. It runs on the same JSON-RPC connection that agents use, so a tool that works in the Inspector also works for an agent. Use this after creating your server (xref:mcp:create-server.adoc[Create an MCP Server]) or whenever you change a tool's schema. After completing this guide, you will be able to: @@ -84,7 +84,7 @@ The Session panel keeps a running history of every call you've made through the |Error |Meaning and fix |`OAuthConnectionRequired` -|User-delegated auth has no stored token for the calling user. Redpanda includes an `authorize_url` in the error detail; complete the consent flow per xref:user-delegated-oauth.adoc[User-delegated OAuth]. +|User-delegated auth has no stored token for the calling user. Redpanda includes an `authorize_url` in the error detail; complete the consent flow per xref:mcp:user-delegated-oauth.adoc[User-delegated OAuth]. |`OAuthTokenExpired` |Stored token is expired and refresh failed. Re-consent through *My Connections*. @@ -103,7 +103,7 @@ The Session panel keeps a running history of every call you've made through the == Test from the CLI -The `rpk ai` plugin offers a non-UI path for the same tool calls. Use it when scripting smoke tests or running checks from CI. +Use `rpk ai` for the same tool calls outside the UI, when scripting smoke tests or running checks from CI. [source,bash] ---- @@ -118,7 +118,7 @@ rpk ai mcp tools call --input '{"arg1":"value"}' rpk ai mcp get ---- -The CLI uses your active `rpk ai` profile for the gateway URL and authentication. See xref:ai-gateway:connect-agent.adoc[Connect your agent] for installation and profile setup. +The command resolves the gateway URL from your active rpk cloud profile and reads the cached `rpk cloud login` token. See xref:ai-gateway:connect-agent.adoc[Connect your agent] for installation and profile setup. == Out of scope diff --git a/modules/observability/pages/ingest-custom-traces.adoc b/modules/observability/pages/ingest-custom-traces.adoc index 982c37d..32b03b9 100644 --- a/modules/observability/pages/ingest-custom-traces.adoc +++ b/modules/observability/pages/ingest-custom-traces.adoc @@ -19,7 +19,7 @@ After reading this page, you will be able to: * A Redpanda Connect pipeline host (today: a Redpanda BYOC cluster with Connect enabled). Ability to manage secrets on that host. // TODO: Replace with the standalone-ADP ingestion target once defined (may no longer require a Redpanda Cloud cluster). -* The latest version of xref:manage:rpk/rpk-install.adoc[`rpk`] installed +* The latest version of xref:redpanda-cloud:manage:rpk/rpk-install.adoc[`rpk`] installed * Custom agent or application instrumented with OpenTelemetry SDK * Basic understanding of the https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/[OpenTelemetry span format^] and https://opentelemetry.io/docs/specs/otlp/[OpenTelemetry Protocol (OTLP)^] @@ -60,11 +60,11 @@ For non-LangChain applications or custom instrumentation, continue with the sect Custom agents are applications with OpenTelemetry instrumentation that operate independently of Redpanda's Remote MCP servers or declarative agents (such as LangChain, CrewAI, or manually instrumented applications). -When these agents send traces to `redpanda.otel_traces`, you gain unified observability alongside Remote MCP server and declarative agent traces. See xref:ai-agents:observability/concepts.adoc#cross-service-transcripts[Cross-service transcripts] for details on how traces correlate across services. +When these agents send traces to `redpanda.otel_traces`, you gain unified observability alongside Remote MCP server and declarative agent traces. See xref:observability:concepts.adoc#cross-service-transcripts[Cross-service transcripts] for details on how traces correlate across services. === Trace format requirements -Custom agents must emit traces in OTLP format. The xref:develop:connect/components/inputs/otlp_http.adoc[`otlp_http`] input accepts both OTLP Protobuf (`application/x-protobuf`) and JSON (`application/json`) payloads. For <>, use the xref:develop:connect/components/inputs/otlp_grpc.adoc[`otlp_grpc`] input. +Custom agents must emit traces in OTLP format. The xref:redpanda-connect:components:inputs/otlp_http.adoc[`otlp_http`] input accepts both OTLP Protobuf (`application/x-protobuf`) and JSON (`application/json`) payloads. For <>, use the xref:redpanda-connect:components:inputs/otlp_grpc.adoc[`otlp_grpc`] input. Each trace must follow the OTLP specification with these required fields: @@ -96,7 +96,7 @@ Optional but recommended fields: - `parentSpanId` for hierarchical traces - `attributes` for contextual information -For complete trace structure details, see xref:ai-agents:observability/concepts.adoc#understand-the-transcript-structure[Understand the transcript structure]. +For complete trace structure details, see xref:observability:concepts.adoc#understand-the-transcript-structure[Understand the transcript structure]. == Configure the ingestion pipeline @@ -573,7 +573,7 @@ After your custom agent sends traces through the pipeline, they appear in the *T ==== Identify custom agent transcripts -Custom agent transcripts are identified by the `service.name` resource attribute, which differs from Redpanda's built-in services (`ai-agent` for declarative agents, `mcp-{server-id}` for MCP servers). See xref:ai-agents:observability/concepts.adoc#cross-service-transcripts[Cross-service transcripts] to understand how the `service.name` attribute identifies transcript sources. +Custom agent transcripts are identified by the `service.name` resource attribute, which differs from Redpanda's built-in services (`ai-agent` for declarative agents, `mcp-{server-id}` for MCP servers). See xref:observability:concepts.adoc#cross-service-transcripts[Cross-service transcripts] to understand how the `service.name` attribute identifies transcript sources. Your custom agent transcripts display with: @@ -581,7 +581,7 @@ Your custom agent transcripts display with: * **Agent name** in span details (from the `gen_ai.agent.name` attribute) * **Operation names** like `"invoke_agent my-assistant"` indicating agent executions -For detailed instructions on filtering, searching, and navigating transcripts in the UI, see xref:ai-agents:observability/transcripts.adoc[View Transcripts]. +For detailed instructions on filtering, searching, and navigating transcripts in the UI, see xref:observability:transcripts.adoc[View Transcripts]. ==== Token usage tracking @@ -619,7 +619,7 @@ If requests succeed but traces do not appear in `redpanda.otel_traces`: == Next steps -* xref:ai-agents:observability/transcripts.adoc[] -* xref:ai-agents:agents/monitor-agents.adoc[Observability for declarative agents] -* xref:develop:connect/components/inputs/otlp_http.adoc[OTLP HTTP input reference] -* xref:develop:connect/components/inputs/otlp_grpc.adoc[OTLP gRPC input reference] +* xref:observability:transcripts.adoc[] +* xref:agents:monitor.adoc[Observability for declarative agents] +* xref:redpanda-connect:components:inputs/otlp_http.adoc[OTLP HTTP input reference] +* xref:redpanda-connect:components:inputs/otlp_grpc.adoc[OTLP gRPC input reference] diff --git a/modules/observability/pages/transcripts.adoc b/modules/observability/pages/transcripts.adoc index 0a2b19e..7d2f033 100644 --- a/modules/observability/pages/transcripts.adoc +++ b/modules/observability/pages/transcripts.adoc @@ -8,7 +8,7 @@ Use the Transcripts view to read a complete record of an agent or MCP server execution, turn by turn. Each transcript captures the conversation between the user, the agent, any LLM calls, and any tools it invoked, along with token usage, USD cost, latency, and any errors. -For conceptual background on the underlying OpenTelemetry data model, see xref:ai-agents:observability/concepts.adoc[]. +For conceptual background on the underlying OpenTelemetry data model, see xref:observability:concepts.adoc[]. After reading this page, you will be able to: @@ -41,7 +41,7 @@ Each row in the list represents one execution (one trace). Columns include: * *USD cost* — total cost for the execution, derived from per-model pricing. See <> if this column shows `0`. * *Duration* — wall-clock time between the first and last span. -A transcript marked _reconstructed_ is one in which some turns were rebuilt from LLM message context after the original spans were evicted from `redpanda.otel_traces`. See xref:ai-agents:observability/concepts.adoc#history-reconstruction[Reconstructed transcript history] for what that means. +A transcript marked _reconstructed_ is one in which some turns were rebuilt from LLM message context after the original spans were evicted from `redpanda.otel_traces`. See xref:observability:concepts.adoc#history-reconstruction[Reconstructed transcript history] for what that means. // TODO: Confirm final column list on the GA Console UI. Today's labels likely shift. Verify against adp-production before merge. @@ -102,7 +102,7 @@ Turns are listed in order by role: * *ASSISTANT* — a response from the LLM. Shows the model, input/output token counts, USD cost for that turn, and latency. If the assistant turn called a tool, its tool calls are nested underneath. * *TOOL* — a tool invocation. Shows the tool name, the arguments passed, the result, and the latency of the call. -Any turn may carry the `is_reconstructed` marker. Reconstructed turns preserve role order and the high-level content of the conversation but do not carry per-turn token counts, latency, or tool-call arguments. See xref:ai-agents:observability/concepts.adoc#history-reconstruction[Reconstructed transcript history] for the mechanics. +Any turn may carry the `is_reconstructed` marker. Reconstructed turns preserve role order and the high-level content of the conversation but do not carry per-turn token counts, latency, or tool-call arguments. See xref:observability:concepts.adoc#history-reconstruction[Reconstructed transcript history] for the mechanics. === Errors @@ -148,7 +148,7 @@ If the failure happened during a tool call, the error is attached to the TOOL tu == Limitations * Large time windows sample the list to keep the UI responsive. The exact transcript you need may not be in the current page; narrow the time range or add filters. -* Reconstructed turns do not carry token counts, latency, or tool-call arguments for the reconstructed range. For byte-level fidelity, lower the ingestion lag or extend `redpanda.otel_traces` retention (see xref:ai-agents:observability/concepts.adoc#opentelemetry-traces-topic[How Redpanda stores trace data]). +* Reconstructed turns do not carry token counts, latency, or tool-call arguments for the reconstructed range. For byte-level fidelity, lower the ingestion lag or extend `redpanda.otel_traces` retention (see xref:observability:concepts.adoc#opentelemetry-traces-topic[How Redpanda stores trace data]). * USD cost is only populated for models covered by the pricing table. // TODO: List which providers/models are priced at GA and what users see for un-priced ones (`0`, `null`, or an explicit "unknown" marker). // TODO: If the GA Console UI ships transcript export, document the entry point and output format here; otherwise omit. @@ -165,7 +165,7 @@ A transcript stays in `RUNNING` until the root span closes. Common causes: === USD cost shows 0 -`TranscriptUsage.usd_cost` is populated by the cost-reporting pipeline from the `gen_ai.usage.*` attributes on each LLM-call span combined with a per-model pricing table. For the full list of cost-bearing attributes (including the explicit USD-cost fields), see xref:ai-agents:observability/concepts.adoc#key-attributes-by-layer[Key attributes by layer]. +`TranscriptUsage.usd_cost` is populated by the cost-reporting pipeline from the `gen_ai.usage.*` attributes on each LLM-call span combined with a per-model pricing table. For the full list of cost-bearing attributes (including the explicit USD-cost fields), see xref:observability:concepts.adoc#key-attributes-by-layer[Key attributes by layer]. // TODO: Document which providers/models are priced at GA. If cost is `0` for a transcript that clearly used tokens, check: diff --git a/modules/observability/partials/transcripts-ui-guide.adoc b/modules/observability/partials/transcripts-ui-guide.adoc index 39fba4a..fb1daa6 100644 --- a/modules/observability/partials/transcripts-ui-guide.adoc +++ b/modules/observability/partials/transcripts-ui-guide.adoc @@ -23,8 +23,8 @@ // Valid values: "agent" | "mcp" // // DEPENDENCIES: -// - xref:ai-agents:observability/concepts.adoc#agent-trace-hierarchy[] -// - xref:ai-agents:observability/concepts.adoc#mcp-server-trace-hierarchy[] +// - xref:observability:concepts.adoc#agent-transcript-hierarchy[] +// - xref:observability:concepts.adoc#mcp-server-transcript-hierarchy[] // // CONTENT TYPE: // UI navigation and interface explanation (procedural context for how-to pages) @@ -66,10 +66,10 @@ The trace list shows nested operations with visual duration bars indicating how // Link to appropriate concepts section based on context ifeval::["{context}" == "agent"] -For details on span types, see xref:ai-agents:observability/concepts.adoc#agent-trace-hierarchy[Agent trace hierarchy]. +For details on span types, see xref:observability:concepts.adoc#agent-transcript-hierarchy[Agent transcript hierarchy]. endif::[] ifeval::["{context}" == "mcp"] -For details on span types, see xref:ai-agents:observability/concepts.adoc#mcp-server-trace-hierarchy[MCP server trace hierarchy]. +For details on span types, see xref:observability:concepts.adoc#mcp-server-transcript-hierarchy[MCP server transcript hierarchy]. endif::[] ==== Summary panel @@ -91,6 +91,6 @@ ifeval::["{context}" == "mcp"] * Service: The MCP server identifier endif::[] -If any turns were rebuilt from LLM message context after their original spans were evicted, the panel shows a _reconstructed_ marker on those turns. For the mechanics, see xref:ai-agents:observability/concepts.adoc#history-reconstruction[Reconstructed transcript history]. +If any turns were rebuilt from LLM message context after their original spans were evicted, the panel shows a _reconstructed_ marker on those turns. For the mechanics, see xref:observability:concepts.adoc#history-reconstruction[Reconstructed transcript history]. // TODO: Re-verify this field list against the GA Console UI on adp-production. Beta labels may shift; update wording before GA.