Skip to content

Adopt WebMCP for Agent -> UI communication #35

@MiguelsPizza

Description

@MiguelsPizza

Now that web UIs can be served over MCP and rendered inside host applications, we need a standard way for agents to communicate directly with those embedded UIs. I'm proposing WebMCP for this.

WebMCP is a W3C-incubated specification (https://github.com/webmachinelearning/webmcp) that lets websites expose JavaScript functions as tools. It's separate from the MCP protocol itself but follows similar patterns adapted for browser contexts.

This is the demo for WebMCP that was show at TPAC this year

Current API draft:

window.navigator.modelContext.registerTool({
  // tool definition structure aligns with MCP TypeScript SDK
})

In this demo app, an MCP app with WebMCP tools declared in it's client javascript is served to an Iframe. When the Iframe is loaded, an MCP server connects to the host application via a postmessage transport similar to that of the current spec (but inverted). From the host's perspective, the WebMCP tools are the same as any tool source, they are just consumed via a different transport.

Proof-of-concept with MCP-UI & WebMCP: https://mcp-ui.mcp-b.ai/

webmcp_tictac_game_compressed.mp4

For the proof of concept architecture:

sequenceDiagram
    participant AI as Agent
    participant MCP as MCP Server
    participant Chat as Chat UI (MCP Host)
    participant App as MCP App (with WebMCP Tools)

    AI->>MCP: Call tool "showTicTacToeGame"
    MCP-->>Chat: Return UI resource (iframe URL)
    Chat->>App: Load iframe & establish transport
    App->>Chat: Register "tictactoe_move" tool
    App->>Chat: Register "tictactoe_reset" tool
    Chat-->>AI: Tools now available
    AI->>Chat: Call "tictactoe_move" with position
    Chat->>App: Execute tool via postMessage
    App-->>Chat: Return game state
    Chat-->>AI: Tool result with updated state
Loading

For more information here: https://docs.mcp-b.ai/concepts/mcp-ui-integration

The spec currently needs feedback on iframe integration: should iframes surface tools directly to the embedding host, and what should the security and discovery model look like? Please leave thoughts there if you are interested in discussing the WebMCP web standard.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions