What WebMCP actually is

On 19-20 May 2026, Google used the I/O developer keynote to introduce WebMCP — a proposed open web standard that lets a website declare its own actions as structured tools a browser-based AI agent can call. It is the headline plumbing announcement of an I/O cycle dominated by agents: Antigravity 2.0, Gemini Spark and a new stable Android CLI for AI agents all landed at the same event.

The pitch is simple. Today, when an agent like Spark or Antigravity opens your web app, it has to reverse-engineer what your site can do by scraping the DOM, guessing at CSS selectors and clicking through brittle visual affordances. WebMCP flips that around: the page itself tells the agent, in machine-readable form, which JavaScript functions and which HTML forms are real tools, what they accept, and what they return. The agent does not have to read your UI — it reads your manifest.

  • Origin: announced as a Google proposal at I/O 2026 on 19-20 May.
  • Shape: declared tools are either plain JavaScript functions exposed via a manifest, or HTML <form> elements annotated with WebMCP-style attributes.
  • Consumers: Gemini Spark in Chrome, Antigravity 2.0 agents, and other browser-resident agents that adopt the discovery format.
  • Companion track: a separate "Modern Web Guidance for AI coding agents" effort, with 100+ use cases shipped at preview, gives Antigravity and Gemini Spark a body of patterns to ground tool calls against.
  • Standards status: proposal-stage only. A W3C or WHATWG pathway has not yet been made public.
Pro tip

Treat WebMCP exactly as you treated JSON-LD in 2014. Ship a thin pilot on your two highest-intent pages — pricing, booking, account — keep the binding behind a single adapter file, and you will absorb the spec churn without rewrites.

Why this matters now

A B2C SaaS team in Bangalore selling a payroll tool, and a fintech in London onboarding new SME accounts, both have the same shape of problem the moment a customer's browser agent shows up on their site. The agent wants to complete a job — "process this month's run", "open a business account for this entity" — and it has two choices. It can drive the visible UI with vision-and-click, which is slow, expensive and breaks every time you ship a CSS change. Or it can call a tool the page itself has advertised. WebMCP is what makes the second option exist.

That changes how you scope agent traffic. Today an agent visit looks like noisy automation against your front end. Once WebMCP is live, an agent visit looks like an authenticated API call inside the user's own session — your CSRF tokens, cookies, MFA state, regional consent flags, all of it carries through automatically. You don't have to ship a parallel API surface, you don't have to issue separate keys, you don't have to design a separate auth model for agents. The browser is the integration.

WebMCP vs MCP vs REST — when to use which

The naming gets confusing fast, so it helps to lay them out side by side. Anthropic's Model Context Protocol (MCP) is server-side: a separate process exposes tools over JSON-RPC to any LLM agent that connects. REST is the API layer most builders already ship. WebMCP is what Google has just proposed for the browser.

Dimension REST API Anthropic MCP WebMCP
Where it runs Your backend Dedicated MCP server process Inside the rendered web page
How the agent finds tools OpenAPI / docs / scraping tools/list JSON-RPC call Page manifest + DOM attributes
Auth model API keys or OAuth tokens Server-issued credentials The user's existing browser session
Who consumes it Any HTTP client Server-side LLM agents Browser-based agents (Spark, Antigravity, etc.)
Where state lives Your database MCP server + backend The DOM, plus your backend
Standardisation Mature (HTTP, OpenAPI) Open spec, multi-vendor adoption Proposal-stage at Google I/O 2026
Right use case Headless service-to-service work Server-side agent backbones User-driven jobs in a browser

The honest reading: WebMCP does not replace REST or MCP. It is the layer that sits between the rendered web app and a browser-resident agent. If your product has a public web UI that a customer might one day visit with Gemini Spark steering the session, WebMCP is the cheapest path to making that visit useful instead of frustrating.

From the desk

"We run a B2C onboarding flow used heavily in Bangalore and a separate SME credit dashboard for our London book. Both have a 'kick off the application' button that sits behind three forms. With WebMCP we can declare those three forms as one tool — startOnboarding — and the agent completes in seconds what currently takes six clicks. Same code, same auth, no new API."

— Builder Desk · AI Tech Connect

A worked example: exposing a form as a WebMCP tool

The spec is still in flux, so what follows is illustrative rather than canonical — pin the exact attribute names to Google's I/O 2026 developer documentation once you start shipping. The shape, though, is settled: a tool is a name, a JSON-schema description of its inputs, and either a JavaScript handler or a backing HTML form.

Start with the HTML side. Take an existing booking form and annotate it.

<!-- Existing form, now WebMCP-discoverable -->
<form id="book-demo"
      data-webmcp-tool="bookDemoSlot"
      data-webmcp-description="Book a 30-minute product demo slot"
      action="/api/demo/book" method="post">

  <label for="email">Work email</label>
  <input id="email" name="email" type="email"
         data-webmcp-arg="email" required>

  <label for="slot">Preferred slot (ISO 8601)</label>
  <input id="slot" name="slot" type="datetime-local"
         data-webmcp-arg="slot" required>

  <button type="submit">Book demo</button>
</form>

Then publish the manifest. One file at /.well-known/webmcp.json tells any visiting agent which tools the page exposes and which fields map to which arguments.

{
  "version": "0.1",
  "tools": [
    {
      "name": "bookDemoSlot",
      "description": "Book a 30-minute product demo slot",
      "scope": "page",
      "input_schema": {
        "type": "object",
        "properties": {
          "email": { "type": "string", "format": "email" },
          "slot":  { "type": "string", "format": "date-time" }
        },
        "required": ["email", "slot"]
      },
      "binding": { "type": "html-form", "form_id": "book-demo" }
    },
    {
      "name": "cancelBooking",
      "description": "Cancel a previously booked demo by reference id",
      "scope": "page",
      "input_schema": {
        "type": "object",
        "properties": {
          "booking_id": { "type": "string" }
        },
        "required": ["booking_id"]
      },
      "binding": { "type": "js-function", "fn": "window.aitc.cancelBooking" }
    }
  ]
}

That is the full minimum. The agent fetches the manifest, sees two tools, calls bookDemoSlot with structured arguments, and your existing form submission handler does the work. cancelBooking is a plain JavaScript function on the page that the agent invokes directly — useful for actions that have no visible form to fill.

Good

Pin tool scope tightly. A scope: "page" tool only exists while the user is on that URL. That keeps the attack surface small and stops a misbehaving agent from invoking destructive actions from an unrelated tab.

Where it still breaks

  1. The spec is not finalised. Attribute names, manifest schema and discovery semantics will shift between today and the W3C or WHATWG track that Google has not yet announced. Keep your bindings behind an adapter file so a rename is a one-day job.
  2. Auth assumptions are implicit. Tools inherit the user's session, which is usually what you want — but if your form mutates server-side state, you still need rate limits, CSRF and a destructive-action confirmation step on the agent side. The protocol does not give you those for free.
  3. Cross-browser support is partial. At launch, expect Chrome and Antigravity-driven agents first. Safari, Firefox and third-party browser agents will follow only once the standards pathway is public.
  4. Multi-step flows still need orchestration. Declaring three forms as three tools is fine. Declaring "complete onboarding" as one tool that triggers all three is where most of the design work actually lives. Plan for compound tools.
  5. Telemetry is your job. Browser agents will hammer some tools and ignore others. You will want a clear event for every WebMCP invocation, distinguished from human form submissions, so you can size the agent traffic honestly.
Don't do this

Do not expose an unguarded deleteAccount or transferFunds tool just because the manifest is convenient. Destructive actions need an explicit confirmation contract that the agent must surface to the user — never assume the agent will get this right by default.

Shipping agent surfaces and want a sounding board?

Every article on AI Tech Connect is written by a Verified Builder. Browse profiles, shortlist who you want to hire or collaborate with.

Browse Builders →

So — should you ship WebMCP this quarter?

It depends on where your traffic is coming from and how tolerant your team is of spec churn.

  • Ship a pilot now if you already see browser-agent traffic in your access logs, if your two highest-intent flows are forms (booking, sign-up, account opening), or if your product roadmap has agent UX as a 2026 commitment.
  • Wait one cycle if your product is API-only, if your front end is mid-rewrite, or if your compliance team needs a published standards-body owner before they will sign off on any new browser surface.
  • Split the difference by keeping a thin webmcp-adapter.js in your codebase and exposing one safe, idempotent, read-only tool — for example, "list available demo slots" — to learn the shape of agent traffic without exposing any state changes.

The Google I/O 2026 developer highlights are at blog.google. Pair this with our earlier coverage of Gemini Spark, Antigravity 2.0 and the Android CLI for AI agents for the full picture of what Google shipped at this I/O.