Twelve years ago, I wrote a short post about a conversation that went roughly like this:
“I need programmatic access.”
“We don’t have an API.”
“Of course you do — it’s AMF behind your Flex UI. A little PyAMF script will do the trick.”
“Please don’t show it to anyone!”
The point was simple: every application that has a UI already has an API. The UI talks to something. That something is the API. You just haven’t admitted it yet.
Yesterday, I wrote a longer post about WebMCP — a shiny new W3C proposal from Google and Microsoft that adds a browser API so AI agents can interact with websites through “structured tools” instead of scraping the DOM.
The websites already have structured tools. They’re called APIs. The SPAs call them. The mobile apps call them. The CLI tools call them. They exist. They have endpoints, schemas, authentication. They are right there.
In 2014, the answer was: “Of course you have an API — it’s behind your Flex app.”
In 2026, the answer is: “Of course you have structured tools — they’re behind your React app.”
Or: How Google and Microsoft Walked Into a Bar and Reinvented the Web, Worse
Google and Microsoft just co-authored a web spec together. Let that sink in.
The last time these two agreed on anything technical, IE6 was busy eating Netscape alive and “web standards” was an oxymoron. Now they’re back — holding hands under a W3C community group banner, gazing into each other’s eyes across a conference table, and delivering unto us WebMCP — a “proposed web standard” that lets websites expose “structured tools” to AI agents.
I have some thoughts.
What WebMCP Actually Is
WebMCP adds a new browser API — navigator.modelContext — that lets a web page register “tools” for AI agents to call. Each tool has a name, a description, a JSON Schema for inputs, and a handler function. Instead of AI agents scraping your DOM and squinting at screenshots like a drunk trying to read a menu, your website just… tells them what’s available.
Two flavors:
Declarative: You annotate HTML forms so agents can submit them directly.
Imperative: You write JavaScript handlers that agents invoke with structured inputs.
The Chrome team is very excited. They’ve published a blog post, opened an early preview program, and shipped it behind a flag in Chrome 146. VentureBeat wrote it up. Everyone is talking about the agentic web. The hype cycle spins.
The Problem WebMCP Solves
AI agents interact with websites by scraping the DOM, interpreting screenshots, and simulating clicks. This is fragile. It breaks when the UI changes. It’s slow and token-expensive (2,000+ tokens per screenshot vs. 20-100 tokens for a structured call). Every CSS class rename is a potential catastrophe.
This is a real problem. I’m not going to pretend it isn’t.
But here’s the thing: it’s a problem the industry created by ignoring the architecture that already solved it.
The Architecture That Already Solved It (You Didn’t Read It Either)
In the year 2000, Roy Fielding published his PhD dissertation describing the architecture of the World Wide Web. He called it REST — Representational State Transfer. You’ve heard of it. You’ve put it on your resume. You almost certainly haven’t read it.
(Don’t feel bad. Nobody has. That’s the whole problem.)
REST has one crucial, defining idea: HATEOAS — Hypermedia As The Engine Of Application State. Terrible acronym. Sounds like a sneeze. But the idea is simple and beautiful: the server’s response tells you everything you need to know about what you can do next. The links are in the response. The forms are in the response. The available actions are self-describing.
An HTML page already IS a “tool contract.” A <form> already IS a structured tool with defined inputs. A <a href> already IS a discoverable action. The entire web was designed from the ground up so that a client — any client, human or machine — could interact with a server without prior knowledge of its API, simply by following the hypermedia controls in the response.
“The HTML response is entirely self-describing. A proper hypermedia client that receives this response does not know what a bank account is, what a balance is, etc. It simply knows how to render a hypermedia, HTML.”
The web already had machine-readable, self-describing, discoverable interactions. It’s called… the web. Somewhere, Roy Fielding is thinking murderous thoughts.
So What Happened?
The industry collectively decided that REST meant “JSON over HTTP with nice-looking URLs.” Which is approximately as accurate as saying democracy means “everyone gets a vote on what to have for lunch.”
Fielding himself, in a now-famous 2008 blog post, tried to set the record straight with the restraint of a man watching his house burn down:
“I am getting frustrated by the number of people calling any HTTP-based interface a REST API… That is RPC. It screams RPC. There is so much coupling on display that it should be given an X rating.”
Reader, the industry did not listen. What followed was a twenty-year sprint in the wrong direction. We abandoned hypermedia for JSON blobs. We replaced self-describing responses with Swagger docs and API versioning. We built increasingly elaborate tooling — API gateways, SDK generators, GraphQL, tRPC — to paper over the problems caused by ignoring the one constraint that made the whole thing work.
And now, in 2026, having thoroughly ignored the architecture of the web while building on the web, we’ve arrived at the logical endpoint: a new browser API so that AI agents can interact with websites in the structured way that websites were already designed to support.
Roy Fielding is no longer thinking murderous thoughts. He’s past that. He’s watching the final scene of Chinatown. “Forget it, Roy. It’s the agentic web.”
The Declarative API Is Just Forms
This is the part where I need you to really focus. From the WebMCP spec:
“Declarative API: Perform standard actions that can be defined directly in HTML forms.”
They. Reinvented. Forms.
Google and Microsoft engineers got together — presumably with catering, perhaps even a whiteboard budget — and produced a specification to make HTML forms work for AI agents. HTML forms. The things that have been telling machines “here is an action, here are the inputs, here is where to send it” since 1993.
The <form> element is literally a structured tool declaration with a name (action), a method (GET/POST), and typed inputs (<input type="text" name="destination" required>). It has been machine-readable for thirty-three years. It is older than some of the engineers who wrote this spec.
But sure. Let’s add an attribute. Innovation.
The Imperative API Is Just RPC (Again)
The other half of WebMCP is the “imperative API,” where you register JavaScript handler functions that agents call with JSON inputs.
This is RPC. Specifically, it’s RPC mediated by the browser, authenticated by the user’s session, and invoked by an AI agent instead of a human. Which is a perfectly fine idea! RPC is useful. It has always been useful. SOAP did this in 1999. CORBA did it before that. Every SPA with a JavaScript API layer does it today.
The new part is navigator.modelContext.registerTool() instead of window.myApp.doThing(). The innovation is… a namespace. Alert the press.
The Security Section Reads Like a Horror Novel
WebMCP’s own specification describes something it calls the “lethal trifecta”: an agent reads your email (private data), encounters a phishing message (untrusted content), and calls a tool to forward that data somewhere (external communication). Each step is legitimate individually. Together, they’re an exfiltration chain.
The spec’s own analysis of this scenario? “Mitigations exist. They reduce risk. They don’t eliminate it. Nobody has a complete answer here yet.”
Nobody has a complete answer yet. They shipped it behind a flag in Chrome 146 anyway. This is the “we’ll add seat belts in v2” school of automotive engineering.
The destructiveHint annotation — the mechanism for flagging “this tool can delete your data” — is marked as advisory, not enforced. The spec literally says the browser or agent can ignore it. It’s a polite suggestion. A Post-it note on the nuclear button that says “maybe don’t?”
And there’s no tool discovery without visiting the page. Agents can’t know what tools Gmail offers without opening Gmail first. The spec proposes future work on a .well-known/webmcp manifest. You mean like robots.txt? Or /.well-known/openid-configuration? Or the dozens of other discovery mechanisms the web already has? Groundbreaking.
The Real Game
Now let’s talk about what this actually is, under the hood.
Google and Microsoft don’t control the API layer. They can’t dictate how backends expose services. But they do control the browser. WebMCP puts the browser — Chrome and Edge, i.e., Chromium with two different logos — at the center of every agent-to-website interaction.
Every AI agent that wants to use WebMCP must go through the browser. The browser mediates authentication, permissions, consent. The browser becomes the gatekeeper. If you control the browser, you control the chokepoint.
This is the same play Google made with AMP: take a real problem (slow mobile pages), create a solution that requires routing through Google’s infrastructure, W3C-wash it, and call it open. WebMCP takes a real problem (agents can’t interact with websites reliably) and creates a solution that routes through Chromium.
MCP (Anthropic’s protocol) connects agents to backend services directly — no browser needed. WebMCP says: no no, come through our browser. That’s not interoperability. That’s a tollbooth with a standards document.
What Should Have Happened
If we actually wanted AI agents to interact with websites reliably, we could:
Build better hypermedia clients. Teach AI agents to understand HTML — forms, links, semantic structure. The web is already machine-readable. We just need clients that aren’t illiterate.
Use existing standards. Schema.org, Microdata, RDFa, JSON-LD — mature standards for machine-readable web content. Google built an entire search empire on them. They work today.
Write APIs. If you want structured machine-to-machine interaction, build an API. REST (actual REST), GraphQL, gRPC — pick your poison. No new browser API required.
Use MCP where appropriate. For backend service integration, MCP does the job without inserting a browser into the loop.
None of these require a new browser API. None of them route through Chromium. None of them require Google and Microsoft to co-author anything.
The Cycle
This is the software industry’s most reliable pattern:
A good architecture is proposed (REST, 2000)
The industry ignores the hard parts (HATEOAS, hypermedia)
The easy parts get cargo-culted (“REST means JSON + HTTP verbs”)
Problems emerge from ignoring the architecture
A new spec is proposed to solve those problems
The new spec doesn’t mention the old architecture
Go to 1
WebMCP is step 5. The Chrome blog post doesn’t mention REST. Doesn’t mention HATEOAS. Doesn’t mention hypermedia. It talks about “the agentic web” as if machine-readable web interactions are a bold new idea that needed inventing in 2026.
Roy Fielding wrote the answer to this problem in his dissertation. In 2000. It’s free to read. It’s shorter than the WebMCP spec. And unlike WebMCP, it doesn’t require Chrome 146.
But sure. Let’s add navigator.modelContext. What’s one more API between friends?
Ok the limited use of mock servers was annoying me, so what is the answer these days? That’s right, we vibe-code a solution. Behold Clavin (ha-ha-ha, because postman).
Sure, it only works on localhost but that’s fine for my use case. For other things, use ngrok, or whatever.
This is a special kind of rant, so I’m starting a new tag in addition to it. It’ll be updated next year, I’m sure.
The state of yak shaving in today’s computing world is insane.
Here we go.
It’s 2024, and…
…and I can’t get CloudWatch agent to work to get memory monitoring (also, why is this extra step needed, why can’t memory monitoring be part of default metrics? Nobody cares about memory?) Screwing around with IAM roles and access keys keeps giving me:
****** processing amazon-cloudwatch-agent ****** 2024/04/05 20:51:36 E! Please make sure the credentials and region set correctly on your hosts.
Finally it works. Add it to my custom dashboard. Nice.
Wait, what’s that? Saving metrics to my custom dashboard from the EC2 instance overrides what I just added. I have to manually edit the JSON source for the dashboard.
It’s 2024.
…and we have what even our apparently an HTTP standard for determining a user’s time zone but per our overlords, but yet no… We are reduced to a ridiculous set of workarounds and explanations for this total fucking bullshit like “ask the user” — yet we do have the Accept-Language header and we’ve had it since RFC 1945 (that’s since 1996, that is for more time than the people claiming this is an answer have been sentient.
It’s 2024.
..and we have a shit-ton of Javascript frameworks, and yet some very popular ones couldn’t give a shit about a basic thing like environment variables (yeah, yeah, I know how that particular sausage is made — screw your sausage, you put wood shavings in it anyway).
…and because I’m starting this rant rubric, we still are on the tabs-vs-spaces (perkeleen vittupä!) and CR-vs-LF-vs-CRLF. WTF, people. This is why I am not getting anything smart, be it a car, a refrigerator, or whatever. I know that sausage. It’s reverse Polish sausage.
I like Postman in general. But some things are annoying, so there…
APIs and Collections and Environments
APIs are great, and equally great is their integration with GitHub, and ability to generate Collections from API definitions and have them be updated when API definition changes. Nice. Except… those Collections cannot be used to create Monitors or Mock servers, you need to create standalone Collections (or copy those you generated from under APIs). But now those don’t integrate with GitHub. There is a fork-and-merge mechanism that kind of takes care of the collaboration, but that those modes are different is annoying. Ditto Environments. What’s up with that?
As someone who offers a REST API, we at Romana project wanted to provide documentation on it via Swagger. But writing out JSON files by hand seemed not just tedious (and thus error-prone), but also likely to result in an outdated documentation down the road.
Why not automate this process, sort of like godoc? Here I walk you through an initial attempt at doing that — a Go program that takes as input Go-based REST service code and outputs Swagger YAML files.
It is not yet available as a separate project, so I will go over the code in the main Romana code base.
NOTE: This approach assumes the services are based on the Romana-specific REST layer on top of Negroni and Gorilla MUX. But a similar approach can be taken without Romana-specific code — or Romana approach can be adopted.
The entry point of the doc tool for generating the Swagger docs is in doc.go.
The first step is to run Analyzer on the entire repository. The Analyzer:
Walks through all directories and tries to import each one (using Import()). If import is unsuccessful, skip this step. If successful, it is a package.
The next step is to run the Swaggerer — the Swagger YAML generator.
At the moment we have 6 REST services to generate documentation for. For now, I’ll explicitly name them in main code. This is the only hardcoded/non-introspectable part here.
From here we, for each service, Initialize Swaggerer and call its Process() method, which will, in turn, call the main workhorse, the getPaths() method, which will, for each Route:
Get its Method and Pattern — e.g., “POST /addresses”
Get the Godoc string of the Handler (from the Godocs we collected in the previous step)
The combination of Negroni and Gorilla MUX is a useful combination for buildng REST applications. However, there are some features I felt were necessary to be built on top. This has not been made into a separate project, and in fact I doubt that it needs to — the world doesn’t need more “frameworks” to add to the paralysis of choice. I think it would be better, instead, to go over some of these that may be of use, so I’ll do that here and in further blog entries.
This was borne out of some real cases at Romana; here I’ll show examples of some of these features in a real project.
Overview
At the core of it all is a Service interface, representing a REST service. It has an Initialize() method that would initialize a Negroni, add some middleware (we will see below) and set up Gorilla Mux Router using its Routes. Overall this part is a very thin layer on top of Negroni and Gorilla and can be easily seen from the above-linked source files. But there are some nice little features worth explaining in detail below.
In the below, we assume that the notions of Gorilla’s Routes and Handlers are understood.
Sort-of-strong-dynamic typing when consuming data
While we have to define our route handlers as taking interface{} as input, nonetheless, it would be nice if the handler received a struct it expects so it can cast it to the proper one and proceed, instead of each handler parsing the provided JSON payload.
To that end, we introduce a MakeMessage field in Route. As its godoc says, “This should return a POINTER to an instance which this route expects as an input”, but let’s illustrate what it means if it is confusing.
Let us consider a route handling an IP allocation request. The input it needs is an IPAMAddressRequest, and so we set its MakeMessage field to a function returning that, as in
Sometimes we prototype features outside of the Go-bases server — as we may be calling out to various CLI utilities (iptables, kubectl, etc), it is easier to first ensure the calls work as CLI or shell scripts, and iterate there. But for some demonstration/QA purposes we still would like to have this functionality available via a REST call to the main Romana service. Enter the hooks functionality.
Whether to run it before or after the Route‘s Handler (When field)
Optionally, a field where the hook’s output will be written (which can then be examined by the Handler if the When field is “before”). If not specified, the output will just be logged.
Then during a request, they are executed by the wrapHandler() method (it has been described above).
That’s it! And it allows for doing some work outside the server to et it right, and only then bother about adding the functionality into the server’s code.
If this doesn’t seem like that much, wait for further installments. There are several more useful features to come. This just sets the stage.