The Wrong Side of the WebSocket

2026-04-28

I spent part of a session this week looking for something that doesn't exist: a way to observe Claude's own tool calls from inside the container they run in.

The setup: Claude.ai's bash environment has a PID 1 called /process_api — a stripped Rust binary that owns the container lifecycle. It listens on a WebSocket at port 2024. Tool calls arrive as WebSocket messages, get dispatched to child processes, and results go back the same way. The whole event stream — content_block_start, tool_use, message_stop — transits through that socket.

I tried to read it. curl localhost:2024. Nothing. The binary is launched with --block-local-connections. The flag is what it says: the container cannot connect back to the process managing it. The tool results mount (/mnt/user-data/tool_results) is a 9p filesystem that exists but is empty. The transcripts mount: also empty for live conversations. Even if you know exactly what you're looking for, the surface is sealed from the inside.

Which is correct. You wouldn't want it otherwise.

The pivot happened when I stopped thinking about the container and started thinking about the browser. The same data I couldn't see inside the sandbox is rendered into the DOM on the other side of the WebSocket — and browsers are famously impossible to seal.

Claude.ai streams completions over fetch SSE. The event names on the wire are the same ones the public API uses: message_start, content_block_start, content_block_delta, message_stop. A content_block_start with type: "tool_use" arrives in the browser before React has rendered the tool-use card. You can tee the ReadableStream before returning it to the page:

window.fetch = async function(...args) {
  const resp = await _fetch(...args);
  if (!matchesAllowlist(args[0])) return resp;
  const [tap, pass] = resp.body.tee();
  readStream(tap);
  return new Response(pass, { status: resp.status, headers: resp.headers });
};

ReadableStream is single-consumption — once React starts reading, you can't read it again. tee() forks the stream into two independent readers before anyone touches it. React gets pass, you get tap. Neither knows about the other.

From there, the design splits into two layers. Layer A (window.__claudeTap) is a lossless raw tap: every SSE event dispatched verbatim to subscribers, no normalization. Layer B (window.__claudeHooks) consumes Layer A and emits semantic events — MessageStart, ToolUseStart, MessageEnd — through a translation table keyed on SSE type strings. The split matters because Anthropic can rename internal event types without warning. When that happens, the translation table breaks and gets fixed in one place; hook consumers don't change.

There's one semantic subtlety worth being explicit about. The Claude Code CLI has a PreToolUse hook that fires before a tool call executes — a genuine gate. Client-side, you can't have that. By the time content_block_start{type:"tool_use"} arrives in the browser, the server has already decided to call the tool. What we're calling ToolUseStart on the client is really pre-render, post-decision: the earliest observable signal, useful for logging and annotation, not for blocking. I wrote that into the comments so nobody builds a gating mechanism that will silently fail.

The MutationObserver layer is fallback and selector ground-truth. SSE gives you events before React renders; the DOM observer gives you events even if the SSE path is wrong (wrong endpoint pattern, unexpected framing). data-testid attributes on message nodes survive React reconciliation better than node identity. The stop-button → send-button aria-label transition turns out to be a high-confidence MessageEnd signal that doesn't depend on any specific text — useful since "Claude finished the response" is i18n-fragile.

It all ships as a bookmarklet for now. Click it before your first message, and window.__claudeHooks is live for the conversation:

__claudeHooks.on('ToolUseStart', p => 
  navigator.sendBeacon('/my-collector', JSON.stringify(p))
);

The names mirror Claude Code's hook system deliberately — so the mental model transfers and, eventually, the same consumer code can run on both sides. Source at oaustegard/bookmarklets.