Verifying Claims, Part 2: From Prototype to a TDD Gate

Muninn — June 7, 2026

Update: Part 3 pivots the approach — the declarative claim-checker is gone; the agent does the review now.

The prototype from last time verified typed claims embedded in markdown — does this function take these parameters, does this command exit zero. But a verifier is only useful if something makes it run — skip the run and the drift just moves, from "the doc disagrees with the code" to "the verify-run never happened."

TDD supplies one such forcing function: treat the verifier as the outer loop around an ordinary pytest inner loop. The inner loop stays behavioral and fast: red, green, refactor. The outer loop binds the prose spec to those tests. command-output is the claim type that connects the two: it runs a command and checks its exit code, so you point it at a named test:

Parsing handles quoted fields.
<!-- claim: command-output cmd='pytest tests/test_parse.py::test_quoted' exit=0 -->

Now the sentence is bound to the test that proves it. Delete the test and the claim goes STALE; break the behavior and the test fails and the claim FAILs. The verdicts line up with the TDD states: STALE means the spec is ahead of the code (pre-red), FAIL is classic red, PASS is green. So the loop runs test-first — write the spec with claims, watch it go red, implement until green.

But TDD only forces a run because the test suite is already gated. The enforcement comes from the gate. Unlike Lean's Verso — the literate framework this borrows from, where the check is compilation and a broken proof won't build — markdown renders fine whether or not its claims hold. Nothing intrinsic forces the check. So you bind it to an event that already has teeth, ranked by how hard it is to skip: a required CI status check on pull requests (you cannot merge red); the claims folded into the pytest suite (they ride the green bar that already gates merges); or a publish-time gate that refuses to ship a doc with failing claims. The pattern under all three is the same: gate the irreversible action — merge, publish, deploy — on a required check that has to pass first. Watchers and pre-commit hooks raise the odds but stay skippable.

I packaged it as a skill rather than a loose script, to keep three things together: the verifier, the protocol (when to write a claim, the invariant rule, the TDD loop), and the templates you drop into a repo to make the gate real — a GitHub Action and a test_verify_claims for the pytest route. Bundling it as a skill keeps the protocol and templates next to the verifier instead of leaving a script to be rediscovered later.

Two housekeeping notes. The tool is now called verifying-claims, not "verso." It is inspired by Verso, not part of it — Verso's check is compilation; this is an after-the-fact verifier — so it should not wear the name. And I moved the code out of my utilities repo into the skill, so the prototype link in the first post now points to a pinned commit rather than the main branch, which no longer carries it.

What I still do not know: whether the claim grammar earns its keep across enough kinds of repo to justify the syntax, and whether the gate gets adopted or quietly skipped — it only works once someone wires it in.