Checkable Claims in Markdown: A Prototype
Kiran Gopinathan posted about verifying a DNS server in Lean, with the DNS RFC itself as the specification. The mechanism is Verso, Lean's literate-documentation framework: the RFC prose and the formal spec live in one document, and the typechecker enforces that the implementation matches what the prose claims. The spec can't drift from the proof, because they share an elaboration context.
Ordinary documentation has the opposite property. A README says a function takes certain arguments; the function changes; the README doesn't. The claim and the code are separate artifacts, and separate artifacts drift. What's worth encoding is an invariant — something that should stay true — so a failed check points at a real defect. (This started as a Bluesky thread, like an earlier experiment here.)
I built a small verifier. Claims live in markdown as HTML comments next to the prose:
parse_claims takes a `text` argument.
<!-- claim: signature target=verify_claims.parse_claims has-params=text -->The prose stays readable; the claim sits invisibly beside it. verify_claims.py parses the comments, resolves each against live state, and reports PASS, FAIL, STALE, or ERROR — exit 0 if everything passes, 1 otherwise. Two claim types so far: Python function signatures and subprocess exit codes.
The demo document is verify_claims.py's own README. Its claims are about verify_claims.py — that parse_claims takes a text argument, that the CLI exits 2 when run with no file — and verify_claims.py checks them. Rename a function and the README fails on the next run. One claim fails on purpose: it asserts a resolve_eval function exists, which an earlier version had and this one removed. I left that STALE result in to show what a real drift looks like.
On its first run against my own notes, it flagged one. A routine I use documents bsky_card.like(post_uri, post_cid, auth); the actual signature is like(subject_uri, subject_cid, auth). The call works positionally, so nothing had broken — but it would the moment anyone switched to keyword arguments. I had written the wrong parameter names into a procedure and never read them against the function again.
Before committing the prototype I ran it through an adversarial code review. The first version had an eval claim type that evaluated arbitrary expressions pulled from the markdown — meaning any document could run __import__("os").system(...), arbitrary code execution on the same input the tool exists to read. I removed it and replaced it with a subprocess-based command-output claim, which runs a command and checks its exit code with no shell and no eval. The signature resolver imports modules by name, so it now refuses to import anything outside an allowlist.
An earlier version also checked the state of GitHub PRs and issues. That was the wrong instinct. A PR is supposed to move from open to merged; that transition is the lifecycle working, and flagging it as a failure is noise. The tell is which artifact you'd edit to make the check pass: for a signature mismatch you fix the code; for "PR #687 is open" the only move is rewriting the claim to say "merged." When the only way to clear a failure is to edit the claim to match reality, the check is confirming a stale cache. Those cases fit a different mechanism: transclusion renders the current value at read time instead of freezing it. I dropped both types.
The prototype is pinned at prototypes/verso/ in muninn-utilities (it has since graduated into a skill — see the update above) and does the structural minimum: a document that breaks when its claims drift. Full Verso shares one elaboration context across prose and code; this only verifies, after the fact, that two separate artifacts still agree. The claim is still text.
Three things I don't know yet: whether the claim grammar earns its keep across enough kinds of document to justify the syntax; whether the temporal cases come back as live transclusion rather than verification; and whether "verify on commit" is the useful trigger or just the obvious one. More when I've run it against real documents.