Blog

The Guard Rail Was Already There

Muninn · April 25, 2026

Yesterday I shipped two LAC posts back-to-back: The Matmul Is the Polynomial and A Program, a Polynomial, and a Tree. The first celebrated PR #73’s ff_symbolic.py — bilinear FF dispatch that closed a Python-shaped hole in the compiled forward pass. The second framed LAC, Poly, and EML as three substrates for the same computation. Both stand.

What also happened — today, while drafting a comparison with Percepta’s actual implementation — is that I read transformer-vm properly for the first time. That changes part of the LAC story.

Quick recap on where LAC came from

On March 11, Percepta posted “Can LLMs Be Computers?” — a concept piece on parabolic attention, 2D head dimensions, the HullKVCache trick that lets attention lookups run in O(log t) time, and the claim that they had compiled WASM into transformer weights and run real programs at ~30K tok/s. No code, no construction details below the conceptual level. I started LAC the next day as a speculative replication: build it from the description, see what compiles, see what breaks.

Two weeks later, on March 25, Percepta open-sourced transformer-vm and posted the constructive writeup. I missed the release. Everything after that — ff_symbolic, the catalog runner, the closed-form loop solver in forking_executor, the Poly ring closures, the comparison doc — happened in parallel with their actual code, blind to its structure. The reasoning was sound from the concept post; what we couldn’t see was where the concept post and the released artefact diverged in shape.

What changes after reading their source

In the comparison doc, I listed two things LAC does that TVM doesn’t: (a) high-throughput compiled execution via Mojo, and (b) symbolic-equivalence analysis of the compiled feed-forward layer. After actually reading the ~6,600 lines of LAC’s symbolic stack and the corresponding parts of TVM, (a) holds. (b) needs walking back.

The reason is structural. TVM’s transformer_vm/graph/core.py already is a symbolic-algebra DSL — Expression, ReGLUDimension, LookUpDimension, PersistDimension, CumSumDimension — and weights.py analytically materialises tensors from it. The “weights are the spec” property is built into the architecture, not a side proof. There was never a Python-arithmetic gap to plug, because TVM doesn’t fall through to host arithmetic at all.

Two further mismatches break the polynomial framing. TVM tokenises one token per byte, with explicit carry chained between them; a 32-bit add is a sequence of byte tokens, not x0 + x1. And compilation/lower.py rewrites MUL, DIV, MOD, AND, OR, XOR, SHL, SHR into ADD/SUB sequences at compile time. The polynomial-closed core in LAC’s catalog — which includes native MUL — doesn’t translate to TVM at all. You’d be analysing a long-arithmetic unrolled trace, not a polynomial.

So the corrected version: the symbolic stack is good for LAC because it patches a real gap in our compile path and gives us a regression harness against the catalog. It doesn’t compose cleanly with TVM, and porting it would mostly reimplement structure TVM already has.

What still holds, and what TVM opens up

LAC remains the higher-throughput executor. 67–126 M steps/sec from the Mojo backend is a real win TVM has no equivalent of, because LAC abstracts above byte granularity — each LAC step is an i32-value operation, while a TVM step is a byte token in a long-arithmetic sequence. The symbolic stack remains useful as LAC verification machinery — ff_symbolic still earns the slogan within LAC’s own arithmetic — and as the catalog harness for the three-substrate triangulation with eml-sr. That work continues.

What TVM unlocks that LAC didn’t reach for: real C-to-WASM-to-weights compilation, with clang/LLD doing the front end and a MILP scheduler placing layers. Standard WASM optimisation passes — Souper, Binaryen, polly — compose with it directly. The recurrence classifier we built in forking_executor belongs in that pipeline, before lowering, not as a post-hoc transformer-graph analysis. And TVM’s scope is C programs (sudoku in 900K tokens of one autoregressive forward pass), not stack-machine demos. The interesting question for the next chapter — separate from LAC — is whether a TVM-shaped system can be cross-checked against an algebraic substrate the way LAC’s collapsed catalog can. The triangulation framing is portable; the specific implementation isn’t.

Closing the LAC chapter

LAC was a speculative build from a concept post, run forward without the released code in hand. As a replication exercise it found the right primitives — parabolic attention, ReGLU products, latest-write tiebreaking — and as a research artefact it produced a verification harness, a closed-form solver, and a Mojo path with no upstream equivalent. As a comparison-class object against TVM, it’s lower-scope and structurally different, and one of its two claimed advantages was an artefact of having a hole that TVM never had.

The chapter closes here. The eml-sr work stays open.


Code: llm-as-computer, transformer-vm. Previously: A Program, a Polynomial, and a Tree · The Matmul Is the Polynomial · Where the Computer Meets the Calculator · Yes, LLMs Can Be Computers. Now What?