remax_kb: A Search Index With No Matrix In It
A random rotation matrix — a dim×dim grid the quantizer projects vectors through — was the largest single thing in this site's search index, bigger than the codes, the keyword index, and the chunk text combined. The last post called remax_kb “hybrid search in a file” and said nothing the search needs lives outside it; this rotation was the one piece that did.
Turning an embedding into bits
The rotation is there to fix a problem in how the bits get made. remax keeps one bit per dimension of an embedding — the sign — and taken straight off the raw coordinates, those signs make a poor code. An embedding's information piles up in a handful of directions; the sign of a high-variance coordinate tells you something, the sign of a flat one is close to a coin flip, and the code spends the same bit on each.
A random rotation evens that out by mixing every input coordinate into every output one. Each output coordinate is now a random blend of the inputs, and its sign records which side of one random hyperplane through the origin the vector landed on. That is SimHash, a locality-sensitive hash: for two vectors separated by angle θ, a random hyperplane falls between them with probability θ⁄π, so across many bits the fraction that disagree — the normalized Hamming distance — estimates that angle. Close vectors share most of their bits; unrelated ones differ on about half. The rotation is what turns a sign into a bit worth storing.
The bits come with one hard requirement: they only mean “same side” relative to one specific set of planes — one specific rotation. The machine that builds the index and the machine that searches it have to use the identical rotation, or they are scoring positions against different planes and the Hamming distance between their codes is noise.
The size of the matrix
The simplest way to guarantee both sides use the same rotation is to put it in the file, and it is not a cheap thing to carry. A rotation for 768-dimensional vectors is a 768×768 matrix, and remax stacks four of them for a longer, steadier code. As float32 that is nine megabytes. The codes for this site's ~1,800 chunks are two-thirds of one. And the matrix doesn't depend on the corpus: the same nine megabytes sit in the file whether it indexes 1,800 chunks or ten million, so the smaller the index, the more of it is rotation.
The matrix is drawn from a single random seed, though. In principle the file only needs that seed, and each side regenerates the matrix when it loads.
Shipping the seed
Regenerating the matrix on both sides is the hard part. The index is packed in Python and searched in JavaScript, and “the same seed” has to yield the same matrix on each. numpy builds a random rotation by drawing a Gaussian matrix and orthogonalizing it with a QR factorization — and QR routines are not reproducible across implementations. Point a JavaScript rewrite at the same seed and it produces a different, equally valid orthogonal matrix. Different planes on each side; the query's bits and the documents' bits are now answers to different questions, and recall falls to chance. Shipping the matrix was the way around exactly this — pay the nine megabytes, and both sides are guaranteed the same planes.
Dropping the rotation: Rademacher planes
Keeping the seed-only plan means relaxing what the planes have to be. Charikar's original SimHash draws its hyperplanes independently, and they were never required to be orthogonal — the entries can be as plain as ±1, a Rademacher sign on every cell of the matrix. A seeded integer generator emits the same ±1 matrix in any language, exactly, with no floating point anywhere to round differently. Ship the seed; both sides build the identical planes; nothing rides along in the file.
The catch is in “independently.” Planes drawn at random are only approximately orthogonal. In high dimensions two random vectors are nearly perpendicular — but nearly leaves the planes slightly aligned, the bits slightly redundant, and the code a little worse at telling close documents apart. On this corpus that ran about two points of recall@10 below the exactly-orthogonal rotation it replaced. Workable, and genuinely portable, but a real cost for giving up the orthogonality.
Getting the orthogonality back: SRHT
You can get back to a rotation that behaves like an orthogonal one from a seed alone, without a QR, by borrowing a transform that is orthogonal by construction. The Hadamard matrix is a fixed grid of ±1 that is orthogonal by definition, and a fast Walsh–Hadamard transform applies it as a butterfly of additions and subtractions — the same shape as an FFT. It exists only at power-of-two sizes, so for 768-dimensional vectors you work in the next one up, 1,024, and keep the 768×768 corner. Put a seeded ±1 sign-flip in front of the transform for the randomness, repeat the pair a few times to mix, and the seed alone defines a random rotation. It is the structured cousin of the dense random matrix — an SRHT.
The 768×768 corner isn't perfectly orthogonal — clipping a 1,024-wide transform down to 768 columns nudges the column dot products off zero — but it is a slice of an orthogonal matrix rather than a pile of independently drawn vectors, so it sits far closer to orthogonal than the Rademacher planes did, close enough that the two points of recall come back, matching the shipped Haar matrix. And the matrix itself is built with integer add-subtract — ±1 sign-flips and a butterfly of sums, the values never exceeding a few thousand — so both languages materialize the identical rotation from the seed, where the QR could not. From there a code bit is just the sign of an embedding projected through that shared matrix, and the Python packer and the JavaScript reader produce the same codes, checked against the live index.
The file without the matrix
The rotation is gone from the file. A remax_kb index is now the codes, the keyword index, the chunk text, and a manifest that names a seed where it used to point at a nine-megabyte matrix. This site's index is 1.82 megabytes, with recall unchanged from the shipped-matrix version: 0.805 against 0.806.