undefined | Better HN

0 pointsmichalsustr3mo ago0 comments

It actually gets worse. The GPUs are numerically non deterministic too. So your embeddings may not be fully reproducible either.

0 comments

varshith173mo ago

You are absolutely right. GPU parallelism (especially reduction ops) combined with floating-point non-associativity means the same model can produce slightly different embeddings on different hardware.

However, that makes deterministic memory more critical, not less.

Right now, we have 'Double Non-Determinism':

The Model produces drifting floats.

The Vector DB (using f32) introduces more drift during indexing and search (different HNSW graph structures on different CPUs).

Valori acts as a Stabilization Boundary. We can't fix the GPU (yet), but once that vector hits our kernel, we normalize it to Q16.16 and freeze it. This guarantees that Input A + Database State B = Result C every single time, regardless of whether the server is x86 or ARM.

Without this boundary, you can't even audit where the drift came from.

chrisjj3mo ago

One could switch ones GPU arithmetic to integer...

... or resign oneself to the fact we've entered the age of Approximate Computing.

varshith173mo ago

Switching GPUs to integer (Quantization) is happening, yes. But that only fixes the inference step.

The problem Valori solves is downstream: Memory State.

We can accept 'Approximate Computing' for generating a probability distribution (the model's thought). We cannot accept it for storing and retrieving that state (the system's memory).

If I 'resign myself' to approximate memory, I can't build consensus, I can't audit decisions, and I can't sync state between nodes.

'Approximate Nearest Neighbor' (ANN) refers to the algorithm's recall trade-off, not an excuse for hardware-dependent non-determinism. Valori proves you can have approximate search that is still bit-perfectly reproducible. Correctness shouldn't be a casualty of the AI age.

j / k navigate · click thread line to collapse

0 comments

varshith173mo ago

However, that makes deterministic memory more critical, not less.

Right now, we have 'Double Non-Determinism':

The Model produces drifting floats.

The Vector DB (using f32) introduces more drift during indexing and search (different HNSW graph structures on different CPUs).

Without this boundary, you can't even audit where the drift came from.

chrisjj3mo ago

One could switch ones GPU arithmetic to integer...

... or resign oneself to the fact we've entered the age of Approximate Computing.

varshith173mo ago

Switching GPUs to integer (Quantization) is happening, yes. But that only fixes the inference step.

The problem Valori solves is downstream: Memory State.

We can accept 'Approximate Computing' for generating a probability distribution (the model's thought). We cannot accept it for storing and retrieving that state (the system's memory).

If I 'resign myself' to approximate memory, I can't build consensus, I can't audit decisions, and I can't sync state between nodes.

j / k navigate · click thread line to collapse