I understand this is more the primitive that you would build such a thing on top of, just that the first question I always have for novel compressors is "how do they do on these example streams of data".
This is, for lack of a better term, a "metacompressor", but it will be interesting to see which of the choices end up dominating; in my past experiences with metacompression, one algorithm is usually consistently ahead.
The new OpenZL SDDL2 (Simple Data Description Language) supports several different floating-point types. It would be worthwhile to contribute some of the FC project's experience to OpenZL. Now the OpenZL supported types:
| Type | Size |Endian|
|----------------|---------|-----|
| `Int8` | 1 byte | N/A |
| `UInt8` | 1 byte | N/A |
| `Int16LE/BE` | 2 bytes | Yes |
| `UInt16LE/BE` | 2 bytes | Yes |
| `Int32LE/BE` | 4 bytes | Yes |
| `UInt32LE/BE` | 4 bytes | Yes |
| `Int64LE/BE` | 8 bytes | Yes |
| `UInt64LE/BE` | 8 bytes | Yes |
| `Float16LE/BE` | 2 bytes | Yes |
| `Float32LE/BE` | 4 bytes | Yes |
| `Float64LE/BE` | 8 bytes | Yes |
| `BFloat16LE/BE`| 2 bytes | Yes |
| `Bytes(n)` | n bytes | N/A |
Some links:- https://github.com/facebook/openzl/releases/tag/v0.2.0
- https://openzl.org/getting-started/introduction/
If you want a double in 32 bits, convert to single precision float. This will beat the relative error of the code you linked to by orders of magnitude, and allow the range of float (~1e38) rather than be limited to +- 1e9.
It is not trying to replace zstd or lz4. The idea is narrower: take blocks of doubles, try a set of float-specific predictors/transforms/coders, and emit whichever representation is smallest for that block.
It is aimed at time-series, scientific, simulation, and analytics data where the numbers often have structure: smooth curves, repeated values, fixed increments, periodic signals, predictable deltas, or low-entropy mantissas.
The API is intentionally small: "fc_enc", "fc_dec", a config struct, and a few counters to inspect which modes won. Decode is parallel and meant to be fast; encode spends more CPU searching for a better representation.
Current caveats: x86-64 only for now, tuned for IEEE-754 doubles, research-grade rather than production-hardened.