the decompressed size should be okay since it's not the same as parsing and JITing 36MB of JS.
what are people's experiences with this?
Arrow is especially powerful across the WASM <--> JS boundary! In fact, I wrote a library to interpret Arrow from Wasm memory into JS without any copies [0]. (Motivating blog post [1])
[0]: https://github.com/kylebarron/arrow-js-ffi
[1]: https://observablehq.com/@kylebarron/zero-copy-apache-arrow-...
For high performance code, I'd have expected overhead in %s, not Xs. And not surprised to hear slowdowns for any straying beyond that -- cool to see folks have expanded further! More recently, we've been having good experiences more recently here in Perspective <-arrow-> Loaders, enough so that we haven't had to dig deeper. Our current code is targeting < 24 FPS, as genAI data analytics is more about bigger volumes than velocity, so unsure. However, it's hard to imagine going much faster though given it's bulk typed arrays without copying, especially on real code.
when i benchmarked the fastest lib to simply run the protobuf decode (https://github.com/mapbox/pbf), it was 5x slower than native JSON parsing in browsers for dataframe-like structures (e.g. a few dozen 2k-long arrays of floats). this is before even hitting any ArrowJS iterators, etc).
Grafana's Go backend uses Arrow dataframes internally, so using the same on the frontend seemed like a logical initial choice back then, but the performance simply didn't pan out.
There is a library by the same author called lonboard that provides the JS bits inside JupyterLab. https://github.com/developmentseed/lonboard
<speculation>I think it is based on the Kepler.gl / Deck.gl data loaders that go straight to GPU from network.</speculation>