Then, maybe the right choice isn't to start a fresh DataFrame library from arrow, but rather leverage Polars and build out the distributed part (in Rust, of course, not in Python).
> We're trying to start with a simpler API that maps well to a distributed query query that we can execute well and then add the features that people request for.
That would have been a good approach on a field that has not been standardised around a single library since its infancy. Polars is beating Pandas in every possible benchmark, yet will continue to struggle for adoption "until the end". Do you really think Daft can do better ? (If yes, go ahaid, and prove me wrong !)
As a comparison, it's like trying to introduce a new transport layer protocol (https://en.wikipedia.org/wiki/QUIC) against TCP. You can do that if and only if there are obvious benefits, no drawbacks and you are prepared to wait 15 years for 30% market share.