I don't know much about DB internals, but to me this sounds like lot of the compute is getting aggregated to the storage layer. I would think that "filtering, aggregation, projection" is fairly big chunk of the computation that typical DB does?
> Many SQL aggregations are monotonic operations (e.g. MAX, SUM, etc) that can be partially completed on each node and then post-merged. Some (e.g. DISTINCT) can be transformed into monotonic ops with some effort. Some aren't possible to do this way. (Ref on monotonicity: arxiv.org/pdf/1901.01930)
The benefit of this is that a lot more work is done _close_ to the data. The trend is that bandwidth is getting larger in data centers, but latency isn't improving at the same rate. Reducing the number of round trips between QP and storage greatly improves the overall query latency, even if you have to do more work on the storage.
But isn't that fundamentally at odds with the central idea of disaggregation
> At a fundamental level, scaling compute in a database system requires disaggregation of storage and compute. If you stick storage and compute together, you end up needing to scale one to scale the other, which is either impossible or uneconomical.
So either you can get good perf by doing the work close to data, or get good scalability by separating compute and data. But I can't see how you can do both.
https://news.ycombinator.com/item?id=42308716
for more