You pay such a high overhead marshalling that data into an Arrow RecordBatch. Best thing ever is to work with the Parquet file and not even decompress the chunks that you don't need. Of course, this assumes that you're writing summary statistics as part of the metadata, which we plan to do.