> It's so complex to work with
This is the opposite of my experience.
> To read a parquet file in Python, you need Apache Arrow and Pandas.
Or DuckDB.
import duckdb
df = duckdb.query("select * from 'a.parquet'")
Want to look inside a Parquet file? Use Visidata.
vd a.parquet
> I remember dealing with Parquet file for a job a while back and this same question came up: Why isn't there a simpler way, for when you're not in the data science stack and you just need to convert a parquet file to csv/json/read rows? Is is a limitation of the format itself?
Do you consider Pandas a "data science" stack? To me, it's just a library like any other that makes it easy to work with tabular data. Even for CSV, there is csvreader (usually not a good idea to deal with CSV by hand). Outputting to CSV is literally a one liner in Pandas or DuckDB.
import pandas as pd
# output to CSV
pd.read_parquet("a.parquet").to_csv("a.csv")
# output to JSON (choose from any number of orientations)
pd.read_parquet("a.parquet").to_json(orient="table")
# read rows
for row in pd.read_parquet("a.parquet").itertuples():
print(row)