Yeah, it's "low level" in the sense that you still have to manually chunk up your data. I agree that Dask, Polars, etc are better if you want a more transparent distributed computing experience. Joblib is great for if you already have working single-process code and you just want to parallelize it. It's what Scikit Learn uses internally, for example.
But as it pertains to the original thread topic, it's still fairly high-level. I'd consider it bit higher-level than concurrent.futures for example.