I’ve worked professionally in quant finance, image processing, defense research, and several mid-to-large ecommerce and payment processor companies.
In all of them, data provenance has been a first class consideration of machine learning and data platform teams, like a day-to-day concern and baked in to architecture review guidelines and production checklists and whatnot for every ML project.
In many of these companies we had teams of 20-40 ML scientists, all of whom knew about data provenance as a first class consideration in their work, had experience with it from their past jobs and academic programs, and considered it on equal footing with any aspect of data curation, model selection, model training and model serving.