Huggingface uses git-lfs for large datasets with good success. git-lfs on GitHub gets very pricey at higher volumes of data. Would love the affordability of object storage, just with a better git blob storage interface, that will be around in the future.
Most of these systems do their own hash calculations and are not interchangeable with each other. I feel like git-lfs has the momentum at the momentum in data-science at the moment, but needs some better options for people who want a low cost storage option that they can control.
Huggingface is great, but it's one more service to onboard if you're in an enterprise. And data privacy/retention/governance means that many people would liek their data to reside on their own infrastructure.
If AWS were to give us a low cost git-lfs hosted service on top of S3 it would be very popular.
If anyone knows of some good alternatives, please let us know!