It's one of the backends available in Splink, our FOSS record linkage software and it's revolutionary how it allows users to execute large scale probabilistic record linkage ridiculously cheaply. It wasn't long ago you needed very expensive proprietary software plus a big on prem cluster, costing in the hundreds of thousands, to achieve this.
A lot of the magic for me is on the infrastructure side: how they can read/write large datasets from s3 so quickly, so the value isn't just in the SQL engine.