At face value I'm suspicious, I thought that there were genuine ambiguities or label errors in some of those datasets that makes it very surprising you could even really define 100% accuracy.
Also, reading the paper a bit, it's either badly written and I just don't understand what they're saying at all, or BS. It doesn't really explain anything about their implementation, it just says they did do and got 100% accuracy, and throws in a bunch of jargon. Maybe I'm just not familiar with this area enough, but the way it's laid out raises even more red flags