> What's not straightforward is recognizing cover songs and the like. But that's not only non-trivial but AFAIK can't be done.
Well, you could translate the music into actual notes (or musical intervals), and use Smith-Waterman (or any more advanced and more recent technique) to find the song with the lowest edit-distance.