Interesting, thank you for the link. I had a hunch this should be possible but I wasn't aware that it was already proven. I used a similar trick on image recognition: turn images into a single 32 bit word by heavy pixelation and then look up a matching description. It's interesting how often that will work once you feed it with enough data. After all, that gives you 4 billion inputs mapped onto 4 billion descriptions, and plenty of those will contain the Eiffel tower with various cloudy backgrounds apparently recognized perfectly.
It's a total cheat but it is funny how close that can get you to something that might be actually useful.