A counterpoint I'd raise is I wonder how aggressive Dall-E 2 is in making assumptions about words it hasn't seen before.
Hard to do given that it's read essentially the entire internet, however someone could make up some latin-esque words that people would be able to guess the meaning of.
If the model is as good as people at assuming the meaning of such made up words, it could stand to reason that if it were aggressive enough in this it might be doing the same thing with gibberish and thus ending up with it's own interpretation of the word, which would land it back in a more targeted concept space.
I'd love to see someone craft some words that most people could guess the meaning of, and see how Dall-E 2 fairs.