But they do! It’s entirely possible for even inexperienced phoneticians to reconstruct speech given only a spectrogram — and it isn’t even that hard to do so. I cannot make any firm statements about these ACF images, but given that they present no temporal information, I find it difficult to imagine this being possible with them.
And as for ‘conveying the nature of sound’, I invite you to consider e.g. [0] or [1]. It’s easy to see on the spectrogram that some sounds are noisy, some are resonant, some are strong, some are weak, and so on.
[0] https://home.cc.umanitoba.ca/~krussll/phonetics/acoustic/spe...