If you know of any implementations that can look at a spectrogram and say “hey there’s peaks at 150hz, 220hz and 300hz with standard deviations of 5hz, 7hz, and 10hz, decreasing in frequency over time thus this is a deep voice saying ‘ay’” and get it right every time I’d be really interested in seeing it (besides neural networks)