Deterministic encryption can be ok if the data that you’re encrypting is already really random (high min-entropy). Compressed audio and video streams have a decent amount of entropy. Probably not enough to satisfy a cryptographer, but it’s probably enough to make it very difficult to learn much from 128-bit AES ECB blocks.
Note that everyone’s favorite ECB example with the picture of Tux the Linux penguin is not very realistic, because the plaintext is not compressed. If you ECB a JPEG or a PNG, you won’t see the same patterns.
I teach the attacks on ECB in my network security class. It’s bad, but AES is not the Caesar cipher. I’m not sure “trivially broken” is quite right.
That said, I am really curious what Zoom is actually doing here. Going to have to take a look today. My guess is that the real fail from using ECB mode is more likely to come from using it on audio/video metadata, or on other more structured parts of the protocol.