undefined | Better HN

0 pointswongarsu3y ago0 comments

Siri and Cortana have to run at least in real time, with reasonable compute resources. Probably faster than real time when the audio gets shipped off to the cloud and transcribed there. This model can't do that (in the "large" version, which the examples use).

Also, you are comparing Whisper's highlight reel with everyday performance of other models. Nobody shows their weaknesses in their highlight reel.

0 comments

coder5433y ago

Someone else in this thread[0] said Whisper was running at 17x real time for them. So, even a weak machine might be able to do an acceptable approximation of real time with Whisper.

Also, I feel like shipping to the cloud and back has been shown to be just as fast as on device transcription in a lot of scenarios. Doing it on device is primarily a benefit for privacy and offline, not necessarily latency. (Although, increasingly powerful smartphone hardware is starting to give the latency edge to local processing.)

Siri's dictation has had such terrible accuracy for me (an American English speaker without a particularly strong regional accent) and everyone else I know for so many years that it is just a joke in my family. Google and Microsoft have much higher accuracy in their models. The bar is so low for Siri that I automatically wonder how much Whisper is beating Siri in accuracy... because I assume it has to be better than that.

I really wish there was an easy demo for Whisper that I could try out.

[0]: https://news.ycombinator.com/item?id=32928207

lunixbochs3y ago

17x realtime on a 3090

I did some basic tests on CPU, the "small" Whisper model is in the ballpark of 0.5x realtime, which is probably not great for interactive use.

My models in Talon run closer to 100x realtime on CPU.

coder5433y ago

“CPU” isn’t necessarily the benchmark, though. Most smartphones going back years have ML inference accelerators built in, and both Intel and AMD are starting to build in instructions to accelerate inference. Apple’s M1 and M2 have the same inference accelerator hardware as their phones and tablets. The question is whether this model is a good fit for those inference accelerators, and how well it works there, or how well it works running on the integrated GPUs these devices all have.

Brute forcing the model with just traditional CPU instructions is fine, but… obviously going to be pretty slow.

I have no experience on the accuracy of Talon, but I’ve heard that most open source models are basically overfit to the test datasets… so their posted accuracy is often misleading. If Whisper is substantially better in the real world, that’s the important thing, but I have no idea if that’s the case.

2 more replies

MacsHeadroom3y ago

> I really wish there was an easy demo for Whisper that I could try out.

Like the colab notebook linked on the official Whisper github project page?

coder5433y ago

Sure, but I did see one linked in another thread here on HN after posting that comment.

The5thElephant3y ago

Good point about realtime or not, however with ML I have found the weaknesses get addressed pretty fast by someone. There is a big step between proof of concept and practical application though, so we shall see.

alex_marchant3y ago

Siri until ios 15 was done in the cloud IIRC.

j / k navigate · click thread line to collapse

0 comments

coder5433y ago

Someone else in this thread[0] said Whisper was running at 17x real time for them. So, even a weak machine might be able to do an acceptable approximation of real time with Whisper.

I really wish there was an easy demo for Whisper that I could try out.

[0]: https://news.ycombinator.com/item?id=32928207

lunixbochs3y ago

17x realtime on a 3090

I did some basic tests on CPU, the "small" Whisper model is in the ballpark of 0.5x realtime, which is probably not great for interactive use.

My models in Talon run closer to 100x realtime on CPU.

coder5433y ago

Brute forcing the model with just traditional CPU instructions is fine, but… obviously going to be pretty slow.

2 more replies

MacsHeadroom3y ago

> I really wish there was an easy demo for Whisper that I could try out.

Like the colab notebook linked on the official Whisper github project page?

coder5433y ago

Sure, but I did see one linked in another thread here on HN after posting that comment.

The5thElephant3y ago

alex_marchant3y ago

Siri until ios 15 was done in the cloud IIRC.

j / k navigate · click thread line to collapse