Hey folks, I am the developer behind muesli - which is your one stop app for all your speech to text needs, be it voice dictation or meeting transcriptions that runs on device on your Apple Neural Engine using CoreML based STT models (Parakeet, Whisper, Cohere transcribe). Everything is open source and we are at 160 stars - au naturale - would love for folks to use it and contribute further to the development
Currently the on device models such as Parakeet and Whisper are great for English, faster than cloud hosted models a little less accurate - if you switch on the post processing, the ASR output goes through a fine tuned Qwen 3.5 model that improves the accuracy, formatting etc - all of the code is open source feel free to inspect and suggest perf improvements as a PR!