I’ve got a local llama 3.2 3B running on my macOS. I can query it for recipes, autocomplete obvious code (this is the only thing GitHub Copilot was useful for). And answer simple questions when provided with a little bit of context.
All with much lower latency than an HTTP request to a random place, knowing that my data can’t be used to trading anything, and it’s free.
It’s absolutely insane this is the real world now.