Also the fact that an M5 version will be coming, and they likely know they are going to sell out on day one (I expect we'll see a price correction from Apple for higher end configs of M5 studios, base price will probably stay the same), so they need to build up stock reserves.
This piqued my interest on how it does it and after briefly checking the project it seems it only has two features for automatic photo categorization. 1) it can group photos by date and 2) It has face detection and recognition that uses trained weights (so ML "intelligence").
I got away from google images and upload to my own Immich instance.
I also use an open source camera app on fdroid to degoogle that whole path.
"They" fully well know that they current frontier model are maybe 6 month ahead of what people will have access to without their control. See Deepseek as Exibit B
The reason you can't run these locally are more with the fact that those mythos sized models require extreme amount of memory and processing power to run at acceptable speeds. And neither you, nor I can afford to pay for those resources to run those models locally. A big reason is that "running locally" means running on your own hardware. And for almost everyone this means "running on hardware that will spent a big portion of its time just sleeping". Because data center and providers have higher utilization rates, they can easily outpace you. That and the fact that when they place an order it's usually for hundreds of thousands of units.
That is why the huge lobby machine is grinding away to make those models illegal.
Rather I think it is just hard for local LLMs to compete in this early stage when the cloud providers are allowed by investors to be unprofitable.
You can grow the utilization rate well beyond that if you don't always care about getting a quick, real-time response. (And if you do, then maybe the cloud model was the better deal after all!)
And, assuming the allegations are true, don't things like Deepseek and Qwen offer existence proofs that frontier models are (and will forever be) trivially distilled down to run domain-specific tasks on boxes that cost a few months of Claude Max subscription?
Isn't that a function of RAM supply not being available now?
Even if that weren't the case, every corp _needs_ you to be on a subscription.
qwen3.5-2b and qwen3.5-4b are great at document parsing. They can run on CPU
qwen3.6-27b and gemma4-31b are borderline better than the human eye in some cases. Their OCR isn't perfect, but they're seriously good. They can still run on the CPU but you'll be waiting minutes per document.
You can demand JSON, YAML, MD, or freeform text just by varying the prompt. Even if you have a custom template, you can just put that in the prompt and they'll do an OK-ish job.
There's also models that aren't in the r/locallama zeitgeist. IBM released a new 4b parameter model for structured text extraction last week, and there's a sea of recent chinese OCR models too.
IMO the open wights models are so good that in a lot of cases it's not worth paying frontier labs for OCR purposes. The only barrier to entry is the effort to set up a pipeline, and havin the spare CPU/GPU capacity.
Besides those, there are a few smaller open-weights models that are dedicated for OCR tasks, for instance DeepSeek-OCR-2 and IBM granite-vision-4.1-4b. (They can be found on huggingface.co)
The dedicated vision models can be run on much cheaper hardware, including smartphones, than the big models that can process images besides text.
Similarly, besides bigger multimodal models, that can accept audio, images or text as imput, there are smaller open-weights models that are dedicated for speech recognition, e.g. Xiaomi MiMo-V2.5-ASR and IBM granite-speech-4.1-2b.
Apple doesn't even sell a model. They just have a deal to use Googles. They can't "protect" their cloud version of a model they don't have.
That's an interesting way to view the world. I mean, utterly stupid as it is, but interesting.
But the previous sentence is even stupider (a Perl script 10 years ago could write code like Qwen does now?), so I guess at least it's consistent.