When does this roll out exactly? And exactly which inference actually is on-device?
I think people have been fooled by marketing for this one and the new Co-Pilot PCs into thinking that most of the AI really is running on-device. The models that run fast locally are still fairly limited compared to what runs in the cloud.
The public betas will be available later this month. The official OS releases are usually in Sept and Oct. Some of the AI stuff should be available right away but rumors say that some of the more advanced Siri features (like app integration) might not launch until after the first of the year.
Specifically, the iOS update comes out with the new iPhones (usually dropping the same day the new devices are available) and the other OSes are usually timed to be released with it since features are shared across the OSes (so they want to release them at the same time) and the beta periods are the same.