I think that is a very narrow perspective. Enormous numbers of consumers own $50,000 cars, but a pair of $2000 GPUs is "not consumer"?
I agree with your view that cheap tokens on SOTA are a trap-- people should use local AI or no AI.
A friend an I had previously worked on an entropy extraction scheme and he recently got around to making a writeup about our work: https://wuille.net/posts/binomial-randomness-extractors/
I instructed the agent to read the URL, implement the technique in C++ for 32-bit registers, then make a SIMD version that interleaves several extractors in parallel for better performance. It implemented it (not hard since there was an implementation there that it read), then wrote more extensive tests. Then it vectorized it. It got confused a few times during debugging because the algorithm uses some number theory tricks so that overflows of intermediate products don't matter and it was obviously trained a lot on ordinary code were such overflows are usually fatal. I instructed it to comment the code explaining why the overflows are fine and had it continue which mostly solved its confusion.
It successfully got the initial 12MB/s scalar implementation to about 48MB/s. Then I told it to keep optimizing until it reaches 100MB/s. I came back the next day and it had stopped after 6 hours when it achieved just over 100MB/s. Reading what it did: it went off looking at disassembly, figured out what hardware it was running on, and reading microarch timing tables online and made some better decisions, tried a lot of things that didn't work, etc. (And of course, the implementation is correct).
I'm pretty skeptical about AI and borderline hateful of many people who (ab)use it and are deluded by it-- but I think this experience shows that a small local model can be objectively useful.
(oh and this experience was also while I only had the model running at 19tok/s)
Running the model in a loop where it can get feedback from actually testing stuff allows you to make progress in spite of making many mistakes.
I could have done this work myself but I didn't have to and I certainly spent less time checking in and prodding it than it would have taken me to do it. In my case I wondered how much faster parallel extractors using SIMD might be-- an idle curiosity that would have gone unanswered if not for the AI.
Congrats, but you're in the 0.0001% thats not just frying their brains, fapping to their local models or doing various magic tricks like a toddler entertained by playing with velcro.
At the end of the day you lost an opportunity to improve yourself and excercise your brain, maybe the opportunity cost is worth it idk, but Im going to keep taking things slow.
Handmade swiss watches > mass manufactured immitations. Handmade clothes > walmart clothes.
$50k is a median priced car in the US. I'd guess >99.9% of people do not own $4000 of GPUs. I consider myself a computer person and I dont think I even own $4000 of computer hardware in total
A car is super useful, so is an AI. But even if we decide cars are incomparably more useful a great many people pay much more than $4000 over the minimum viable car, and that's money that could be deployed to secure access to private, secure, and autonomous AI facilities. A few thousand dollars in computing is consumer hardware, or at least could easily be with more reason and awareness driving adoption.
People spend a LOT of money in things less useful than local copy of qwen3.6-27b can be.
A top-spec MacBook Pro is >$4k, so I assure you that plenty of computer people do own $4k of computer hardware.
Hell, most tech folks are wandering around with a ~$1k smartphone in their pocket too.