[1] https://blogs.nvidia.com/blog/open-models-data-tools-acceler...
Nvidia has always had its own family of models, it's nothing new and not something you should read too much into IMHO. They use those as template other people can leverage and they are of course optimized for Nvidia hardware.
Nvidia has been training models in the Megatron family as well as many others since at least 2019 which was used as blueprint by many players. [1]
It doesn't get a ton of attention on /r/LocalLLaMA but it is worth trying out, even if you have a relatively modest machine.
[0] https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B...
[1] https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF
Megatron was a research project.
NVidia has professional services selling companies on using Nemo for user facing applications.
Commodity businesses are price chasers. That's the only thing to compete on when product offerings are similar enough. AI valuations are not setup for this. AI Valuations are for 'winner takes all' implications. These are clearly now falling apart.
As problematic as SWE-Bench is as a benchmark, the top commercial models are far better than anything else and it seems tough to see this as anything but a 3 horse race atm.
I'm not saying this is what will happen, but people obviously bet a lot of money on that.
Point me to these? Would like to have a look.
1. OpenAI bet largely on consumer. Consumers have mostly rejected AI. And in a lot of cases even hate it (can't go on TikTok or Reddit without people calling something slop, or hating on AI generated content). Anthropic on the other hand went all in on B2B and coding. That seems to be the much better market to be in.
2. Sam Altman is profoundly unlikable.
People like to complain about things, but consumers are heavily using AI.
ChatGPT.com is now up to the 4th most visited website in the world: https://explodingtopics.com/blog/chatgpt-users
Besides OpenAI was never going to recoup the billions of dollars based on advertising or $20/month subscriptions
Source on that?
Lots of organizations offer ChatGPT subscriptions, and Microsoft pushes Copilot as hard as it can which uses GPT models.
He says a lot of fluff, doesn’t try to be very extreme, and focuses on selling. I don’t know him personally but he comes across like an average person if that makes sense (in this environment that is).
I think I personally prefer that over Elon’s self induced mental illnesses and Dario being a doomer promoting the “end” of (insert a profession here) in 12 months every 6 months. It’s hard for me to trust a megalomaniac or a total nerd. So Sam is kinda in the middle there.
I hope OpenAI continues to dominate even if the margins of winning tighten.
> Anthropic relies heavily on a combination of chips designed by Amazon Web Services known as Trainium, as well as Google’s in-house designed TPU processors, to train its AI models. Google largely uses its TPUs to train Gemini. Both chips represent major competitive threats to Nvidia’s best-selling products, known as graphics processing units, or GPUs.
So which leading AI company is going to build on Nvidia, if not OpenAI?
If I were Nvidia I would be hedging my bets a little. OpenAI looks like it's on shaky ground, it might not be around in a few years.
https://blogs.nvidia.com/blog/open-models-data-tools-acceler...
Interesting times.
> So which leading AI company is going to build on Nvidia, if not OpenAI?
It's xAI.
But what matters is that there is more competition for Nvidia and they bought Groq to reduce that. OpenAI is building their own chips as well as Meta.
The real question is this: What happens when the competition catches up with Nvidia and takes a significant slice out of their data center revenues?
https://techcrunch.com/2026/01/26/nvidia-invests-2b-to-help-...
For example, Amazon isn’t able to train its own models so it hedges by investing in Anthropic and OpenAI. Oracle, same with OpenAI deal. Nvidia wants to stay in OpenAI and Anthropic’s tech stack.
It’s all jockeying for position.
I guarrantee you that in 10 years time, you will get claims of unethical conduct by those companies only after the mania has ended (and by then the claimants have sold all their RSUs.)
https://github.com/openai/codex/issues/9253
OTOH, if Anthropic did that to Claude Code, there wasn’t a moderately straightforward workaround, and Anthropic didn’t revert it quickly, it might actually be a risk-the-whole-business issue. Nothing makes people jump ship quite like the ship refusing to go anywhere for weeks while the skipper fumbles around and keeps claiming to have fixed the engines.
Also, the fact that it’s not major news that most business users cannot log in to the agent CLI for two weeks running is not major news suggests that OpenAI has rather less developer traction than they would like. (Personal users are fine. Users who are running locally on an X11-compatible distro and thus have DISPLAY set are okay because the new behavior doesn’t trigger. It kind of seems like everyone else gets nonsense errors out of the login flow with precise failures that change every couple days while OpenAI fixes yet another bug.)
"Root Cause
The backend enforces an Enterprise-only entitlement for codex_device_code_auth on POST /backend-api/accounts/{account_id}/beta_features. Your account is on the Team plan, so the server rejects the toggle with {"detail":"Enterprise plan required."} "
and so on and so forth. At any given day i have several such long-term tickets that get ultimately escalated to me (i'm in dev and usually the guy who would pull the page with ssh tunnel or credentials copying :)
The backstory here is that codex-rs (OpenAI’s CLI agent harness) launched an actual headless login mechanism, just like Claude Code has had forever. And it didn’t work, from day one. And they can’t be bothered to revert it for some reason.
Sure, big enterprises are inept. But this tool is fundamentally a command line tool. It runs in a terminal. It’s their answer to one of their top two competitors’ flagship product. For a company that is in some kind of code red, the fact that they cannot get their ducks in a row to fix it is not a good sign.
Keep in mind that OpenAI is a young company. They should have have a thicket of ancient garbage to wade through to fix this — it’s not as if this is some complex Active Directory issue that no one knows how to fix because the design is 30-40 years old and supports layers and layers of legacy garbage.
It’s also possible that the majority of people hitting it are using the actual website support (which is utterly and completely useless), since the bug is only a bug in codex-rs to the extent that codex-rs should have either reverted or deployed a workaround already.
And yes, Sam is incredibly unlikable. Every time I see him give an interview, I am shocked how poorly prepared he is. Not to mention his “ads are distasteful, but I love my supercar and ridiculous sunglasses.”
Microsoft has GitHub - the world’s biggest pile of code training data, plus infinite cash.
OpenAI has …… none of these advantages.
Google has data, TPUs, and a shitload of cash to burn
but in this case it is, ChatGPT name is really, really strong, it's like "just google it" instead of "just search the web"
Companies like Google produce and operate AI models largely using their own TPUs rather than NVidia's GPUs. We've seen the Chinese produce pretty competitive open models with either older NVidia GPUs or alternative GPUs because they are not allowed to buy the newer ones. And AMD, Intel and other chip makers are also eager to get in on the action. Companies like Microsoft, Amazon, etc. have their own chips as well (similar to Google). All the hyperscalers are moving away from NVidia.
And then Apple runs a non Intel and non NVidia based range of workstations and laptops that are pretty popular with AI researchers because the M series CPU/GPU/NPU is pretty decent value for running AI models. You see similar movement with ARM chips from Qualcomm and others. They all want to run AI models on phones, tablets, laptops. But without NVidia.
NVidia's bubble is about vastly overcharging for a thing that only they can provide. Their GPU chips have enormous margins relative to CPU chips coming out of the same/similar machines. That's a bubble. As soon as you introduce competition, the companies with the best price performance wins. NVidia is still pretty good at what they do. But not enough to justify an order of magnitude price/cost difference.
NVidia's success has been predicated on its proprietary software and instruction set (CUDA). That's a moat that won't last. The reason Google can use its own TPUs rather than CUDA is that it worked hard to get rid of their CUDA dependence. Same for the other hyperscalars. At this point they can do training and inference without CUDA/NVidia and its more cost effective.
The reason that this 100B deal is apparently being reconsidered is that it is a bad deal for OpenAI. It was going to overpay for a solution that they can get cheaper elsewhere. It's bad news for NVidia, good news for OpenAI. This deal started out with just NVidia. But at this point there are also deals with AMD, MS, and others. OpenAI like the other hyperscalers is not betting the company on NVidia/CUDA. Good for them.
Yes it is. I think even for multiple reasons. Competition in that space not sleeping is one but it's also a huge overestimation of demand combined with the questionable believe those GPUs and the Datacenters housing them can actually be built and put into operation as fast as envisioned.
> The reason that this 100B deal is apparently being reconsidered is that it is a bad deal for OpenAI. It was going to overpay for a solution that they can get cheaper elsewhere. It's bad news for NVidia, good news for OpenAI. This deal started out with just NVidia. But at this point there are also deals with AMD, MS, and others. OpenAI like the other hyperscalers is not betting the company on NVidia/CUDA. Good for them.
I think in case of OpenAI both may be true. While what you are saying makes sense, NVs first mover advantage obviously can't last forever, OpenAI currently does have little to no competitive advantage over other players. Combine this with the fact that some (esp. Google) sit on a huge pile of cash. In contrast for OpenAI the party is pretty much over as soon as investors stop throwing money into the oven so they might need to cut back a bit.
https://preview.redd.it/sam-altman-on-the-model-v0-7u2a2o7lr...
The tools on top of the models are the path and people building things faster is the value.
Those without models are hugely vulnerable to sudden rug pulls.
They’re never gonna recover their investment and eventually their partners will run away.
The GPT models are not a moat.
- Nvidia is the most valuable company. Why? It makes GPUs. Why does that matter? Because AI is faster on them than CPUs, ASICs are too narrowly useful, and because first-mover advantage. AMD makes GPUs that work great for AI, but they're a fraction of the value of Nvidia, despite the fact that they make more useful products than Nvidia. Why? Nvidia just got there first, people started building on them, and haven't stopped, because it's the path of least resistance. But if Nvidia went away tomorrow, investors would just pour money into AMD. So Nvidia doesn't have any significant value compared to AMD other than people are lazy and are just buying the hot thing. Nvidia was less valuable than AMD before, they'll return there eventually; all AMD needs is more adoption and investment.
- Every frontier model provider out there has invested billions to get models to the advanced state they're in today. But every single time they advance the state of the art, open weights soon match them. Very soon, there won't be any significant improvement, and open weights will be the same as frontier, meaning there's no advantage to paying for frontier models. So within a few years, there will be no point to paying OpenAI, Anthropic, etc. Again, these were just first-movers in a commodity market. The value just isn't there. They can still provide unique services, tailored polished apps, etc (Anthropic is already doing this by banning users who have the audacity to use their fixed-price plans with non-Anthropic tools). But with AI code tools, anyone can do this. They are making themselves obsolete.
- The final form of AI coding is orchestrated agent-driven vibe-coding with safeguards. Think an insane asylum with a bowling league: you still want 100 people to autonomously (and in parallel) knock the pins knocked over, but you have to prevent the inmates from killing anyone. That's where the future of coding is. It's just too productive to avoid. But with open models and open source interfaces, anyone can do this, whether with hosted models (on any of 50 different providers), or a Beowulf cluster of cobbled together cheap hardware in a garage.
- Eventually, in like 5-10 years (a lifetime away), after AI Beowulfs have been a fad for a while, people will tire of it and move back to the cloud, where they can run any model they want on a K8s cluster full of GPUs, basically the same as today. Difference between now and then is, right now everyone is chasing Anthropic because their tools and models are slightly better. But by then, they won't be. Maybe people will use their tools anyway? But they won't be paying for their models. And it's not just price: one of the things you learn quickly by running models, is they're all good for different things. Not only that, you can tweak them, fine-tune them, and make them faster, cheaper, better than what's served up by frontier models. So if you don't care about the results or cost, you could use frontier, but otherwise you'll be digging deep into them, the same way some companies invest in writing their own software vs paying for it.
- Finally, there's the icing on the cake: LLMs will be cooked in 10 years. I keep reading from AI research experts that "LLMs are a dead end" - and it turns out it's true. LLMs are basically only good because we invest an unsustainable amount of money in the brute-forcing of a relatively dumb form of iteration: download all knowledge, do some mind-bogglingly expensive computational math on it, tweak the reasults, repeat. There's only so many of that loop you can do, because fundamentally, all you're doing is trying to guess your way to an answer from a picture of the past. It doesn't actually learn, the way a living organism learns, from experience, in real-time, going forward; LLMs only look backward. Like taking a snapshot of all the books a 6 year old has read, then doing tweaks to try to optimize the knowledge from those books, then doing it again. There's only so much knowledge, only so many tweaks. The sensory data of the lived experience of a single year of life of a 6 year old is many times more information than everything ever recorded by man. Reinforcement Learning actually gives you progressive, continuously improved knowledge. But it's slow, which is why we aren't doing it much. We do LLMs instead because we can speed-run them. But the game has an end, and it's the total sum of our recorded knowledge and our tweaks.
So LLMs will plateau, frontier models will make no sense, all lines of code will be hands-off, and Nvidia will return to making hardware for video games. All within about 10 years. With the caveat that there might be a shift in global power and economic stability that interrupts the whole game.... but that's where we stand if things keep on course. Personally, I am happy to keep using AI and reap the benefits of all these moronic companies dumping their money into it, because the open weights continue being useful after those companies are dead. But I'm not gonna be buying Nvidia stock anytime soon, and I'm definitely not gonna use just one frontier model company.
The closed LLMs with the biggest amount of users will eventually outperform the open ones too, I believe. They have a lot of closed data that they can train their next generation on. Especially the LLMs that the scientific community uses will be a lot more valuable (for everyone). So in terms of quality, the closed LLMs should eventually outperform the open ones, I believe, which is indeed worrisome.
I also felt anxious early december about the valuations, but, one thing remains certain. Compute is in heavy demand, regardless of which LLM people use. I can't go back to pre-AI. I want more and more and faster and faster AI. The whole world is moving that way it seems like. I'm invested into phsyical AI atm (chips, ram, ...) whose evaluations look decently cheap.
- LLMs have fixed limitations. The first one is training, the dataset you use. There's only so much information in the world and we've largely downloaded it all, so it can't get better there. Next you can do training on specific things to make it better at specific things, but that is by definition niche; and you can actually do that for free today with Google's Tensors in free Cloud products. Later people will pay for this, but the point is, it's ridiculously easy for anyone to fine-tune training, we don't need frontier companies for that. And finally, LLM improvements come by small tweaks to models that already come to open weights within a matter of months, often surpassing the frontier! All you have to do is sit on your ass for a couple months and you have a better open model. Why would anyone do this? Because once all models are extremely good (about 1 year from now) you won't need them to be better, they'll already do everything you need in 1-shot, so you can afford to sit and wait for open models. Then the only reason left to use frontier cloud is that they host a model; but other people do cloud-hosted models! Because it's a commodity! (And by the way, people like me are already pissed off at Anthropic because we're not allowed to use OAuth with 3rd party tools, which is complete bullshit. I won't use them on general principle now, they're a lock-in moat, and I don't need them) There will also be better, faster, more optimized open models, which everyone is going to use. For doing math you'll use one model, for intelligence you'll use a different model, for coding a different model, for health a different model, etc, and the reason is simple: it's faster, lower memory, and more accurate. Why do things 2x slower if you don't have to? Frontier model providers just don't provide this kind of flexibility, but the community does. Smart users will do more with less, and that means open.
On the hardware:
- Def it will continue to be investment-worthy, but be cautious. The growth simply isn't going to continue at pace, and the simple reason is we've already got enough hardware. They want more hardware so they can continue trying to "scale LLMs" the way they have with brute force. But soon the LLMs will plateau and the brute force method isn't going to net the kind of improvements that justify the cost. Demand for hardware is going to drop like a stone in 1-2 years; if they don't cease building/buying then, they risk devaluing it (supply/demand), but either way Nvidia won't be selling as much product so there goes their valuation. And RAM is eventually going to get cheaper, so even if demand goes up, spending is less. The other reason demand won't continue at pace is investors are already scared, so the taps are being tightened (I'm sure the "Megadeal" being put on-hold is the secret investment groups tightening their belts or trying to secure more favorable terms). I honestly can't say what the economic picture is going to look like, but I guarantee you Nvidia will fall from its storied heights back to normal earth, and other providers will fill the gap. I don't know who for certain, but AMD just makes sense, because they're already supported by most AI software the way Nvidia is (try to run open-source inference today, it's one of those two). Frontier and cloud providers have Tensors and other exotic hardware, which is great for them, but everyone else is gonna buy commodity chips. Watch for architectures with lower price and higher parts availability.
https://www.macrobusiness.com.au/2021/05/the-great-semicondu...
Here is a long article from last year about Sam Altman.
https://www.nytimes.com/2024/09/25/business/openai-plan-elec...
https://finance.yahoo.com/news/tsmc-rejects-podcasting-bro-s...
> TSMC’s leadership dismissed Altman as a “podcasting bro” and scoffed at his proposed $7 trillion plan to build 36 new chip manufacturing plants and AI data centers.
I thought it was ridiculous when I read it. I'm glad the fabs think he's crazy too. If he wants this then he can give them the money up front. But of course he doesn't have it.
After the dot com collapse my company's fabs were running at 50% capacity for a few years and losing money. In 2014 IBM paid Global Foundries $1.5 billion to take the fabs away. They didn't sell the fabs, they paid someone to take them away. The people who run TSMC are smart and don't want to invest $20-100 billion in new fabs that come online in 3-5 years just as the AI bubble bursts and demand collapses.
https://gf.com/gf-press-release/globalfoundries-acquire-ibms...
https://www.theregister.com/2026/01/29/oracle_td_cowen_note/
Edit: Another src https://www.cio.com/article/4125103/oracle-may-slash-up-to-3...
We all know this is a speculative run-up. We all know it'll end somehow. Crashes always start with something like this. Is this the tipping point? Damned if I know. But it'll come.