But to your point, that is exactly how American companies like to play now. No one is stopping them from screwing over the consumer.
I have a Micron near me and they are building another chip facility but we are years away still so I suspect China will beat them to the punch.
SK Hynix and Samsung are South Korean.
Basically:
China floods the market with cheaper but less QA'd parts, makes a gazillion dollars, is able to spend said money to fix yields / QA issues and streamline operations, by the time that happens Micron and maybe a few other existing players will have new memory production, and then we'll have a flood of cheap, reliable memory. 4yr, maybe?
China is very far away from flooding the DRAM market.
The DRAM fabs have been on a roundabout for 40 years going from getting accused of price fixing and cartel behavior, to struggling to keep the lights on.
And imo it's not really their fault, it's all the lead time of advanced semiconductors, combined with the commodity dynamics of oil. And the goal is to match that supply to the demand of everything from consumer electronics to more datacenters than you can shake a stick at.
It's maddening to try and solve that, so at this point I really don't fault them for prioritizing survival.
I'm guessing you are also probably unfamiliar with the terms like "chicken game" which refers to the cutthroat, high-stakes price wars where dominant semiconductor manufacturers intentionally overproduce and slash prices. This is literally how the industry went from dozens to just three majors today since the 80's.
The industry is so naturally prone to oversupply that the only stable equilibrium is undersupply. Aggressive expansion kicks off a price war, which immediately undercuts the logic of the expansion.
This only changes with new entrants, which will come, especially from China. But it takes time to build fab capacity, so the medium-term modal outcome is consistent undersupply.
Memory in particular ... https://en.wikipedia.org/wiki/DRAM_price_fixing_scandal
The entry-cost to getting into memory is on the order of $billions and years - you can do just about anything...
If it costs you $1B and five years to build out new supply and you think demand will not sustain for more than three years, it does not make sense to expand supply.
Instead you will maintain your margins currently and await demand to decrease back to your current supply.
This is pretty common and as others have pointed out is even more common in markets where competition is slow and lead times are long.
Ammunition is a great example over the last decade or so as political turnover caused relatively short lived demand spikes and manufacturers didn't expand supply because they knew once political winds shift, demand would decrease.
The thing is they tend to only do that when they can get a technological competitive advantage. The priority access gives them a locked in competitive edge, for a while. It’s not clear there is an opportunity like that in memory.
There’s a lot to criticize Sam Altman for saying or popularizing culturally but I’ve come to think his “this is the worst it will ever be” is, in the long run, actually a very intriguing and underrated point.
In a decade training LLMs to the current level of sophistication, which is in my opinion rather advanced and probably has lots of additional upside just from constructing better RL training regime independently of hardware advancement, will become just as table stakes as running a database is now. I highly recommend everyone look into the Allen Institute’s projects in GitHub and HF because they have open source training materials (including an LLM from scratch off common crawl, and some quite interesting tunes of qwen) to get a taste for what will be in the near future afternoon projects or educational material. The future is going to be wild
Until everything matures, most likely the current iteration of OpenAI and Anthropic will be long gone, along with their current business models.
You can get 100x the output with the same energy use.
Posits do a little better if your numbers are biased enough toward 1, but not much better. A 16 bit posit in a near-ideal situation matches an 18 bit IEEE float, and in a pretty wide range of situations loses to either fp16 or bf16.
Training anything at 8 bits is going to be tough, and it's hard to say if the flexible exponent is worth the precision tradeoffs.
The difficult question is more whether foreseeable memory demand will remain at the current level, grow even further, or shrink again.
Likewise it's probably dwarfed by improvements in how we make dram - continuing the roughly exponential (maybe a bit less recently) scaling of chips - but not necessarily.
The 2x from returning to previous costs is interesting because it's practically guaranteed, and it's on top of everything else. We're just currently "overpaying" (relative to the stable market price) for the manufacture of dram because of a sudden increase in demand.
> this is just not true at all, there are massive leaps from algorithms, data, etc. every year. scale is one axis of many and you need to get them all correct.
Or the more likely scenario that the AI bubble bursts and the hyperscalars realize they have built too many data centers.
The up-front investment of a memory fab is measured in billions, and takes years to construct and get running. The margin on the chips themselves is terrible, so without scale its not worth even trying. DDR5 is a industry standard that takes some effort to conform to, but the licence fees is a drop in the bucket to the cost of creating a fab.
The fabricators were cautious about increasing production, and slow to start planning. It takes further time to build up capacity, and if the demand drops down, they may end up producing dram at a loss when the market flips over to oversupply. The demand whiplash could kill any company that dared betting on increasing production. See the "bullwhip effect" https://en.wikipedia.org/wiki/Bullwhip_effect which has killed semiconductor fabricators before.
There is a discussion to be had about how to maintain national semiconductor production in Europe and US as a strategic industry, but historic attempts have all failed.
Also that's not what the bullwhip effect is - although I know what you are saying. The bullwhip scenario is about the effect of communication and batching through various layers in the supply chain, this is more similar to the cobweb effect/theory.
If it was just variable costs and new capacity was available today they’d do it. But there are substantial fixed costs and delays to increasing capacity, and that uncertainty makes it risky.
Now it's 2021 and someone gets a tanker stuck in the Suez, sending the price of oil sky-high. How long does the ship have to be stuck before you spend those billions of dollars on a bet that it'll recoup before someone gets the ship out?
Really?
How long do we have to wait until that ... cost reduction hits us?
It just takes that long to get a fab up and running.
Safe to say at least a year or two. It'd be shocking if it took a decade.
That is to say at least you were able to buy them at $350 today, with the current trajectory there will be no supply at all in few months.
Those are not normal pricing. Before the pricing collapse in early 2020, 96GB DDR5 would have cost about $450 to $500. And I will need to restate again the cost of DRAM hasn't really changed much in the past 20 years. Its price just goes up and down in cycles.
So in reality it is more like going from $500 to $1300. But consumer felt it was more like going from $200 to $1300.
Crucial are already selling DRAM made by CXMT. And China are already throwing money at it. I doubt the memory bubble will burst in next 12-24 months. As in going back to money losing DRAM pricing. As they will all pivot to HBM or other money making products. But the bulk of lower end consumer DDR5 or LPDDR5 will goes to Chinese Foundry. Assuming they have figure out how to do them well. Which they have improved but are still so far away from industry leaders.
Normally memory maker will push the next DDR standard to market just to push out Chinese competitors, I am not sure it will work the same this time around. DDR5 have plenty of other usage / demands.
Historically the price has always trended downward. When I first got into computing $200 could buy you 128 MB (yes M) of ram. Really nice systems had 512 MB.
That's obviously changed over the decades as process shrinks have lead to higher memory density. We should generally expect that ram will cheaper up and until the point where process shrinks stop happening. They've definitely slowed, but they haven't stopped.
I am keeping a piece of paper that came with my Tex Murphy game which stated that one could get 32MB of RAM for as little as $700 (1990s dollars) which would drastically improve the game!
Crucial was disestablished this year.
Looking at the current prices, even of the same RAM, is just insane. Those companies really need to pay us compensation damage here. The whole "free market" notion does not work when you have de-facto monopolies and mega-corporations abuse average Joe and average Jane.
I don't see it going away. I mean, it may not grow as fast as now, but I don't see it growing away either. I get why the memory makers do not want to bankrupt themselves, but it feels like there's got to be some way to push that risk off onto model providers and other people in the ecosystem to allow us to grow ram capacity more like 50% per year.
I don't actually know what the rate of growth before October was, I'm sure someone round here will though.
As for 20-25% growth not being enough, I think it's not that far off, if we assume data center build out plans hit a wall and slow down significantly, and the AI heat starts to cool off.
I don't think 20-25% may be enough in the short term but if the AI build out stops within this year, we have a massive oversupply instead of a under supply.
Let me explain, imagine CXML grows massive and builds a lot of fabs, so much so that it becomes the leader in multiple segments, then the market demand cools off.
Then CXML the company that invested massively has oversupply so it undercuts every other memory company.
Aka, Samsung, SK Hynix are dead, and to protect Micron now US has 10000% tariff on the supply of memory.
Imagine. Because that has happened, if you don't play the boom and bust game someone will because the market is very large during a boom, and generally the player scaling more isn't the one with margins to protect and generally has the ability to undercut others.
Asian memory chip giants were made by under cutting European and American companies, American companies adapted by moving manufacturing to Asia, and European ones got bought for pennies or dissolved.
But can massive gains still be made? Definitely.
The entire AI hype is based on the paper Attention is all you need, and Attention is basically loading a huge matrix of all the tokens in memory, how well you can optimize this attention layer is basically how most architectures are trying to solve for performance and memory usage.
Only one with significant gains in it is DeepSeek (or so I would like to believe because others don't make their work open for folks like me not in Big AI Labs to read). Their MLA architecture reduced KV-cache memory requirements by upto 90%, ofc that's purely architectural change.
With some quantization like Turboquant from google you could push it down to ~1/3 of that. So 96% memory savings when talking about kv-cache.
But the models are close to being saturated for quantization based memory optimizations. We will have to see some architectural changes for a significant shift now.
We just haven’t reached the diminishing return of gen AI capabilities yet.
Models will get more useful if you have higher context size or higher param size. Then people will just use the models even more, leading to even more memory demand.
What if its in everyone's interest to buy computers at say 1/3rd the rate and switch everything over to HBM?
the discrepancy between compute and memory has been growing for ages, perhaps a painful switch to HBM is exactly what we need?
Would you rather have 3 intermediate computers with low memory bandwidth, or wait a little longer statistically so that we can all enjoy a new computer at 1/3rd the rate but much higher bandwidth than the area ratio?
As always, some interpret certain recent events as reason to conclude "but this time it's different." Occasionally they are correct. But that doesn't change the fact that it's reasonable to assume some of the recent extreme, rapid price inflation is due to shorter term market distortion. It's also pretty clear that some of the recent increase in demand represents a stable increase in the long-term trendline. The question is how much is long-term stable and how much is short-term distortion.
People used to get into gaming pcs as an affordable hobby, now it’s making general aviation look like plan B.
The only hope left is really Apple, but even apple has conspicuously delayed the launch of M5-gen mac minis and mac studio. Mostly because even Apple can't source enough DRAM to fully supply all their product lines.
You don't even have to drop down to old indie games. You just have to turn off the FPS counter and stop pixel peeping screenshots.
Prices haven't risen THAT much and are quite affordable. And if you look at the improved quality of upscalers (DLLS 4.5 for example), gaming is now more affordable than ever, despite the increased cost of components.
Of course, the 5090 prices are insane, as are for SOME memory models, but that's nothing new and represents a fairly small market share.
> When I started building gaming pcs, the top top card was 750$ (NZD)
When I started building gaming PC, the top $700 cards didn't even provide comfortable performance or graphics. Back then, you were supposed to have several of this connected SLI or somethin. And even then, it wasn't always reliable, and it resulted in stuttering, lags, and graphical artifacts (in cases when it worked). Today, even $700 graphics cards are a much better product from a user perspective than the high-end cards of that time (and that's not even taking into account that $700 cards back then were much more expensive).
As for how much the prices have actually risen, it’s not hard to see if this is true or not. If doubling of prices doesn’t raise your eyebrows, I’m not sure what will.
I just don't see the cost savings of sharing a GPU overcoming the extra expense + profit such a service would need.
Can’t afford a computer because they bought up all the supply? They’ll conveniently sell it back to you with a subscription!
You’ll own nothing and be happy.
They've intentionally crafted an unsustainable business model in an effort to get users in the front door and raise their MAUs. We've seen this story before. We should know precisely where it's headed.
Sorry that “it is going unused”? From what I've read, most AI providers are capacity constrained.
However, that the hyperscalers and AI companies aren't doing this says a lot about their true beliefs about how much future demand AI will have.
AI companies claim they will need a ton of massive expansion, but are unwilling to take on the risk of the capital needed for that expansion.
I'm hearing a lot of sad whining from AI folks about how these chip makers are holding them back, but who actually has the money to finance the expansion easily? Chip makers have been through this game far longer, when Sam Altman went around claiming it was time for $7T of fabs the AI companies made it clear that they were willing to make ridiculous claims, eliminating credibility.
What's needed now is for them to funnel a tiny amount of their massive piles of cash into financing fabs directly.
With what money? They have to spend the money they get on hardware ASAP else they are left behind.
Just look at how Intel has struggled to compete in recent years, and they have been in the business for decades.
They forgot Moore's main lesson: only the paranoid survive. They thought they could coast, and it nearly killed them.
"Only the Paranoid Survive" is rather a quote and book title by Andrew S. Grove.
Most memory companies have backroom deals to exchange tit-for-tat patent violations against each other.
Not sure how a new memory manufacture comes into being without getting sunk from licensing costs?
Most users don't seem to care about storing everything they generate in cloud services and this could easily be sold as an alternative to owning "expensive" desktop or laptop hardware.
If hyperscalers are using more RAM, and that RAM is not available for consumers, it means all the heavy stuff will happen in the cloud. Why would we want both the hyperscalers and consumers to have RAM simultaneously? Consumers would want more RAM to run local models but then hyperscalers capacity will be unused.
The VRAM in the 5090 is only made by one country in the world.
The 50xx series is special, because its ram is so dependent on a single commodity. It’s not like a 4090 or a 3090; their VRAM chips have been around for years.
If there’s a shortage or interruption in DDR7 VRAM, it seems like every GPU that requires it would explode in value.
I hope I don’t regret posting this because I’d really like to buy one myself…
I really need to shut up, or bite the bullet and by one.
If you graph the tokens per second on the 5090, your jaw will hit the floor at how cheap it is
If it's 4k instead of 2k msrp, that's a 100% increase.
The RTX 5090 is faster than an H200. It just has less ram (32 vs 141), doesn't have NVLink, and technically isn't allowed to be used in a datacenter.
The datacenter GPUs sell at an 80% margin. They're incredibly overpriced. But the laws of supply and demand are undefeated and so here we all are.
H200 has HBM and much more 64-bit compute
RTX 5090 has more CUDA cores that run at a higher clock speed. H200 has more RAM and significantly more RAM bandwidth.
Which one is net faster depends on your use case. But you may be very surprised that many workflows are faster on an RTX 5090!
Also had to do an Intel build, and there was no way we were going cudimm at current prices. =3
No doubt Cloud Gaming is in the cards for the future, only purists like myself with an RTX 5090 will pay premium for offline gaming
Once enough gaming compute runs at the edge it also allows for more technically advanced games than would currently be economically feasible (but aren’t made mostly for lack of a market/adoption of cloud gaming and the resulting lack of technical know-how). So I think it will stick and probably end up winning over the holdouts, once the cost of rendering the games they want to play with consumer hardware becomes too large to stomach.
NVIDIA in their recent quarterly report stopped categorizing "Geforce" as a single category, and merged it into "Edge-Computing".
If you are a PC Gamer or PC Enthusiast as I am, then we have some dark times ahead.
Or, we could be fucked.
"Order yours now, for just $99.99 per month, hardware included! Order today, and you will get three months of 'Office Suite' for free, with a small additional cost of $49.99 after month 4. On a tight budget? Switch to the yearly subscription, and pay comfortably in 18 installments."
As long as the discussion seems focused on memory, I'd suspect the latter, but if its really the semiconductor boules/wafers, then I'd expect the boule growers to profit, not the memory makers, who just pass on the cost.
So which is it?
Dram is just extremely specialised.
I asked for evidence different people keep feeding me opposite stories: one insists its not fab capacity but wafer competition, with a recent article claiming HBM3E takes 3 times as much wafer area per bit than LPDDR5X. Others tell me the complete opposite: its fab capacity, not wafer shortage.
Do we have citable references to ground either set of claims?
From your sibling comment, I think you're interpreting the 3x HBM stat as contributing to making wafers scarce. It's more that the next wafer to be processed in a fab is especially precious, making the opportunity cost larger. The beach sand remains plentiful.
So which is the bottleneck: fabs or boule growing?
also consider how most solar panels are monocrystalline silicon, how credible is silicon wafer shortage ... really? there is so much disinformation in this market...
Surely they need GPU capacity and would need memory for those GPUs but OpenAI doesn't build GPUs or any hardware, right? So did they pay to keep the supply locked up, or do they have the ability to put that ram into use?
SeedLM from Apple is an interesting approach for inference memory efficiency. I'd like to see someone try and build that into training so that it's not a post training compression step.
If you made it 10x cheaper right now you would see a truly unimaginable wave of bot slop.
And by doing this, they ensure local LLMs never become feasible for the vast majority of people and AI companies solidify subscriptions forever.
The reason memory prices can stay high for years in this mega cycle is because the 3 players will be very cautious on overbuilding. They’d rather under build, make great profit (not maximum) and reduce the risk of going bust if this suddenly ends.
Same for TSMC in chips.
Great opportunity for Chinese companies though. This shortage is exactly what Chinese companies need to scale.
Then why do only 3 companies make it?
When Samsung had to sell memory at a loss after COVID, no one came to save them. They buffered their memory division using profits from their other businesses. That’s how Samsung survives memory downturns.
According to some stories, this is how Samsung convinced TSMC to not enter the memory business - that you need a nation or other lines of business to prevent bankruptcies.
The market has stabilized to 3 players.
Placing the bet isn't as hard as making an accurate prediction.
These two aren't related.
Dram is a commodity because the you can replace a chip from hynix with a chip from micron, the have the same behaviour.
And a price competitive Dram isn't easy manufacture, or China would have made it already.
Exactly, so what’s the incentive for anyone to sink half a billy into building out more capacity.
The existing players get to rest on their laurels and succeed whether or not the AI bubble busts.
Samsung, SK Hynix, and Micron all have to balance between capex spending, making as much profit as possible, and risk of bankruptcy.
Right now their opportunity cost is too high.
> risky it is to spin up a new fab
You don't need a new fab. You can build memory in 20 years old fab.
This boom is magnitudes higher than before. The attention will be endless.
Memory is a commodity, so I think you will be very lonely in your quest.
Why were tech savy investors unable to figure this out when the datacenter craze had already started?
How to explain this lag between quickly rising demand for all datacenter components besides memory?
https://davidoks.blog/p/ai-is-killing-the-cheap-smartphone
Maybe long-term purchase agreements from big buyers might have helped convince them it's okay to build, but apparently it didn't happen.
The entire sector is now facing a critical RAM starvation crisis where memory manufacturers are actively slow-rolling supply just to keep prices high and avoid running out entirely.
This has created an unprecedented supply-and-demand distortion where desperate companies are getting rejected even at a 5x markup, and mission-critical SKUs are skyrocketing to 10x and 20x their baseline value.
It is a macroeconomic squeeze at a staggering scale, and the massive venture scale opportunity lies in capturing the value created by this memory gatekeeper.
From the perspective of an armchair economist, the winners will be the investors who invest in RAM wisely. The losers will likely be cash strapped SAAS companies. They’re almost completely dependent on a fleet of servers in the hyperscalers, and they’re leasing those servers and services. That leaves small SAAS companies exposed to incoming inflation in the cost of hosting.
Which they will pass on to their customers. If their product provides enough value the customers will pay.....
A lot of capex is supposed to go into the datacentres, didn't they know that datacentres need to be filled among other stuff with RAM? I wonder if at some point we will discover that there is a shortage of fibre optic cables of SFPs ...
PS: Obviously armchair economist here too ... but for it doesn't seem too difficult to foresee the increase of the demand.
I only feel sorrow for the electron devs, they will have a hard time.
WallstreeetBets has been disturbingly accurate in its predictions - basically anything related to AI.
Memory squeeze will get worse before it gets better.
As you stated it, it would merely be a property of (nearly) all demand curves. Jevons paradox only happens sometimes. It isn't a law.
we are going to have amazing cheap used hardware for a decade