The 5090 just has way more CUDA cores and uses proportionally more power compared to the 4090, when going by CUDA core comparisons and clock speed alone.
All of the "massive gains" were comparing DLSS and other optimization strategies to standard hardware rendering.
Something tells me Nvidia made next to no gains for this generation.
https://www.youtube.com/watch?v=ghT7G_9xyDU
we do see power requirements on the high end parts every generation, but that may be to maintain the desired SKU price points. there's clearly some major perf/watt improvements if you zoom out. idk how much is arch vs node, but they have plenty of room to dissipate more power over bigger dies if needed for the high end.
The intel problem was that their foundries couldn't improve the die size while the other foundries kept improving theirs. But technically nvidia can switch foundry if another one proves better than TSMC even though that doesn't seem likely (at least without a major breakthrough not capitalized by ASML).
None of this is magic. None of it is even particularly hard. There's no reason for any of it to get stuck. (Intel's problem was letting the beancounters delay EUV - no reason to expect there to be a similar mis-step from Nvidia.)
> Something tells me Nvidia made next to no gains for this generation.
Sounds to me like they made "massive gains". In the end, what matters to gamers is
1. Do my games look good? 2. Do my games run well?
If I can go from 45 FPS to 120 FPS and the quality is still there, I don't care if it's because of frame generation and neural upscaling and so on. I'm not going to be upset that it's not lovingly rasterized pixel by pixel if I'm getting the same results (or better, in some cases) from DLSS.
To say that Nvidia made no gains this generation makes no sense when they've apparently figured out how to deliver better results to users for less money.
I use DLSS type tech, but you lose a lot of fine details with it. Far away text looks blurry, textures aren’t as rich, and lines between individual models lose their sharpness.
Also, if you’re spending $2000 for a toy you are allowed to have high standards.
Making better looking individual frames and benchmarks for worse gameplay experiences is an old tradition for these GPU makers.
- Data Center: Third-quarter revenue was a record $30.8 billion
- Gaming and AI PC: Third-quarter Gaming revenue was $3.3 billion
If the gains are for only 10% of your customers, I would put this closer to the "next to no gains" rather than the "massive gains".
i d like to point you to r/FuckTAA
>Do my games run well
if the internal logic is still in sub 120 hz and it is a twichy game, then no
https://www.techpowerup.com/gpu-specs/nvidia-gb202.g1072
Maybe there is a RTX 5090 Ti being held in reserve. They could potentially increase the compute on it by 13% and the memory bandwidth on it by 25% versus the 5090.
I wonder if anyone will try to solder 36Gbps GDDR7 chips onto a 5090 and then increase the memory clock manually.
Isn't not being kept a secret, its being openly discussed that they need to leverage AI for better gaming performance.
If you can use AI to go from 40fps to 120fps with near identical quality, then that's still an improvement
So the biggest benefit is PCIe 5 and the faster/more memory (credit going to Micron).
This is one of the worst generational upgrades. They’re doing it to keep profits in the data center business.
I can understand lack of supply, but why can't I go on nvidia.com and buy something the same way I go on apple.com and buy hardware?
I'm looking for GPUs and navigating all these different resellers with wildly different prices and confusing names (on top of the already confusing set of available cards).
1. Many people knew the new series of nvidia cards was about to be announced, and nobody wanted to get stuck with a big stock of previous-generation cards. So most reputable retailers are just sold out.
2. With lots of places sold out, some scalpers have realised they can charge big markups. Places like Amazon and Ebay don't mind if marketplace sellers charge $3000 for a $1500-list-price GPU.
3. For various reasons, although nvidia makes and sells some "founder edition" the vast majority of cards are made by other companies. Sometimes they'll do 'added value' things like adding RGB LEDs and factory overclocking, leading to a 10% price spread for cards with the same chip.
4. nvidia's product lineup is just very confusing. Several product lines (consumer, workstation, data centre) times several product generations (Turing, Ampere, Ada Lovelace) times several vram/performance mixes (24GB, 16GB, 12GB, 8GB) plus variants (Super, Ti) times desktop and laptop versions. That's a lot of different models!
nvidia also don't particularly want it to be easy for you to compare performance across product classes or generations. Workstation and server cards don't even have a list price, you can only get them by buying a workstation or server from an approved vendor.
Also nvidia don't tend to update their marketing material when products are surpassed, so if you look up their flagship from three generations ago it'll still say it offers unsurpassed performance for the most demanding, cutting-edge applications.
https://www.techpowerup.com/gpu-specs/rtx-6000-ada-generatio...
It's just not their main business model, it's been that way for many many years at this point. I'm guessing business people have decided that it's not worth it.
Saying that they are "resellers" isn't technically accurate. The 5080 you buy from ASUS will be different than the one you buy from MSI.
Most people don't realize that Nvidia is much more of a software company than a hardware company. CUDA in particular is like 90% of the reason why they are where they are while AMD and Intel struggle to keep up.
Also aren't most of the business cards made by Nvidia directly... or at least Nvidia branded?
it's not worth it.
I wonder how much "it's not worth it". Surely it should have been at all profitable? (a honest question)So scalpers want to make a buck on that.
All there is to it. Whenever demand surpasses supply, someone will try to make money off that difference. Unfortunately for consumers, that means scalpers use bots to clean out retail stores, and then flip them to consumers.
You can? Thought this thread was about how they're sold out everywhere.
So nvidia wouldn't have the connections or skillset to do budget manufacturing of low-cost holder boards the way ASUS or EVGA does. Plus with so many competitors angling to use the same nvidia GPU chips, nvidia collects all the margin regardless.
There is profit in this, but it’s also a whole set of skills that doesn’t really make sense for Nvidia.
3090 - 350W
3090 Ti - 450W
4090 - 450W
5090 - 575W
3x3090 (1050W) is less than 2x5090 (1150W), plus you get 72GB of VRAM instead of 64GB, if you can find a motherboard that supports 3 massive cards or good enough risers (apparently near impossible?).
I'm not buying GPUs that expensive nor energy consuming, no chance.
In any case I think Maxwell/Pascal efficiency won't be seen anymore, with those RT cores you get more energy draw, can't get around that.
I built a gaming PC aiming to last 8-10 years. I spent $$$ on MO-RA3 radiator for water cooling loop.
My view:
1. a gaming PC is almost always plugged into a wall powerpoint
2. loudest voices in the market always want "MOAR POWA!!!"
1. + 2. = gaming PC will evolve until it takes up the max wattage a powerpoint can deliver.
For the future: "split system aircon" built into your gaming PC.
For consumers, they do not care.
PCIe Gen 4 dictates a tighter tolerance on signalling to achieve a faster bus speed, and it took quite a good amount of time for good quality Gen 4 risers to come to market. I have zero doubt in my mind that Gen 5 steps that up even further making the product design just that much harder.
But NVIDIA is claiming that the 5070 is equivalent to the 4090, so maybe they’re expecting you to wait a generation and get the lower card if you care about TDP? Although I suspect that equivalence only applies to gaming; probably for ML you’d still need the higher-tier card.
[1] https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...
[2] https://www.nvidia.com/en-us/geforce/graphics-cards/50-serie...
But it seems more aimed at inference from what I’ve read?
But, the advantage is that you can load a much more complex model easily (24GB vs 32GB is much easier since 24GB is just barely around 70B parameters).
More energy means more power consumption, more heat in my room, you can't escape thermodynamics. I have a small home office, it's 6 square meters, during summer energy draw in my room makes a gigantic difference in temperature.
I have no intention of drawing more than a total 400w top while gaming and I prefer compromising on lowering settings.
Energy consumption can't keep increasing over and over forever.
I can even understand it on flagships, they meant for enthusiasts, but all the tiers have been ballooning in energy consumption.
(with the suggested 1000 W PSU for the current gen, it's quite conceivable that at this rate of increase soon we'll run into the maximum of around 1600 W from a typical 110 V outlet on a 15 A circuit)
I do a lot of training of encoders, multimodal, and vision models, which are typically small enough to fit on a single GPU; multiple GPUs enables data parallelism, where the data is spread to an independent copy of each model.
Occasionally fine-tuning large models and need to use model-parallelism, where the model is split across GPUs. This is also necessary for inference of the really big models, as well.
But most tooling for training/inference of all kinds of models supports using multiple cards pretty easily.
32 GB VRAM on the highest end GPU seems almost small after running LLMs with 128 GB RAM on the M3 Max, but the speed will most likely more than make up for it. I do wonder when we’ll see bigger jumps in VRAM though, now that the need for running multiple AI models at once seems like a realistic use case (their tech explainers also mentions they already do this for games).
Conversely, this means you can pay less if you need less.
Seems like a win all around.
For more of the fast VRAM you would be in Quadro territory.
And yet it just so happens they work effectively the same. I've done research on an RTX 2070 with just 8 GB VRAM. That card consistently met or got close to the performance of a V100 albeit with less vram.
Why indicate people shouldn't use consumer cards? It's dramatically (like 10x-50x) cheaper. Is machine learning only for those who can afford 10k-50k USD workstation GPU's? That's lame and frankly comes across as gate keeping.
Honestly I can't really imagine how a person could reasonably have this stance. Just let folks buy hardware and use it however they want. Sure if may be less than optimal but it's important to remember that not everyone in the world has the money to afford an H100.
Perhaps you can explain some other better reason for why people shouldn't use consumer cards for ML? It's frankly kind of a rude suggestion in the absence of a better explanation.
These are monumentally different. You cannot use your computer as an LLM. Its more novelty.
I'm not even sure why people mention these things. Its possible, but no one actually does this out of testing purposes.
It falsely equates Nivida GPUs with Apple CPUs. The winner is Apple.
However, I’m a AAA gamedev CTO and they might have been telling me what the card means to me.
It won't stop crypto and LLM peeps from buying everything (one assumes TDP is proportional too). Gamers not being able to find an affordable option is still a problem.
Used to think about this often because I had a side hobby of building and selling computers for friends and coworkers that wanted to get into gaming, but otherwise had no use for a powerful computer.
For the longest time I could still put together $800-$1000 PC's that could blow consoles away and provide great value for the money.
Now days I almost want to recommend they go back to console gaming. Seeing older ps5's on store shelves hit $349.99 during the holidays really cemented that idea. Its so astronomically expensive for a PC build at the moment unless you can be convinced to buy a gaming laptop on a deep sale.
Only way to fix this is for AMD to decide it likes money. I'm not holding my breath.
It probably serves to make the 4070 look reasonably priced, even though it isn't.
4090 was already priced for high income (in first world countries) people. Nvidia saw 4090s were being sold on second hand market way beyond 2k. They merely milking the cow.
We'll have to see how much they'll charge for these cards this time, but I feel like the price bump has been massively exaggerated by people on HN
https://www.okdo.com/wp-content/uploads/2023/03/jetson-agx-o...
The 3090 Ti had about 5 times the memory bandwidth and 5 times the compute capability. If that ratio holds for blackwell, the 5090 will run circles around it when it has enough VRAM (or you have enough 5090 cards to fit everything into VRAM).
This will make it possible for you to run models up to 405B parameters, like Llama 3.1 405B at 4bit quant or the Grok-1 314B at 6bit quant.
Who knows, maybe some better models will be released in the future which are better optimized and won't need that much RAM, but it is easier to buy a second 'Digits' in comparison to building a rack with 8xGPUs. For example, if you look at the latest Llama models, Meta states: 'Llama 3.3 70B approaches the performance of Llama 3.1 405B'.
To interfere with Llama3.3-70B-Instruct with ~8k context length (without offloading), you'd need: - Q4 (~44GB): 2x5090; 1x 'Digits' - Q6 (~58GB): 2x5090; 1x 'Digits' - Q8 (~74GB): 3x5090; 1x 'Digits' - FP16 (~144GB): 5x5090; 2x 'Digits'
Let's wait and see which bandwidth it will have.
5070, 5070 Ti, 5080, 5090 to
5000, 5000 Plus, 5000 Pro, 5000 Pro Max.
:O
Presumably the pro hardware based on the same silicon will have 64GB, they usually double whatever the gaming cards have.
While they've come a long way, I'd imagine they're still highly specialized compared to general-purpose hardware and maybe still graphics-oriented in many ways. One could test this by comparing them to SGI-style NUMA machines, Tilera's tile-based systems, or Adapteva's 1024-core design. Maybe Ambric given it aimed for generality but Am2045's were DSP-style. They might still be GPU's if they still looked more like GPU's side by side with such architectures.
Press the power button, boot the GPU?
Surely a terrible idea, and I know system-on-a-chip makes this more confusing/complicated (like Apple Silicon, etc.)
Any modern card under $1000 is more than enough for graphics in virtually all games. The gaming crisis is not in a graphics card market at all.
Now, I do agree that $1000 is plenty for 95% of gamers, but for those who want the best, Nvidia is pretty clearly holding out intentionally. The gap between a 4080TI and a 4090 is GIANT. Check this great comparison from Tom's Hardware: https://cdn.mos.cms.futurecdn.net/BAGV2GBMHHE4gkb7ZzTxwK-120...
The biggest next-up offering leap on the chart is 4090.
I barely play video games but I definitely do
I disagree. I run a 4070 Super, Ryzen 7700 with DDR5 and I still cant run Asseto Corsa Competizione in VR at 90fps. MSFS 2024 runs at 30 something fps at medium settings. VR gaming is a different beast
Me. I do. I *love* raytracing; and, as has been said and seen for several of the newest AAA games, raytracing is no longer optional for the newest games. It's required, now. Those 1080s, wonderful as long as they have been (and they have been truly great cards) are definitely in need of an upgrade now.
I went from 80 FPS (highest settings) to 365 FPS (capped to my alienware 360hz monitor) when I upgraded from my old rig (i7-8700K and 1070GTX) to a new one ( 7800X3D and 3090 RTX)
You will love the RTX 5080 then. It is priced at $999.
https://www.nvidia.com/en-us/geforce/graphics-cards/50-serie...
When was the last time Nvidia made a high end GeForce card use only 2 slots?
(Looks like Nvidia even advertises an "SFF-Ready" label for cards that are small enough: https://www.nvidia.com/en-us/geforce/news/small-form-factor-...)
Translation: No significant actual upgrade.
Sounds like we're continuing the trend of newer generations being beaten on fps/$ by the previous generations while hardly pushing the envelope at the top end.
A 3090 is $1000 right now.
Jensen thinks that "Moore's Law is Dead" and it's just time to rest and vest with regards to GPUs. This is the same attitude that Intel adopted 2013-2024.
I've heard this twice today so curious why it's being mentioned so often.
Not really worth it if you can get a 5090 for $1,999
2x faster in DLSS. If we look at the 1:1 resolution performance, the increase is likely 1.2x.
Seemingly NVIDIA is just playing number games, like wow 3352 is a huge leap compared to 1321 right? But how does it really help us in LLMs, diffusion models and so on?
> DLPerf (Deep Learning Performance) - is our own scoring function. It is an approximate estimate of performance for typical deep learning tasks. Currently, DLPerf predicts performance well in terms of iters/second for a few common tasks such as training ResNet50 CNNs. For example, on these tasks, a V100 instance with a DLPerf score of 21 is roughly ~2x faster than a 1080Ti with a DLPerf of 10. [...] Although far from perfect, DLPerf is more useful for predicting performance than TFLops for most tasks.
The real jump is 26%, at 28% higher power draw and 25% higher price.
A dud indeed.
Does anyone know what these might cost in the US after the rumored tariffs?
If DLSS4 and “MOAR POWAH” are the only things on offer versus my 3090, it’s a hard pass. I need efficiency, not a bigger TDP.
Going from 60 to 120fps is cool. Going from 120fps to 240fps is in the realm of diminishing returns, especially because the added latency makes it a non starter for fast paced multiplayer games.
12GB VRAM for over $500 is an absolute travesty. Even today cards with 12GB struggle in some games. 16GB is fine right now, but I'm pretty certain it's going to be an issue in a few years and is kind of insane at $1000. The amount of VRAM should really be double of what it is across the board.
likely 10-30% going off of both the cuda core specs (nearly unchanged gen/gen for everything but the 5090) as well as the 2 benchmarks Nvidia published that didn't use dlss4 multi frame gen - Far Cry 6 & A Plague Tale
https://www.nvidia.com/en-us/geforce/graphics-cards/50-serie...
I’m expecting a minor bump that will look less impressive if you compare it to watts, these things are hungry.
It’s hard to get excited when most of the gains will be limited to a few new showcase AAA releases and maybe an update to a couple of your favourites if your lucky.
You can try out pretty much all GPUs on a cloud provider these days. Do it.
VRAM is important for maxing out your batch size. It might make your training go faster, but other hardware matters too.
How much having more VRAM speeds things up also depends on your training code. If your next batch isn’t ready by the time one is finished training, fix that first.
Coil whine is noticeable on my machine. I can hear when the model is training/next batch is loading.
Don’t bother with the founder’s edition.
> Don’t bother with the founder’s edition.
Why?
That's not even close, the M4 Max 12C has less than a third of the 5090s memory throughput and the 10C version has less than a quarter. The M4 Ultra should trade blows with the 4090 but it'll still fall well short of the 5090.
By the way, this is even better as far as memory size is concerned:
https://www.asrockrack.com/minisite/AmpereAltraFamily/
However, memory bandwidth is what matters for token generation. The memory bandwidth of this is only 204.8GB/sec if I understand correctly. Apple's top level hardware reportedly does 800GB/sec.
If that holds up in the benchmarks, this is a nice jump for a generation. I agree with others that more memory would've been nice, but it's clear Nvidia are trying to segment their SKUs into AI and non-AI models and using RAM to do it.
That might not be such a bad outcome if it means gamers can actually buy GPUs without them being instantly bought by robots like the peak crypto mining era.
I would expect something like the 5080 super will have something like 20/24Gb of VRAM. 16Gb just seems wrong for their "target" consumer GPU.
This time around, I will save for the 5090 or just wait for the Ti/Super refreshes.
* Neural texture stuff - also super exciting, big advancement in rendering, I see this being used a lot (and helps to make up for the meh vram blackwell has)
* Neural material stuff - might be neat, Unreal strata materials will like this, but going to be a while until it gets a good amount of adoption
* Neural shader stuff in general - who knows, we'll see how it pans out
* DLSS upscaling/denoising improvements (all GPUs) - Great! More stable upscaling and denoising is very much welcome
* DLSS framegen and reflex improvements - bleh, ok I guess, reflex especially is going to be very niche
* Hardware itself - lower end a lot cheaper than I expected! Memory bandwidth and VRAM is meh, but the perf itself seems good, newer cores, better SER, good stuff for the most part!
Note that the material/texture/BVH/denoising stuff is all research papers nvidia and others have put out over the last few years, just finally getting production-ized. Neural textures and nanite-like RT is stuff I've been hyped for the past ~2 years.
I'm very tempted to upgrade my 3080 (that I bought used for $600 ~2 years ago) to a 5070 ti.
I'm hoping generative AI models can be used to generate more immersive NPCs.
RTX 5090: 32 GB GDDR7, ~1.8 TB/s bandwidth. H100 (SXM5): 80 GB HBM3, ~3+ TB/s bandwidth.
RTX 5090: ~318 TFLOPS in ray tracing, ~3,352 AI TOPS. H100: Optimized for matrix and tensor computations, with ~1,000 TFLOPS for AI workloads (using Tensor Cores).
RTX 5090: 575W, higher for enthusiast-class performance. H100 (PCIe): 350W, efficient for data centers.
RTX 5090: Expected MSRP ~$2,000 (consumer pricing). H100: Pricing starts at ~$15,000–$30,000+ per unit.
They’re not really supposed to either judging by how they priced this. For non AI uses the 5080 is infinitely better positioned
Source: https://www.nvidia.com/en-us/data-center/technologies/blackw...
It seems obvious to me that even NVIDIA knows that 5090s and 4090s are used more for AI Workloads than gaming. In my company, every PC has 2 4090s, and 48GB is not enough. 64GB is much better, though I would have preferred if NVIDIA went all in and gave us a 48GB GPU, so that we could have 96GB workstations at this price point without having to spend 6k on an A6000.
Overall I think 5090 is a good addition to the quick experimentation for deep learning market, where all serious training and inference will occur on cloud GPU clusters, but we can still do some experimentation on local compute with the 5090.
I always end up late to the party and the prices end up being massively inflated - even now I cant seem to buy a 4090 for anywhere close to the RRP.
The cat loves laying/basking on it when it’s putting out 1400w in 400w mode though, so I leave it turned up most of the time! (200w for the cpu)
it is easy to be carried away with vram size, but keeping in mind that most people with apple silicon (who can enjoy several times more memory) are stuck at inference, while training performance is off the charts through cuda hardware.
the jury is yet to be out on actual ai training performance, but i bet 4090, if sold at 1k or below, would be better value than lower tier 50 series. the "ai tops" of the 50 series is only impressive for the top model, while the rest are either similar or with lower memory bandwidth despite the newer architecture.
i think by now the training is best left on the cloud and overall i'd be happy rather owning a 5070 ti at this rate.
Gaming performance has been plateaued for some time now, maybe an 8k monitor wave can revive things
I miss when high-end GPUs were $300-400, and you could get something reasonable for $100-200. I guess that's just integrated graphics these days.
The most I've ever spent on a GPU is ~$300, and I don't really see that changing anytime soon, so it'll be a long time before I'll even consider one of these cards.
That time is 25 years ago though, i think the Geforce DDR is the last high end card to fit this price bracket. While cards have gotten a lot more expensive those $300 high end cards should be around $600 now. And $200-400 for low end still exists.
It doesn't matter if that's through software or hardware improvements.
This is the same thing they did with the RTX 4000 series. More fake frames, less GPU horsepower, "Moore's Law is Dead", Jensen wrings his hands, "Nothing I can do! Moore's Law is Dead!" which is how Intel has been slacking since 2013.
They may resurrect it at some stage, but at this stage yes.
I'm planning to upgrade (prob to a mid-end) as my 5 year old computer is starting to show it's age, and with the new GPUs releasing this might be a good time.