• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Next-Gen NVIDIA GeForce RTX 4090 With Top AD102 GPU Could Be The First Gaming Graphics Card To Break Past 100 TFLOPs

tusharngf

Member
Recent rumors regarding the next-generation NVIDIA GeForce RTX 4090 series suggest that the AD102-powered graphics card might be the first gaming product to break past the 100 TFLOPs barrier.

NVIDIA GeForce RTX 4090 Class Graphics Cards Might Become The First Gaming 'AD102' GPU To Break Past the 100 TFLOPs Barrier​

Currently, the NVIDIA GeForce RTX 3090 Ti offers the highest compute performance amongst all gaming graphics cards, hitting anywhere between 40 to 45 TFLOPs of FP32 (Single-Precision) GPU compute. But with the next-generation GPUs arriving later this year, things are going to take a big boost.



As per rumors from Kopite7kimi and Greymon55, the next-generation graphics cards, not only from NVIDIA but AMD too, are expected to reach the 100 TFLOPs mark. This would mark a huge milestone in the consumer graphics market which has definitely seen a major performance and also a power jump with the current generation of cards. We went straight from 275W being the limit to 350-400W becoming the norm and the likes of the RTX 3090 Ti are already sipping in over 500W of power. The next generation is going to be even more power-hungry but if the compute numbers are anything to go by, then we already know one reason why they are going to sip that much power.


As per the report, NVIDIA's Ada Lovelace GPUs, especially the AD102 chip, has seen some major breakthrough on TSMC's 4N process node. Compared to the previous 2.2-2.4 GHz clock speed rumors, the current estimates are that AMD and NVIDIA will have boost speeds similar to each other and that's around 2.8-3.0 GHz. For NVIDIA specifically, the company is going to fuse a total of 18,432 cores coupled with 96 MB of L2 cache and a 384-bit bus interface. These will be stacked in a 12 GPC die layout with 6 TPCs and 2 SMs per TPC for a total of 144 SMs.


Based on a theoretical clock speed of 2.8 GHz, you get up to 103 TFLOPs of compute performance and the rumors are suggesting even higher boost clocks. Now, these are definitely sounding like peak clocks, similar to AMD's peak frequencies which are higher than the average 'Game' clock. A 100+ TFLOPs compute performance means more than double the horsepower versus the 3090 Ti flagship. But one should keep in mind that compute performance doesn't necessarily indicate the overall gaming performance but despite that, it will be a huge upgrade for gaming PCs and an 8.5x increase over the current fastest console, the Xbox Series X.




Upcoming Flagship AMD, Intel, NVIDIA GPU Specs (Preliminary)​

GPU NameAD102Navi 31Xe2-HPG
CodenameAda LovelaceRDNA 3Battlemage
Flagship SKUGeForce RTX 4090 SeriesRadeon RX 7900 SeriesArc B900 Series
GPU ProcessTSMC 4NTSMC 5nm+ TSMC 6nmTSCM 5nm?
GPU PackageMonolithicMCD (Multi-Chiplet Die)MCM (Multi-Chiplet Module)
GPU DiesMono x 12 x GCD + 4 x MCD + 1 x IODQuad-Tile (tGPU)
GPU Mega Clusters12 GPCs (Graphics Processing Clusters)6 Shader Engines10 Render Slices
GPU Super Clusters72 TPC (Texture Processing Clusters)30 WGPs (Per MCD)
60 WGPs (In Total)
40 Xe-Cores (Per Tile)
160 Xe-Cores (Total)
GPU Clusters144 Stream Multiprocessors (SM)120 Compute Units (CU)
240 Compute Units (in total)
1280 Xe VE (Per Tile)
5120 Xe VE (In Total)
Cores (Per Die)18432 CUDA Cores7680 SPs (Per GCD)
15360 SPs (In Total)
20480 ALUs (In Total)
Peak Clock~2.85 GHz~3.0 GHzTBD
FP32 Compute~105 TFLOPs~92 TFLOPsTBD
Memory TypeGDDR6XGDDR6GDDR6?
Memory Capacity24 GB32 GBTBD
Memory Bus384-bit256-bitTBD
Memory Speeds~21 Gbps~18 GbpsTBD
Cache Subsystems96 MB L2 Cache512 MB (Infinity Cache)TBD
TBP~600W~500WTBD
LaunchQ4 2022Q4 20222023


Source: https://wccftech.com/next-gen-nvidi...aming-graphics-card-to-break-past-100-tflops/
 

Black_Stride

do not tempt fate do not contrain Wonder Woman's thighs do not do not
Werent the rumors updated and the RX 7000 is expected to peak at around 75TFLOPs.

If AMD really have dropped the ball and are weaker in pure raster while already being weaker in RT AMD might have a tough fight coming up.

AMD might also have a NAVI 30(?) chip that actually competes with the 4090 that has the full compute units active.




But man 100TFLOPs....Nvidia really arent fucking around are they.
With mining on the decline and LHR3.0 already proving to be a pretty big problem for miners, we might actually be able to purchase these cards near MSRP. (wishful thinking).

Heres to hoping I can snag an xx80 at MSRP within the first 6 months. Then forget upgrading my GPU till the next generation of consoles cuz I only play at 3440x1440p.
 

DukeNukem00

Banned
Werent the rumors updated and the RX 7000 is expected to peak at around 75TFLOPs.

If AMD really have dropped the ball and are weaker in pure raster while already being weaker in RT AMD might have a tough fight coming up.

AMD might also have a NAVI 30(?) chip that actually competes with the 4090 that has the full compute units active.


The rumours about a year ago were something like that, but nvidia seems to have altered things due to AMD's moves. Unlike the issues intel had, nvidia in the modern times is at the top of their game. They're never gonna allow amd them to claim the fastest gpu on the market.

At the moment im on 1440p since its easier to drive 165 hz at that. But man, i cant wait till the power is sufficient that we can go past 120 hz at 4k. Until then im staying put, since 60 frames is unplayable shit once you adjust to high refresh rates
 
Last edited:

winjer

Member
At this point, we don't know how SMs are configured in Ada Lovelace. But if it's something like Ampere, then those TFLOPs are not that impressive.
Remember that Ampere replaced the Int units with more FP32 units. This increased TFLOP count, but performance per TFLOP decreased.
Example: The 2070 and 3060 are almost identical in rasterization performance.
But the 2070 is a 8.9 TLOP card. But the 3060 is a 14 TFLOP card.
On the AMD side, for example, a 6600 is only 4% slower than a 2070. With 8.9 TFLOPs.

The thing to keep in mind is that TFLOPs are no longer a good representation of real world performance.
 

CuNi

Member
The thing to keep in mind is that TFLOPs are no longer a good representation of real world performance.

I have been saying it for ages but TFLOPS never really were a representation of real world GAMING performance. It's a metric that is and was always intended more for the science-side of use for those cards.
TFLOPS never tell you anything about a cards performance in games since gaming is a fluctuating utilization of the card and its parts. No card is saturated 100% in all of its parts all of the time while gaming, so TFLOPS mean nothing in gaming.
 

LordOfChaos

Member
While I know Nvidia's Tflops : Performance ratio changed to be more floppy than previous with Ampere, dang that seems like a crazy fast turnaround to near 10x current consoles, doesn't it?

It's possible this architecture again brings up the Tflops per actual performance ratio though
 
Last edited:

DaGwaphics

Gold Member
At least there is a nice bump since the last series. A good sign that both AMD and Nvidia might need to compete for gamers a bit more this time around. Still the issue of possibly needing a second mortgage for this thing, but it does replace your furnace, maybe it balances out. :messenger_grinning_squinting:
 

winjer

Member
I have been saying it for ages but TFLOPS never really were a representation of real world GAMING performance. It's a metric that is and was always intended more for the science-side of use for those cards.
TFLOPS never tell you anything about a cards performance in games since gaming is a fluctuating utilization of the card and its parts. No card is saturated 100% in all of its parts all of the time while gaming, so TFLOPS mean nothing in gaming.

TFLOPs had it's use, but it is only one metric to describe computational capabilities of a GPU. Unfortunately, both gaming journalists and gamers, with limited knowledge about tech, tend to gravitate to one number, trying to represented performance.
But this ignores all other things in hardware and software that contribute to performance in games.
 
At this point, we don't know how SMs are configured in Ada Lovelace. But if it's something like Ampere, then those TFLOPs are not that impressive.
Remember that Ampere replaced the Int units with more FP32 units. This increased TFLOP count, but performance per TFLOP decreased.
Example: The 2070 and 3060 are almost identical in rasterization performance.
But the 2070 is a 8.9 TLOP card. But the 3060 is a 14 TFLOP card.
On the AMD side, for example, a 6600 is only 4% slower than a 2070. With 8.9 TFLOPs.

The thing to keep in mind is that TFLOPs are no longer a good representation of real world performance.
Was TFLOPs ever a good representation of real world performance?

The recent focus on TFLOP seem to have come from console marketing and even there it backfired.
 
Last edited:
Nvidia has pushed the GPU more than anyone and should be applauded for bringing things to the market ahead of everyone else.
They were first with Mesh Shaders and VRS.
They were first with Ray Tracing.
They were first with ML upscaling
AMD and now Intel are playing catch up.

I actually think the GPU advancements have gone too fast for developers to actually adopt.
We have no games with Mesh Shaders on the market.
Ray Tracing is only in its beginning stages.
Sampler Feedback Streaming isn't being used on the Xbox Series front.
VRS has been touched on slightly and with the advancements in VRS 2.0 it promises alot of upside.
DLSS and other upscaling methods are also starting to mature and not yet fully utilised. We have the lower Int4 and Int8 capabilities on the Xbox Series console which heading into 2 years into the console generation have yet to be touched.
Game engines haven't caught up yet to what's already available.
 
TF can give a rough estimate about performance on AMD cards but NVIDIA since the RTX 3000 cards made TF numbers pointless
 

IFireflyl

Member
I'm torn between loving the jump in performance and hating the energy efficiency (or should I say energy inefficiency?) of the new GPUs. Intel barely eeks out a win against AMD in real-world performance with their CPUs, but AMD is far more energy efficient. I hope these companies start putting more R&D time into energy efficiency. I don't want my house to be heating up the neighborhood when I play a game.
 

Kenpachii

Member
At this point, we don't know how SMs are configured in Ada Lovelace. But if it's something like Ampere, then those TFLOPs are not that impressive.
Remember that Ampere replaced the Int units with more FP32 units. This increased TFLOP count, but performance per TFLOP decreased.
Example: The 2070 and 3060 are almost identical in rasterization performance.
But the 2070 is a 8.9 TLOP card. But the 3060 is a 14 TFLOP card.
On the AMD side, for example, a 6600 is only 4% slower than a 2070. With 8.9 TFLOPs.

The thing to keep in mind is that TFLOPs are no longer a good representation of real world performance.

Tflops are fine, u compare them in the same generation of the same card maker like nvidia.

Comparing tflops vs tflops from amd vs nvidia was always pointless.
 
AMD may have fewer TFLOPS, but *may* win in lower price point, lower TDP, efficiency, decent RT, and may finally be implementing DLSS like upscaling due to architecture having ML cores (not confirmed)
 
Last edited:
A German tech site recently tested how well three popular PC cases handled 450+ Watt GPUs in terms of thermals, and only one of them passed. The other two had serious issues getting rid of all the heat, causing temperature spikes in other components and increased case/CPU fan noise.

Some of these new cards will supposedly use up to 600 Watts. Unless Nvidia and AMD make water cooling mandatory, people are going to start running into all sorts of issues.
 

Thaedolus

Gold Member
600w is insane really, that means the aftermarket gpu's will be 1000w probably. Have fun making your room a sauna.
I run a space heater in my downstairs for the majority of the year, any inefficiencies will be comfort heat when it’s not late June to early September.
 

Ellery

Member
I always have a pretty decent rig and I am pretty likely to buy a 40 series card (or AMD equivalent depending on who offers the more compelling products).
The one thing that makes me sad is the lack of (exclusive, doesn't necessarily have to be exclusive) graphically insane non-modded PC games. I am not saying that multiplats don't look great on PC, but it would be really nice to have something special on PC since the hardware is crazy expensive and capable of so much more.

However, and even if we take the rumors the most conservative way possible, the next generation of GPUs is expected to be a significant jump. Mostly because Nvidia was on shitty Samsung 8nm and that process node jump for the 40 series alone will amount to significant performance gains even without factoring in anything else (architectural changes, new features etc.).
This makes the transition for PCs to 4K pretty easy then (on top of DLSS and FSR).

Yup TFLOPs are somewhat useless. Good for marketing I guess, but I have never looked at TFLOPs as a deciding factor for any hardware (gpu, console etc.) purchase.

Hope the prices don't stupidly skyrocket through the board. I don't care for the Ultra Enthusiast High End Segment like 3090 and 3090 Ti since those things are basically 200% price for 8% performance.
 

Black_Stride

do not tempt fate do not contrain Wonder Woman's thighs do not do not
I doubt it, probably 50-60 TFOPS range
That would be the worst gen on gen upgrade in history.......since the GTX1080Ti to 2080Ti.
But atleast the 20 series brought us a bunch of RT and Tensor goodness.
The 40 series doesnt seem to be bringing anything new cept a larger L2 cache and astronomical prices.

2 x performance gain is generally what we expect gen on gen.

So the 3080 being approx 30TFLOPs the 4080 should be ball park 60TFLOPs.
The 3090 being ~40TFLOPs we should expect the 4090 to be ballpark 80TFLOPs, if theyve gotten some architecture gains 100TFLOPs doesnt seem outta the realm of possibility.

It does mean the 4070 will likely atleast from a TFLOP perspective beat the 4090....which is wild when you consider it just came out.
 

SlimySnake

Member
Was TFLOPs ever a good representation of real world performance?

The recent focus on TFLOP seem to have come from console marketing and even there it backfired.
Yes it used to be. That changed with Ampere and to a lesser extent Series X.
 
Last edited:

Draugoth

Gold Member
That would be the worst gen on gen upgrade in history.......since the GTX1080Ti to 2080Ti.
But atleast the 20 series brought us a bunch of RT and Tensor goodness.
The 40 series doesnt seem to be bringing anything new cept a larger L2 cache and astronomical prices.

2 x performance gain is generally what we expect gen on gen.

So the 3080 being approx 30TFLOPs the 4080 should be ball park 60TFLOPs.
The 3090 being ~40TFLOPs we should expect the 4090 to be ballpark 80TFLOPs, if theyve gotten some architecture gains 100TFLOPs doesnt seem outta the realm of possibility.

It does mean the 4070 will likely atleast from a TFLOP perspective beat the 4090....which is wild when you consider it just came out.

The RTX 3090 TI can currently output 40 TFOPS, so if we go with the generation leaps, it would be 80 TF. But i think we will still get 2x times performance in games and around 60 TFOPS of processing power.
 

SF Kosmo

...please disperse...
Werent the rumors updated and the RX 7000 is expected to peak at around 75TFLOPs.

If AMD really have dropped the ball and are weaker in pure raster while already being weaker in RT AMD might have a tough fight coming up.
Let's face it, though, there's a lot of ground to be gained on affordability and price right now, this isn't about who can get the best $3000 card out.
 

SlimySnake

Member
After the 35 tflops 3090 performing more like a 20 tflops 2080 Ti, I have no faith in Nvidia's tflops claims.

The 3080 is a 30 tflops card. The 2080 was 11.5 tflops at average clocks. It should be roughly 3x or around 200% more powerful. But instead it actually only offered 65% more performance. These fake tflops numbers from Nvidia are embarrassing. Maybe they arent lying this time around but I am going to have to wait to see the receipts. I suspect we will see yet another 65-70% jump despite the 3x increase in tflops.



I am more interested in AMD. Their tflops scaled 1:1 with performance from RDNA 1.0 to RDNA 2.0 but they too have a habit of releasing high tflops cards that dont translate to performance. See the vega series. But if they can fix their RT implementation then their 90 tflops card just might be better than the Nvidia 100 tflops card with far lower TDP.

I find it funny that no one looks at the massive tdp gap between the RDNA 2.0 and Ampere cards. Some of these Ampere cards already consume 400 watts while the biggest RDNA 2.0 card the 6900xt tops out at 260 watts. I remember just last year a lot of intel CPUs being trashed by reviewers just because they had a higher tdp and ran hotter even though they more or less matched AMD's Zen 3 performance.
 
Last edited:

DukeNukem00

Banned
I find it funny that no one looks at the massive tdp gap between the RDNA 2.0 and Ampere cards. Some of these Ampere cards already consume 400 watts while the biggest RDNA 2.0 card the 6900xt tops out at 260 watts. I remember just last year a lot of intel CPUs being trashed by reviewers just because they had a higher tdp and ran hotter even though they more or less matched AMD's Zen 3 performance.

cute light skin boy names

AMD's cards were never 270W. Always 300 and above. And that is on a much denser and better node than nvidia uses on ampere and more space on the pcb used for traditional raster compared to nvidia who is shoving rt cores, tensor cores, etc.
 

FireFly

Member
After the 35 tflops 3090 performing more like a 20 tflops 2080 Ti, I have no faith in Nvidia's tflops claims.

The 3080 is a 30 tflops card. The 2080 was 11.5 tflops at average clocks. It should be roughly 3x or around 200% more powerful. But instead it actually only offered 65% more performance. These fake tflops numbers from Nvidia are embarrassing. Maybe they arent lying this time around but I am going to have to wait to see the receipts. I suspect we will see yet another 65-70% jump despite the 3x increase in tflops.
It shouldn't be 3x more powerful, firstly because games don't just rely on compute, but also the texture rate and fill rate – and these things were barely changed. And also because in Ampere when you use the extra FP32 unit per SM, you forgo the use of the INT unit. So in practice floating point performance gets somewhat traded off against integer performance.

None of these things make the 3090 teraflops figures "fake", however. Since in a purely compute application you can hit them.
 

nkarafo

Member
I hope this gen i will able to get a "60" series card with less than 400 euros.

It got ridiculous the last few years and i'm still stuck with a 1060.

I also hope it's not going to be higher than 160-180w because that's getting ridiculous too.
 
Last edited:

SlimySnake

Member
cute light skin boy names

AMD's cards were never 270W. Always 300 and above. And that is on a much denser and better node than nvidia uses on ampere and more space on the pcb used for traditional raster compared to nvidia who is shoving rt cores, tensor cores, etc.
That might be true for the OC cards but the ones ive seen in numerous benchmarks stay well below 300 watts. My 12 GB 3080 is always around 400 watts and roughly on par with the 6900xt in non RT games.




 
It takes power to make power. We will hit some supposedly, high "inefficient" wattage draw on these cards, especially Nvidia as they are going with a monolithic design vs chiplets like Intel and AMD. It'll be an interesting year in the GPU spaces, as they pull even further away than ever, from the previous generation of GPU's as well as consoles. May the best man win, AKA the customers. As we have more options than ever.
 
At this point, we don't know how SMs are configured in Ada Lovelace. But if it's something like Ampere, then those TFLOPs are not that impressive.
Remember that Ampere replaced the Int units with more FP32 units. This increased TFLOP count, but performance per TFLOP decreased.
Example: The 2070 and 3060 are almost identical in rasterization performance.
But the 2070 is a 8.9 TLOP card. But the 3060 is a 14 TFLOP card.
On the AMD side, for example, a 6600 is only 4% slower than a 2070. With 8.9 TFLOPs.

The thing to keep in mind is that TFLOPs are no longer a good representation of real world performance.

Only on Nvidia's side because of these inflated TFlop number.
But when counted the same way it's still valid.
The thing to noticed is that during CCN days an AMD TFlop as weaker than a Nvidia TFlop, because of the poor utilization rate. But since Navi 1 AMD is on par with Nvidia. RDNA3 should bring actually IPC improvements this time, so if Nvidia is still counting their TFlops like Ampere this new generation will be again be very close, or even closer than now, because of AMD's improved RT and support of FSR2.0 and similar without needed dedicated Matrix units.

I am more interested in AMD. Their tflops scaled 1:1 with performance from RDNA 1.0 to RDNA 2.0 but they too have a habit of releasing high tflops cards that dont translate to performance. See the vega series. But if they can fix their RDNA implementation then their 90 tflops card just might be better than the Nvidia 100 tflops card with far lower TDP.

You're contradicting yourself, AMD already did that.
Vega was still CCN with poor utilization. RDNA was all about improving utilization, putting those flops to work.


I doubt it, probably 50-60 TFOPS range

It's very possible, at least on AMD's side because it'll bit based on chiplets.
Not MCM, two whole chips, but chiplets, like with Zen CPUs. There'll be a IO die and two compute dies, that's why the performance may jump so much without increasing the TDP that much.
 
Last edited:

Allandor

Member
That would be the worst gen on gen upgrade in history.......since the GTX1080Ti to 2080Ti.
But atleast the 20 series brought us a bunch of RT and Tensor goodness.
The 40 series doesnt seem to be bringing anything new cept a larger L2 cache and astronomical prices.

2 x performance gain is generally what we expect gen on gen.

So the 3080 being approx 30TFLOPs the 4080 should be ball park 60TFLOPs.
The 3090 being ~40TFLOPs we should expect the 4090 to be ballpark 80TFLOPs, if theyve gotten some architecture gains 100TFLOPs doesnt seem outta the realm of possibility.

It does mean the 4070 will likely atleast from a TFLOP perspective beat the 4090....which is wild when you consider it just came out.
Flops != Flops
Especially with the 3x00 gen, nvidia increased the flop-rate but performance didn't increase in the same way. It was just an "easy"-win to increase the flop-rate but at the expense of efficiency. A 10TF 2x00 would still be faster than a 10TF 3x00 (scaled the same way). E.g. the 3070 is more or less on par with the 2080 Ti in most situations (expect memory-limited scenarios) but the 3070 is a 20TF card, while the 2080TI is a ~13TF card. They have different strength and can't be that easily compared flop for flop. Funny enough they made the opposite of what AMD made from GCN to RDNA(2). AMD went all in for efficiency while nvidia went for high flop-counts.

The same will be true for the next generation of GPUs. They might double/triple the flop-count but that doesn't have to say anything.
Another thing are the tensor & RT-cores. Those are effectively helping helping the GPUs main "core" with some things. On the other hand, now that the GPU has so many flops, Tensor cores might not even be needed anymore as the GPU might have enough calculation power to do everything on the GPU itself. I can just guess here, but I don't think that the tensor cores have a bright future. I think they were just a "bridge" technology used while the GPU itself was not powerful enough to also make the things that tensor cores are used now. So far tensor cores are only used for a minority of tasks. E.g. with DLSS 1.x we already seen the "just shader"-DLSS which wasn't bad at all. So the question is, why nvidia still uses those cores. Currently they make everything a bit more complicated while not even being used in that many cases while having more general shader-power can be used to accelerate almost all things.
 

SlimySnake

Member
Only on Nvidia's side because of these inflated TFlop number.
But when counted the same way it's still valid.
The thing to noticed is that during CCN days an AMD TFlop as weaker than a Nvidia TFlop, because of the poor utilization rate. But since Navi 1 AMD is on par with Nvidia. RDNA3 should bring actually IPC improvements this time, so if Nvidia is still counting their TFlops like Ampere this new generation will be again be very close, or even closer than now, because of AMD's improved RT and support of FSR2.0 and similar without needed dedicated Matrix units.



You're contradicting yourself, AMD already did that.
Vega was still CCN with poor utilization. RDNA was all about improving utilization, putting those flops to work.
I meant to say if they can fix their RT implementation. Not RDNA.
 
It would be interesting to know how much of the sales are used in (real, not cripto) applications. I have the feeling that if not now in the near future it should represent the biggest chunk of their revenue. These user cards seem more like marketing. I worked for a while with HPC and the introduction of these cards was a big step-up in many applications.
 

Kenpachii

Member
What makes you think that? When did vendor GPUs ever draw 70% more power than a reference model?



3090 evga, as the normal wattage is 350 which is almost half of this new card which is 600w, its not far fetched this thing will hit 1000w by aftermarket vendors.

I can tell you this, 3090 with 500w consumption is no joke, pull 2 next towards eachother and practically every youtuber that had that setup going bitched about the insane heat that comes out of it. This card will be a toaster.

Honestly my cut off point with heat specially when summer is a thing is really 300w at best, i think its still manageble then with some downclocking to 200-250w at times. 500+w is laughable.
 
Last edited:

Black_Stride

do not tempt fate do not contrain Wonder Woman's thighs do not do not


3090 evga, as the normal wattage is 350 which is almost half of this new card which is 600w, its not far fetched this thing will hit 1000w by aftermarket vendors.

I can tell you this, 3090 with 500w consumption is no joke, pull 2 next towards eachother and practically every youtuber that had that setup going bitched about the insane heat that comes out of it. This card will be a toaster.

Honestly my cut off point with heat specially when summer is a thing is really 300w at best, i think its still manageble then with some downclocking to 200-250w at times. 500+w is laughable.
EVGA and MSI are known to up the powerlimits on their cards for overclockers.
But realistically average power draw in gaming wont be in the 500W range.

 
3090 evga, as the normal wattage is 350 which is almost half of this new card which is 600w, its not far fetched this thing will hit 1000w by aftermarket vendors.
That's after a pretty significant overclock using a custom BIOS though, right? IIRC even the 3090 FTW3 only consumed 350 Watts out of the box.

I think 1000 Watt cards are far fetched, and the reason is something you've already touched on in your post: you can't reliably cool a card drawing that much power using conventional means, and that includes most water cooling solutions.
 
Top Bottom