• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Nvidia Ampere teraflops and how you cannot compare them to Turing

Krisprolls

Banned
That's because deep down they feel that I'm trying to attack their "precioussss"...
Although I will probably buy Ampere too.
Go figure...

Kudos for trying to make some sense here, I hope you took your flame shield with you.

I explained this number was bullshit too but it was all circle jerk in recent days. They really ask to be duped by believing every marketing number. Next time it will be AMD with a 100 Terafloops (tm) card and they'll gobble it up too.

Real world benchmark on recent games from outside neutral sources is the only thing people should look at. The rest is marketing, including obviously all those Nvidia videos or numbers.

Which doesn't mean the card is bad (I may even upgrade), obviously. Just not the jump forward they say.
 
Last edited:

GHG

Member
Welp, at least you aren't claiming it's .5 Turing TF like you did in the other threads...but you're still wrong. Here's why.

Reddit Q&A



Read that bolded part carefully.



Good. Addition, yay! So let's see where you went wrong...



And there it is. You can't only account for raw numerical performance; node process efficiency gains have to also be taken into account. So even if the numerical numbers between the two average out to being the same across both architectures, Ampere still sees IPC gains simply by being on a newer process (not to mention having other hardware present to offload certain taskwork more efficiently than present in Turing, such as RT, DLSS and AI through the Tensor cores. Equivalent performance in those areas on Turing would've required more raw GPU resources expended to cover the gap).



I don't see why you're doing the math this way. Nvidia says they see 36 ADDITIONAL INT32 OPs for each 100 FP32 OPs. Just previously you listed Turing as 64 INT32 + 64 FP32, and one of Ampere's as the same. So would this above division not be worthless at that point? In both cases you get 128 OPs per SM per cycle; the two Ampere numbers are clearly an either/or, the SMs can operate either in full FP32 or mixed FP32/INT32 OP modes on a cycle.



None of your calculations make sense in terms of context here. By the description at the top of your post, both Turing and Ampere are capable of the same number of INT32 OPs per clock cycle. Which means your numbers here should be reflective on both Ampere AND Turing, which ultimately means that the performance delta between an Ampere TF and Turing TF stays the same i.e 2TF Ampere would = 2 TF Turing (before factoring in node gains and improvements on earlier tech, API algorithms etc. present on Turing continued in Ampere, which would actually increase the Ampere performance over Turing, not decrease it).



Wrong; you seem to have forgotten that even with Ampere's new pipeline architecture they are capable of same INT32 OPs per clock cycle as Turing, and your numbers only applied the conditional to Ampere while ignoring doing the same with Turing (even though you claimed you were going to do so in a sentence before doing these calculations).



Do you have a source where they specifically phrased 3080 performance in this manner?



Same as above two.



Again, where is a source that quotes official reps from Nvidia claiming this exact figurative comparison metric? You can't claim they said this or that if not able to source it yourself.



Yes, "up to", as in, depending on what the game itself requires to be performed for calculations. Actually let's go back for a bit because I think you misread this following quote:



So reading this again, it really does look like you got wonky with your calculations because it's Turing that would be hindered by running INT32 instructions on a clock cycle, not Ampere, since FP32 instructions would have to wait their turn until INT32 instructions are completed.



Don't see where you're getting this from, especially considering I looked at your calculations and they seem dubious at best IMHO.



Interesting speculation, but in light of what you've posted before, I don't know if the foundation of this speculation is necessarily sound.




So this is basically a recap of your calculations that I already touched on above, no need to repeat myself.

Needless to say, I think the context and conclusions of your calculations are inaccurate, because I don't think you initialized conditions for those calculations correctly.



I'm not interested in discussing RDNA1 here as the crux of the discussion is on your (IMHO) flawed/inaccurate Ampere/Turing calculations, but needless to say I wouldn't be completely confident in these stated numbers either 🤷‍♂️

I don't understand what a lot of this means but thank you for dissecting his bullshit bit by bit. This isn't even a situation where there's a teraflop or 2 in it here, once everything is taken into account we are talking about at least double the amount of performance that will be in his precious plastic box of choice regardless of which one he chooses. The guy has been running from thread to thread crying about Nvidia since the 1st of September telling everyone to not get excited about the new GPU's. How about...

tenor.gif


I wish we could round these fools up and put them in a special section of the forum so they can battle it out amongst themselves.
 
Last edited:

nochance

Banned
We need to stop discussing tflops as anything other than number of specific type of calculations that can be completed on a processor. PCs allow for use of objective benchmarks to measure performance in real world scenarios. The absolute best that AMD is supposedly cooking up is supposed to compete with a 2080ti, assuming that the console parts will correspond to tehir low/mid range, we can expect gaming performance at about 30% of rtx 3080 from them, regardless of the floating Point figurek.
 
Last edited:

Ascend

Member
This is my whole point all along.
TFLOPS were never meant to be used to compare gaming performance.
They are a metric for scientific applications that run only on FP Operations and compare devices based on this.
A Ampere GPU with 10TF is just as good as a Pascal 10TF GPU. Obviously efficiency will be better but the idea of FLOPS does not care about efficiency.
It is a raw performance metric, period. NVidia says their cards will deliver up to 30tf. They are not lying.
It's people thinking they can derive gaming performance from this thus it's them fooling themselves.
This is not much different than when the GTX 970 was advertised as having 4GB. Technically it did, but the last 0.5GB was not up to par.

And I never said they will deliver those tf all the time!
Obviously they won't as its unrealistic as you said yourself!

That's what I meant by the fact that people don't understand the FLOPS metric at all.
If there were a metric by which we could exactly compare cards, we'd have no need for benchmarks. But that's the point.
We do not have a real metric on which to compare them scientifically accurately besides running multiple benchmarks and averaging out the results.
Have you noticed how many people are flailing the 30+TF numbers around on here? It causes unrealistic expectations, because none of these cards are 3x as fast as the ~10TF turing cards.

Next time it will be AMD with a 100 Terafloops (tm) card and they'll gobble it up too.
I doubt it. AMD is scrutinized like crazy, but nVidia gets a pass for everything. If this was an AMD reveal, everyone would be slamming them for their TFs being unrealistic, be whining about the power consumption, and talking about bad drivers. But I agree with the rest of your post.
 

psorcerer

Banned
It because the utilization of those flops can on occur under specific circumstances, which are not being met by current software. Things may improve a bit with new compiles, but I doubt it.

Just shows you, your flops are only as good as the efficiency of your architecture.

No matter the architecture, you always going to have data streams that don't align with the compute units for every cycle.

Again. When I said that Ampere is less efficient per FLOP all the hell broke loose.
 
But you should care about the opposite. If they are advertising with 30TF, but it's really 'only' 20TF to the previous generation, even if it is the fastest card available, it is deceiving.

No. This just shows you don't understand what a TFLOP is. ( or how it's calculated )

Nvidia did infact double the Cuda Cores per SM. ( But they didn't double the TMUs or the ROPS )

So it's not like Nvidia changed the way TFLOPS are calculated. You're implying some kind of cheating where none exists.

The exact same way you calculate PS4 TFLOPS or Xbox TFLOPS is the exact same way you calculate Ampere TFLOPS, Which in the 3080s case is ~30 TFLOPS.

Here's EXACTLY how it's calculated....

3080RTX - 8704(Cuda Cores) X 1710 (Boost Clock) x 2 (two operations per clock) = 29.76 TFLOPS.

And here's a kicker, the ACTUAL TFLOP count is likely somewhat higher than Nvidia is reporting( it always is ). That reported boost clock is always LOW. The real ingame clocks are going to be higher, probably around 2000mhz ( I don't know exactly what they will be yet, that's just a rough guess based on their previous GPUs ) which would make the 3080 actually a 34.8 TFLOP GPU.
 
Last edited:

OmegaSupreme

advanced basic bitch




The rest of your post is kinda irrelevant.
Please read the OP carefully. There are no errors there. Sorry (at least not the ones that you claim)




I'm combative? :messenger_grinning:
Only with trolls and miserable cunts
 

psorcerer

Banned

Ascend

Member
Personally I've no problem with anyone who wants to deep-dive into numbers these companies provide us. However, at least after looking over the OP's conditions for their calculations, I don't think they're accurate. Not the calculations themselves, but the foundation for initiating them and the context, because there are parts of the details NV provided he either ignored, didn't catch, and then did calculations with conditionals on only one side rather than both as they seemingly said they would've done on the outset.

If the conditions and context for calculations look suspect, I think that is worth questioning, long as it's respectful. FWIW there's been a rather strong push by some to downplay Nvidia's stuff, especially on the I/O front, following their presentation. If you look a little deeper you can infer why some people are doing it, too, but I'll leave that for another time and in fact I don't think it's really necessary to say why at this point :LOL:
The I/O stuff is what it is.

I think his numbers are fine to be honest. Let me try and pitch it from another perspective, and maybe you'll understand...

RTX 2080: 10.7TF
RTX 3080: 29.8TF
RTX 3080 = 179% faster
DF benchmarks suggest RTX 3080 = ~80% faster at best

You don't see this as a problem? There is practically a 10TF mark-up on the RTX 3080.
 
Last edited:

OmegaSupreme

advanced basic bitch
I don't understand what a lot of this means but thank you for dissecting his bullshit bit by bit. This isn't even a situation where there's a teraflop or 2 in it here, once everything is taken into account we are talking about at least double the amount of performance that will be in his precious plastic box of choice regardless of which one he chooses. The guy has been running from thread to thread crying about Nvidia since the 1st of September telling everyone to not get excited about the new GPU's. How about...

tenor.gif


I wish we could round these fools up and put them in a special section of the forum so they can battle it out amongst themselves.
Yep. Replying for visibility.
 
  • Strength
Reactions: GHG
Ok, but when Nvidia talks about 30 teraflops, they clearly try to make it sound far better than it is. This part is on them. Nobody forced them to use bullshit numbers.

But that's the thing. Nvidia is NOT using "bullshit numbers"

The EXACT same way that you calculate PS5 TFLOPS or XSX TFLOPS is the exact same way you calculate Ampere TFLOPS, which in the 3080's case is ~30 TFLOPS. ( and in reality it's going to be closer to 35 TFLOPS for most 3080 owners, and I'd be more than happy to explain why this is true, if you'd like ).
 
Last edited:

psorcerer

Banned
Ok, but when Nvidia talks about 30 teraflops, they clearly try to make it sound far better than it is. This part is on them. Nobody forced them to use bullshit numbers.

I actually have a theory that they try to show they trump XBSX.
Because Huang was comparing stuff to XBSX in the presntation.
And these TFs are highly comparable to RDNA(1/2) ones it seems.
 

Ascend

Member
But that's the thing. Nvidia is NOT using "bullshit numbers"

The EXACT same way that you calculate PS5 TFLOPS or XSX TFLOPS is the exact same way you calculate Ampere TFLOPS, which in the 3080's case is ~30 TFLOPS. ( and in reality it's going to be closer to 35 TFLOPS for most 3080 owners, and I'd be more than happy to explain why this is true, if you'd like ).
I'm listening.
 

CuNi

Member
This is not much different than when the GTX 970 was advertised as having 4GB. Technically it did, but the last 0.5GB was not up to par.
Have you noticed how many people are flailing the 30+TF numbers around on here? It causes unrealistic expectations, because none of these cards are 3x as fast as the ~10TF turing cards.

Listen, I do get your point. I'm not calling those practices right in any way, but they are not wrong either.
That's like saying my car has 4 seats. It can hold 4 people, but that doesn't mean there will be 4 people in it every time. Same for busses.
I don't know about US Trucks but I'd say they are like the EU ones where the bigger ones have a liftable axle. You might not use that axle 70% of the time, maybe you use it 100% of the time, maybe even never.
But that axle still is there. 30TF is a potential limit that the card can reach in specific situations, just as a 2080 can theoretically reach 10.7TF.
And yes, technically it was correct that the 970 (which I funnily own right now) has 4GB. It does, but it's also true that only 3.5GB was connected via high bandwidth and 0.5GB slow.
Doesn't change the fact that it had 4GB. And that even is a perfect example of why just single numbers without context mean nothing.
The 970 was benchmarked and it performed exceptionally good for it's price point. Then, on a later date, the whole thing with 3.5GB + 0.5GB leaked and there was a huge outcry.
Did it change how the card performed? No, it did not, but I bet people would have had different expectations if you would have said that it is a 4GB card than if you would've told them it's 3.5 + 0.5.
But in the end, it performed as well as it did and that performance is what we care about, not some numbers on a paper sheet. We're not running scientific applications on this so we don't care about theoretical TFLOPS, GB/s VRAM speeds.

If you are honest to yourself, we care about gaming performance. If a card with 1MB VRAM, 1 FLOP and PCIe 1.0 would magically deliver 240 FPS in 8k HDR with RTX on, we'd be happy and still buy that card.
And this is what I mean in the very end. So many gamers just look at some arbitrary number and pit cards against them, when in reality GPUs do so much more work in the background that it's nearly impossible to infer ingame performance from data sheets. Heck, even look at varying game performance between different engines/game types etc.
 
The I/O stuff is what it is.

I think his numbers are fine to be honest. Let me try and pitch it from another perspective, and maybe you'll understand...

RTX 2080: 10.7TF
RTX 3080: 29.8TF
RTX 3080 = 279% faster
DF benchmarks suggest RTX 3080 = ~80% faster at best

You don't see this as a problem? There is practically a 10TF mark-up on the RTX 3080.

TBH I forgot DF did another video, didn't they? I have to watch it. However something I remember is for other NV cards they usually actually state the lower-bound TF performance; generally you don't have to do much out of the box to get the higher performance on their cards. I dunno why they list their card performance that way when they could market the higher numbers, but I guess that's NV for 'ya.

Still though, I think the conditions for his calculations are what's suspect, not the actual numbers themselves (i.e just looking at them as equations in isolation the results match up. I'm more suspicious about the terms he states to use those numbers in the first place, though).

I don't understand what a lot of this means but thank you for dissecting his bullshit bit by bit. This isn't even a situation where there's a teraflop or 2 in it here, once everything is taken into account we are talking about at least double the amount of performance that will be in his precious plastic box of choice regardless of which one he chooses. The guy has been running from thread to thread crying about Nvidia since the 1st of September telling everyone to not get excited about the new GPU's. How about...

tenor.gif


I wish we could round these fools up and put them in a special section of the forum so they can battle it out amongst themselves.

True, he's been adamant about "being realistic" (moreso downplaying IMO) Ampere since the announcement. I don't know why; I guess some people are looking at this from the perspective of the consoles and, from their interpretation, think these new cards make the consoles look bad?

Because they honestly don't. Both consoles still have two main advantages over PCs regardless how good these and future cards are: performance-per-dollar (I'm talking in terms of overall features and performance the consoles have compared to a graphics card), and large install base that incentivizes 3rd-parties to use consoles as the baseline. PCs are never going to have those advantages, so it's not really any surprise they (at least NV so far; AMD will hopefully be competitive here as well) have just increased the raw performance gap and narrowed the I/O gap between them and the consoles considerably (I think it's fair to say the consoles still have a few benefits here but not stuff that makes a generational difference).

Maybe the surprise is more at how quickly this has all happened? TBH I wasn't really keeping up with GPU news all that much, and figured any robust I/O advancements would be at least a year out. Which I guess they still are for PCs in general outside of NV and (possibly) AMD GPUs; DirectStorage isn't coming until 2021 in that market. But how anyone can look at this and make it out as a bad thing is honestly a bit laughable IMHO. It just helps make things more balanced between all the options and definitely benefits 3rd-party devs a ton.

At the very least I've got a new GPU to look at when upgrading my PC a little later in the year. Just hope the prices don't suddenly creep up again :S.
 

Mentat02

Banned
nvidia really hurt a lot of people with their reveal of the new GPUs. now you have PS5 fanboys who have never had a PC gaming rig claim they are going to buy a new 3000 series card because Xbox is irrelevant. On the other hand you have Xbox fanboys panicking because they think no one will buy their plastic box of choice because of the Ampere GPUs. what planet do these weirdos live on? A lot of PC gamers really don't care for consoles and no the sales of your precious plastic boxes won't be compromised because of one piece of hardware.
 

psorcerer

Banned
30TF is a potential limit that the card can reach in specific situations, just as a 2080 can theoretically reach 10.7TF.

But that assumption is incorrect.
2080 Ti can reach 15TF if half of the operations happen to be INT32 (and it again would be exactly 50% of 3080).
Either we don't count INT32 at all and then it seems like the whole INT32 separate Turing pipeline was not needed (it wasn't)
Or if we do count it, then we need to somehow take it into account.
 

KungFucius

King Snowflake
These cards are not just marketed to gamers. People use them for high performance computing. The marketing there is correct. It can do 30 TF. You cry that that is bullshit because gaming requires the use of some of those cores for INTOPS. So what? They showed how the gaming performance scales to their current lineup.

In terms of floating point operations per seconds, the 3080 does 30TFs and the 2080 does whatever its specs are. The gaming performance always differs from architecture to architecture, from game to game. The 3080 is the best card you can get for 700 bucks on 9/17 and may be for a year or so.
 

Hendrick's

If only my penis was as big as my GamerScore!
Yes I was gonna post about this but the tugging off merry-go-round (circle jerk) was so overwhelming the other day I let it slide.

This '30 tflops' is Nvidia BS marketing. Also marketing was the 1.9X perf per watt improvement. Yeh only to reach 60fps at certain settings in a certain title comparing certain cards!
As opposed to the console IO nonsense which isn’t pr marketing at all right?
 
But that assumption is incorrect.
2080 Ti can reach 15TF if half of the operations happen to be INT32 (and it again would be exactly 50% of 3080).
Either we don't count INT32 at all and then it seems like the whole INT32 separate Turing pipeline was not needed (it wasn't)
Or if we do count it, then we need to somehow take it into account.

This post just highlights that you don't understand what a TFLOP is or how it's calculated.

You also seem to believe that 2X the TFLOP count should equal 2X the framerate, but this isn't, and has never, been true ( because these two metrics are not directly tied to each other )

The fact that you aren't seeing 2X FLOPS = 2X framerate is leading you to believe there is some funny business going on that you can't quite put your finger on.
 

OmegaSupreme

advanced basic bitch
This post just highlights that you don't understand what a TFLOP is or how it's calculated.

You also seem to believe that 2X the TFLOP count should equal 2X the framerate, but this isn't, and has never, been true ( because these two metrics are not directly tied to each other )

The fact that you aren't seeing 2X FLOPS = 2X framerate is leading you to believe there is some funny business going on that you can't quite put your finger on.
This is same guy who thinks 30fps is fine and 60fps and higher is over rated.
 
Yeah but DF compared it directly to Turing in games and there was a 50 - 70% performance increase. So it’s still a big increase.

Exactly. This is a HUGE increase.

The last time we saw an increase this large was the monstrous jump we saw from the 9XX series to the 10XX series. Except this time it's probably a bit larger!

The jump from the 10XX series to the 20XX series was small ( only about 25-30% and that's WITH a 70% increase in price ) This is ONE of the reasons why so many people felt that Turing ( 20XX series ) was such a joke and a ripoff.

Historically speaking, the jump from Turing to Ampere is about as good as it gets! PC gamers are extremely impressed and rightfully so.
 
Last edited:

Ascend

Member
Listen, I do get your point. I'm not calling those practices right in any way, but they are not wrong either.
That's like saying my car has 4 seats. It can hold 4 people, but that doesn't mean there will be 4 people in it every time. Same for busses.
I don't know about US Trucks but I'd say they are like the EU ones where the bigger ones have a liftable axle. You might not use that axle 70% of the time, maybe you use it 100% of the time, maybe even never.
But that axle still is there. 30TF is a potential limit that the card can reach in specific situations, just as a 2080 can theoretically reach 10.7TF.
And yes, technically it was correct that the 970 (which I funnily own right now) has 4GB. It does, but it's also true that only 3.5GB was connected via high bandwidth and 0.5GB slow.
Doesn't change the fact that it had 4GB. And that even is a perfect example of why just single numbers without context mean nothing.
The 970 was benchmarked and it performed exceptionally good for it's price point. Then, on a later date, the whole thing with 3.5GB + 0.5GB leaked and there was a huge outcry.
Did it change how the card performed? No, it did not, but I bet people would have had different expectations if you would have said that it is a 4GB card than if you would've told them it's 3.5 + 0.5.
But in the end, it performed as well as it did and that performance is what we care about, not some numbers on a paper sheet. We're not running scientific applications on this so we don't care about theoretical TFLOPS, GB/s VRAM speeds.

If you are honest to yourself, we care about gaming performance. If a card with 1MB VRAM, 1 FLOP and PCIe 1.0 would magically deliver 240 FPS in 8k HDR with RTX on, we'd be happy and still buy that card.
And this is what I mean in the very end. So many gamers just look at some arbitrary number and pit cards against them, when in reality GPUs do so much more work in the background that it's nearly impossible to infer ingame performance from data sheets. Heck, even look at varying game performance between different engines/game types etc.
I can't say that you're wrong. But I'm glad you get the point.

Yeah but DF compared it directly to Turing in games and there was a 50 - 70% performance increase. So it’s still a big increase.
RTX 3080 vs RTX 2080, not vs RTX 2080Ti. This increase is big, but, from all sides they are making the performance gap appear larger than it is, despite being quite large already. One has to ask why.
 
Last edited:

Kerlurk

Banned
 
Last edited:

psorcerer

Banned
We still need to wait for more benchmarks, but at this point it's looking like the jump from Turing to Ampere will be equal to or greater than the jump from Maxwell to Pascal.

I think it's bigger only if you count in 3090, which is kinda bad perf per dollar, compared to Pascal.
But maybe 3080Ti will obliterate everything.
 

CuNi

Member
But that assumption is incorrect.
2080 Ti can reach 15TF if half of the operations happen to be INT32 (and it again would be exactly 50% of 3080).

I'm sorry, I swapped the 2080 ~10TF with the 2080ti's TF, which are roughly 13.5TF.
To my knowledge, if 100% only doing FP32, the 2080Ti has 13.5TFLOPS at maximum boost clock.

Either we don't count INT32 at all and then it seems like the whole INT32 separate Turing pipeline was not needed (it wasn't)
Or if we do count it, then we need to somehow take it into account.

That's what I was trying to talk about. We can't just look at TFLOPS and try to infer a cards performance because it's way more than just FP32. You have INT32, we have memory bandwidth, clock etc.
FLOPS are only for scientific applications where the card is only doing floating point calculations. I already wrote it somewhere further up that we as gamers care more about overall performance than theoretical power in only scientific applications. The only real way to pit cards in a fair way against each other and compare them is in benchmarks. Everything else is edge cases and synthetic benchmarking which won't translate into real world performance well or even at all.
 
Yes, it seems to be. But it's not 3 times as fast, as the TFLOP count would lead you to believe.

This is the crux of this argument.

It IS "3x as fast" but that sentence by itself is meaningless. 3X faster at what? At maximum theoretical floating point operations per second - it IS. Just because you have 3X the TFLOPS doesn't mean 3X the framerate is some kind of guarantee. Far from it.

Look at it from the perspective of PS4 ( 1.8TFLOPS) to PS5 (10TFLOPS).

That's 5.55X the FLOPS.

So now think of a game that runs at 1080p 60FPS on PS4. Are you really expecting the PS5 to be able to run the same game at 333FPS? Cuz that shit ain't happening. ( maybe in some rare, contrived example )

Does this mean that the PS5's TFLOP count is "bullshit"? Using the "logic" of some people ITT, it would.

Very rarely does double the power equate to double the framerate. Some engines are just inefficient and doubling the framerate will require far more than double the power.
 
Last edited:

Ascend

Member
Listening to Broken Silicon right now... They basically agree. "An Ampere CUDA core is at least 33% weaker than a Turing one. If you compare IPC per CUDA core, they've lost a lot of IPC to pack in this many cores"

Timestamped;
 
Last edited:
That's a very narrow way of looking at it.

Especially if you're going to take that isolated factoid and use it to mean that "Nvidia is pushing BS numbers"

That's not what's happening here. We just have an architectural change that focuses the improvements in different places. For example the RTX cores and Tensor cores are far MORE efficient in Ampere than they were in Turing, and Ampere does in fact have DOUBLE the Cuda Cores per SM.

And the result? A massive increase in performance over Turing, which is exactly what every gamer wanted. ( And didnt get in Turing over Pascal )

Nvidia Gaming GPUs are literally made up of 3 completely different cores now. Cuda Cores, RTX Cores and Tensor Cores.

And when someone starts throwing around TFLOP numbers they are only addressing a simple calculation that ONLY takes into account the amount of shaders in the Cuda Cores, and completely ignores the other 2/3 of the GPU.
 
Last edited:
Listening to Broken Silicon right now... They basically agree. "An Ampere CUDA core is at least 33% weaker than a Turing one. If you compare IPC per CUDA core, they've lost a lot of IPC to pack in this many cores"

Timestamped;


I wouldn't listen too deeply into MLID; won't get into specifics here, but he's made some rather inflammatory (and wrong) claims on other products in recent past.
 
Last edited:
Top Bottom