Welp, at least you aren't claiming it's .5 Turing TF like you did in the other threads...but you're still wrong. Here's why.
Reddit Q&A
Read that bolded part carefully.
Good. Addition, yay! So let's see where you went wrong...
And there it is. You can't only account for raw numerical performance; node process efficiency gains have to also be taken into account. So even if the numerical numbers between the two average out to being the same across both architectures, Ampere still sees IPC gains simply by being on a newer process (not to mention having other hardware present to offload certain taskwork more efficiently than present in Turing, such as RT, DLSS and AI through the Tensor cores. Equivalent performance in those areas on Turing would've required more raw GPU resources expended to cover the gap).
I don't see why you're doing the math this way. Nvidia says they see 36
ADDITIONAL INT32 OPs for each 100 FP32 OPs. Just previously you listed Turing as 64 INT32 + 64 FP32, and one of Ampere's as the same. So would this above division not be worthless at that point? In both cases you get 128 OPs per SM per cycle; the two Ampere numbers are clearly an either/or, the SMs can operate either in full FP32 or mixed FP32/INT32 OP modes on a cycle.
None of your calculations make sense in terms of context here. By the description at the top of your post, both Turing and Ampere are capable of the
same number of INT32 OPs per clock cycle. Which means your numbers here should be reflective on both Ampere
AND Turing, which ultimately means that the performance delta between an Ampere TF and Turing TF stays the same i.e 2TF Ampere would = 2 TF Turing (before factoring in node gains and improvements on earlier tech, API algorithms etc. present on Turing continued in Ampere, which would actually increase the Ampere performance over Turing, not decrease it).
Wrong; you seem to have forgotten that even with Ampere's new pipeline architecture they are capable of same INT32 OPs per clock cycle as Turing, and your numbers only applied the conditional to Ampere while ignoring doing the same with Turing (even though you claimed you were going to do so in a sentence before doing these calculations).
Do you have a source where they specifically phrased 3080 performance in this manner?
Same as above two.
Again, where is a source that quotes
official reps from Nvidia claiming this exact figurative comparison metric? You can't claim they said this or that if not able to source it yourself.
Yes, "up to", as in, depending on what the game itself requires to be performed for calculations. Actually let's go back for a bit because I think you misread this following quote:
So reading this again, it really does look like you got wonky with your calculations because it's Turing that would be hindered by running INT32 instructions on a clock cycle, not Ampere, since FP32 instructions would have to wait their turn until INT32 instructions are completed.
Don't see where you're getting this from, especially considering I looked at your calculations and they seem dubious at best IMHO.
Interesting speculation, but in light of what you've posted before, I don't know if the foundation of this speculation is necessarily sound.
So this is basically a recap of your calculations that I already touched on above, no need to repeat myself.
Needless to say, I think the context and conclusions of your calculations are inaccurate, because I don't think you initialized conditions for those calculations correctly.
I'm not interested in discussing RDNA1 here as the crux of the discussion is on your (IMHO) flawed/inaccurate Ampere/Turing calculations, but needless to say I wouldn't be completely confident in these stated numbers either