• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

PS4 Pro and Half Floats. 16bit at 8.4 TF vs. 32bit at 4.2 TF. Explanation on this?

Status
Not open for further replies.
No, that's definitely not what I'm saying.

It's a 4.1 TF machine (since "TFlop" without additional qualifiers in graphics generally refers to FP32 in this day and age). Depending on the individual workload profile of a given game, and how much developers are willing to invest into optimizations which won't do anything for the larger (non-Pro) part of their audience, it could perform faster than that (but never close to twice as fast).

I'm not a game software dev, but with my limited coding knowledge, this is sort of along the lines of what I was thinking. The fact that it is only on the PS4 Pro could severely limit the developers who choose to invest in coding for it, and so it will likely make little difference to bridge the performance gap (necessary to get to native 4k) for 3rd party tiles. For first party titles though it could make some difference though, (no clue how much though).
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
I'm not a game software dev, but with my limited coding knowledge, this is sort of along the lines of what I was thinking. The fact that it is only on the PS4 Pro could severely limit the developers who choose to invest in coding for it, and so it will likely make little difference to bridge the performance gap (necessary to get to native 4k) for 3rd party tiles. For first party titles though it could make some difference though, (no clue how much though).
And switch.
 
And switch.

Yes, sorry, I wasn't thinking of switch because....well Nintendo doesn't usually get most of the big AAA games on their platform, so the crossover between Pro and Switch games may be limited.
(though I hope the switch's success will convince more developers to start making games for it)
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Games which will target both Switch and PS4Pro/Scorpio prooooooobably won't have any issues running on the latter without any mixed precision in use.
You mean we can't have games that run at 720@30 on the switch and 4K@60 on the pro?
 
Something tells me flops numbers just fell off the PR train.
That's too bad, TFLOPS has been my favorite marketing jargon that so many people don't understand.

Time for MEGAPIXEL to make a comeback?

I like the other one being thrown around too "Bottleneck", love me some good ole fashion "Bottleneck" posts.

As far as FP16, which has been around for a long time in the PC world, it will be interesting to see how this may be used by developers. I don't think the impact will be quite as great as some are touting, but I also don't think it's meaningless.
 
Actually, it is exactly the opposite of what you're saying. Mobile SOCs are usually advertised with FP16 benchmarks. While desktops are advertised with FP32 benchmarks, especially when it comes to graphics benchmarks.

Scientific research and Simulation usually use FP64 and AI deep learning use FP16.

It's all relative to what you're doing.

To answer OPs question. Yes FP16 can be used in many cases in games.

Reading an actual developer's comments on this topic, it seems the idea is almost all current games output images at 8bit/channel so it will be a waist to calculate every intermediate math at higher 32-bit when you can do it at 16-bit without any distinguishable loss in quality.

For example Killzone Shadowfall
It's important to notice the context here.

The keyword being bandwidth bound, which is usually the case whenever fp16 can be used with decent precision (mostly in the -1 to 1 range), so doubling the math output isn't much of an advantage, but reducing the bandwidth in half is, but that benefit are already there even for architectures that doesn't support double rate fp16.
 

Dynomutt

Member
That's too bad, TFLOPS has been my favorite marketing jargon that so many people don't understand.

Time for MEGAPIXEL to make a comeback?

I like the other one being thrown around too "Bottleneck", love me some good ole fashion "Bottleneck" posts.

As far as FP16, which has been around for a long time in the PC world, it will be interesting to see how this may be used by developers. I don't think the impact will be quite as great as some are touting, but I also don't think it's meaningless.

Marketing jargon is unprecedented...
 
Marketing jargon is unprecedented...

tenor.gif


Uncompressed mega-teraflops

You forgot Bottlenecks.

Uncompressed bottlenecks featuring mega-teraflops
 

gofreak

GAF's Bob Woodward
As far as FP16, which has been around for a long time in the PC world, it will be interesting to see how this may be used by developers.

It hasn't afaik. Not sure, but I think there's some confusion going on about FP16 compute rate optimisation vs use of FP16 storage formats - the latter has been around forever, the former not, or not in this class of desktop/console gpu hardware.

For developers this is only becoming an option now... we'll have to wait to let some idea of how much optimisation mileage is typically there or not. It'll vary obviously. I know the Mantis Burn Racing devs seemed to credit it heavily in letting them hit 4K/60 on Pro, but I guess they were quite compute/ALU bound at that res/fps.
 

Durante

Member
It hasn't afaik.
The Geforce FX series was twice as fast at FP16 compute compared to FP32 compute, in 2003.

Few developers were using it, gamers were blaming Valve (among others) for "intentionally crippling" game performance for not explicitly optimizing for it, all the good stuff.
 

ethomaz

Banned
The Geforce FX series was twice as fast at FP16 compute compared to FP32 compute, in 2003.

Few developers were using it, gamers were blaming Valve (among others) for "intentionally crippling" game performance for not explicitly optimizing for it, all the good stuff.
If I remember Valve engine in HL2 could run in both modes... the blaming come from Valve forcing FP32 by default for GeForce FX making the performance go shit... you need to change via app or config files to swift to FP16 mode and get a good performance in FX cards... there are difference in image quality in HL2 comparing the two modes too.

And you are right FP16 is really old... even Voodoo supported it... the actual modern change is that Pro and Vega runs FP16 twice faster than FP32 in consumer gaming GPUs (something already happened in a lot of markets).
 

gofreak

GAF's Bob Woodward
The Geforce FX series was twice as fast at FP16 compute compared to FP32 compute, in 2003.

Few developers were using it, gamers were blaming Valve (among others) for "intentionally crippling" game performance for not explicitly optimizing for it, all the good stuff.

Has it been consistently available ever since? I thought it was something current consumer PC GPUs were only starting to flirt with (again) in recent times, in terms of something that's advantageous performance wise, at least (double-rate).

From a 'what to expect from developers' point of view, if it's not been consistently available & performant then the point stands that we don't really have much precedence to go on. If it has been consistently available, but ignored, that would speak for itself...
 

SappYoda

Member
I found an interesting article written in 2004 about FP16 vs FP32

http://www.hwupgrade.it/articoli/skvideo/1013/radeon-x800-e-il-momento-di-r420_15.html

Here sample translated using google:
Since the VS and PS 2.0 have introduced the floating point calculation there have always been many discussions about which was the standard regarding the number of bits to be used. ATI has chosen the path of 24 bits per component which leads to a total precision to 96 bits (24 bits * 3 (RGB) + 24 bits (alpha)). According to the Canadian company this mode offers a good compromise between quality and speed. NVIDIA, however, has embarked on a more dynamic road: 16 bit and 32 bit. If a shader does not foresee too complex calculations to the American Society it believes that the FP16 precision is more than enough. Otherwise it is possible to use 32 bits per channel for a total of 128 well-bit.Tuttavia, the main problem in the architecture of NV3x has always been that by enabling the FP32 calculation is assisted to a real collapse of performance compared to FP16 mode . Moreover, what it was to be expected given that a higher precision requires a greater number of registers and a higher bandwidth.

In that article there are some screenshot comparison using Far Cry. Can you spot the difference?

FP16:
shader_nv40_16_1.jpg


FP32:
shader_nv40_32_1.jpg


FP16:
shader_nv40_16_2.jpg


FP32:
shader_nv40_32_2.jpg


And here are some benchmarks with FP16 vs FP32:
http://www.hwupgrade.it/articoli/skvideo/1013/radeon-x800-e-il-momento-di-r420_19.html
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
In that article there are some screenshot comparison using Far Cry. Can you spot the difference?
Clearly (on my good monitor).

That said, it does not look so much like a difference due to fp16 general aritmetics as much as a borked power function (read: low-prec sw or hw approximation).
 

ethomaz

Banned
In that article there are some screenshot comparison using Far Cry. Can you spot the difference?
The difference is clear like I posted in another thread (thanks by the link btw).

FP16 pictures is less sharper, the lighting lose intensity and there are artifacts not present in the FP32 picture... FP32 picture is clean with better reflections.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
The difference is clear like I posted in another thread (thanks by the link btw).
I don't know what you mean by 'less sharp' - I don't see any difference in sharpness, but yes, there are artifacts in a couple of high-intensity specular hotpots (again, pointing to a faulty power function).

ps: there are no 'reflections' whatsoever in these shots. It's all specular hotspots.
 

AgentP

Thinks mods influence posters politics. Promoted to QAnon Editor.
16 bits is going to be too little dynamic range for most data, but maybe it's useful in some use places.
 

ModBot

Not a mod, just a bot.
When we locked the other thread, that wasn't an indication you should start bumping every related thread as an anxiety outlet.
 
Status
Not open for further replies.
Top Bottom