• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

VGLeaks: PS4 GPU Hardware Balanced at 14 CUs - 4 CUs only minor boost for rendering

Status
Not open for further replies.
Let's discuss the possible bottlenecks that cause this.

There aren't any bottlenecks.

So that 40% difference is only like a 10-20% difference?This explains the lack of resolution confirmation in multiplats. Both consoles don't have the juice to meet the targets.

Cancel the generation

/conclusion jump

No. The performance difference is the same. It just seems that Sony recommended at one point that using 14 CUs for normal rendering and 4 CUs for things like particles etc. could result in better graphics than just using 18 CUs for normal rendering. But right now not even Killzone: Shadow Fall is using compute for particles etc.
 
acknowledging fact 2 is real, then the power difference between the systems for rendering is not as substantial as it has been made out to be

Umm, 400 GFLOPS of GPGPU don't suddenly stop mattering. But this explains why people don't want to believe this though. Flops are flops IMO.
 
No, you posted this because you think 12 CU is better than 18 CU.
rQDDS09.jpg

Not to mention his pretending to be an insider on twitter...
 
So Microsoft probably knew something more when it talked about the Xbox One being a balanced system. This kind of proves Microsoft's point and it brings the performance of both systems closer together. Well played, MS.

Not sure what they knew, but I agree and have always thought these consoles will be pretty much like this gen. Both having great games and both having some great exclusives.
 
So is this PS4 only or is this similar effect after PC Cards on say the 7990 that has 32 CU's?

14CU's scale down in performance all the way to say 32?

If not then its a specific to the PS4? bottleneck or design choice and imo surely ram bandwidth can not be the bottle neck since it has the same or faster ram available on the 32CU PC cards, it would have to be the CPU or ??

No it goes to all cards, the difference would be power limitations. These console have a very lower power footprint compared to desktop pcs. Which means you only have a specific ammount of voltage to work with.
 
I'm basing this on the Cerny presentation. He said he wanted a system that was easy to develop for but difficult to master. He then mentioned that going with eDRAM would have gotten them better performance but it would be a bit too difficult to deal with. So instead Sony decided to go with GDDR5 and the part that would require some time and depth to master was using the GPU for compute.

So Sony still accomplished its goal of creating a system that is easy to develop for yet difficult to master.

If the term "trade-off" offended you, then get over it. It's a $400 games console. There were trade-offs all over the place.

You remember wrong Sony was always going with GDDR5 but the bus would have been 128 and the bandwidth would have been smaller in the GDDR5
He never said devs would have gotten better performance but they would have to used separate techniques because of that set up .
He then said no devs wanted that so they went with a more simple set up , he also talk about a ray tracing chip but that was also something devs did not want .
 
This thread is basically over at this point but I still want to ask. What is so hard to understand about this? Even me, someone who has close to zero programming knowledge and no hardware knowledge can understand this concept.
You can use 14 CU for graphics, and that's where it starts bottlenecking. You are then suggested to use remaining 4 CU on other tasks for more efficiency. The units are all the same power. It's not forced upon you either.
But, I do still have a question. And it may be completely and utterly stupid. How many CPU MHZ, would 4 of the CU put out?
 
It is not a fact. It is an example of hardware balance that devs can use when delving into GPGPU tasks. It is absolutely not required, and does not incur some level of inefficiency when using all 18 CUs for rendering. It depends entirely on how developers utilize the hardware in their software.

Stop peddling this myth. The agenda, to attempt to reduce the perceived power disparity between the Xbone and PS4 through nonsense, is blatantly obvious.

You are so hostile to what I said that you didn't read it. I said in most situations...good Devs...and no hardware limitations. What myth am I peddling?
 
Here's a Cerny quote that might be relevant to this discussion:

And then for the GPU we went in and made sure if would work better for asynchronous fine-grain compute because I believe that with regards to the GPU, we'll get in a couple of years into the hardware cycle and it'll be used for a lot more than graphics.

Now when I say that many people say, "but we want the best possible graphics". It turns out that they're not incompatible. If you look at how the GPU and its various sub-components are utilised throughout the frame, there are many portions throughout the frame - for example during the rendering of opaque shadowmaps - that the bulk of the GPU is unused. And so if you're doing compute for collision detection, physics or ray-casting for audio during those times you're not really affecting the graphics. You're utilising portions of the GPU that at that instant are otherwise under-utilised. And if you look through the frame you can see that depending on what phase it is, what portion is really available to use for compute.

So it sounds like during some GPU tasks, the majority of the CUs are sitting idle. So why not use those idle CUs for non-graphics workloads during the frame?
 
Mark Cerny debunked this a while ago, as many mentioned before. Nonetheless somebody with a twitter account could ask yosp.
 
Uh...I keep seeing people say "debunked".

Then I read what Cerny said and he seems to CONFIRM it. All of them can be used, but beyond the 14 doesn't generate the same level of performance as the initial 14. Those other 4 can be utilized for other things.

Is that not correct?
 
So Microsoft probably knew something more when it talked about the Xbox One being a balanced system. This kind of proves Microsoft's point and it brings the performance of both systems closer together. Well played, MS.

But MS was talking in regards to MS' console. This is evident by the fact AMD has GCN based cards out that go well beyond 14CU with a performance increase. Hell the R9 290X has 44CU, clearly AMD missed MS' new found discovery.
 
Depends on the leak and for which company it pertains to,

You mean MS?



Why would this be wishful thinking. Why would anyone wish how best to use ALU resources. Why so defensive. The only thing that's been debunked is that the CU's are physically different from each other. If their is no balance between rendering and GPGPU allocation of resources, why does Cerny state its not round. Why are people so hostile to this concept. What is threatening here?

Concept? You stated "facts". You chose not to back up those "facts". Twice. Even when specifically asked. If that is considered being hostile, then okay.
 
Uh...I keep seeing people say "debunked".

Then I read what Cerny said and he seems to CONFIRM it. All of them can be used, but beyond the 14 doesn't generate the same level of performance as the initial 14.

Is that not correct?

No. This has nothing to do with performance. See my previous post:

http://www.neogaf.com/forum/showpost.php?p=84198901&postcount=152

The performance is the same, it could just be that using a certain amount of processing power for particles etc instead of normal rendering results in better graphics.
 
The agenda you're pushing. Don't act dumb. It's insulting.

I posted this slide because it was just made public when I opened the thread by VGLeaks. If you look into the OP I added the comment from Cerny about it and addressed the age of the slide and that things could've changed.

But I can see why you assumed I want to speak bad about the PS4 regarding my posting history here: I'm obviously biased towards the X1 to some degree but that doesn't mean that I want to dismiss the HW advantage of the PS4.
 
So is this PS4 only or is this similar effect after PC Cards on say the 7990 that has 32 CU's?

14CU's scale down in performance all the way to say 32?

If not then its a specific to the PS4? bottleneck or design choice and imo surely ram bandwidth can not be the bottle neck since it has the same or faster ram available on the 32CU PC cards, it would have to be the CPU or ??

If you took two PC cards, the only difference being the number of CUs, then in 'typical games' you wouldn't see a linear scaling of performance with CUs.

Past a certain amount of CU utilisation other bounds would kick in.

It all completely depends on software and what its demands are, though. If you had a pipeline that had a heavy ALU:tex or ALU:pixel or ALU:bandwidth demand ratios then it may well continue to scale well.

Sony is just pointing out 'typical' behaviours to motivate devs to look for untapped CU time and use it for compute tasks. But it's not some fixed thing, it very much depends on software.

On a different note, in terms of comparing to XB1, we cannot assume that software that would scale well up to 14 CUs on PS4 would scale well up to 12 CUs on XB1. The ratio of resources available on each system is very different. Nevermind the amount of CU time available in each system, even assuming a '14 CU cap' in PS4, might vary a lot due to reservations etc. on XB1. If Sony thinks in typical cases PS4 is 'balanced' at 14 CUs for rendering, matched to 32 ROPs and their bandwidth, XB1 looks very imbalanced with 12:16:XB1's bandwidth. Superficially Microsoft might find it useful to invoke Sony's words to defend their CU count but it actually doesn't paint a very balanced picture for XB1 on the whole...
 
Here's a Cerny quote that might be relevant to this discussion:

Not to mention the X1 has 50% less Rops and Shader units, surely the PS4 in turn has 50% extra for the performance of 50% extra GCN's when being utilized for rendering etc...

Mark cerny said to point out. Compute is done inbetween frames not by spare CU's?? - doesnt this nullify this 14+4 (not separate CU's, just to illustrate diminishing returns after the 14 point)
 
So as I understand it. 14 CUs is what the hardware was balanced to. That could mean the entire system or just the gpu. Not clear on that. The 4 additional CUs can be use for rendering, but would only result in a minor boost so they recommend using them for compute. Seems self explanatory.
 
You remember wrong Sony was always going with GDDR5 but the bus would have been 128 and the bandwidth would have been smaller in the GDDR5
He never said devs would have gotten better performance but they would have to used separate techniques because of that set up .
He then said no devs wanted that so they went with more simple set up he also talk about a ray tracing chip but that also something eh said devs did not want .

Cerny's quote directly:

By the PS3 era of thinking, we would definitely have gone with that design [The 1 TB eDRAM]. But with the new way of thinking, we wanted to be sure that accessibility was there. We went with this combination of year-one accessibility and then a very interesting feature-set which... there's a Gamasutra article where I went through the major points... for year three or year four.

So Cerny said that if they wanted throughput above all else like Kutaragi's philosophy they would have gone with the eDRAM solution. But instead they went with GDDR5 because it was more accessible for developers.
 
No, you posted this because you think 12 CU is better than 18 CU.
rQDDS09.jpg

Nnnnnnn expose his agenda.
sad.gif


Ekim just give it up. I've noticed your desperation for hit threads for a while now. Trying to bring up old stuff that was debunked and all these ridiculous rumors trying to make it seem like some undiscovered thing.. Just stop.
 
I posted this slide because it was just made public when I opened the thread by VGLeaks. If you look into the OP I added the comment from Cerny about it and addressed the age of the slide and that things could've changed.

But I can see why you assumed I want to speak bad about the PS4 regarding my posting history here: I'm obviously biased towards the X1 to some degree but that doesn't mean that I want to dismiss the HW advantage of the PS4.

Are you just repeating yourself here?
 
Gemüsepizza;84199725 said:
No. This has nothing to do with performance. See my previous post:

http://www.neogaf.com/forum/showpost.php?p=84198901&postcount=152

The performance is the same, it could just be that using a certain amount of processing power for particles etc instead of normal rendering results in better graphics.

So performance isn't the right way to approach it? So 14 = same graphics as the 18...so you can just use those 4 to offload something else (if you want..or not)?

I'm a bit confused. Need a better explanation..lol

So as I understand it. 14 CUs is what the hardware was balanced to. That could mean the entire system or just the gpu. Not clear on that. The 4 additional CUs can be use for rendering, but would only result in a minor boost so they recommend using them for compute. Seems self explanatory.

This doesn't help my confusion. So it IS performance related when it comes to rendering something? The other 4 are just a minor boost and are better used elsewhere IF needed.
 
Not to mention the X1 has 50% less Rops and Shader units, surely the PS4 in turn has 50% extra for the performance of 50% extra GCN's when being utilized for rendering etc...

Mark cerny said to point out. Compute is done inbetween frames not by spare CU's?? - doesnt this nullify this 14+4 (not separate CU's, just to illustrate diminishing returns after the 14 point)

A big, fairly unaddressed point here is that the 14:4 is just a convenient way of talking about a more complicated picture of CU utilisation on average across a frame. At different points in the frame the CUs might be completely saturated while at others that might be completely idle. That's what Cerny means by 'filling in between render tasks' with compute, using CUs for other things when they're idle.

Splitting it out into a ratio of average CU utilisation is just a simplification to make it easier to talk about, which is worth bearing in mind.
 
Not sure what they knew, but I agree and have always thought these consoles will be pretty much like this gen. Both having great games and both having some great exclusives.

I agree that both will have good games, and both will have good exclusives, but there is a power disparity between the two that is larger than the on between the 360 and the PS3. I dont think it will lead to huge differences with games on both consoles, but it is there.

And we've had this arguement about 14+4 CUs before, its painfully obvious why the OP created this thread.
 
Uh...I keep seeing people say "debunked".

Then I read what Cerny said and he seems to CONFIRM it. All of them can be used, but beyond the 14 doesn't generate the same level of performance as the initial 14. Those other 4 can be utilized for other things.

Is that not correct?

That goes without saying because CU don't scale linearly. Example, if you have 6CU, adding 2 more isn't going to generate the same level of performance as the initial 6; add another 2 and it's not going to generate the same level of performance of the last 8 and repeated. However that doesn't mean there isn't a performance increase at all which is what everyone is assuming. The same thing applies to GCN hardware on desktop GPUs. You're never getting 100% of each additional CU that's added (this goes for Nvidia's CUDA cores as well), but you are getting a performance % increase never the less. One thing we do know is that 12-14CU isn't the ceiling or anywhere near it for AMD's GCN architecture in terms of efficiency.

14:4 doesn't make sense simply because it undermines the entire point of increasing ACEs to 8 with 8 CLs.
 
You are so hostile to what I said that you didn't read it. I said in most situations...good Devs...and no hardware limitations. What myth am I peddling?

There is no most situations because different games and engines have different priorities \ drawbacks .
The point is you can use all 18 CUs for what ever you want some games might used 4 ,5 or 6 for GPGPU while others might not .
They whole idea that the perfect balance is 14 for rendering and 4 GPGPU is a myth and make no sense .
 
A big, fairly unaddressed point here is that the 14:4 is just a convenient way of talking about a more complicated picture of CU utilisation on average across a frame. At different points in the frame the CUs might be completely saturated while at others that might be completely idle. That's what Cerny means by 'filling in between render tasks' with compute, using CUs for other things when they're idle.

Splitting it out into a ratio of average CU utilisation is just a simplification to make it easier to talk about, which is worth bearing in mind.

Good clarification :)
 
So performance isn't the right way to approach it? So 14 = same graphics as the 18...so you can just use those 4 to offload something else (if you want..or not)?

I'm a bit confused. Need a better explanation..lol

The FLOPS performance doesn't change. And using 18 CUs for rendering is better than using 14 CUs for rendering. Like I said, Killzone uses most CUs for rendering and no compute for particles. But there is a possibility that, if you are using some of those CUs for things like particles instead of for normal rendering, it will look better.
 
Here's a Cerny quote that might be relevant to this discussion:



So it sounds like during some GPU tasks, the majority of the CUs are sitting idle. So why not use those idle CUs for non-graphics workloads during the frame?

And they will. But there's only so much work you can do in small windows of opportunity when the CUs are idle...

Obviously the CU utilization won't be the same for all developers, but it seems that they are giving developers a bit more of processing power compared to what the system would need for it's graphical tasks, and developers can either use them to try to push graphics further ahead, or to do other stuff without taking away from rendering.
 
Status
Not open for further replies.
Top Bottom