• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Ubisoft GDC EU Presentation Shows Playstation 4 & Xbox One CPU & GPU Performance- RGT

vpance

Member
I think we'll certainly see improvements, but on consoles, those improvements are NOT going to be equivalent to what was the case with last gen. There are major differences between the state of 3D rasterization in terms of talent with requisite expertise, asset creation API's AND hardware back then and today.

GCN is already way more efficient at shading than previous gen's GPU's, the numbers don't tell the whole tale. Every generation games come out that leave people thinking "how the hell did they do that?" and it'll happen again, diminishing returns or not. I don't think we should expect the moon here.
 
I think a lot of console only folk are going to be hella disappointed with just how quickly the disparity between what devs can do on console with their multiplats and what devs can do on PC with their multiplats grows.

The CPUs in the consoles simply don't have the grunt to carry out the evolution that we'll be seeing over the next 4 years as PC tech changes.

And I'm sure both Sony and especially MS are intimately aware of the fact that this will need to be a shorter generation.

This has been the case since before the last generation so I dont belive anyone will be suprised that was around for that.
 
And I'm sure both Sony and especially MS are intimately aware of the fact that this will need to be a shorter generation.

This presupposes the idea that they particularly care about wowing people with tech specs, when it seems that both Sony and MS are reluctant to keep bleeding each other out financially for marginal marketshare gains (the extreme length of last gen + the cost cutting / comparatively weaker and cheaper hardware this gen).

It just doesn't seem like either of them are economically stable (for entirely different reasons) enough to be able to go back to the old razor blade models and subsidised hardware any time soon.
 

Kinthalis

Banned
GCN is already way more efficient at shading than previous gen's GPU's, the numbers don't tell the whole tale. Every generation games come out that leave people thinking "how the hell did they do that?" and it'll happen again, diminishing returns or not. I don't think we should expect the moon here.

The question is, how much of that power is being used efficiently today? The answer to that will be speculative, but I don't think I would be terirbly off the mark to say very damn efficiently today, with the exception of compute (that's the wild card, IMHO) and possibly memory.

The answer to that this time last gen was basically NOT AT ALL. Those are two very different answers.
 

Teletraan1

Banned
Ubisoft games routinely run CPU usage heavy on i5 & i7 processors clocked at 4+GHz. They seem to be one of the only developers who I have this issue with on PC. I am not sure why they are being used as some shinning example of what is wrong with the current generation systems. This narrative is weaker than the CPUs.
 

Lord Error

Insane For Sony
With the PS4 and Xbox One CPUs themselves equivalent to an overclocked Atom processor the performance is near comparable. Now how do you think the performance of the PS4 and Xbox One CPUs be in that chart if they had a variant as powerful as a i5 or i7 Processor.
Going by flops count of non-monster versions of i5 or i7 CPUs available at the time of the console's making, I think it'd be about twice as much, so nowhere near the GPU performance. PS4/XB1 have weak cores CPU, but there's 8 of those cores (6 available for games).
 

Lord Error

Insane For Sony
Sure more ACE will benefit GPU Compute, its why it was designed for.
The question is though, will more ACE benefit GPGPU in PS4 games?
Will there be even games that will use GPGPU compute in a way You will need more ACE for wavefront scheduling.
ACE number upgrade was mostly designed for very high end cards that reach 5-6Gflops, not for 1.8tflops. And still, we dont know if it not technology designed more for GPGPU based research applications and not games.

It can be a boost for PS4, but it can also be not and for now, we dont have any prove that it helps in real world games scenarios.
It is probably what made Tomorrow's Children possible on PS4. I think I got that from dev's explanation posted here. His comment about the GPU performing as an equivalent of a 2.3TF GPU for the task they needed to do due to higher ACE count and assymetry. Also, if I'd have to guess, it's also probably what made Resogun possible.
 
I think a lot of console only folk are going to be hella disappointed with just how quickly the disparity between what devs can do on console with their multiplats and what devs can do on PC with their multiplats grows.

The CPUs in the consoles simply don't have the grunt to carry out the evolution that we'll be seeing over the next 4 years as PC tech changes.

And I'm sure both Sony and especially MS are intimately aware of the fact that this will need to be a shorter generation.


I think your grossly overestimating what devs are doing with PC's today . My 4+ yr old setup with an 8800GT still plays everything I want .

There is a distinct difference between games that demand extra cpu/gpu for the game and that power being required to drive higher resolutions like 4k. So I don't really count game graphics "Scaling" as demand. A dev isn't going to make a game which is that brutle on requirements mainly because they want a return on their investment. Thats why the majority of PC games will always be playable on lower end systems . I don't see that changing anytime soon.

When Oculus Rift is fine tuned, that's when I will upgrade again. VR is the only dev path I see where the consoles will be left in the dust since if you want HD VR then the consoles really can't handle it.
 
I think a lot of console only folk are going to be hella disappointed with just how quickly the disparity between what devs can do on console with their multiplats and what devs can do on PC with their multiplats grows.

The CPUs in the consoles simply don't have the grunt to carry out the evolution that we'll be seeing over the next 4 years as PC tech changes.

And I'm sure both Sony and especially MS are intimately aware of the fact that this will need to be a shorter generation.

How is his different from any other generation. I don't think people care. Exclusives will always look visually advanced enough to appease everyone
 

Vizzeh

Banned
I think a lot of console only folk are going to be hella disappointed with just how quickly the disparity between what devs can do on console with their multiplats and what devs can do on PC with their multiplats grows.

The CPUs in the consoles simply don't have the grunt to carry out the evolution that we'll be seeing over the next 4 years as PC tech changes.

And I'm sure both Sony and especially MS are intimately aware of the fact that this will need to be a shorter generation.

Im not sure I see why this would be any different than any other console previously. This generation more than any will require devs to get a bit more hands-on with the code or they likely will fall behind, where they can make it up is on the GPGPU, its a fact of life seemingly they will have to get used too. 3rd party devs couldnt use the PS3 SPU's that well for years last gen, they still couldn't compete with ND by any stretch come end-cycle. That has to be a requirement this Gen to with-hold some of that PC ground aslong as possible.

Im also hopefully, purely from a PS4 owners point of view that market share will dictate some better quality games from the higher end of the performance spectrum, whilst the disparity between X1 + PS4 to grow. ( I dont wish that on X1, but want the best from our System, which is why this parity rubbish has to stop before it starts)

Of course the exclusives will shine, especially where the PS4 comes in, on compute and visuals, with hopefully some further advancements in other fields such as AI + other realism tweaks.
 
Not really; it seems like having unneccessarily high GPU spec requirements because the GPUs are being used to do work that should be being done by the CPU is going to be a thing that continues.
"Should be" run on the CPU? How do you figure? The function returns 7.3x faster on the XBone's GPU, and 16.3x faster on the PS4's GPU. Why spend ~5 ms of the CPU's time computing 100 dancers, when you can perform exactly the same work using 0.3-0.6 ms of time on the GPU instead? That seems like a poor use of one's resources, especially when you know there is code which can't be efficiently moved to the GPU.

The fact is, as the presenter says, just about everything can be moved to the GPU. The trick is doing it in a way that actually improves your performance in the process. Sony have a lot of experience in this particular area thanks to the work they did on the Cell, actually. Sony's current tools will actually compile two versions of every function; one to run on the CPU, and one to run on the GPU. The developer can then choose whichever gives him the best performance, and tweak the implementation as necessary to tune performance.


You are not magically getting more than 176GB/s. The hardware does not go beyond using that.
Correct, but what you're not realizing is 176GB/s is far more bandwidth than the GPU even requires. The 7870 has 20 CUs clocked at 1000 MHz, and its GDDR5 is only 153.6GB/s. Even if the PS4's CPU were constantly sucking 20GB/s off of the main bus, that still leaves 156GB/s entirely to the GPU. That's more raw bandwidth than the 7870, and it's only feeding 18, 800 MHz CUs. So per tick, it's effectively got 141% of bandwidth a comparable PC card would provide, even if the CPU is simultaneously hammering the main bus as hard as it can. Conversely, if the XBone's CPU was pulling 20 GB/s off of its bus, that would leave only 48 GB/s for the (12@853MHz) GPU, which is 61% of the per-tick bandwidth of the PC card.


Its actually about the difficulties in programming games for those consoles due to their decision to go with weak CPUs and the constraints that that results in.
Actually, the thread is about the huge performance gains which can be achieved by transitioning certain code to the GPU, and how best to actually accomplish that.

Some people have taken that and tried to twist it in to a lolConsoles thread, which is pretty ironic, given that the PS4 — and to some extent the XBone — is actually much better suited to leverage these super helpful techniques than is a typical PC, thanks to things like the unified memory pool, the increased ACE count to reduce contention as more jobs are scheduled, improved cache management, and the "excessive" bandwidth needed to carry any extra load.

Given games made for the PS4 and Xbone are also games made for the PC, and given that the OP is pretty much outright saying "improvements in games are going to be mostly constrained to purely aesthetics as a result", it isn't irrelevant at all to question that decision by the platform holders.
Well, they say in the presentation that these techniques are also useful for things like collision detection, so strawman fail.
 

vpance

Member
The question is, how much of that power is being used efficiently today? The answer to that will be speculative, but I don't think I would be terirbly off the mark to say very damn efficiently today, with the exception of compute (that's the wild card, IMHO) and possibly memory.

The answer to that this time last gen was basically NOT AT ALL. Those are two very different answers.

I think you are far off the mark. Developers aren't thinking this at all, why should we? Example:

Trials dev on Beyond3d said:
This is an interesting question. Furmark has certainly caused thermal problems on past PC GPUs, but modern GPUs throttle automatically if the thermal budget is exceeded. Asynchronous compute can be used to utilize the GPU better. I expect to see at least some games utilizing asynchronous compute to reach Furmark-level thermals.

We aren't close to approaching very efficient at all right now. To think otherwise based on a glut of cross gen games and games half developed before SDK's were mature 11 months in is a bit presumptuous.
 

Ricky_R

Member
I think a lot of console only folk are going to be hella disappointed with just how quickly the disparity between what devs can do on console with their multiplats and what devs can do on PC with their multiplats grows.

The CPUs in the consoles simply don't have the grunt to carry out the evolution that we'll be seeing over the next 4 years as PC tech changes.

Welcome to every gen.
 
Actually, the thread is about the huge performance gains which can be achieved by transitioning certain code to the GPU, and how best to actually accomplish that.

From the OP:
"Technically we're CPU-bound. The GPUs are really powerful, obviously the graphics look pretty good, but it's the CPU [that] has to process the AI, the number of NPCs we have on screen, all these systems running in parallel.

"We were quickly bottlenecked by that and it was a bit frustrating," he continued, "because we thought that this was going to be a tenfold improvement over everything AI-wise, and we realised it was going to be pretty hard. It's not the number of polygons that affect the framerate. We could be running at 100fps if it was just graphics, but because of AI, we're still limited to 30 frames per second."

That really doesn't sound like someone excited about the huge performance gains that can be achieved by transitioning code to the GPU.
It sounds an awful lot like the exact fucking opposite.

Im pretty sure he's talking about the visual appearance of said abilities. Which he is right in regards to last gen not being able to to such effects.

You can quite clearly see from the quote that that is a response to, and his previous comments that he isn't.
 

Marlenus

Member
I think a lot of console only folk are going to be hella disappointed with just how quickly the disparity between what devs can do on console with their multiplats and what devs can do on PC with their multiplats grows.

The CPUs in the consoles simply don't have the grunt to carry out the evolution that we'll be seeing over the next 4 years as PC tech changes.

And I'm sure both Sony and especially MS are intimately aware of the fact that this will need to be a shorter generation.

The CELL was great at a specific set of tasks, parallel computing being one of those tasks, but it is very weak at more general tasks. The current consoles are relatively strong in both as the CPU is pretty good (not desktop CPU good but still good) at the general stuff and the GPU is fantastic at the parallel stuff. It will require time for the paradigm to shift towards this but it will start with the 1st party stuff and move into the 3rd party stuff in the next few years I would have thought. Some 1st parties are already using GPU compute so it will defiantly become more widely utilised.

With the what was available I really do not think Mark Cerny could have made a more balanced gaming system than the PS4 at the price it was released at.

When looking at the various compromises I get the follwing:
Steamroller based AMD APU: More power consumption, more die space. at the $399 budget you would get a weaker GPU resulting in a lower performing system than what we have now. At the $499 budget you would perhaps have had a similar GPU to what it currently has and a CPU performance increase. Is it worth $100 though? For me, I am not sure.
Nvidia APU: ARM based CPU with likely Kepler based GPU, weaker CPU than currently and the GPU would have worse compute performance so the ability to offload tasks from the CPU would be even weaker than now and it would have a weaker CPU as well. On top of that Nvidia have no experience in manufacturing an APU of this nature so that would be added R&D costs so I doubt that the console would have cost $399.
Intel APU: Depends on the CPU, Atom is no faster than Jaguar so you need a pentium at minimum which would likely require a custom 4 core design as I am sure a dual core would not be ideal for the consoles if they are to last for 5+ years. Anything higher is too expensive. Even with an IRIS Pro based GPU it would be weak compared to the current PS4 GPU so overall it would at best cost the same but be weaker overall.
Intel CPU with AMD/Nvidia GPU: Again within a realistic budget the best Intel can get you is a custom 4 core Pentium and that might be a stretch. Multiple suppliers and multiple chips increases costs and complexity so I doubt they could hit the same performance the current PS4 has at $399.

I just do not see a way they could have done more with their budget, it seems totally optimised within the restrictions they were given and without a higher budget or a subsidised pricing model nothing more was doable.
 

delta25

Banned
And you speak for everyone? How do you know that?

Why be concerned with the hardware when the end results are providing visually pleasing games to the vast majority of people who play them. Last I checked there was no shortage of games on the PS4 where people were gushing about the visuals.
 

Vizzeh

Banned
From the OP:


That really doesn't sound like someone excited about the huge performance gains that can be achieved by transitioning code to the GPU.
It sounds an awful lot like the exact fucking opposite.



You can quite clearly see from the quote that that is a response to, and his previous comments that he isn't.

That was a recent quote by ubisoft in attempt to justify 900p AC Unity "parity" - which is why it may contradict, based on interpretation of the GDC presentation by the same Developer.

Not that the AI having inpact on resolution makes sense to begin with, why not use more of the AI + Numbercrunching on the GPGPU, since their own presentation proves it could be fruitful. My guess is Unity + its code was in development before they knew how to properly fine tune for this GEN with the devkits...
 
That was a recent quote by ubisoft in attempt to justify 900p AC Unity "parity" - which is why it may contradict, based on interpretation of the GDC presentation by the same Developer.

Not that the AI having inpact on resolution makes sense to begin with, why not use more of the AI + Numbercrunching on the GPGPU, since their own presentation proves it could be fruitful. My guess is Unity + its code was in development before they knew how to properly fine tune for this GEN with the devkits...

Possibly, but then we're into second-guessing software architectures and claiming to be better programmers than the professionals responsible territory.
 

Kinthalis

Banned
I think you are far off the mark. Developers aren't thinking this at all, why should we? Example:



We aren't close to approaching very efficient at all right now. To think otherwise based on a glut of cross gen games and games half developed before SDK's were mature 11 months in is a bit presumptuous.


Every single dev out there has been saying that they are utilizing these consoles as best they can.

Sure, some of that is marketing BS, but for the most part, they are right.

As I mentioned, the situation isn't nearly what it was at the begining of last gen, when there were very few people who understood the changes new API's and hardware were bringing to the table in terms of asset creation and in terms of how rendering engines would change.

Today that's just simply not the case. Outside of compute (which will change things a bit, I'm sure, but isn't magic either), there's no shortage of talent out there which fully understands the capability of modern hardware.
 
Possibly, but then we're into second-guessing software architectures and claiming to be better programmers than the professionals responsible territory.
Your argument is an Appeal to Authority.

What about all of the developers who are able to make AI work at 60 Hz? Or the ones who don't spend half their CPU time decompressing textures? And what does any of this have to do with lightening the load on the CPU by offloading work to the GPU? It seems that the CPU is just being trotted out as a convenient excuse to attempt to explain parity to people who hopefully won't understand what's being explained to them. Then this flimsy excuse for parity is being waved about as "proof" that consoles are holding back PC development, which really has nothing to do with this thread, or the presentation its based on.

Of course, I could make the opposite argument, and point out that the need to support the archaic memory architecture on PC is stifling PS4 development, but it would be equally off-topic.
 

RoboPlato

I'd be in the dick
Every single dev out there has been saying that they are utilizing these consoles as best they can.

Sure, some of that is marketing BS, but for the most part, they are right.

As I mentioned, the situation isn't nearly what it was at the begining of last gen, when there were very few people who understood the changes new API's and hardware were bringing to the table in terms of asset creation and in terms of how rendering engines would change.

Today that's just simply not the case. Outside of compute (which will change things a bit, I'm sure, but isn't magic either), there's no shortage of talent out there which fully understands the capability of modern hardware.

I still think we're going to see significant gains throughout the generation. Right now we have a lot of cross gen titles being shipped and next gen versions aren't getting focused on as heavily as they will when games are only shipping on two or three platforms. While there is less mystery as to where improvements will come from (GPGPU and HSA features mainly) I believe that the tight level of control devs have on console hardware will allow them to do some impressive things to eek out impressive work.
 

On Demand

Banned
Every single dev out there has been saying that they are utilizing these consoles as best they can.

Sure, some of that is marketing BS, but for the most part, they are right.

As I mentioned, the situation isn't nearly what it was at the begining of last gen, when there were very few people who understood the changes new API's and hardware were bringing to the table in terms of asset creation and in terms of how rendering engines would change.

Today that's just simply not the case. Outside of compute (which will change things a bit, I'm sure, but isn't magic either), there's no shortage of talent out there which fully understands the capability of modern hardware.


I don't think that because the hardware is easy to understand that developers won't get more out of it over time. As with anything, there's still a learning curve. It happens every generation. Developers are already talking about it-


However, Rundvist also said that there's still the ability to push things further, “With a console generation there is a lot of growth once people learn to use the system," he said. I think the same will happen with the PS4. We will be able to push much more from the consoles when we learn to use all the details. They are incredibly powerful by default but there is more to get from them.

http://www.psu.com/news/24971/Tom-C...4-its-an-amazing-machine-says-Ubisoft-Massive


Properly utilizing Compute shader functionality can certainly add a lot of life to a game, and as developers mature with the new hardware we should expect massive gains from offloading highly parallel tasks from the CPU to the GPU.  The trick will be defining what processes are good for Compute shaders to take over. Mastering Compute on the PS4 will be similar to mastering CELL on the PS3, but it has the advantage of being a technology that is valid for all current platforms.


http://www.worldsfactory.net/2014/04/28/mastering-ps4-compute-yields-massive-gains-like-ps3-cell


Some feared that with PS4 hardware being much closer than his predecessor to PCs, the platform’s graphics output would be maxed much sooner by developers. These slides, however, tell a very different story: even a first party studio like Sucker Punch learned many things after completing their first PS4 game, and according to what we’re seeing, there’s a whole different approach to the hardware with compute (or GPGPU – using the GPU for tasks that had traditionally been reserved for CPU) that many games haven’t even tried yet. What’s more important, it could prove the most fruitful.

http://www.worldsfactory.net/2014/04/15/sucker-punch-ps4-future-compute-explains-tricks



You can't seriously expect performance to be static over the years. I expect games released towards the end of the generation to look drastically different than what we have now.
 
 

No it isn't.
I'm not inferring anything about a transitive property of authority in a different field as a result of an authority in one.

Accepting an expert in the fields expertise on the subject they are on an expert in with no evidence to the contraryis in no way shape or form a logical fallacy.

Saying it can't be true because other titles have AI in 60fps is a logical fallacy however, because different games do different things and you are using false equivalencies.

Of course, I could make the opposite argument, and point out that the need to support the archaic memory architecture on PC is stifling PS4 development, but it would be equally off-topic.

I mean, you could but it would be self evidently ridiculous to do so.
 

KKRT00

Member
It is probably what made Tomorrow's Children possible on PS4. I think I got that from dev's explanation posted here. His comment about the GPU performing as an equivalent of a 2.3TF GPU for the task they needed to do due to higher ACE count and assymetry. Also, if I'd have to guess, it's also probably what made Resogun possible.

Asymmetric GPGPU is possible even on PC even now with Mantle [Oxide was talking about this] and definitely its also possible on Xbone, and will be on DX12. You dont need more ACE for this.
Thats my point, there is no evidence or research currently that would show that 2 ACE wouldnt be enough or only slightly worse at performing same GPGPU calculations as 8 ACE in normal gaming scenario.
And by normal gaming scenario, i mean scenario where GPU handles rendering most of the time and uses only 15-20% of its power for GPGPU related tasks [thats counting async].

Can 8 ACE boost PS4 performance over that 40% CU gap in the future? Maybe, but for now we dont have even slight proof of it. Throwing it left and right with hUMA and supercharged architecture is as much accurate as 1Tflops of performance graphs from Nvidia at PS3 launch event.
 
With the PS4 and Xbox One CPUs themselves equivalent to an overclocked Atom processor the performance is near comparable. Now how do you think the performance of the PS4 and Xbox One CPUs be in that chart if they had a variant as powerful as a i5 or i7 Processor.

Now how much bigger, hotter, more power hungry and more expensive do you think the PS4/XBO would have be if they had a variant as powerful as i5 or i7 processor?

Imagine if Sony overclocked their CPU at the last minute too...

There's no reason why they couldn't. They did similar with PSP ran it clocked at 222MHz to save battery power, certain games could unlock it to 333MHz then they eventually fully unlocked it with a firmware update.

They could let Microsoft be the guinea pig and see if it shortens the lifespan and then overclock it down the line to give devs a little extra boost.
 
Since Xbox One CPU is marginally better than PS4 Imagine if Microsoft didn't gimped on the GPU....

Definitely would take more GPU power of CPU any day...I always thought diminishing returns hit in pretty quickly on the CPU side of things on PC. Obviously CPU power is becoming more relevant again through increases AI requirements etc...
 

Cyriades

Member
There's no reason why they couldn't. They did similar with PSP ran it clocked at 222MHz to save battery power, certain games could unlock it to 333MHz then they eventually fully unlocked it with a firmware update.

They could let Microsoft be the guinea pig and see if it shortens the lifespan and then overclock it down the line to give devs a little extra boost.

GPU/CPU clocks are sourced off the same clock generator in the APU. Increasing CPU speed means also increasing GPU clock speed, the multiple is 2x, 800MHZ gpu, 1600mhz cpu. PS4 is already at thermal limits/yield, no room for clock bumps.
 

KidJr

Member
"Should be" run on the CPU? How do you figure? The function returns 7.3x faster on the XBone's GPU, and 16.3x faster on the PS4's GPU. Why spend ~5 ms of the CPU's time computing 100 dancers, when you can perform exactly the same work using 0.3-0.6 ms of time on the GPU instead? That seems like a poor use of one's resources, especially when you know there is code which can't be efficiently moved to the GPU.

The fact is, as the presenter says, just about everything can be moved to the GPU. The trick is doing it in a way that actually improves your performance in the process. Sony have a lot of experience in this particular area thanks to the work they did on the Cell, actually. Sony's current tools will actually compile two versions of every function; one to run on the CPU, and one to run on the GPU. The developer can then choose whichever gives him the best performance, and tweak the implementation as necessary to tune performance.



Correct, but what you're not realizing is 176GB/s is far more bandwidth than the GPU even requires. The 7870 has 20 CUs clocked at 1000 MHz, and its GDDR5 is only 153.6GB/s. Even if the PS4's CPU were constantly sucking 20GB/s off of the main bus, that still leaves 156GB/s entirely to the GPU. That's more raw bandwidth than the 7870, and it's only feeding 18, 800 MHz CUs. So per tick, it's effectively got 141% of bandwidth a comparable PC card would provide, even if the CPU is simultaneously hammering the main bus as hard as it can. Conversely, if the XBone's CPU was pulling 20 GB/s off of its bus, that would leave only 48 GB/s for the (12@853MHz) GPU, which is 61% of the per-tick bandwidth of the PC card.



Actually, the thread is about the huge performance gains which can be achieved by transitioning certain code to the GPU, and how best to actually accomplish that.

Some people have taken that and tried to twist it in to a lolConsoles thread, which is pretty ironic, given that the PS4 — and to some extent the XBone — is actually much better suited to leverage these super helpful techniques than is a typical PC, thanks to things like the unified memory pool, the increased ACE count to reduce contention as more jobs are scheduled, improved cache management, and the "excessive" bandwidth needed to carry any extra load.


Well, they say in the presentation that these techniques are also useful for things like collision detection, so strawman fail.

Awesome post!

Question to bring this back round to topic. I get that there's a unified memory architecture, from what I understand developers spend alot of time still having to syncing data / and communicating with the rest of the chip set, why is that? I imagine it has something to do with keeping the core algorithm in the GPU's local memory (which I'm assuming can only be access by each compute unit as opposed to being actually shared)?
 

KidJr

Member
Asymmetric GPGPU is possible even on PC even now with Mantle [Oxide was talking about this] and definitely its also possible on Xbone, and will be on DX12. You dont need more ACE for this.
Thats my point, there is no evidence or research currently that would show that 2 ACE wouldnt be enough or only slightly worse at performing same GPGPU calculations as 8 ACE in normal gaming scenario.
And by normal gaming scenario, i mean scenario where GPU handles rendering most of the time and uses only 15-20% of its power for GPGPU related tasks [thats counting async].

Can 8 ACE boost PS4 performance over that 40% CU gap in the future? Maybe, but for now we dont have even slight proof of it. Throwing it left and right with hUMA and supercharged architecture is as much accurate as 1Tflops of performance graphs from Nvidia at PS3 launch event.

Yeah I get what your saying but it doesnt really make sense the more I think about (I dont mean this to be rude, I'm really enjoying our discussion :) so please do expand on my point. It goes back to what serversurfer said about bandwidth, which hes shows there is more than enough on the for the GPU. So we know that bandwidth isnt going to be an issue for GPU Compute and the whole point of GPU compute and how you REALLY get performance out it is to keep all the execution pipelines busy,the ACEs are responsible for monitoring queues and starting the launch process of a kernel. So for basic GPU Compute features that we're seeing now then no, you dont really need more than 2 ACE but surely it's logical as GPU Compute becomes more detailed having more ACE will allow for series performance advantage?

I mean am I miss understanding it?
 

SapientWolf

Trucker Sexologist
AI seems like a really poor fit for the GPGPU paradigm. The AI actors in a game like AC Unity have state and situational behavior depending on those states. Things like physics and swarm behavior are better suited.
 
No it isn't.
I'm not inferring anything about a transitive property of authority in a different field as a result of an authority in one.

Accepting an expert in the fields expertise on the subject they are on an expert in with no evidence to the contraryis in no way shape or form a logical fallacy.
What? Read the link.

In informal reasoning, the appeal to authority is a form of argument attempting to establish a statistical syllogism.[2] The appeal to authority relies on an argument of the form:
- A is an authority on a particular topic
- A says something about that topic
- A is probably correct


That's exactly what you're doing when you say, "but then we're into second-guessing software architectures and claiming to be better programmers than the professionals responsible territory." This guy is the expert, so he must be right.

Saying it can't be true because other titles have AI in 60fps is a logical fallacy however, because different games do different things and you are using false equivalencies.
I never said anything of the sort. I was simply pointing out that a single developer's woes can't be extrapolated in to a general truism that's dragging down an entire industry, which is exactly what you're attempting to do with your appeal to authority. To add insult to injury, your Authority never even stated what you claimed; console CPU limitations are holding back what can be achieved on PC. You put those words in to his mouth, and are now claiming that because in your estimation an Authority implied it, it must be true.

I mean, you could but it would be self evidently ridiculous to do so.
In what way?


Asymmetric GPGPU is possible even on PC even now with Mantle [Oxide was talking about this] and definitely its also possible on Xbone, and will be on DX12. You dont need more ACE for this.
Thats my point, there is no evidence or research currently that would show that 2 ACE wouldnt be enough or only slightly worse at performing same GPGPU calculations as 8 ACE in normal gaming scenario.
And by normal gaming scenario, i mean scenario where GPU handles rendering most of the time and uses only 15-20% of its power for GPGPU related tasks [thats counting async].

Can 8 ACE boost PS4 performance over that 40% CU gap in the future? Maybe, but for now we dont have even slight proof of it. Throwing it left and right with hUMA and supercharged architecture is as much accurate as 1Tflops of performance graphs from Nvidia at PS3 launch event.
I think it's more about making full use of the "extra" CUs. The more power available, the more jobs you'll be trying to slot, whether you're slipping extra jobs in to the cracks, are running a "14+4"-type setup, or have a combination of both.

Also, with the hUMA architecture, you're likely to see a lot more jobs being passed off to the GPU than you would on your typical PC. Because of the split pool on the PC, if the main thread needs something crunched by the GPU, the CPU needs to copy the data from the RAM to the VRAM. Then the GPU needs to do its thing and write the result back out to the VRAM. Once that is finished, the CPU then needs to copy the new data back to RAM from VRAM, and only then it can begin to work with the result provided by the GPU. All of that copying takes such a long time that if you've got a comparatively easy problem to solve, you lose more time shuffling data around than the GPU can actually save you, even if it's 10-15x faster than the CPU at solving that particular problem. Whittling a 1.0 ms problem down to 0.1 ms problem does you no good if it takes you 2.5 ms to ship it to and from the whittler.

hUMA avoids the shuffling by having a truly shared data pool. Think of it like two blacksmiths both tempering the same sword; the sword stays on the anvil, and they just take turns hammering it. Because eliminating the data shuffling also eliminates the barrier to entry for these jobs
— You must be this fat to ride this ride. —
it stands to reason that we're going to be seeing a lot more jobs wanting to get in on the action. Not only do the additional queues mean shorter lines to get on board, the expanded look at upcoming demand allows for more efficient scheduling of resources.

The benefits of additional queues seem quite clear, and indeed, we already have devs putting them to good use. I'm not sure why you're so convinced they're unnecessary.


Question to bring this back round to topic. I get that there's a unified memory architecture, from what I understand developers spend alot of time still having to syncing data / and communicating with the rest of the chip set, why is that? I imagine it has something to do with keeping the core algorithm in the GPU's local memory (which I'm assuming can only be access by each compute unit as opposed to being actually shared)?
The GPU and CPU each maintain their own internal, crazy-fast caches of what the data looks like. If the GPU alters a variable the CPU is holding in its cache, it can update its own cache with the result, or not, and it can also update the CPU's cache with the result, or not. Sometimes it's hard for developers to keep track of what should be updated and what should be bypassed. Also, if you're doing a lot of bypassing, sometimes you need to do a periodic flush to make sure all of your data is rational, but doing so kinda reboots any processes you currently have running, so you don't want to do it willy nilly.

Yeah I get what your saying but it doesnt really make sense the more I think about (I dont mean this to be rude, I'm really enjoying our discussion :) so please do expand on my point. It goes back to what serversurfer said about bandwidth, which hes shows there is more than enough on the for the GPU. So we know that bandwidth isnt going to be an issue for GPU Compute and the whole point of GPU compute and how you REALLY get performance out it is to keep all the execution pipelines busy,the ACEs are responsible for monitoring queues and starting the launch process of a kernel. So for basic GPU Compute features that we're seeing now then no, you dont really need more than 2 ACE but surely it's logical as GPU Compute becomes more detailed having more ACE will allow for series performance advantage?

I mean am I miss understanding it?
Nope, you nailed it. The additional hardware is there specifically to facilitate this kind of stuff. You don't see it in "normal game usage" because normal games haven't really had access to this capability for long. The devs who are exploring it are definitely putting the additional queues to use though.
 

SmokedMeat

Gamer™
Why be concerned with the hardware when the end results are providing visually pleasing games to the vast majority of people who play them. Last I checked there was no shortage of games on the PS4 where people were gushing about the visuals.

That's the thing; even with Wii U we've reached the point where games are gorgeous. The average consumer isn't going to care that they're missing out on a little vegetation in their copy of Mordor, or the textures are equal to PC Medium settings. This stuff only matters to a small minority of the gaming community.
 

vpance

Member
Today that's just simply not the case. Outside of compute (which will change things a bit, I'm sure, but isn't magic either), there's no shortage of talent out there which fully understands the capability of modern hardware.

Ease of understanding and actually putting methods into practice are 2 separate things due to budget and time. Ramping things up to meet a certain level is no doubt much faster and easier than before. But just because the parts that went into these consoles are more PC like now doesn't mean they've approached max efficiency in the first year.

Things will improve, period. And we haven't seen the best there is to offer yet. People saying the gains are significantly diminished and that most will be disappointed have absolutely no ground to stand on.
 
I think a lot of console only folk are going to be hella disappointed with just how quickly the disparity between what devs can do on console with their multiplats and what devs can do on PC with their multiplats grows.

The CPUs in the consoles simply don't have the grunt to carry out the evolution that we'll be seeing over the next 4 years as PC tech changes.
It doesn't matter when virtually all AAA games are built for console specs. All PC versions will get is better IQ and a few effects. The core assets of the games are all built for console specs. No publisher in their right mind would greenlight a AAA game built to fully take advantage of high-end PC hardware... financial suicide.
 
AI seems like a really poor fit for the GPGPU paradigm. The AI actors in a game like AC Unity have state and situational behavior depending on those states. Things like physics and swarm behavior are better suited.
I don't understand why the actors couldn't behave independently. Yes, each actor has his own set of inputs, but then those inputs are all being processed by a common algorithm, assuming they're the same type of actor, right?

If you see a threat, flee. One guy sees a threat, the other guy doesn't, and they each act accordingly. Isn't that Single Instruction - Multiple Data? Why can't stuff like decision making and path-finding be done on the GPU? Setting an actor75's state to Fleeing shouldn't be any harder than setting pixel75's color to Blue, it seems to me.
 

KKRT00

Member
Yeah I get what your saying but it doesnt really make sense the more I think about (I dont mean this to be rude, I'm really enjoying our discussion :) so please do expand on my point. It goes back to what serversurfer said about bandwidth, which hes shows there is more than enough on the for the GPU. So we know that bandwidth isnt going to be an issue for GPU Compute and the whole point of GPU compute and how you REALLY get performance out it is to keep all the execution pipelines busy,the ACEs are responsible for monitoring queues and starting the launch process of a kernel. So for basic GPU Compute features that we're seeing now then no, you dont really need more than 2 ACE but surely it's logical as GPU Compute becomes more detailed having more ACE will allow for series performance advantage?

I mean am I miss understanding it?

Yes i would agree, if PS4 had more than 1.8Tflops. 1,8tflops is not really that much and most of rendering time will be still consumed by normal rendering. Sure some post-processing, physics or even some lighting calculation will be moved to GPGPU, but majority of frame time, the GPU will still be used for shading, geometry processing, shadow maps etc
I dont think bandwidth is really limitation for GPGPU, we have Flex from Nvidia, the most advanced gpgpu based physics engine working without unified memory architecture without any problem.
Bandwidth will be mostly consumed by textures, framebuffers and alpha textures this gen. And i dont think PS4 has enough bandwidth, going by the lack of AF in so many games, even from 1st studios.

---
I think it's more about making full use of the "extra" CUs. The more power available, the more jobs you'll be trying to slot, whether you're slipping extra jobs in to the cracks, are running a "14+4"-type setup, or have a combination of both.

Also, with the hUMA architecture, you're likely to see a lot more jobs being passed off to the GPU than you would on your typical PC. Because of the split pool on the PC, if the main thread needs something crunched by the GPU, the CPU needs to copy the data from the RAM to the VRAM. Then the GPU needs to do its thing and write the result back out to the VRAM. Once that is finished, the CPU then needs to copy the new data back to RAM from VRAM, and only then it can begin to work with the result provided by the GPU. All of that copying takes such a long time that if you've got a comparatively easy problem to solve, you lose more time shuffling data around than the GPU can actually save you, even if it's 10-15x faster than the CPU at solving that particular problem. Whittling a 1.0 ms problem down to 0.1 ms problem does you no good if it takes you 2.5 ms to ship it to and from the whittler.

hUMA avoids the shuffling by having a truly shared data pool. Think of it like two blacksmiths both tempering the same sword; the sword stays on the anvil, and they just take turns hammering it. Because eliminating the data shuffling also eliminates the barrier to entry for these jobs
— You must be this fat to ride this ride. —
it stands to reason that we're going to be seeing a lot more jobs wanting to get in on the action. Not only do the additional queues mean shorter lines to get on board, the expanded look at upcoming demand allows for more efficient scheduling of resources.

The benefits of additional queues seem quite clear, and indeed, we already have devs putting them to good use. I'm not sure why you're so convinced they're unnecessary.
The benefits of hUMA from all the application it been used so far, came mostly from very big data-driven applications, so nothing You can use GPGPU in games.
I've also read many times that devs can avoid many reads and writes with smart scheduling and data structure.

Can You also post any examples of devs putting more ACE to good use?

Nope, you nailed it. The additional hardware is there specifically to facilitate this kind of stuff. You don't see it in "normal game usage" because normal games haven't really had access to this capability for long. The devs who are exploring it are definitely putting the additional queues to use though.

We have already quite a lot of games using GPGPU, mostly through the Physx and thats without new, low level APIs
 
The CELL was great at a specific set of tasks, parallel computing being one of those tasks, but it is very weak at more general tasks. The current consoles are relatively strong in both as the CPU is pretty good (not desktop CPU good but still good) at the general stuff and the GPU is fantastic at the parallel stuff. It will require time for the paradigm to shift towards this but it will start with the 1st party stuff and move into the 3rd party stuff in the next few years I would have thought. Some 1st parties are already using GPU compute so it will defiantly become more widely utilised.

With the what was available I really do not think Mark Cerny could have made a more balanced gaming system than the PS4 at the price it was released at.

When looking at the various compromises I get the follwing:
Steamroller based AMD APU: More power consumption, more die space. at the $399 budget you would get a weaker GPU resulting in a lower performing system than what we have now. At the $499 budget you would perhaps have had a similar GPU to what it currently has and a CPU performance increase. Is it worth $100 though? For me, I am not sure.
Nvidia APU: ARM based CPU with likely Kepler based GPU, weaker CPU than currently and the GPU would have worse compute performance so the ability to offload tasks from the CPU would be even weaker than now and it would have a weaker CPU as well. On top of that Nvidia have no experience in manufacturing an APU of this nature so that would be added R&D costs so I doubt that the console would have cost $399.
Intel APU: Depends on the CPU, Atom is no faster than Jaguar so you need a pentium at minimum which would likely require a custom 4 core design as I am sure a dual core would not be ideal for the consoles if they are to last for 5+ years. Anything higher is too expensive. Even with an IRIS Pro based GPU it would be weak compared to the current PS4 GPU so overall it would at best cost the same but be weaker overall.
Intel CPU with AMD/Nvidia GPU: Again within a realistic budget the best Intel can get you is a custom 4 core Pentium and that might be a stretch. Multiple suppliers and multiple chips increases costs and complexity so I doubt they could hit the same performance the current PS4 has at $399.

I just do not see a way they could have done more with their budget, it seems totally optimised within the restrictions they were given and without a higher budget or a subsidised pricing model nothing more was doable.

Intel APU wouldn't be Atom level, it'll be Iris Pro with much stronger CPU (probably mobile i5 level). It would probably be more like $500 rather than $400, though.

Intel CPU w/ Nvidia GPU is basically the OG Xbox, expensive as fuck but insane amounts of power.
 
What? Read the link.

In informal reasoning, the appeal to authority is a form of argument attempting to establish a statistical syllogism.[2] The appeal to authority relies on an argument of the form:
- A is an authority on a particular topic
- A says something about that topic
- A is probably correct

Why did you leave out the actual fallacy?

Fallacious examples of using the appeal include any appeal to authority used in the context of logical reasoning, and appealing to the position of an authority or authorities to dismiss evidence,[2][3][4][5] as, while authorities can be correct in judgments related to their area of expertise more often than laypersons,[citation needed] they can still come to the wrong judgments through error, bias, dishonesty, or falling prey to groupthink

Accepting an expert opinion on prima facie grounds is not only entirely acceptable, it is the norm, and you haven't actually said anything to dispute that other than "He's an expert, therefore we can;'t trust him" which is inherently nonsensical.

To add insult to injury, your Authority never even stated what you claimed; console CPU limitations are holding back what can be achieved on PC. You put those words in to his mouth, and are now claiming that because in your estimation an Authority implied it, it must be true.

No, I said that his arguments may be a self-serving case to 'justify' platform parity, but that saying so as an indisputable fact is very much getting into the territory of second guessing someone else coding practives without - AFAIK - any access to their source code or development methodologies.

In what way?

The PC is not the lead platform, therefore whatever engine architecture is chosen was not chosen to cater to the PC.

Nope, you nailed it. The additional hardware is there specifically to facilitate this kind of stuff. You don't see it in "normal game usage" because normal games haven't really had access to this capability for long. The devs who are exploring it are definitely putting the additional queues to use though.

I really have to ask you what your professional qualification for making these sort of statements is, because while I haven't done any game engine programming in almost a decade, do not maintain any interest or knowledge about current methodologies and have never worked on a closed system, the stuff you are coming out with as "This is totally a way better way to do it" is completely at odds with all sorts of best practices I had hammered into me.

I mean, its entirely possible that the GPGPU paradigm necessitates entirely rethinking software architecture, but the entire OP doesn't read like a "Here's why we had to change every single thing we have always done and why its super cool and we are totally hyped", it reads like a "These are all the problems we ran into, the massive ballache it caused us, and some elegant hacks we found to get stuff running acceptably".
If this sort of balancing act is whats required and things that never should have been done by the GPU now have to be due to speed discrepancies, the huge amount of delays to titles and guesswork involved in officially released PC minimum specs makes a lot more sense to me.
 

Blazini

Neo Member
I'm fine with whats in my consoles. I already own a gaming pc and dont want to be spending 800-1000 on a console. Just give me fun games with great MP experience.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Also, with the hUMA architecture, you're likely to see a lot more jobs being passed off to the GPU than you would on your typical PC. Because of the split pool on the PC, if the main thread needs something crunched by the GPU, the CPU needs to copy the data from the RAM to the VRAM. Then the GPU needs to do its thing and write the result back out to the VRAM. Once that is finished, the CPU then needs to copy the new data back to RAM from VRAM, and only then it can begin to work with the result provided by the GPU. All of that copying takes such a long time that if you've got a comparatively easy problem to solve, you lose more time shuffling data around than the GPU can actually save you, even if it's 10-15x faster than the CPU at solving that particular problem. Whittling a 1.0 ms problem down to 0.1 ms problem does you no good if it takes you 2.5 ms to ship it to and from the whittler.
GPGPU is the way to go, hUMA is a step in the right direction, and you're giving the audience a nice picture of the benefits, but it should be noted that there' are also monsters down this road.

Full coherency is great, but it's still communication - data still has to move in-and-out of caches, and when it comes to minimizing latencies (as in tight ms budget scenarios), there's a tipping point where your computational problem becomes a communication problem. That is, you can reach a granularity level of your GPGPU candidate jobs where your coherency protocols would eat up your fabric's BW.

This is a fundamental problem in NUMA/GPGPU-style UMA clusters ala KNC, KNL, and I cannot see that being much less of an issue for hUMA in the not-too-distant future.
 
Top Bottom