• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Virtual testing of PS4 and XBO GPUs prove PS4 has bigger grafix numbers

Status
Not open for further replies.
I used the absolute "theoretical" max that the article authors get from their "benchmarking"

They specify that very clearly as 32 - 35 "relative points" and it's pretty easy to see they have the PS4 at around 42

If you wanted to say use the "only ESram" XB1 score then it would be around 42/28 -> 1.5 which is basically what you suggested

That will never be achieved because that's going above the actual performance. Realistically you will never hit Microsoft's theoretical limit which used non-practical tricks to accomplish. Not with that eSRAM limit. Meaning the eSRAM performance on the test is already biased in Microsoft's favor.
 
That will never be achieved because that's going above the actual performance. Realistically you will never hit Microsoft's theoretical limit which used non-practical tricks to accomplish. Not with that eSRAM limit.

23 + 27 = 50 is listed as the actual absolute theoretical limit of the XB1 as mentioned in the article

Let me elaborate: If they cannot properly utilize the eSRAM at all (which is physically impossible I might add), you will get the ‘no eSRAM’ performance. Depending on how much they are able to utilize the eSRAM some portion of the ‘eSRAM only’ bar will be added to the lowest bar. However they cannot and never will be fully added. You will never get 23 + 27 = 50 Relative Points. Since they are utilizing the same GPU, realistically speaking the absolute maximum performance you can expect is till 32 – 35 Relative points.

The 32 - 35 is the author somehow [seemingly magically] drawing what the actual performance maximum possible is.

I was incorrect to use theoretical as I did in my previous statement

That all being said, I pretty much see this as BS until we have some comparisons made to other benchmarking tests showing similar real world differences between PC cards
 
@SwiftDeath

You are missing the big picture. The eSRAM results are already above where they should be. They are using a theoretical max that used trix to meet that aren't applicable outside special circumstances designed to meet that new threshold. So the chart was already higher than it should be. Don't you remember all those eSRAM news when it was new?
 
Chloe Dykstra

Great! Something to do before I go to bed.

g7IsB.gif


i mean come on let's be real; we always asks the girls name just so we can do the gif later. XD

Anyway as for the topic, well we kind of knew this already didn't we?
 

AmyS

Member
Also, don't forget that while Xbox One will be getting Direct X 12 support, PS4 is getting GPU driver and API optimizations from both ICE Team and SN Systems / Razor Team.


Microsoft has a whole operative PR team that explains and demonstrates the upcoming technologies and programming methods that the company uses to enhance the performance of its Xbox One video game console, but Sony on the other hand, likes to give some secrecy to the people who work for betterment of the PS4, and with no PR team, you will hear fairly less about what is going on behind the scenes at Sony, but that doesn’t mean the platform holder is not concerned about the performance of its console.

Until now, we have only heard about Sony’s ICE Team and how it is working to improve the overall PS4 performance by optimizing GPU drivers, graphics API and other important management systems used to run the console, but it seems that ICE Team is not the only unit that works for betterment of PS4, a little-known team called SN Systems is also working to optimize Sony’s latest console.


According to ICE Team Programmer, Cort Stratton, SN Systems (also known as the Razor Team) is a part of Sony Computer Entertainment Inc. Thanks to GamingBolt, we have some details about how the team is contributing in optimizing and increasing the offerings of the PS4 console. Folks over at the SN Systems actually work on the console’s GPU performance analysis and debugging tool, and Razor, which is a CPU profiler for the console.

The official website of SN Systems also reveals some interesting details on how they have been creating the tools that are used by video game developers around the world and how they are pushing the PlayStation platforms to gain every bit of the essential performance. The Razor performance analysis tools build by SN Systems have a feature called PC (or Program Counter) sampling. This feature helps the developers to find out which functions are being called and executed the most, which are sometimes termed as “hot” functions.


PC sampling measures performance by regularly determining which functions the Razor Profiler’s program counter is in, this allows the developers to get a visual overview for each and every function that is being executed, along with timings. Analyzing the provided timings, developers can find out which functions are taking too long to execute, and this results into optimized coding and programming. Eventually, handling a few microseconds allows the developers to run games at either 30fps or 60fps. In the screenshot shared below, Razor Profiler is showing which function is taking the most time, and the green bars are showing which function the program counter is in.

The amount of storage can be limited when developing for a video game console, so SN Systems makes sure that the executable files are as small as they possibly can be, with the help of dead-stripping and de-duplicating. As the name suggests, dead-stripping is about removing the unused “dead” code and data blocks from the executable files since they are of no use now. This program exercise reduces the size of executable file around 5-10%. De-duplication further optimizes the final executable file by reducing its size after eliminating duplicated copies of identical code and read-only data. In this way, each of the references is changed into the original one, allowing 1-2% reduction in executable size.

There was already a thread on this but here's the rest:

http://wccftech.com/ps4-razor-gpucp...detailed-sn-systems-working-optimize-console/
 
@SwiftDeath

You are missing the big picture. The eSRAM results are already above where they should be. They are using a theoretical max that used trix to meet that aren't applicable outside special circumstances designed to meet that new threshold. So the chart was already higher than it should be. Don't you remember all those eSRAM news when it was new?

First off I really and truly do think it's BS but having conceded that I tried to interpret what they were saying and what their "results" were

From this

Let me elaborate: If they cannot properly utilize the eSRAM at all (which is physically impossible I might add), you will get the ‘no eSRAM’ performance. Depending on how much they are able to utilize the eSRAM some portion of the ‘eSRAM only’ bar will be added to the lowest bar. However they cannot and never will be fully added.

Depending on how successfully you utilize the ESram you add to the No Esram bar

Thus the absolute theoretical minimum XB1 score is 23 and the absolute theoretical maximum is 50 if you were fully utilizing the ESram 100% Both 23 and 50 are impossible states to achieve by the way.

All of this is clearly explained in the article as well.
 

klaus

Member
Actually the PS4 would be at 155% of the Xbox One.

That's incorrect, but an understandable mistake - the graph is highly misleading (and the whole "benchmark" is rather dubious anyways). Both bars for the Xbox One are absolute worst cases (no eSRAM bandwith used at all or no DDR3 bandwith used at all) which will never occur in a real situation. The text even clearly states so, and they assume an actual theoretical limit of 32-35 relative points.

This article gives no real new information, uses wrong terminology (no benchmarks here) and contains graphs with misleading stats. In other words, it's utterly useless, but I'm already looking forward to the countless quotes of the graph used to demonstrate how "huge" the power gap is :)
PS4 is obviously more powerful, but if we would put any faith in this article the difference would be around 20% - 30%, clearly contradicting the impression the graph is giving..
 
Thus the absolute theoretical minimum XB1 score is 23 and the absolute theoretical maximum is 50 if you were fully utilizing the ESram 100% Both 23 and 50 are impossible states to achieve by the way.

No... You are still maxed out by the GPU. You could only add if it had dual gpus. I'm getting extremely technical now, maybe someone else can explain this who's better at English.

That's incorrect, but an understandable mistake - the graph is highly misleading (and the whole "benchmark" is rather dubious anyways). Both bars for the Xbox One are absolute worst cases (no eSRAM bandwith used at all or no DDR3 bandwith used at all) which will never occur in a real situation. The text even clearly states so, and they assume an actual theoretical limit of 32-35 relative points.

This article gives no real new information, uses wrong terminology (no benchmarks here) and contains graphs with misleading stats. In other words, it's utterly useless, but I'm already looking forward to the countless quotes of the graph used to demonstrate how "huge" the power gap is :)
PS4 is obviously more powerful, but if we would put any faith in this article the difference would be around 20% - 30%, clearly contradicting the impression the graph is giving..

Actually without meaning to the ESRAM is so much higher than it should be, because they are using theoretical limits, that can't be achieved in realtime and had to do insane things to increase the theoretical limit. Also you will never utilize both evenly unless you only used 2x the eSRAM memory, after that it slows down closer to DDR3. The only chance of getting that bigger boost is if they just said I'm not going to use that 7+ GB of ram. Even then your not using 2 GPUS, your still only using one, so in the best case scenario where they don't need much ram...your still held back by one GPU.
 

klaus

Member
No... You are still maxed out by the GPU. You could only add if it had dual gpus. I'm getting extremely technical now, maybe someone else can explain this who's better at English.

Maybe I'm mistaken (and I see your point about the eSRAM bar being too high to begin with since it's using theoretical numbers), but aren't both bars maxed out by bandwith (either DDR3 or eSRAM)? So of course after a certain point it will be maxed out by GPU, and that's the case the article is making with the 32-35 theoretical max?

Edit: To make my point clearer: isn't the graph missing the improved bandwith achieved by using DDR3 & eSRAM in parallel?
 
Maybe I'm mistaken (and I see your point about the eSRAM bar being too high to begin with since it's using theoretical numbers), but aren't both bars maxed out by bandwith (either DDR3 or eSRAM)? So of course after a certain point it will be maxed out by GPU, and that's the case the article is making with the 32-35 theoretical max?

Those maxes are within reason if they only use equal amounts of eSRAM and DDR3 memory. Once you use more DD3 than eSRAM it drops quickly.
 

TheCloser

Banned
First off I really and truly do think it's BS but having conceded that I tried to interpret what they were saying and what their "results" were

From this



Depending on how successfully you utilize the ESram you add to the No Esram bar

Thus the absolute theoretical minimum XB1 score is 23 and the absolute theoretical maximum is 50 if you were fully utilizing the ESram 100% Both 23 and 50 are impossible states to achieve by the way.

All of this is clearly explained in the article as well.
Here, I'll explain it to you in plain English. The highest the xb1 can reach is 27 and that is if eSRAM is only used. In real life scenarios, the xb1 number will fall between 23 and 27 based on the combination of ram used. Adding numbers is not permitted as we are talking about a single gpu. Doing so is completely and utterly foolish. It displays a lack of understanding of the subject.
 

klaus

Member
Those maxes are within reason if they only use equal amounts of eSRAM and DDR3 memory. Once you use more DD3 than eSRAM it drops quickly.

OK, got it now - the problem is that the eSRAM quickly fills & then stalls once full, while DDR3 can still be used, thus bringing down the average throughput. But I guess that would highly depend on how the eSRAM is used - if you can fit the whole image buffer into eSRAM you should be able to use its full bandwith, since you're overwriting pixels many times, correct? (Or am I mixing up / forgetting about read / write operations?)
 

LiquidMetal14

hide your water-based mammals
Great! Something to do before I go to bed.

g7IsB.gif


i mean come on let's be real; we always asks the girls name just so we can do the gif later. XD

Anyway as for the topic, well we kind of knew this already didn't we?

Somehow the thread always has some of these funny reactions.
 
Here, I'll explain it to you in plain English. The highest the xb1 can reach is 27 and that is if eSRAM is only used. In real life scenarios, the xb1 number will fall between 23 and 27 based on the combination of ram used. Adding numbers is not permitted as we are talking about a single gpu. Doing so is completely and utterly foolish. It displays a lack of understanding of the subject.

Please explain where the 32 - 35 numbers come from then?

Let me elaborate: If they cannot properly utilize the eSRAM at all (which is physically impossible I might add), you will get the ‘no eSRAM’ performance. Depending on how much they are able to utilize the eSRAM some portion of the ‘eSRAM only’ bar will be added to the lowest bar. However they cannot and never will be fully added. You will never get 23 + 27 = 50 Relative Points. Since they are utilizing the same GPU, realistically speaking the absolute maximum performance you can expect is till 32 – 35 Relative points.

Clearly they are still talking about a singular GPU scenario
 
OK, got it now - the problem is that the eSRAM quickly fills & then stalls once full, while DDR3 can still be used, thus bringing down the average throughput. But I guess that would highly depend on how the eSRAM is used - if you can fit the whole image buffer into eSRAM you should be able to use its full bandwith, since you're overwriting pixels many times, correct? (Or am I mixing up / forgetting about read / write operations?)

This is the reason why we are seeing the huge gaps closer to the DDR3 only VS PS4 graph. More than double the eSRAM is needed visually for all modern games currently. Game developers are easily using all the DDR3, and a lot of that is visual information.

Please explain where the 32 - 35 numbers come from then?



Clearly they are still talking about a singular GPU scenario

They are using the bandwidth from both effectively, 64 MB (32 from eSRAM and 32 from DDR3) of memory only for visuals and GPGPU operations combined. Anything more and the performance lowers.
 

klaus

Member
Here, I'll explain it to you in plain English. The highest the xb1 can reach is 27 and that is if eSRAM is only used. In real life scenarios, the xb1 number will fall between 23 and 27 based on the combination of ram used. Adding numbers is not permitted as we are talking about a single gpu. Doing so is completely and utterly foolish. It displays a lack of understanding of the subject.

No that's also not correct - the article quite clearly states, that
"Depending on how much they are able to utilize the eSRAM some portion of the ‘eSRAM only’ bar will be added to the lowest bar. However they cannot and never will be fully added. You will never get 23 + 27 = 50 Relative Points."
 

npa189

Member
Its a know fact that Microsoft screwed up this gen, we need to stop beating a dead horse. The guts of a box doesn't define it, the games it has does. Both are going to have kick ass titles.
 

TheCloser

Banned
Please explain where the 32 - 35 numbers come from then?



Clearly they are still talking about a singular GPU scenario
I don't agree with the article at all. I don't know how he came up with his 32-35 numbers and I think the whole benchmark is a joke. The fastest possible path of getting data to the gpu is through eSRAM. You can't get data any faster than that. Obviously you are able to use both types of ram to transfer data simultaneously but there will be adverse effects. We already know that Esram is not automatically managed like eDRAM on the 360. The management of the data is left to the programmer and that data must be managed which always means wasted time. You cannot generalize the effects of this because it differs on a case to case basis. I honestly don't know how the author is getting the numbers and I'd love to hear more.

In my eyes, what this article does is benchmark two scenarios(ddr3 & eSRAM) and tries to figure out what it's performance level is. That's like boiling two eggs separately and trying to figure out how long it will take to boil the two eggs together. Completely pointless and a waste of time.
 

klaus

Member
This is the reason why we are seeing the huge gaps closer to the DDR3 only VS PS4 graph. More than double the eSRAM is needed visually for all modern games currently. Game developers are easily using all the DDR3, and a lot of that is visual information.



They are using the bandwidth from both effectively, 64 MB (32 from eSRAM and 32 from DDR3) of memory only for visuals and GPGPU operations combined. Anything more and the performance lowers.

Can you elaborate on that, please? I'm fully believing you, but am not knowledgeable enough about the inner workings of the hardware to understand why only 32 MB of DDR3 can be used in the given scenario. I guess it has something to do with the data flow, i.e. where stuff is written to / read from. Is your example assuming reading 32 MB from DDR3 & 32 MB from eSRAM in parallel (and writing to god knows where :)? And is the read / write bandwith cumulative (i.e. can you use full bandwith in both directions at the same time) for a given type of RAM?

Sorry for the noobish questions :)
 
Can you elaborate on that, please? I'm fully believing you, but am not knowledgeable enough about the inner workings of the hardware to understand why only 32 MB of DDR3 can be used. I guess it has something to do with the data flow, i.e. where stuff is written to / read from. Is you example assuming reading 32 MB from DDR3 & 32 MB from eSRAM in parallel (and writing god knows where :)? And is the read / write bandwith cumulative (i.e. can you use full bandwith in both directions at the same time) for a given type of RAM?

To get the maximum bandwidth of both you have to use equal amounts, after that you start to go down because you can't continue to combine something that isn't available. All the DDR3 in excess now lowers the available potential bandwidth. Please don't make me do a chart.
 

TheCloser

Banned
No that's also not correct - the article quite clearly states, that
"Depending on how much they are able to utilize the eSRAM some portion of the ‘eSRAM only’ bar will be added to the lowest bar. However they cannot and never will be fully added. You will never get 23 + 27 = 50 Relative Points."
Yep I see your point and it just clicked in my brain. Please disregard my other post. It's way too late and I'm way too tired. I'm out.
 

klaus

Member
To get the maximum bandwidth of both you have to use equal amounts, after that you start to go down because you can't continue to combine something that isn't available. All the DDR3 in excess now lowers the available potential bandwidth. Please don't make me do a chart.

Eh I hope that won't be needed :)

Well I'm still a bit thick, if you say you combine them, wouldn't the 32 MB in the eSRAM be "used up" much faster since it has a higher bandwith? But I guess I should stop asking and rather start reading up on how GPU hardware works - got a solid understanding of vertex / pixel shaders, SIMD and stuff, but never bothered to learn how the underlying bus'es, ROPs and whatnot are working :p
 
Eh I hope that won't be needed :)

Well I'm still a bit thick, if you say you combine them, wouldn't the 32 MB in the eSRAM be "used up" much faster since it has a higher bandwith? But I guess I should stop asking and rather start reading up on how GPU hardware works - got a solid understanding of vertex / pixel shaders, SIMD and stuff, but never bothered to learn how the underlying bus'es, ROPs and whatnot are working :p

Combined means using the max output of both memory simultaneously. Something that will never be achieved in real-time, but used to make it easier from a mathematics standpoint.
 

KoopaTheCasual

Junior Member
Wait, Nelson himself isn't the joke?
Aww, I wish people weren't so hard on Nelson. He clearly knows he's been shoveling garbage PR (hence the passive agressive twitter handle change he made after DRM-gate, about flipping switches). Can we lighten up on the poor chap?
 
Not like it isn't totally obvious the PS4 GPU is the stronger of the two, but a scenario with only eSRAM and only DDR3 isn't terribly realistic. And to make matters even worse, how a developer plans and then takes advantage of eSRAM can likely vary quite significantly from game to game.

One other thing is that it's just the DDR3 that is 256-bit on the Xbox One. The eSRAM is actually 1024-bit. The significance in this particular case? Hell if I know, but thought it required mentioning.
 
So most powerful console = 7850 and a bunch of Jaguars (AMD Atoms)?
Now all that talk about consoles power praising the "powerful" consoles became more ridiculous than ever.
 

klaus

Member
Do we really need more tests to say ps4 is more gpu powerful? Lol

We've known that for over a year know. The interesting thing would be to find out how much stronger it is :)

Apparently the answer is quite difficult and depending on a lot of assumptions / the given situation, since the X1 has a rather peculiar architecture where the theoretical limit changes with how the hardware is used (but still is in any case lower than the PS4 GPU's limit). At the end of the day, a difference of 20% is a whole different ballpark than 80% (pulling numbers out of my *ss ofc).
 

Daviii

Member
Please guys stop summing up the two XBone bars. That's just a nosense.

The peak maximum performance of the XBone is the longer bar. You cannot get past that.

And you cannot even get there in an actual game either.
 
Not thread whining as this is a legitimate question, but what is there supposed to be discussed here that hasn't been since the leading up to the launch consoles?
 
Please guys stop summing up the two XBone bars. That's just a nosense.

The peak maximum performance of the XBone is the longer bar. You cannot get past that.

And you cannot even get there in an actual game either.

There's no Xbox One game that will ever not make use of DDR3 for the record, so what you're saying doesn't make sense. And, yes, I'm well aware that they gave the Xbox One the benefit of the doubt by using Microsoft's own high mark for peak achievable eSRAM bandwidth, but the fact remains that there are key variables here that simply can't be overlooked, and that can't be very easily benchmarked.
 
Hopefully this helps people visualize what's happening. (This is napkin math and is only used to give us an idea of what is going on, I think I am being lenient on how fast it drops to be safe)
hk9ckEB.jpg


Assumptions
1.) Only eSRAM is used if under 32 MBs
2.) The numbers given are accurate
3.) We are using the unreachable theoretical max of eSRAM

When they equal out the chart goes up, once the max threshold is met the max goes down on a curve and it will eventually drop to DDR3 numbers if we had infinite DDR3 ram. As you can see around 500MB of total ram used and your back to the eSRAM line. This is all theoretically. The question now lies in how much ram do the developers need for the visuals alone....
 
Status
Not open for further replies.
Top Bottom