• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

EuroGamer: More details on the BALANCE of XB1

It's suspiciously interesting the MS PR specifically mentioned than the GPU overclock was better than running with 14 CUs...

MS could not use those 2 and this is upclock was better for them is bs (their games may have had gains from upclock but thats beside the point) CELL had terrrible yields and it used 1 out of 8 for better yields MS uses 1 out of 7. Who in their right mind would want 1/7 of CU silicon go "waste"?
 
This isn't how the real world works, it isn't so cut and dry but I could believe it for specific scenarios.

Which just goes back to what Cerny said that if you have extra cycles you can use them for GPGPU.
I would hope that people would know your never maxing out the GPU all the time.
 

gofreak

GAF's Bob Woodward
PS4 unbalanced confirmed :)

that's really interesting though. And perhaps would that mean that cross-gen multiplatform titles might not show much difference between xbox and PS4? Assuming they are relatively straightforward up-ports.

Actually, if you are not CU bound on - say - a 'typical' 1080p game today on either system, but are fillrate bound, it would spell bigger trouble for XB1.

The implications of MS's comments aren't really flattering on a number of fronts.

If I was MS, I'd be hoping PS4 was CU bound too...else, that much larger ROP advantage could be brought to bear in a more straightforward fashion.

It's also funny to see them say 'look, Sony thinks we're right too!', when Sony's prescription for balance in the typical case for rendering actually looks a whole lot different than MS's. The funny thing about MS talking about balance while looking at one part of the pipeline...

(In reality, personally, I think the reality of performance is very much a ymmv thing...at a given res some games will be more CU bound, some games will be less ROP bound etc. etc. In any case, all these variations will do better on PS4)
 

RoboPlato

I'd be in the dick
That's a huge misconception. He said no different than what I said, only he danced around or left out the information communicated to devs. He actually didn't lie, he just omitted or didn't fully clarify.

In fact, Cerny greatly hinted that it was true.

http://www.eurogamer.net/articles/digitalfoundry-face-to-face-with-mark-cerny



When he says a little bit more ALU than if you were thinking strictly about graphics, and when he says the hardware is intentionally not 100 percent round, he means that it doesn't scale in returns for graphics per block of ALU like you might think.

He was just making clear that performance doesn't scale linearly, which is true. He was saying that fact might encourage devs to use that performance for compute instead of rendering. You aren't forced into anything. I don't think devs would be reporting 40-50% gains over XBO performance if 14 was the max they could use efficiently.

Also this:
The Graphics Processing Unit (GPU) has been enhanced in a number of ways, principally to allow for easier use of the GPU for general purpose computing (GPGPU) such as physics simulation. The GPU contains a unified array of 18 compute units, which collectively generate 1.84 Teraflops of processing power that can freely be applied to graphics, simulation tasks, or some mixture of the two.

http://www.scei.co.jp/corporate/release/130221a_e.html
 

mrklaw

MrArseFace
so to paraphrase

"you could use those extra CUs for graphics, but you'll hit a bit of diminishing returns - wouldn't that be a waste? Sure your game will look a little better, but imagine how amazing it'll look if you explored using those CUs for GPGPU instead. All that amazing physics, AI, fluid dynamics - it'll look fantastic"
 

ekim

Member
so to paraphrase

"you could use those extra CUs for graphics, but wouldn't that be a bit of a waste? Sure your game will look a little better, but imagine how amazing it'll look if you explored using those CUs for GPGPU instead. All that amazing physics, AI, fluid dynamics - it'll look fantastic"

That's how I understand it, yes.
 
You need to post whatever you have because you are interpreting it wrong.

This isn't my interpretation. It's the interpretation of an experienced games programmer that learned this at an official Sony devcon. I'm just repeating their understanding, so there's very little getting lost in translation as a result. Only way the info could be misinterpreted is if they misinterpreted it, which I doubt.

so to paraphrase

"you could use those extra CUs for graphics, but you'll hit a bit of diminishing returns - wouldn't that be a waste? Sure your game will look a little better, but imagine how amazing it'll look if you explored using those CUs for GPGPU instead. All that amazing physics, AI, fluid dynamics - it'll look fantastic"

More or less, absolutely.
 

KidBeta

Junior Member
This isn't my interpretation. It's the interpretation of an experienced games programmer that learned this at an official Sony devcon. I'm just repeating their understanding, so there's very little getting lost in translation as a result. Only way the info could be misinterpreted is if they misinterpreted it, which I doubt.



More or less, absolutely.

Well then they are misinterpreting it or you are not getting the full story because half of what you post is completely false.
 
Is the developer conference in question GDC and is the presentation in question "Overview of the PS4 for Developers" because then I could understand why it couldn't be shared - since it's behind a paywall on their GDC vault.
Well if what Gofreak said is true then it does impact desktop GPUs. He stated benchmarks of comparable GPUs with 50% more CUs only received a 25% gain in performance.
What's seemingly being described here seems beyond just diminishing returns - it's being made out as if improvement falls entirely off a cliff after 14 CUs - like it's the magic number, while AMD is happily churning out 16, 20, 32 CU cards etc.

From a review of the 7790, framerates (not the ideal comparison of course, since the 7850 has other advantages beyond the CU count, but the 7790 has the magic number of CUs)
Code:
		Radeon HD 7790 1GB	Radeon HD 7850 1GB
BioShock Infinite	60.5		81.5
Hitman Absolution	35.5		46.7
Batman: Arkham City	65		89
Tomb Raider		53.4		69.4
Sleeping Dogs		45.9		56.3
Metro 2033		23.7		32.6
Sony is saying it might be better to use those extra CUs for Compute worked because you will gain a greater than x% of performance then just using them on graphics.
Sure. That's all well and good and seems plausible. But that's not really the only claim being made here.
True. But they're just saying: have a look to see if there's a point where you could be using CUs better for other things.

They're asking devs to check if they're under-using ALU, holding the fairly attractive suggestion that quite a lot of power could be spare without hugely affecting render performance. But, again, it's very much your mileage may vary.

It wouldn't be pointless... like those PC benches show, you can still get a gain (variable depending on the game). In some the gain may even be linear. But if on average the gain isn't linear, and your software is being held back by some other point in the pipeline beyond a certain level of shading performance, it is simply wise to consider throwing in other work that may have a different ratio of resource demands, to get alu utilisation back up, and get some nice features or work done into the bargain.

It's not a specific impact on PS4. This is true of any GPU.
As above, seems completely plausible - if the extra compute resources aren't really going to do much for a given piece of software in terms of graphics it makes perfect sense to utilise them for something else otherwise it's just wasted resources. Of course, as also noted above, that doesn't seem to be the sole claim being made here - which seems to be more along the lines of the PS4 is somehow different from desktop GPUs and beyond using 14 of its CUs for graphics the performance increase plummets to the realms of negligibility.
And for sure it's in Sony's interests to push GPGPU. It could be a strong competitive advantage for them, it would be hard if not impossible to bring parity on another console if a game is chucking a decent amount of GPGPU around.
Also very true. If a multiplatform game really is utilising 400 GFLOPS of the GPU for non-graphical tasks, it probably doesn't end well for the XB1.
 

onQ123

Member
Because the way your saying it, is why even have a CPU IF part of a giant GPU can just act as one? It doesn't work that way, the GPGPU can only do certain things a CPU can and if the things needed are one of the things it can't do well or at all, then that power might go to waste.

What are you talking about?

That's the conclusion that you came to in your head I never said anything like that, I just pointed out that even if you split the GPU up into 14 + 4 it would still be 400GFLOPS of compute left over for the devs to use.


I know what a GPGPU is.

could Next Gen see 1 of the biggest improvements between console gens do to GPGPUs?


GPGPU Computing & why you should be more excited about Kinect 2 & the Next PS-Eye.,
 

viveks86

Member
This isn't my interpretation. It's the interpretation of an experienced games programmer that learned this at an official Sony devcon. I'm just repeating their understanding, so there's very little getting lost in translation as a result. Only way the info could be misinterpreted is if they misinterpreted it, which I doubt.

Doesn't this sound eerily similar to Penello's "I've got a technical fellow" argument? I'm sorry Senjutsu, but that justification holds no water here. When information changes hands, our own presumptions invariably crop in. Going by your last few posts, you are paraphrasing, not reporting. So there is always a possibility of misinterpretation.
 
so to paraphrase

"you could use those extra CUs for graphics, but you'll hit a bit of diminishing returns - wouldn't that be a waste? Sure your game will look a little better, but imagine how amazing it'll look if you explored using those CUs for GPGPU instead. All that amazing physics, AI, fluid dynamics - it'll look fantastic"

If bandwith allows, use them for graphics if graphics is what you want. With similar engines with XO and PS4 mp games I would assume diminishing returns would come at some point. Depends on scalability of tasks and if extra work gives noticeable results. I can not wait to desing my own engine.
 

gofreak

GAF's Bob Woodward
He was just making clear that performance doesn't scale linearly, which is true. He was just saying that fact might encourage devs to use that performance for compute instead of rendering. You aren't forced into anything. I don't think devs would be reporting 40-50% gains over XBO performance if 14 was the max they could use efficiently.

A game's sweet spot for rendering actually wouldn't have to go far north of 14 to achieve that kind of difference, if rumours about XB1's OS reservations are true.

Say - 15 800Mhz CUs vs 10.8 853 Mhz CUs (12-10%) with 32 ROPs vs 16 ROPs...that would give a pretty straight forward 30-90% gain on the shading and pixel output end of the pipeline, where perf might land depending on where in that segment of the pipeline the bound balanced out.

And this example is assuming a case where rendering perf doesn't scale at all past a certain amount of ALU, whereas realistically we might be talking more about a less than linear scaling (i.e. some further improvement on the lower end of that range).

So back of a napkin stuff - but I don't think this idea is necessarily inconsistent with some of the rumours we've been hearing about rendering deltas. Remember too: it all depends on the shape of an individual game's pipeline.
 

bishoptl

Banstick Emeritus
Only saying it now so I don't have to repeat it. I'm not posting a link, because it isn't a link. it's just information that I think proves what I'm saying is true beyond doubt, but it also carries the risk of getting someone in trouble or violating their trust, which is why I can't just say everything, and why immediately after I said it, I said I would have no issue sharing the information with a mod if it made people more comfortable that I'm being transparent and honest.
Cool, you know where to find me then.
 
Cool, you know where to find me then.

Just sent.

He was just making clear that performance doesn't scale linearly, which is true. He was saying that fact might encourage devs to use that performance for compute instead of rendering. You aren't forced into anything. I don't think devs would be reporting 40-50% gains over XBO performance if 14 was the max they could use efficiently.

Also this:


http://www.scei.co.jp/corporate/release/130221a_e.html

Keep in mind I didn't say it couldn't be used however a dev wished. Just what Sony presented to devs about a bit of diminishing returns for graphics operations after the 14 CU mark. Devs if they wish can use all 18 for graphics, but apparently there's a significant drop off in the benefit that you see for the additional ALU resources for graphics past the 14 CU mark, which is why Sony suggests the most optimal use for the extra 4 CUs is to use them for compute work.
 
MS must genuinely be freaking out about the negative PR coming out of their '20% but 50% less' game console considering how much they're trying to get out ahead of this, but the numbers don't lie, and the gap is pretty substantial.

They keep trying to defend it somehow meaning they're scared it will put off potential buyers, since they know ppl don't want to spend $60 for a game and get an inferior product. Let alone $500 up front for an inferior product. Leaving a hugely dissatisfied customer. So they're trying to muddy the waters as much as possible hoping xbone buyers will be happy with their product.

Also, take a drink every time the guy refers to 'balance' in the architecture. I can only assume a small team of people at MS spent at least a few hours brainstorming a term they could use to make the situation more vague and questionable.

They're desperate and clasping at straws.
 
What exactly is being sent if what we're currently discussing is just a verbatim repetition of an interpretation of a developer presentation and not any form of official document? Like an ordinary PM with the same information that's been posted in the thread? :/

Oh well, iirc, bish is a developer as well...? So it would be interesting to get his take on this all anyway.
 

onQ123

Member
What exactly is being sent if what we're currently discussing is just a verbatim repetition of an interpretation of a developer presentation and not any form of official document? Like an ordinary PM :/

Yeah he claims to know something and is willing to only give his word, no links... I hope mods do not go too easy on that shit.
 

Fafalada

Fafracer forever
gofreak said:
They're asking devs to check if they're under-using ALU, holding the fairly attractive suggestion that quite a lot of power could be spare without hugely affecting render performance.
The whole notion of discrete distribution of CUs in this conversations is stupid though. Async compute is there so you don't have to do that.
Eg. in a deferred renderer all CUs will be effectively idle during Z-prepass(meaning ~18 would be available for compute during), and all of them could be utilized during shading pass.
Sure it may average out to 4:14 over the course of frame - but if you discretely partition it, chances are you're losing performance on both sides.
 

gofreak

GAF's Bob Woodward
Of course, as also noted above, that doesn't seem to be the sole claim being made here - which seems to be more along the lines of the PS4 is somehow different from desktop GPUs and beyond using 14 of its CUs for graphics the performance increase plummets to the realms of negligibility.

I'm not sure that claim is being made. The same kind of behaviour about non linear return on ALU in 'typical games' can be observed on PC, in that DF comparison article for example.

The whole notion of discrete distribution of CUs in this conversations is stupid though. Async compute is there so you don't have to do that.
Eg. in a deferred renderer all CUs will be effectively idle during Z-prepass(meaning ~18 would be available for compute during), and all of them could be utilized during shading pass.
Sure it may average out to 4:14 over the course of frame - but if you discretely partition it, chances are you're losing performance on both sides.


Sure, but it's just easier to talk about in terms of discrete CUs to try and quantify the amount of performance that could be going a-begging. But yeah, that is worth clarifying. No one's going to actually be partitioning CUs, I don't think that's even possible through the API (?)
 

Thorgal

Member
What exactly is being sent if what we're currently discussing is just a verbatim repetition of an interpretation of a developer presentation and not any form of official document? Like an ordinary PM :/

Oh well, iirc, bish is a developer as well...? So it would be interesting to get his take on this all anyway.

I believe he is a developer for a well known studio but for obvious reasons doesn't say for which one .

So it would be easy for him to check if someone is making up stuff or not through his connections.

Feel free to correct me if i got this wrong . Bish.
 
so to paraphrase

"you could use those extra CUs for graphics, but you'll hit a bit of diminishing returns - wouldn't that be a waste? Sure your game will look a little better, but imagine how amazing it'll look if you explored using those CUs for GPGPU instead. All that amazing physics, AI, fluid dynamics - it'll look fantastic"

That's how I understand it, yes.


And IIRC, these 4 extra CU are not the same as the other 14, they're specialized for compute which would make it an even bigger waste.
 
And IIRC, these 4 extra CU are not the same as the other 14, they're specialized for compute which would make it an even bigger waste.

There not specialized for anything , i can't believe we back to this .
This has been debunk already and explained .

EDIT Fafalada just gave us a perfect eg of how things can go .
 
The whole notion of discrete distribution of CUs in this conversations is stupid though. Async compute is there so you don't have to do that.
Eg. in a deferred renderer all CUs will be effectively idle during Z-prepass(meaning ~18 would be available for compute during), and all of them could be utilized during shading pass.
Sure it may average out to 4:14 over the course of frame - but if you discretely partition it, chances are you're losing performance on both sides.

Wouldn't this affect any number of CUs? So it would be so with XO too? That would not be what we are thinkering here. And wouldnt constant, say 4 CUs for compute better as you have more reliable access to resources? Are engines more foward or deferred now days? How would f-rendered engines fare on XO and PS4?
 

jcm

Member
Keep in mind I didn't say it couldn't be used however a dev wished. Just what Sony presented to devs about a bit of diminishing returns for graphics operations after the 14 CU mark. Devs if they wish can use all 18 for graphics, but apparently there's a significant drop off in the benefit that you see for the additional ALU resources for graphics past the 14 CU mark, which is why Sony suggests the most optimal use for the extra 4 CUs is to use them for compute work.

But that doesn't make sense, that 14 is some kind of magic number. It will completely depend on your game engine. That's like someone saying "after 6 GB of RAM you have diminishing returns in a PC". Well, sure, maybe if you're only using it to read your email, but if you're running a DB on that machine you'l easily soak up the 6GB.

And you didn't even claim diminishing returns, originally, you said this:
"The exact information was that given the rest of the design there is a huge knee in the performance curve, and anything beyond that point there is a significant drop off in the apparent value that you get from additional ALU resources for graphics, and the PS4 is said to be well to the right of that knee.
 
I'm glad you learned your lesson SenjutsuSage. Spinning makes your head dizzy.

I did :p

But that doesn't make sense, that 14 is some kind of magic number. It will completely depend on your game engine. That's like someone saying "after 6 GB of RAM you have diminishing returns in a PC". Well, sure, maybe if you're only using it to read your email, but if you're running a DB on that machine you'l easily soak up the 6GB.

And you didn't even claim diminishing returns, originally, you said this:

That explains exactly diminishing returns. There is a drop off in the value of the extra ALU resources for graphics after a certain point, that point being 14 CUs.
 
Mr Sage isn't banned yet. Docs real?

Hopefully bish gives us his interpretation of it as I think there is some truth to what Senjutsu said but likely not quite as big a deal as originally suggested

Curious to know what the returns are like from those 4 CU's

Although not sure I'd understand it to be honest
 

USC-fan

Banned
Just sent.



Keep in mind I didn't say it couldn't be used however a dev wished. Just what Sony presented to devs about a bit of diminishing returns for graphics operations after the 14 CU mark. Devs if they wish can use all 18 for graphics, but apparently there's a significant drop off in the benefit that you see for the additional ALU resources for graphics past the 14 CU mark, which is why Sony suggests the most optimal use for the extra 4 CUs is to use them for compute work.
Nothing about diminishing returns was ever stated by anyone but ms pr guys. the changes made by sony would not be needed if you were always have some cu just for gpgpu work. Not buying any of this nonsense.
 

viveks86

Member
Didn't Senjutsu PM a bunch of others as well? Does anyone have a take on it? Are you allowed to talk about it, if not quote? I'm new here, so I have no idea how that works! :D
 

Fafalada

Fafracer forever
gofreak said:
Sure, but it's just easier to talk about in terms of discrete CUs to try and quantify the amount of performance that could be going a-begging.
Yes but distinction is important when arguing about "wasted CU potential in graphics", as scenarios where you target max consumption at different parts of frame are quite viable in closed boxes, even if bandwith/fill suggests you can't in average case.

NTIETS13A_Superiority said:
And wouldnt constant, say 4 CUs for compute better as you have more reliable access to resources?
No. Games are inherently spiky when it comes to performance, nothing is better off with evenly distributed average, that's the entire reason why we have unified shaders since 2005.
But yes, XB1 will operate like that as well (barring the differences in approach to ACU by each company).
 

goonergaz

Member
I did :p



That explains exactly diminishing returns. There is a drop off in the value of the extra ALU resources for graphics after a certain point, that point being 14 CUs.

Tbh my interpretation (fwiw) was that 14 was already past a very sharp drop off...now you're sounding like "a bit of a drop past 14" although I confess I have no idea where 'right of the knee' is
 

onQ123

Member
And IIRC, these 4 extra CU are not the same as the other 14, they're specialized for compute which would make it an even bigger waste.

They are all the same.


basically 14 CU's is good enough for 1080P @ 60FPS using it for just fixed function graphics, so the other 4 can get you something like 1080P @ 75FPS or 1400P @ 60FPS.

but why would you do that if they are not standards? so you will be better off using the 4 CU's to add more effects using compute.
 

gofreak

GAF's Bob Woodward
Hopefully bish gives us his interpretation of it as I think there is some truth to what Senjutsu said but likely not quite as big a deal as originally suggested

Curious to know what the returns are like from those 4 CU's

Although not sure I'd understand it to be honest

The DF comparison that tried to isolate 50%+ ALU showed a range of returns in 5 of today's games at some different levels of settings and effects. Those games showed a circa 17-30% improvement with 50% more ALU.

i.e. non-linear, 'might be better of spending it on something else' in some cases, but still fairly substantial
 
Top Bottom