Support NeoGAF

AmyS · Apr 8, 2017

If anything hinders Scorpio from pulling off native 4K consistently across a wide range of games, it's probably going to be due to having only 32 ROPs. I was expecting 64.

Good article.

Leyasu · Apr 8, 2017

AmyS said:
If anything hinders Scorpio from pulling off native 4K consistently across a wide range of games, it's probably going to be due to having only 32 ROPs. I was expecting 64.

Good article.

Yeah, 32 ROPs seems a little light. Has this been confirmed by MS?

JaggedSac · Apr 8, 2017

Shame MS didn't listen to AMD when they said, "You know, we tried that once, didn't work out, not worth it."

timlot · Apr 8, 2017

Well that didn't take long. He must want a trip to Redmond too. I'm sure Ars Technica won't be far behind with there blind hot take.

STEaMkb · Apr 8, 2017

Space_nut said:
Oh no not the fp16 doubling power theory

Wishmaster92 said:
Well it's not really blowing smoke, it gave DICE a 30% performance improvement.

VooFoo Studios also:

On paper though, the technical accomplishment here is impressive. VooFoo has quadrupled resolution over the base PS4 version, and it has done this using a GPU that only has 2.3x the compute power of the older hardware and only 25 per cent more memory bandwidth. Either the base PS4 is being significantly underutilised (in which case we would expect an improvement on its 2x MSAA) or there's something more going on behind the scenes. The team joked about 'Mantis magic' before revealing that exploiting enhancements made to the PlayStation 4 Pro GPU have paid dividends.

Of course, we already knew that the Pro graphics core implements a range of new instructions - it was part of the initial leak - but we didn't really know exactly what they could actually do. As we understand it, with the new enhancements, it's possible to complete two 16-bit floating point operations in the time taken to complete one on the base PS4 hardware. The end result from the new Radeon technology is the additional throughput required to making Mantis Burn Racing hit its 4K performance target

It has limited applicability though so expectations must be tempered.

wachie · Apr 8, 2017

Klocker said:
This isn't just a matter of DF vs Anand and what MS told them but they actually are publishing these specs on their page so I doubt they would be so confident if it were some bullshit

http://www.xbox.com/en-US/project-scorpio

Didnt they have the 204Gbps for the Xbox One too?

timlot said:
Well that didn't take long. He must want a trip to Redmond too. I'm sure Ars Technica won't be far behind with there blind hot take.

lol, you got me.

Locuza · Apr 9, 2017

MisterXDTV said:
Yeah, but it's still a RX 480 with Vega enhancements[...]

Currently not one specific thing was mentioned in this regard, not even FP16.

Collingwood said:
But why state 326 if it can't be used?

That's a big gap.

They can be used if the game needs a certain format and this topic applies to all other consoles and hardware as well:

For the Scorpio:
1172 x 32 x 4 Bytes = 150 GB/s
1172 x 32 x 8 Bytes = 300 GB/s
1172 x 32 x 16 Bytes = 600 GB/s

The CPU also needs a few GB/s and then you lose some because of memory contention and other inefficiencies.

And Tahiti wasn't the last chip which used 384-Bit and 32 ROPs, there was also Tonga.

I wouldn't be too surprised about 32 ROPs being tied to a 384-Bit wide bus, but 2048 KB L2$ on it:

Lom1lo said:
Lom1lo said:

Is this good or not ?

Click to expand...

In practise it might not matter much but in theory it's suboptimal.
The L2$ slices are tied to the memory controllers so for even load distrubution you arrange L2$ slices with the same capacity across the memory controllers.
With 6 memory controllers you would expect 512KB L2$ per mc, 3MB (or 256KB for 1,5MB) in total and not 2 .

Another thing which makes this interesting is Vega.
Currently the ROPs are the clients of the memory controllers but on GCN they have an extra interconnect which makes it possible to scale them independently of the number of memory controllers.
Vega will change this, with Vega the ROPs will be the clients of the L2$ and for equal load balancing the ROPs will be partionated accordingly and the L2$ in the same way to the MC.
https://www.extremetech.com/wp-content/uploads/2017/01/RenderL2.jpg

Scorpio must have 6 L2$ slices which would mean 48 ROPs (4 per slice) if the ROPs where tied to the L2$ but it has 32 ROPs which means that Scorpio probably uses the current GCN backend design up to Polaris and not Vega.

Although MS could make something like this:
4 ROPs for 4 256KB L2$ slices to 4 memory controllers
8 ROPs for 2 512KB L2$ slices to 2 memory controllers

You would get 32 ROPs, 2MB L2$ and a 384-Bit wide bus.
But I wouldn't bet on this scenario.

And for those people who asked if 32 ROPs are confirmed:

"As you can see, we doubled the amount of shader engines. That has the effect of improvement of boosting our triangle and vertex rate by 2.7x when you include the clock boost as well. We doubled the number of render back-ends, which has the effect of increasing our fill-rate by 2.7x. We quadrupled the GPU L2 cache size, again for targeting the 4K performance."

Click to expand...

http://www.eurogamer.net/articles/digitalfoundry-2017-project-scorpio-tech-revealed

Angelus Errare · Apr 9, 2017

Wait, I thought Scorpio had 40 ROPs?

Also, PS4Pro using Vega enhancements while Scorpio is just a Polaris part (paraphrasing from the article)? Hmmm, I'm skeptical just a tad. Anandtech is very thorough in their observations and hardware breakdowns, but I dunno...skeptical.

Bsigg12 · Apr 9, 2017

Angelus Errare said:
Wait, I thought Scorpio had 40 ROPs?

It does. I don't know where people are getting 32.

It also says it in the OP.

wachie · Apr 9, 2017

Bsigg12 said:
It does. I don't know where people are getting 32.

Compute Units are not ROPs.

Angelus Errare · Apr 9, 2017

wachie said:
Compute Units are not ROPs.

Derp, Mandela effect on my part. Should have just checked DF instead of going off memory.

Space_nut · Apr 9, 2017

Collingwood said:
But why state 326 if it can't be used?

That's a big gap.

They're guessing. They don't even have the full detailed specs for the chip

mugurumakensei · Apr 9, 2017

Locuza said:
And for those people who asked if 32 ROPs are confirmed:

That doesn't confirm anything technically as it says double the rendering backends increased fill-rate by 2.7x. I'm thinking we should be taking that statement as >= 2x XBox One ROPs and not = 2x XBox One ROPs although the same statement is used for shaders. However, we got an official published number for the latter but not the former.

geordiemp · Apr 9, 2017

Guerrillas in the Mist said:
I think my satire detection is broken now: is this the new "secret sauce"?

http://gamingbolt.com/mass-effect-a...kerboard-rendering-30-improvement-due-to-fp16

Fp16 frostbite = 30 % ?

Ushay · Apr 9, 2017

Great read.

Will be interesting to see the hardware in action. I cannot wait.

GTR R35_Supra RZ · Apr 9, 2017

Primethius said:
I got no issues with Scorpio. Xbox? Maybe. But that's cause I'm a long time Xbox fan from the OG days so I'm bound to have issues.

Here though, I was just making an observation on the prediction/posts.

👊✌💪..all good no harm.

Space_nut · Apr 9, 2017

This article is just second hand analysis from the same DF article. Nothing but guesses. Was hoping they had detailed info on the actual chip and not trying to guess what's inside

wachie · Apr 9, 2017

Space_nut said:
They're guessing. They don't even have the full detailed specs for the chip

They're going off of a GPU that had similar setup, it's not guessing. FWIW AMD dumped that design (decoupling ROPs from memory controllers) in their next architecture with Hawaii. It hasnt re-appeared in Fiji or Polaris either.

The situation is more complicated as the same bandwidth will be shared with the CPU cores in the case of Scorpio. Anandtech article is asking the right questions.

Space_nut said:
This article is just second hand analysis from the same DF article. Nothing but guesses. Was hoping they had detailed info on the actual chip and not trying to guess what's inside

They are taking the same information presented by DF and breaking it down without any superlatives or buzzwords. For the unknowns, they have raised the questions.

quest · Apr 9, 2017

Angelus Errare said:
Wait, I thought Scorpio had 40 ROPs?

Also, PS4Pro using Vega enhancements while Scorpio is just a Polaris part (paraphrasing from the article)? Hmmm, I'm skeptical just a tad. Anandtech is very thorough in their observations and hardware breakdowns, but I dunno...skeptical.

Why it is clear MS goal was 6tf in the smallest package possible. So instead of making the die larger with Vega enhancements they went mean and lean. Also keeping it smaller helps with thermos. Be interesting how this works out 4.2tf with more advanced tech vs 6tf of older tech.

Space_nut · Apr 9, 2017

wachie said:
They're going off of a GPU that had similar setup, it's not guessing. FWIW AMD dumped that design (decoupling ROPs from memory controllers) in their next architecture with Hawaii. It hasnt re-appeared in Fiji or Polaris either.

The situation is more complicated as the same bandwidth will be shared with the CPU cores in the case of Scorpio. Anandtech article is asking the right questions.

But we need actual chip design analysis to see what exactly has been done. It's good to replicate something from a desktop gpu but let's wait till everyone has their hands on actual chips to evaluate it. Right now it's still not a hard definitive on what's on the chip. Even says in the article mostly speculating

Space_nut · Apr 9, 2017

quest said:
Why it is clear MS goal was 6tf in the smallest package possible. So instead of making the die larger with Vega enhancements they went mean and lean. Also keeping it smaller helps with thermos. Be interesting how this works out 4.2tf with more advanced tech vs 6tf of older tech.

Who says what pro has isn't in Scorpio already?

Locuza · Apr 9, 2017

mugurumakensei said:
That doesn't confirm anything technically as it says double the rendering backends increased fill-rate by 2.7x. I'm thinking we should be taking that statement as >= 2x XBox One ROPs and not = 2x XBox One ROPs although the same statement is used for shaders. However, we got an official published number for the latter but not the former.

They literally said they doubled the render backend, coupled with the increased clock speeds of the ROPs you get ~2.7x more pixel fillrate.

16 ROPs x 853 Mhz ~ 13.648 MPix/s
32 ROPs x 1172 Mz ~ 37.404 MPix/s (2.7x)

quest · Apr 9, 2017

Space_nut said:
Who says what pro has isn't in Scorpio already?

Because MS has kept quiet on that front. If it had those advanced features we would of heard from DF article. Instead it was a generic 60 custom features. Makes sense honestly they hit their goals. The question now is was designing around 6tf number worth it.

Space_nut · Apr 9, 2017

quest said:
Because MS has kept quiet on that front. If it had those advanced features we would of heard from DF article. Instead it was a generic 60 custom features. Makes sense honestly they hit their goals. The question now is was designing around 6tf number worth it.

I'm sure DF will be releasing more info on the 60 custom features. MS made all upgrades to Polaris tech before they built the chip to handle any previous bottlenecks. I'm sure they have the same tech as pro plus a few more. Whatever DF wrote isn't all there is. Let's wait till we get the full details on the chip design

wachie · Apr 9, 2017

Space_nut said:
But we need actual chip design analysis to see what exactly has been done. It's good to replicate something from a desktop gpu but let's wait till everyone has their hands on actual chips to evaluate it. Right now it's still not a hard definitive on what's on the chip. Even says in the article mostly speculating

And you're the same person who's pushing the 1070 based off of one game. Mkay.

BlackClouds · Apr 9, 2017

Niks said:
I dont understand?

GAF told me the scorpio was more powerful than a gtx 1070?

You know, I wonder if it had Vega compute units would it be approaching that power I guess that's down the drain since it's not using Vega.

Space_nut · Apr 9, 2017

wachie said:
And you're the same person who's pushing the 1070 based off of one game. Mkay.

DF got hands on the tech. What they wrote in the initial article wasn't everything. The article you posted is basing their info from what DF disclosed. Again if they had design docs and first hand then fine. I just see a article basing on info from DF article that didn't go into full details on the design

Locuza · Apr 9, 2017

wachie said:
They're going off of a GPU that had similar setup, it's not guessing. FWIW AMD dumped that design (decoupling ROPs from memory controllers) in their next architecture with Hawaii. It hasnt re-appeared in Fiji or Polaris either.
[...]

AMD used this design again with Tonga (GCN Gen 3):
http://neogaf.com/forum/showpost.php?p=233597567&postcount=107

AMD even built a card with the fully enabled chip but it never entered mass production:
http://www.overclock.net/t/1583638/amd-r9-285-380-380x-tonga-tonga-xt-owners-discussion-thread/440#post_25633363

+ his GPU-Z validation:
https://www.techpowerup.com/gpuz/details/anr8p

Jaguar Victory · Apr 9, 2017

Deku Tree said:
It sounds like a lot of this extra memory bandwidth would only possibly be taken advantage of by first party developers.

No multiplat is going to go out of their way to code to the metal just for Scorpio.

Engines. How do they work?

MisterXDTV · Apr 9, 2017

Locuza said:
Currently not one specific thing was mentioned in this regard, not even FP16.

I was talking about PS4 Pro, great post by the way

fade_ · Apr 9, 2017

Syrus said:
They wouldnt state that memory speed if hey cant use it. I hope this gets clarified

Average consumer always things more is better so it makes sense to state it regardless.

Jaguar Victory · Apr 9, 2017

SRTtoZ said:
It's no surprise that even the far back leaks always lead to an overclocked RX 480 and it still holds true today. I don't know why people expected Sony and MS to bump up specs to godly levels last minute. These consoles need to be cost efficient.

It's not overclocked. It's clocked lower actually. Just has 4 more CUs.

MisterXDTV · Apr 9, 2017

Jaguar Victory said:
It's not overclocked. It's clocked lower actually. Just has 4 more CUs.

For a console, it's overclocked: 1172 mhz is a lot. Usually it stays below 1 Ghz (One was 853 mhz, One S 914, PS4 Pro 911)

timlot · Apr 9, 2017

Syrus said:
They wouldnt state that memory speed if hey cant use it. I hope this gets clarified

Don't think there is much to clarify...

"For 4K assets, textures get larger and render targets get larger as well. This means a couple of things - you need more space, you need more bandwidth. The question, though, was how much?" asks Nick Baker, Distinguished Engineer, Silicon. "We'd hate to build this GPU and then end up having to be memory-starved. So all the analysis that Andrew was talking about, we were able to look at the effect of different memory bandwidths, and it quickly led us to needing more than 300GB/s memory bandwidth. So in the end we ended up choosing 326GB/s. On Scorpio we are using a 384-bit GDDR5 interface - that is 12 channels. Each channel is 32 bits."

wachie · Apr 9, 2017

Locuza said:
AMD used this design again with Tonga (GCN Gen 3):
http://neogaf.com/forum/showpost.php?p=233597567&postcount=107

AMD even built a card with the fully enabled chip but it never entered mass production:
http://www.overclock.net/t/1583638/amd-r9-285-380-380x-tonga-tonga-xt-owners-discussion-thread/440#post_25633363

+ his GPU-Z validation:
https://www.techpowerup.com/gpuz/details/anr8p

It always was a 256b chip, wasnt it? Even though built as 384b chip.

Dictator93 · Apr 9, 2017

geordiemp said:
http://gamingbolt.com/mass-effect-a...kerboard-rendering-30-improvement-due-to-fp16

Fp16 frostbite = 30 % ?

30% for a part of the checkerboarding, not total rendering time

Locuza · Apr 9, 2017

wachie said:
It always was a 256b chip, wasnt it? Even though built as 384b chip.

It was only sold as a 256-Bit product but there are 6 MCs and 32 ROPs just like on Tahiti (GCN Gen 1).

Tonga's die-shot:

wachie · Apr 9, 2017

Locuza said:
It was only sold as a 256-Bit product but there are 6 MCs and 32 ROPs just like on Tahiti (GCN Gen 1).

Tonga's die-shot:
https://www.3dnews.ru/assets/external/illustrations/2015/06/08/915323/tonga-crys800.jpg[IMG][/QUOTE]
Yes, I think AMD also said to some affect that there wasnt a payoff to release it as a 384b design.

I remember Tahiti had some issues hitting its peak memory bandwidth, it was like ~70% at most compared to the 80%+ of other GPUs. I cant find those results unfortunately since that chip is so old now.

Fredrik · Apr 9, 2017

Asherdude said:
They didn't see it in action either. So no benchmarks. I'm trusting DF's take on it for now.

lol yeah this is exactly what DF predicted would've happened in one of their videos if the specs would've just leaked the regular way and analyzed on the paper with no inside knowledge. DF might still hype certain things up a bit too much though.

MisterXDTV · Apr 9, 2017

Locuza said:
It was only sold as a 256-Bit product but there are 6 MCs and 32 ROPs just like on Tahiti (GCN Gen 1).

So in the end, do you think Anandtech is right thinking Scorpio will struggle to use the full bandwidth in many cases?

PuppetMaster · Apr 9, 2017

MisterXDTV said:
So in the end, do you think Anandtech is right thinking Scorpio will struggle to use the full bandwidth in many cases?

It should not even be possible to fully max it with 32 rops.

I am starting to think Scorpio was initailly designed as a 8chip x 1GB GDDR5 machine that got bumped up to 12 chip at the last min. Because that 32 ROPs seems clearly best designed for 8

MisterXDTV · Apr 9, 2017

PuppetMaster said:
It should not even be possible to fully max it with 32 rops.

I am starting to think Scorpio was initailly designed as a 8chip x 1GB GDDR5 machine that got bumped up to 12 chip at the last min. Because that 32 ROPs seems clearly best designed for 8

Well, the rendering from last June suggested 12 GB from the beginning, so I'm confused

quest · Apr 9, 2017

PuppetMaster said:
It should not even be possible to fully max it with 32 rops.

I am starting to think Scorpio was initailly designed as a 8chip x 1GB GDDR5 machine that got bumped up to 12 chip at the last min. Because that 32 ROPs seems clearly best designed for 8

No way the 384 bit bus was there from the initial design. They wanted a ton of bandwidth and more than 8 gigs of ram. 16 gigs of GDdr5x was never a realistic option.

PuppetMaster · Apr 9, 2017

MisterXDTV said:
Well, the rendering from last June suggested 12 GB from the beginning, so I'm confused

Well maybe not that last min then.

But like the article said. We want the width of the total ROPs and memory bus to be 1:1 for full utilization. 8 chips would have been 1:1

Proelite · Apr 9, 2017

PuppetMaster said:
Well maybe not that last min then.

But like the article said. We want the width of the total ROPs and memory bus to be 1:1 for full utilization. 8 chips would have been 1:1

Sounds like they should have gone 16 chips of GDDR5x clamshell on 256 bit bus for 16GB because I have 20:20 hindsight and knows more than Nick Baker.

Locuza · Apr 9, 2017

wachie said:
[...]
I remember Tahiti had some issues hitting its peak memory bandwidth, it was like ~70% at most compared to the 80%+ of other GPUs. I cant find those results unfortunately since that chip is so old now.

I will throw two things in here:

ROP rates, of course, include the pixel fill rate and, more crucially these days, the amount of blending power for multisampled antialiasing. The 7970 is barely faster than the 6970 on the this front because it sports the same basic mix of hardware: eight ROP partitions, each capable of outputting four colored pixels or 16 Z/stencil pixels per clock. Rather than increasing the hardware counts here, AMD decided on a reorganization. In previous designs, two ROP partitions (or render back-ends) were associated with each memory controller, but AMD claims the memory controllers were "oversubscribed" in that setup, leaving the ROPs twiddling their thumbs at times. Tahiti's ROPs are no longer associated with a specific memory controller. Instead, the chip has a crossbar allowing direct, switched communication between each ROP partition and each memory controller. (The ROPs are not L2 cache clients, incidentally.) With this increased flexibility and the addition of two more memory controllers, AMD claims Tahiti's ROPs should achieve up to 50% higher utilization and thus efficiency. Higher efficiency is a good thing, but the big question is whether Tahiti's relatively low maximum ROP rates will be a limiting factor, even if the chip does approach its full potential more frequently.

http://techreport.com/review/22192/amd-radeon-hd-7970-graphics-processor/4

And here is a performance table with a fill rate test from 3D Mark Vantage:

http://techreport.com/review/25509/amd-radeon-r9-290x-graphics-card-reviewed/6

Tahiti really pushes much more than Cayman (6970) and Pitcairn who also has 32 ROPs but only a 256-Bit Interface.
Hawaii has 64 ROPs (+100%) but only 320 GB/s (+11%) vs. 288 GB/s which results in only 21% more fill rate, at least in the 3D Mark test.

Other tests and formats might show different results but the backend design with disproportionate ROP/MC distribution doesn't look problematic.

MisterXDTV said:
So in the end, do you think Anandtech is right thinking Scorpio will struggle to use the full bandwidth in many cases?

Not in the way Anandtech speculates because the interconnection is solid and then you are simply ROP-Bound or BW-Bound.
In my earlier posting you can see that the ROPs can push 150 GB worth of data, 300 GB or even 600.
I don't know in which proportions games use certain formats but either way there is no hardware issue/special bottleneck coming from design aspect.
As I mentioned before I'm more curious about the L2$ distrubtion and well together with the ROPs.

belvedere · Apr 9, 2017

Very interesting comments both here and in a couple of the other threads. Devs chiming in about not only Jaguar, but potential GPU limitations as well.

Once the dust settles I look forward to more in-depth analysis.

quest · Apr 9, 2017

belvedere said:
Very interesting comments both here and in a couple of the other threads. Devs chiming in about not only Jaguar, but potential GPU limitations as well.

Once the dust settles I look forward to more in-depth analysis.

I would love that I just worry it won't happen. If Scorpio has less Vega features than the pro I can see ms being tight lipped. They gave df almost nothing of substance outside the basic specs. This article would explain why.

Renekton · Apr 9, 2017

Erebus said:
What I can't quite grasp is how Scorpio will render games at native 4k with this hardware. I mean the same DF has said in the past that a GTX 1080 is the absolute minimum for native 4k gaming and I quote:

http://www.eurogamer.net/articles/digitalfoundry-2016-4k-gaming-is-finally-viable-and-its-stunning

But somehow, they didn't even argue about Scorpio's 4k capabilities.

That is 4K 60fps Ultra, clearly stated in the article.

PackAPunchedMick · Apr 9, 2017

This article, although based on theory is exactly what I wanted from digital foundry.

Instead we got buzzwords and clickbait.

It looks a good class above the PS4 pro but chasing that native 4k will negate any power difference when Sony implements checkboarding instead.

Exclusives withstanding of course.

Which in digital foundrys own words is "a minimal difference".

Support NeoGAF

Anandtech breaks down Scorpio specs + predictions

Member

Banned

Member

Banned

Member

Member

Member

Banned

Member

Member

Banned

Member

Member

Member

Member

Banned

Member

Member

Not Banned from OT

Member

Member

Member

Not Banned from OT

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Banned

Member

Member

Member

Member

Member

Member

Member

Member

Not Banned from OT

Member

Member

Member

Junior Butler

Not Banned from OT

Member

Member

Similar threads