• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

VGLeaks - Orbis GPU Detailed - compute, queues and pipelines

It definitely seems like a capable and wisely designed system.

That's not what really made the PS2 successful though.
I know but if they are really opening their platform to indie devs, that will play a huge part in their comeback. They also had a lot of ips with ps3 even though it was so complex, now imagine an easy and powerful system on Sony's end. Plus if they don't do this always online, used games bs, and ms does...yea that's a checkmate and wii u isn't even competing. You gotta be this powerful to ride the next gen coaster :)
 

benny_a

extra source of jiggaflops
I don't follow what you are saying. The APU (CPU+GPU) is on one bus using one memory pool (8GB GDDR5). There is no PCIe which was used to transfer data from the main memory (DDR3) to the video memory (GDDR5) in a PC.
Okay let me try again.

In the benchmark I posted above (PCI-E 2.0 8GB/s vs. PCI-E 3.0 16GB/s) they tested the performance of released games when changing the variable PCI-E.

Because we know using PCI-E is slower to write to and from the GPU than within the PS4 which has access to write to the RAM with 176GB/s I was curious how much of an impact this ACE change would have.

Because in the OP of the rumor thread it says:
Despite featuring PCI-Express 3.0 which doubles interface bandwidth from 8GB/s to 16GB/s, plus support for numerous data and protocol commands, the internal dual DMA engines can push data through the bus to saturate that bandwidth.

I was just curious if we would see a good jump based on this increase in throughput and therefor in performance?
(From the benchmark in my previous post: 8GB/s vs. 16GB/s achieved a 10% at maximum FPS difference in the best case scenario)

Edit: Or alternatively asked. If we took high-end PC specs and increased the PCI-E bus 10-fold, would this give a massive performance boost or is it not that important?
 

RoboPlato

I'd be in the dick
Alright, good gpu check. Lots of ram with good bandwidth check. The only thing left to worry about is the CPU and gddr5 latency problem. If Sony solved these issues then they have quite the capable system. I see the era of ps2 coming back )

The latency won't be much of an issue, if any. Good data management can do a lot to counter that and GDDR5 latency compared to DDR3 is a bit overexaggerated.
 
Alright, good gpu check. Lots of ram with good bandwidth check. The only thing left to worry about is the CPU and gddr5 latency problem. If Sony solved these issues then they have quite the capable system. I see the era of ps2 coming back )

After talking to a non-insider friend of mine very knowledgable in system designs in general, I certainly wouldn't worry about the GDDR5 latency. In his overview, (and to vastly simplify his thoughts as best I can,) he thought most if not all of that issue would be mediated by ample cache and the relatively low clock speed of the CPU plus a few additional considerations. .
 
The latency won't be much of an issue, if any. Good data management can do a lot to counter that and GDDR5 latency compared to DDR3 is a bit overexaggerated.
On b3d they were talking about how the low latency on esram provides a huge benefit for compute. I think ddr3 to CPU latency is about 50 ms and gddr5 is about 300 ms. Now if it takes the CPU longer to do its thing then gpu might have to take its place in some cases which sounds like wasted resources to me. I want Sony to make the most out of that crappy tablet CPU so that the gpu can be focused more on rendering rather then doing compute jobs.

Edit: alright thanks pristine. The cache should help to have data available rather then fetch it with delay.
 

DieH@rd

Banned
So, how much resources has VShell [OS overlay?] taken for itself?

It has exclusive access to "High Priority Graphics (HP3D) ring and pipeline", it seems developers cant touch that.
 
Because we know PCI-E is slower to write to and from the GPU than the PS4 which has access to write to the RAM with 176GB/s I was curious how much of an impact this ACE change would have.
From the same guy that said in arstechnica forums tha ps4 cpu is beefed up over vanilla he also claimed the gpu reads from GDDR5 through the cpu that runs a kind of "move engine" program that feeds the gpu ring buffers.
 

artist

Banned
From the same guy that said in arstechnica forums tha ps4 cpu is beefed up over vanilla he also claimed the gpu reads from GDDR5 through the cpu that runs a kind of "move engine" program that feeds the gpu ring buffers.
Wouldnt the memory controller handle this?
 
After talking to a non-insider friend of mine very knowledgable in system designs in general, I certainly wouldn't worry about the GDDR5 latency. In his overview, (and to vastly simplify his thoughts as best I can,) he thought most if not all of that issue would be mediated by ample cache and the relatively low clock speed of the CPU plus a few additional considerations. .
The problem is the gpu computing tasks latencies.A good solution would be to increase the size of the L2 cache of the GPU.
 

artist

Banned
The problem is the gpu computing tasks latencies.A good solution would be to increase the size of the L2 cache of the GPU.
It's a good thing that the ACEs have direct access to the DMA as well as the L2/GDS pools. I wonder if the DMA count was also increased? The VGLeaks block diagram doesnt explicitly mention this but Tahiti has dual DMAs.

I dont know, he said gpu access to mem was slow.We are missing a data cache ESRAM somewhere...
That sounds rather strange.
 

Rafy

Member
The cloth quality was really noticeable in KZ:Shadowfall ,when he is hanging by the rope on the helicopter, on the gloves.I found a screen of that part and I was amazed at the amount of the detail visible on the glove's fabric texture. I showed the screen to my GF who is studying fashion design and she thought it was a still from a movie, she was blown away by the fabric visuals and said it looked lifelike.
 

i-Lo

Member
The cloth quality was really noticeable in KZ:Shadowfall ,when he is hanging by the rope on the helicopter, on the gloves.I found a screen of that part and I was amazed at the amount of the detail visible on the glove's fabric texture. I showed the screen to my GF who is studying fashion design and she thought it was a still from a movie, she was blown away by the fabric visuals and said it looked lifelike.

I think you're in the wrong thread.
 

DieH@rd

Banned
In what i've seen the Durango GPU only has 2 queues, itll be interesting to see the difference it makes.

Things will remain the same. Everything that Durango can do, PS4 can do better. :)

I think you're in the wrong thread.
Well no, that cloth dynamics must be processed either by CPU or GPU. There is a extremley large chance GPU did it.
edit- just saw it. He has armored glove, no moving "flappable" parts. :(
 
The cloth quality was really noticeable in KZ:Shadowfall ,when he is hanging by the rope on the helicopter, on the gloves.I found a screen of that part and I was amazed at the amount of the detail visible on the glove's fabric texture. I showed the screen to my GF who is studying fashion design and she thought it was a still from a movie, she was blown away by the fabric visuals and said it looked lifelike.

Wrong thread?
 
Regardless of if this is GCN 2.0, the fact that it is a at least a hybrid of GCN 1.0 is a step in the right direction. It's great to see AMD/Sony design an APU which so wells caters towards the efficiencies of console hardware. AMD/ATi have lost quite a bit of ground to Intel/Nvidia over the last few years... Their spec for the PS4 (which we are only beginning to see) seems inspired. I hope it puts them back in the game... K7 was a great piece of hardware.
 

Kaako

Felium Defensor
So is it safe to assume that the whole "100% efficiency" comment from before applies to both PS4/720 because of the GNC 1.0/2.0 or modified architecture?
 

artist

Banned
So is it safe to assume that the whole "100% efficiency" comment from before applies to both PS4/720 because of the GNC 1.0/2.0 or modified architecture?
Based on this slide alone, you can average the utilization and fud it to 100% or more;

compute_benches_960W.jpg
 
a poster on beyond3d brings up a good question. I'm not sure if hes implying what I think hes implying.

Yes, if not GCN 2.0 then some kind of precursor to it.

Could the TFLOPS effectively be worth more than 1.0s?
http://forum.beyond3d.com/showpost.php?p=1713699&postcount=632

Seeing as PS4's GPU has 8 compute pipelines which have simulteneous asynchronous compute, allowing you to process compute and rendering tasks in parallel, could you achieve more than 1.84tflops, or more than you can measure in 1.0s?

A lot of this is just over my head, and pure speculation.
 

dr_rus

Member
8 active thread queues seems like an overkill for just 18 CUs. While the idea behind this is pretty clear the realization of it as a straight 4x increase in the number of active threads look kind of stupid and brute force. Well, I guess they didn't have the budget to do anything else here.

Does this mean its possible to emulate the ps3?
No.
 
a poster on beyond3d brings up a good question. I'm not sure if hes implying what I think hes implying.


http://forum.beyond3d.com/showpost.php?p=1713699&postcount=632

Seeing as PS4's GPU has 8 compute pipelines which have simulteneous asynchronous compute, allowing you to process compute and rendering tasks in parallel, could you achieve more than 1.84tflops, or more than you can measure in 1.0s?

A lot of this is just over my head, and pure speculation.

We'll have to wait and see it's Windows Experience Index #.
 

pixelbox

Member
8 active thread queues seems like an overkill for just 18 CUs. While the idea behind this is pretty clear the realization of it as a straight 4x increase in the number of active threads look kind of stupid and brute force. Well, I guess they didn't have the budget to do anything else here.


No.
Not even a little?
 

MaulerX

Member
What I find odd is that in this article: http://www.vgleaks.com/ps4-presentation-confirmed-leaks-and-what-can-we-expect-for-e3-2013/

VGleaks states:

"First of all, I would like to dedicate some sentences to the PS4 specifications presented by Sony. In our Orbis leak, we had written: RAM 4GB GDRR5. Sony has updated and improved these numbers. Developers were working with 4GB not 8GB, the devkit had 4GB. You could think that our documents (SDK) were old, but all our info about PS4 had less than one month and half. Speaking about the other specs, we were right (If I not mistake)."



So their info on PS4 was less than "one month and half" (whereas the Durango info is a year old), yet, in that time frame, not only did the PS4 RAM situation change, now the GPU as well? Also in the bolded part, they claim to have been right on all else but RAM, but now this new article is saying they were wrong on the GPU (14+4 compute) as well?
 

SSM25

Member
I like how the main concern about PS4 is overkill and brute force now, rather this than not enough complaints in past architectures.
 
So their info on PS4 was less than "one month and half" (whereas the Durango info is a year old), yet, in that time frame, not only did the PS4 RAM situation change, now the GPU as well? Also in the bolded part, they claim to have been right on all else but RAM, but now this new article is saying they were wrong on the GPU (14+4 compute) as well?

Yea I'd like them to address the 14+4 compute thing as well. Some things are not adding up.

We'll have to wait and see it's Windows Experience Index #.

whats that? and what does that have to do with what I was asking?
 
So their info on PS4 was less than "one month and half" (whereas the Durango info is a year old), yet, in that time frame, not only did the PS4 RAM situation change, now the GPU as well? Also in the bolded part, they claim to have been right on all else but RAM, but now this new article is saying they were wrong on the GPU (14+4 compute) as well?

Yeah I'm thinking Durango has seen some significant improvements, I dont see it any other way. It was probably foolish of Sony to show some of their advantages and features this early, giving MS time to respond. Oh well, competition is good.
 
Yeah I'm thinking Durango has seen some significant improvements, I dont see it any other way. It was probably foolish of Sony to show some of their advantages and features this early, giving MS time to respond. Oh well, competition is good.

According to Reiko and thuway who have received an up to date document the specs haven't changed. edit: typo

I know what you have. Please don't leak it. Trying to act cool on the internet is not worth having someone lose their job. I can back you up though-

Specs for Durango haven't changed. The GPU is nearly 100% efficient though at 1.23 TF, so that's something most people didn't know.

Now I'm unclear about whether this denies bsassassin's source who said the CPU was upgraded to have twice the flops of a vanilla Jaguar core.
 

RoboPlato

I'd be in the dick
What I find odd is that in this article: http://www.vgleaks.com/ps4-presentation-confirmed-leaks-and-what-can-we-expect-for-e3-2013/

VGleaks states:

"First of all, I would like to dedicate some sentences to the PS4 specifications presented by Sony. In our Orbis leak, we had written: RAM 4GB GDRR5. Sony has updated and improved these numbers. Developers were working with 4GB not 8GB, the devkit had 4GB. You could think that our documents (SDK) were old, but all our info about PS4 had less than one month and half. Speaking about the other specs, we were right (If I not mistake)."



So their info on PS4 was less than "one month and half" (whereas the Durango info is a year old), yet, in that time frame, not only did the PS4 RAM situation change, now the GPU as well? Also in the bolded part, they claim to have been right on all else but RAM, but now this new article is saying they were wrong on the GPU (14+4 compute) as well?

14+4 may have just been a devkit thing. Now that the system has been formally announced and Sony's working to get more devs on board they're sending out updated documentation. I don't think VGLeaks info was wrong before, they just weren't going on final documentation.
 

spwolf

Member
a poster on beyond3d brings up a good question. I'm not sure if hes implying what I think hes implying.


http://forum.beyond3d.com/showpost.php?p=1713699&postcount=632

Seeing as PS4's GPU has 8 compute pipelines which have simulteneous asynchronous compute, allowing you to process compute and rendering tasks in parallel, could you achieve more than 1.84tflops, or more than you can measure in 1.0s?

A lot of this is just over my head, and pure speculation.

of course not.
 

Jburton

Banned
Yeah I'm thinking Durango has seen some significant improvements, I dont see it any other way. It was probably foolish of Sony to show some of their advantages and features this early, giving MS time to respond. Oh well, competition is good.

Does not work like that, the only thing that really changed for the PS4 was the amount of RAM.

MS could add more of the same RAM but it would not be much of an upgrade.

It is way too late to change chips / architecture ........... no way you could add more or new features to a custom chip design this late unless you want to delay release by at least 6 - 12 months.


The information on PS4 was incorrect, the 18 cu instead of 14+4 cu was not a change rather it was always that way, the leaked info was wrong.
 

MaulerX

Member
14+4 may have just been a devkit thing. Now that the system has been formally announced and Sony's working to get more devs on board they're sending out updated documentation. I don't think VGLeaks info was wrong before, they just weren't going on final documentation.


I doubt it was a devkit thing. I mean, if you remember, they had an article talking about the Orbis devkits. Then, another article with the actual Orbis specs. Then, when people got confused about the whole "18CU's hardware balanced at 14CU's", that's when they posted an update explaining the 14+4CU setup.
 

Bsigg12

Member
So if the most recent leak on the Durango we have is a year old, how is it not possible that Microsoft has gone through and updated all the specs. Sure, responding to Sony now would be more of the way they actually structure their reveal, but I firmly believe that the hardware that has been rumored/leaked will be different in some way.

Unless Microsoft decides to release a new system every 4-5 years, if they don't future proof the system 7+ years out, they could be hurting really bad later in the lifecycle. I say really bad lightly though because it will still be able to do some impressive things, just not as impressive as the PS4.
 
I doubt it was a devkit thing. I mean, if you remember, they had an article talking about the Orbis devkits. Then, another article with the actual Orbis specs. Then, when people got confused about the whole "18CU's hardware balanced at 14CU's", that's when they posted an update explaining the 14+4CU setup.

It seems like the info they got was dated. I doubt the ram is the only thing that Sony changed. The official doc that Sony released is the most updated info.
 

vpance

Member
why not? could you elaborate please. The guy on beyond3d seems to have a valid question, I dont know why he would ask it, if it was impossible.

It kind of looks like the Sea Islands spec sheet, which is supposed to have 8 ACE too.

We know Kepler flops > GCN flops at the moment so I don't think it's out of the question that some GCN+ or GCN 2 type modifications might improve on that flop quality.
 

spwolf

Member
why not? could you elaborate please. The guy on beyond3d seems to have a valid question, I dont know why he would ask it, if it was impossible.

well he is asking probably for the same reason you are copying :).

it is done so you can utilize APU better, but you cant get more raw power that way. But sure, games will look better for it.
 
does anyone know what Vshell(is it the OS?) is and what this means?

- High Priority Graphics (HP3D) ring and pipeline

New for Liverpool
Same as GFX pipeline except no compute capabilities
For exclusive use by VShell


well he is asking probably for the same reason you are copying :).

it is done so you can utilize APU better, but you cant get more raw power that way. But sure, games will look better for it.

It kind of looks like the Sea Islands spec sheet, which is supposed to have 8 ACE too.

We know Kepler flops > GCN flops at the moment so I don't think it's out of the question that some GCN+ or GCN 2 type modifications might improve on that flop quality.

:( I see...Would it be able to affect benchmark performance? Seems like it would be only something programmers would be able to take advantage of.
 

AgentP

Thinks mods influence posters politics. Promoted to QAnon Editor.
why not? could you elaborate please. The guy on beyond3d seems to have a valid question, I dont know why he would ask it, if it was impossible.

FLOPS is a standard unit (floating point operations per second), it is a theoretical number. I think what he is asking, in a strange way, is will the new GPU be more efficient and get closer to its theoretical max. I think that is the point in the changes, less stalled, unused CUs.

does anyone know what Vshell(is it the OS?) is and what this means?
.

This gives the OS quick, direct access to the GPU. This is for pop up messages and other stuff that might overlay on the game screen.
 

vpance

Member
Are the 8 compute pipelines= 8 ACEs ?

I think so.

http://beyond3d.com/showpost.php?p=1708222&postcount=1082

•Multi queue compute
Lets multiple user-level queues of compute workloads be bound to the device and processed simultaneous. Hardware supports up to eight compute pipelines with up to eight queues bound to each pipeline.

:( I see...Would it be able to affect benchmark performance? Seems like it would be only something programmers would be able to take advantage of.

Gotta run 3dmark on a 7850 and PS4's GPU to see for sure :) If devs need to spend more effort to squeeze out better flops at least they could do it, rather than not as well with regular GCN. This is nothing like the Cell SPE magic tricks you needed to know to harness the power there.
 
Top Bottom