Support NeoGAF

Takuya · Mar 13, 2013

http://www.vgleaks.com/durango-memory-system-overview/
&
http://www.vgleaks.com/durango-memory-system-example/

Memory

As you can see on the right side of the diagram, the Durango console has:

8 GB of DRAM.
32 MB of ESRAM.

DRAM

The maximum combined read and write bandwidth to DRAM is 68 GB/s (gigabytes per second). In other words, the sum of read and write bandwidth to DRAM cannot exceed 68 GB/s. You can realistically expect that about 80 – 85% of that bandwidth will be achievable (54.4 GB/s – 57.8 GB/s).

DRAM bandwidth is shared between the following components:

- CPU
- GPU
- Display scan out
- Move engines
- Audio system
- ESRAM

The maximum combined ESRAM read and write bandwidth is 102 GB/s. Having high bandwidth and lower latency makes ESRAM a really valuable memory resource for the GPU.

ESRAM bandwidth is shared between the following components:

- GPU
- Move engines

Video encode/decode engine. System coherency

There are two types of coherency in the Durango memory system:

Fully hardware coherent

I/O coherent

[...]

The CPU

The Durango console has two CPU modules, and each module has its own 2 MB L2 cache. Each module has four cores, and each of the four cores in each module also has its own 32 KB L1 cache.

When a local L2 miss occurs, the Durango console probes the adjacent L2 cache via the north bridge. Since there is no fast path between the two L2 caches, to avoid cache thrashing, it’s important that you maximize the sharing of data between cores in a module, and that you minimize the sharing between the two CPU modules.

Typical latencies for local and remote cache hits are shown in this table.

- Remote L2 hit approximately 100 cycles
- Remote L1 hit approximately 120 cycles
- Local L1 Hit 3 cycles for 64-bit values
- 5 cycles for 128-bit values
- Local L2 Hit approximately 30 cycles
- Each of the two CPU modules connects to the north bridge by a bus that can carry up to 20.8 GB/s in each direction.

From a program standpoint, normal x86 ordering applies to both reads and writes. Stores are strongly ordered (becoming visible in program order with no explicit memory barriers), and reads are out of order.

Keep in mind that if the CPU uses Write Combined memory writes, then a memory synchronization instruction (SFENCE) must follow to ensure that the writes are visible to the other client devices.

The GPU

The GPU can read at 170 GB/s and write at 102 GB/s through multiple combinations of its clients. Examples of GPU clients are the Color/Depth Blocks and the GPU L2 cache.

The GPU has a direct non-coherent connection to the DRAM memory controller and to ESRAM. The GPU also has a coherent read/write path to the CPU’s L2 caches and to DRAM.

For each read and write request from the GPU, the request uses one path depending on whether the accessed resource is located in “coherent” or “non-coherent” memory.

Some GPU functions share a lower-bandwidth (25.6 GB/s), bidirectional read/write path. Those GPU functions include:

- Command buffer and vertex index fetch
- Move engines
- Video encoding/decoding engines
- Front buffer scan out
- As the GPU is I/O coherent, data in the GPU caches must be flushed before that data is visible to other components of the system.

The available bandwidth and requirements of other memory clients limit the total read and write bandwidth of the GPU.

Move engines

The Durango console has 25.6 GB/s of read and 25.6 GB/s of write bandwidth shared between:

- Four move engines
- Display scan out and write-back
- Video encoding and decoding
- The display scan out consumes a maximum of 3.9 GB/s of read bandwidth (multiply 3 display planes × 4 bytes per pixel × HDMI limit of 300 megapixels per second), and display write-back consumes a maximum of 1.1 GB/s of write bandwidth (multiply 30 bits per pixel × 300 - megapixels per second).

You may wonder what happens when the GPU is busy copying data and a move engine is told to copy data from one type of memory to another. In this situation, the memory system of the GPU shares bandwidth fairly between source and destination clients. The maximum bandwidth can be calculated by using the peak-bandwidth diagram at the start of this article.

stryke · Mar 13, 2013

I feel like we've seen all this already.

Cidd · Mar 13, 2013

stryke said:
I feel like we've seen all this already.

Well it's basically an overview of what we already know, Just with a little more detail.

Shao Kahn Brewing a Stew · Mar 13, 2013

stryke said:
I feel like we've seen all this already.

Yep. Stopped following them and just waiting for Microsoft to officially unveil it. There's nothing an extended explanation is going to the already known specs.

Reiko · Mar 13, 2013

Cidd said:
Well it's basically an overview of what we already know, Just with a little more detail.

It's how you milk the info you sit on for more hits. It's annoying actually. lol

PSGames · Mar 13, 2013

Interesting that the max bandwidth when using main ram and esram is 136.4 GB not 170.

CadetMahoney · Mar 13, 2013

Reiko said:
It's how you milk the info you sit on for more hits.

pretty much.

fratstar · Mar 13, 2013

I don't know what any of this means. It certainly doesn't make me want to click on that VGLeaks link.

itsgreen · Mar 13, 2013

PSGames said:
Interesting that the max bandwidth when using main ram and esram is 136.4 GB not 170.

Nevermind

Confusing since they say:

The GPU can read at 170 GB/s and write at 102 GB/s through multiple combinations of its clients.

Phonomezer · Mar 13, 2013

Why are they releasing the exact same stuff again? Are they fishing for more info? Hits?

Tomcat · Mar 13, 2013

Phonomezer said:
Why are they releasing the exact same stuff again? Are they fishing for more info? Hits?

of course lol

Cidd · Mar 13, 2013

PSGames said:
Interesting that the max bandwidth when using main ram and esram is 136.4 GB not 170.

I was expecting it to be lower like maybe around 155 or so, But 136 that's a pretty steep dip. Now I'm curious to see the PS4 RAM bandwidth, If it takes the same hit then things just got a lot more interesting.

Reiko · Mar 13, 2013

Phonomezer said:
Why are they releasing the exact same stuff again? Are they fishing for more info? Hits?

Yeah. It's really obvious with this article.

Maximilian E. · Mar 13, 2013

Still based on old information..

(or is it??)

Shayan · Mar 13, 2013

that is misleading

only 30mb of that esram will write at 170g/s

if the rumored 8g GDDR3 is true then the write speed will be 68g/s

Maximilian E. said:
Still based on old information..

(or is it??)

nothing prevents MS from delaying launch by 3/6 months with better RAM ,unless they are going for a casual market where they feel they have everything to attract that casual crowd

SSM25 · Mar 13, 2013

No new info, sucks

Why are they adding bandwidth again?

PSGames · Mar 13, 2013

itsgreen said:
Explain your math junior PSGames... because I don't see 136.4. Because I read:

It's right there in the article. And please I'm one of the biggest xbox fans out there. lol

itsgreen · Mar 13, 2013

PSGames said:
It's right there in the article. And please I'm one of the biggest xbox fans out there. lol

Ah fair enough, didn't read the article

McHuj · Mar 13, 2013

PSGames said:
Interesting that the max bandwidth when using main ram and esram is 136.4 GB not 170.

That's for a memcopy. That theoretical ~170 was if the GPU could read from both the DRAM and ESRAM which isn't in their table (ann may not be possible anyways)

PSGames · Mar 13, 2013

Cidd said:
I was expecting it to be lower like maybe around 155 or so, But 136 that's a pretty steep dip. Now I'm curious to the PS4 RAM bandwidth, If it takes the same hit then things just got a lot more interesting.

this hit is due to transferring information from dram to esram. The ps4 has one memory pool so it won't have this issue.

x-Lundz-x · Mar 13, 2013

So, someone who can understand these charts explain to me please.

What does this equal in terms of PS4 power?

Same, 75% etc????

Napalm_Frank · Mar 13, 2013

No Secret Sauce?

artist · Mar 13, 2013

PSGames said:
Interesting that the max bandwidth when using main ram and esram is 136.4 GB not 170.

Let the downgrades begin!1

Triple U · Mar 13, 2013

PSGames said:
this hit is due to transferring information from dram to esram. The ps4 has one memory pool so it won't have this issue.

Interesting, how his GDDR5 calculated then if they combine read and write speeds? Is it the same?

sangreal · Mar 13, 2013

PSGames said:
Interesting that the max bandwidth when using main ram and esram is 136.4 GB not 170.

136.4 is transfering data between the two memory pools. Reading from both pools is 170. It's pretty obvious transferring would be limited by the 68GB/s of the slower pool. Since you can only read 68GB/s you can't write that data faster to the esram and since you can only write 68GB/s there wouldn't be any point to reading faster from the esram

McHuj · Mar 13, 2013

Triple U said:
Interesting, how his GDDR5 calculated then if they combine read and write speeds? Is it the same?

Yeah, it would be the same like the DRAM to DRAM copy line, except 176 for the total.

DieH@rd · Mar 13, 2013

According to their example, main ram is ~55-58GB/s [theoretic 68GB/s, but that will never happen] and they presume that usual CPU+northbridge modules workload will be around 25GB/s, with possibility of hitting saturation point of 30GB/s. That means that main ram will be shared 50/50 between CPU/northbrigde and GPU. GPU will be able to work with ~130GB/s [~30 to mainram [thats both read+write] and ~100to ESram, presuming that move engines are shut down].

Nothing new really, we knew from before that this Durango architecture has memory problems that are really nicely fixed in PS4.

PSGames · Mar 13, 2013

sangreal said:
136.4 is transfering data between the two memory pools. Reading from both pools is 170

ah ok thanks

LukasTaves · Mar 13, 2013

PSGames said:
Interesting that the max bandwidth when using main ram and esram is 136.4 GB not 170.

That's the bandwidth when the gpu is moving data between them...

It's also a bit misleading to call it that way because you are both reading from one pool at 68GB/s and writing at the other at the same speed... That operation leaves you with ~40GB/s unused on esram, but you are not moving data at 136GB/s

Cidd · Mar 13, 2013

LukasTaves said:
That's the bandwidth when the gpu is moving data between them...

It's also a bit misleading to call it that way because you are both reading from one pool at 68GB/s and writing at the other at the same speed... That operation leaves you with ~40GB/s unused on esram, but you are not moving data at 136GB/s

So it's even worse than they posted? I don't like the sound of this.

gcubed · Mar 13, 2013

this is boring, another rehash. I want to know software information. Why are they reserving 3gb (if thats still true). Is the API requirement true?

LukasTaves · Mar 13, 2013

Cidd said:
So it's even worse than they posted? I don't like the sound of this.

It's not worse, it's actually the same they are saying, just worded differently... What they wrote as if durango had 170GB/s of total memory bandwidth and that operation consumes 136... That's partially correct, but the actual transfer is happening at only 68GB/s...

I dunno how much the gpu itself will be transferring data between them, though. The point of DMEs is to operate in parallel with both the gpu and cpu doing these data transfers when they are busy doing other work...

Clear · Mar 13, 2013

What strikes me about that block diagram is that its complication is a result of very specific system software design goals, far more so than increasing/improving application performance.

Cidd · Mar 13, 2013

LukasTaves said:
It's not worse, it's actually the same they are saying, just worded differently... What they wrote as if durango had 170GB/s of total memory bandwidth and that operation consumes 136... That's partially correct, but the actual transfer is happening at only 68GB/s...

I dunno how much the gpu itself will be transferring data between them, though. The point of DMEs is to operate in parallel with both the gpu and cpu doing these data transfers when they are busy doing other work...

Ah, thanks for clearing that up, so any idea why they decided to combined both Bandwidth?
Is that something special to the Xbox 720?

LukasTaves · Mar 13, 2013

Everyone else being able to read directly from the cpu caches are new? I mean, was that touted as a feature for jaguar before?

On 360 the gpu could also access the L2 cache directly, and use it as a stream buffer, but not sure that was widely used in games, but that sounds like the same concept.

shandy706 · Mar 13, 2013

<--- Continues to wait for reveal. Time is crawling now!

amstradcpc · Mar 13, 2013

LukasTaves said:
Everyone else being able to read directly from the cpu caches are new? I mean, was that touted as a feature for jaguar before?

On 360 the gpu could also access the L2 cache directly, and use it as a stream buffer, but not sure that was widely used in games, but that sounds like the same concept.

In that graphic there should be somewhere a HSA mem management unit (HMMU) than makes this via hardware. In Xbox 360 memexport was software IIRC.

http://developer.amd.com/wordpress/media/2012/10/hsa10.pdf

gAg CruSh3r · Mar 13, 2013

Eideka · Mar 13, 2013

The gap in power between the 720 and the PS4 is terrifying. I wonder how multiplats on 720 will fare, there are reasons to be worried on that front.

I wouldn't be surprised if 3dr parties systematically enhance the PS4 versions of their game with graphical features.

Reiko · Mar 13, 2013

Eideka said:
The gap in power between the 720 and the PS4 is terrifying. I wonder how multiplats on 720 will fare.

The implications of Xbox going the weaker route is very interesting no doubt.

Eideka · Mar 13, 2013

Reiko said:
The implications of Xbox going the weaker route is very interesting no doubt.

That said I find hard to believe that the rumored specs are final, that's way too weak.

Osiris · Mar 13, 2013

LukasTaves said:
Everyone else being able to read directly from the cpu caches are new? I mean, was that touted as a feature for jaguar before?

On 360 the gpu could also access the L2 cache directly, and use it as a stream buffer, but not sure that was widely used in games, but that sounds like the same concept.

It's a big part of AMD's APU push, so something both PS4 and the next Xbox will be capable of. (Assuming the leaks are correct and MS are using Jaguar cores)

See here for more detail.

shandy706 · Mar 13, 2013

Eideka said:
The gap in power between the 720 and the PS4 is terrifying. I wonder how multiplats on 720 will fare, there are reasons to be worried on that front.

I wouldn't be surprised if 3dr parties systematically enhance the PS4 versions of their game with graphical features.

Unfortunately (fortunately for MS) that won't matter if the majority buys the new Xbox. Not talking Neogaf "hardcore/Sony supporting" gamers obviously, but if Microsoft comes out swinging with advertising and gets both many hardcore gamers and casual....the developers will most likely design/aim for the best profit level.

I have a feeling MS has something up their sleeve. These "leaks" are all so old now...they've got the current Durango information (even if it's the same) locked up in some nuclear silo. lol

PhatSaqs · Mar 13, 2013

Eideka said:
The gap in power between the 720 and the PS4 is terrifying.

mrklaw · Mar 13, 2013

shandy706 said:
Unfortunately (fortunately for MS) that won't matter if the majority buys the new Xbox. Not talking hardcore gamers obviously, but if Microsoft comes out swinging with advertising and gets both many hardcore gamers and casual....the developers will most likely design/aim for the best profit level.

PCs will still be more powerful and likely to be the lead development platform anyway, with optimisations for console.

If the strengths of both platforms can be properly leveraged (PS4 seems fairly transparent, Durango will depend on how automatic some of these subsystems are), then I can see Durango getting the downports this time round.

both should still look great, but considering the tiniest differences seem enough in face-offs..

Cidd · Mar 13, 2013

shandy706 said:
Unfortunately (fortunately for MS) that won't matter if the majority buys the new Xbox. Not talking hardcore gamers obviously, but if Microsoft comes out swinging with advertising and gets both many hardcore gamers and casual....the developers will most likely design/aim for the best profit level.

Well that's the problem isn't the majority of sales at launch the hardcore crowd? I can see the Xbox fans buying the 720 for first party games but if third party games have a clear advantage in appearance in favor of the PS4 then MS got a lot to worry about.

It's even more grim if they're launching around the same time.

LukasTaves · Mar 13, 2013

Cidd said:
Ah, thanks for clearing that up, so any idea why they decided to combined both Bandwidth?
Is that something special to the Xbox 720?

Why they decided to add the bandwidth for data transfers or why they are adding 102+68 as the total bandwidth?

For the transfers they are probably adding to show that even when transferring data from one memory to the other, as esram still has some bandwidth available that other clients can consume...

The added total bandwidth is a bit curious too. It implies that while the gpu can read at both at the same time it can only write to either of them at once... That limits some scenarios they hinted it would be possible before...

Eideka · Mar 13, 2013

shandy706 said:
Unfortunately (fortunately for MS) that won't matter if the majority buys the new Xbox. Not talking Neogaf "hardcore/Sony supporting" gamers obviously, but if Microsoft comes out swinging with advertising and gets both many hardcore gamers and casual....the developers will most likely design/aim for the best profit level.

I don't see how enhancing the PS4 version harms the next Xbox, it's the hardware that matters after all. I have trouble believing they will cater to the lowest machine and don't scale up from there given how easy the PS4 is to developp for.

I have a feeling MS has something up their sleeve. These "leaks" are all so old now...they've got the current Durango information (even if it's the same) locked up in some nuclear silo. lol

For the sake of competition, I hope so.

PCs will still be more powerful and likely to be the lead development platform anyway, with optimisations for console.

It has never been the case this generation save for a few exceptions, why would the PC be the lead for next-gen multiplatform games ?
Most likely one of the consoles.

@PhatSaqs : if those rumored specs are accurate then yes the PS4 is far ahead in every department.

PSGames · Mar 13, 2013

LukasTaves said:
It's not worse, it's actually the same they are saying, just worded differently... What they wrote as if durango had 170GB/s of total memory bandwidth and that operation consumes 136... That's partially correct, but the actual transfer is happening at only 68GB/s...

I dunno how much the gpu itself will be transferring data between them, though. The point of DMEs is to operate in parallel with both the gpu and cpu doing these data transfers when they are busy doing other work...

Question so the GPU can read both at 170GBs but wouldn't the esram have to be transferred the data from the dram first at 68GBs in 32MB chunks? So even if the you can read both at 170GBs it takes a lot of lower bandwidth steps before it even gets to that point correct?

specialguy · Mar 13, 2013

DieH@rd said:
According to their example, main ram is ~55-58GB/s [theoretic 68GB/s, but that will never happen] and they presume that usual CPU+northbridge modules workload will be around 25GB/s, with possibility of hitting saturation point of 30GB/s. That means that main ram will be shared 50/50 between CPU/northbrigde and GPU. GPU will be able to work with ~130GB/s [~30 to mainram [thats both read+write] and ~100to ESram, presuming that move engines are shut down].

Nothing new really, we knew from before that this Durango architecture has memory problems that are really nicely fixed in PS4.

These seem like normal overheads that I'm quite sure exist on every architecture including PS4.

AKA PS4 wont be doing whatever it does at 176 GB/s, the practical limit will be like 140 GB/s or something (just an example not an actual accurate number)

Support NeoGAF

VGLeaks Rumor: Durango Memory System Overview & Example

Banned

Member

Member

Banned

Banned

Junior Member

Member

Banned

Member

Banned

Member

Member

Banned

AKA MS-Evangelist

Banned

Member

Junior Member

Member

Member

Junior Member

Member

Member

Banned

Banned

Member

Member

Banned

Junior Member

Member

Member

Member

Member

CliffyB's Cock Holster

Member

Member

Member

Member

Member

Banned

Banned

Banned

I permanently banned my 6 year old daughter from using the PS4 for mistakenly sending grief reports as it's too hard to watch or talk to her

Member

Banned

MrArseFace

Member

Member

Banned

Junior Member

Banned

Similar threads