• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

PS4 & Unified Memory Pool - How Substantial?

Dunk#7

Member
As an engineer I typically only see a lot of things on the surface and don't really dig too deep into the finite technical aspects of everything.

I am just wondering if somebody could break down or explain how substantial the gains can be from the unified memory architecture versus the way everything has been done in the past.

I know that the main processor and the GPU can now communicate much faster, but how does that relate to performance gains?

Are there any examples of this architecture in the PC market? Has anybody developed software in the past to take advantage of something like this?
 
Bethesda had problems with Skyr because PS3 has a split pool of memory and they could not store everything they wanted in one pool. I guess they ironed it out by now, but 360 didn't have those problems with its unified memory pool.
 
Bethesda had problems with Skyr because PS3 has a split pool of memory and they could not store everything they wanted in one pool. I guess they ironed it out by now, but 360 didn't have those problems with its unified memory pool.

Not the same thing. I'm not talking about the RAM that is split between the OS and the Games

I am talking about the fact that there are no busses to get in the way of the transfer of information between the CPU and the GPU. No bottleneck like there used to be.
 
I am just wondering if somebody could break down or explain how substantial the gains can be from the unified memory architecture versus the way everything has been done in the past.
The performance gains on algorithms in common use today are probably not all that significant, since those are already designed with the bottleneck in mind.

Are there any examples of this architecture in the PC market?
Yes, pretty much every integrated GPU, even from way back when they were integrated in the northbridge, and not the CPU. Just not at the same performance levels. And of course, it's not a new concept in consoles.

Has anybody developed software in the past to take advantage of something like this?
There is a recent paper from people associated with Intel that looks into using shared memory between GPUs and CPUs to implement asynchronous anti-aliasing, but that seems much less applicable in the console case with the comparatively weak CPUs and stong GPUs.
 
Imagine the benefits MS had with the 360 and their unified pool. Sony now has that advantage but even more so because they fully comply with the AMD Heterogeneous Uniform Memory Access (hUMA) because they don't have to use another pool of memory like the eSRAM.

Everything the GPU needs from the CPU, it can access. Everything the CPU needs from the GPU, it also can access. There is NO need for it to copy data from the main ram to a different pool of RAM. That along with the huge bandwidth advantage, will put it ahead of the X1.
 
The performance gains on algorithms in common use today are probably not all that significant, since those are already designed with the bottleneck in mind.

Yes, pretty much every integrated GPU, even from way back when they were integrated in the northbridge, and not the CPU. Just not at the same performance levels. And of course, it's not a new concept in consoles.

There is a recent paper from people associated with Intel that looks into using shared memory between GPUs and CPUs to implement asynchronous anti-aliasing, but that seems much less applicable in the console case with the comparatively weak CPUs and stong GPUs.

Thanks for the information.

Sounds like it is similar to what we saw with the PS3. First party developers could see significant gains if they write their code to take advantage of the specific hardware architecture.

Or would the likely not try this since it would be like re-inventing the wheel with certain segments of code? Do developers often build with "building blocks"? Meaning do they take segments of old code and drop them into new development?
 
Bethesda had problems with Skyr because PS3 has a split pool of memory and they could not store everything they wanted in one pool. I guess they ironed it out by now, but 360 didn't have those problems with its unified memory pool.
It didn't have a split pool. It had separate RAM for both the CPU and the GPU.

Not the same thing. I'm not talking about the RAM that is split between the OS and the Games

I am talking about the fact that there are no busses to get in the way of the transfer of information between the CPU and the GPU. No bottleneck like there used to be.

The problem with Skyrim was that it had lousy QA. Also, Leucrota is talking about a split between the CPU and GPU, not Games/OS.
 
Not the same thing. I'm not talking about the RAM that is split between the OS and the Games

No, they weren't talking about it being split between the OS and the games... both split 256MB pools on PS3 were involved in games. Not sure what you are talking about tbh.
 
Nvidia are going to be bringing it [UMA] up the rear with Maxwell too, aren't they? By that point all the vendors will be on board so you may start seeing some real gains from devs.
No, they weren't talking about it being split between the OS and the games... both split 256MB pools on PS3 were involved in games. Not sure what you are talking about tbh.

It was a poorly worded title.
 
No, they weren't talking about it being split between the OS and the games... both split 256MB pools on PS3 were involved in games. Not sure what you are talking about tbh.

Well I must be wrong about the reasoning for split in the RAM on the PS3, but it is still different.

Neither the PS3 not the 360 were set up to where the CPU and GPU had direct access to the same RAM pool. They had to access the ram through various busses.


Nvidia are going to be bringing it [UMA] up the rear with Maxwell too, aren't they? By that point all the vendors will be on board so you may start seeing some real gains from devs.


It was a poorly worded title.

How should I have worded it to get the point across that I was seeking?
 
It didn't have a split pool. It had separate RAM for both the CPU and the GPU.



The problem with Skyrim was that it had lousy QA. Also, Leucrota is talking about a split between the CPU and GPU, not Games/OS.

Bethesda definitely does not get a pass from me at all about their games, BUT...Sony really fucked up with Cell and their RAM configuration. It was a stupid decision to begin with and it continued the tradition of Sony's consoles being more problematic for developers to figure out, and any supposed advantage in specs is quickly washed away with the difficulty in development. They provided a set of encyclopedia-sized manuals that developers had to read through just to figure out how to make a polygon on that system. Once you learned it, it could do some nice things, but the learning curve was ridiculous compared to their chief competition.

None of that applies now, however, as both of them are using a very familiar architecture.

But yeah, in some ways Bethesda does suck.
 
Bethesda had problems with Skyr because PS3 has a split pool of memory and they could not store everything they wanted in one pool. I guess they ironed it out by now, but 360 didn't have those problems with its unified memory pool.

This is bollox.
Split memory wasn't the issue at all it was because there wasn't enough memory in the pool.
 
This is bollox.
Split memory wasn't the issue at all it was because there wasn't enough memory in the pool.

PS3 and 360 have the same amount of memory available. The difference is that the PS3 has two different types of memory, running at different speeds, split between the CPU and GPU. (256 megs each.)

It takes a lot more memory management to work around this than it does on the 360, because the 360 treats all 512 megs of ram as a single pool that the GPU or CPU can draw from as needed.
 
This is bollox.
Split memory wasn't the issue at all it was because there wasn't enough memory in the pool.

Split memory doesn't help. It makes allocation more difficult, especially when you are limited by how much memory you have. That with the OS that takes more memory than MS, it made it difficult for devs to do what they needed to do.

That said, I think Bethesda's "BelieveUsIt'sNotGamebryo" engine is a sack of dildos. Even on PC it's a load of shit, can't imagine it being better on console, right?
 
There's no bottleneck on PS4 architecture.

WHAT THAT MEANS is that usually data has to pass through the CPU to get into ram. This is not the case on the PS4. The PS4 has no eram either. It doesn't need it.

The data is just put into the GDDR5, and then simply asks the CPU and GPU if this is "ok". They reply. That's it.
This also means that both the CPU and the GPU modify the same ram, allowing much, much more crazy things to happen dynamically.

When trying to spot the difference between the PS4 and the Xbone, look for things like cloth, particles, and other dynamics.
 
PS3 and 360 have the same amount of memory available. The difference is that the PS3 has two different types of memory, running at different speeds, split between the CPU and GPU. (256 megs each.)

It takes a lot more memory management to work around this than it does on the 360, because the 360 treats all 512 megs of ram as a single pool that the GPU or CPU can draw from as needed.

Diff in OS footprints are a thing as well, although the PS3's isn't quite the hog it used to be.
 
The performance gains on algorithms in common use today are probably not all that significant, since those are already designed with the bottleneck in mind.

Yes, pretty much every integrated GPU, even from way back when they were integrated in the northbridge, and not the CPU. Just not at the same performance levels. And of course, it's not a new concept in consoles.

There is a recent paper from people associated with Intel that looks into using shared memory between GPUs and CPUs to implement asynchronous anti-aliasing, but that seems much less applicable in the console case with the comparatively weak CPUs and stong GPUs.

I think point is that something need to be standard to be used by developers. APUs are still minority in gaming and speed of APU in PC is really slow.
Integrated GPUs are not different from case above. Integrated GPUs in motherboards also are not valuable as asset in software development like proper GPU.

I share my view of this same as Shifty from B3D. It is simply to early to even point what will happen. Most of software simply will need to be changed same with thinking. We can assume certain predictions may happen but HSA is like giving developers completely new hardware, where those developers need to first explore.
 
Yes, pretty much every integrated GPU, even from way back when they were integrated in the northbridge, and not the CPU. Just not at the same performance levels. And of course, it's not a new concept in consoles.

I don't think anything like this has ever been done. Not only is memory unified, but the CPU and GPU can share the same address space, which I think is a first. AFAIK integrated GPUs use reserved space in memory.
 
I don't think anything like this has ever been done. Not only is memory unified, but the CPU and GPU can share the same address space, which I think is a first. AFAIK integrated GPUs use reserved space in memory.
Haswell supports direct access to the same memory space by GPU and CPU.

Hetero-core processors are an inevitable future anyway. Better start sooner than later.
Then I guess it's a good thing Sony started in 2006.
 
It's also worth mentioning that Mark Cerny explained that this architecture is a terra incognita for programmers. Utilizing the heterogenous architecture of the PS4 to capacity is a long term goal and won't happen right from the start. We won't see heterogenous algorithms that use CPU and GPU in concert in any of the games that release at the end of the year.

So like every console ever, launch titles will look and perform significantly worse than games a few years in. Kinda exciting. I just want to know if they put a hardware scaler in this time or if they really plan on sticking with 1080p.
 
1) Things like particle effects will be easier in PS4.
http://research.microsoft.com/en-us/people/sscarle/complete_talk.pptx‎

2) In current gen, you only see muted colors and not rich vibrant colors. Even though the textures have high resolution, the color spectrum is very small. Only uses 10 bit to represent a pixel.
With PS4 we will start seeing high res, higher bits per color for textures.

In PC:

1) CPU has to setup DMA to move data from (HDD, Blu Ray) to DDR3
2) Game developer has to decide what has to be cached inside GPU and move them ahead of time to the GPU
3) Physics, Lighting and other calculation will require the data to be moved in and out of GPU GDDR5 and CPU DDR3. PCIE latency is 1000x higher than memory
4) As PCIE latency is very high, if moving large textures doesn't complete on time, it may cause screen tearing. Most of the games don't use 32 bit color textures.

360/Xbox one:
The cache inside GPU is very very small ~32 MB. So the game developer has to keep the core textures small. Either mute the color or bring down the resolution. That is why Gears and other games have detailed levels but almost all the texture within the level will look the same and muted.


With PS4:
- You can use GPU for any thing like Physics, Lighting, AI without any over head.
- Game development becomes easy as you don't have to pre-load any thing.
- You can have variety of textures within the same level. As no GPU caching of textures is needed. There is no movement of data, once the texture is loaded to memory GPU can use it for rendering directly.
 
Then I guess it's a good thing Sony started in 2006.

It's ironic that the "hard to develop for Cell" demanded the amount of "jobification" that modern game engines nowadays strive to achieve anyway.
 
There is a recent paper from people associated with Intel that looks into using shared memory between GPUs and CPUs to implement asynchronous anti-aliasing, but that seems much less applicable in the console case with the comparatively weak CPUs and stong GPUs.

Could you perhaps link this paper Durante?
 
The unified memory of the 360 and its edram is what brought it leagues ahead of the ps3 on certain performance functions.

Developers had easy optimization solutions to ram, as everything can be tweaked, instead of from the 256 here, and the 256 there.

It makes perfect sense on fixed hardware.
 
2) In current gen, you only see muted colors and not rich vibrant colors. They are high resolution but very limited color spectrum. Only use 10 bit to represent a pixel.
It's completely wrong to imply that the vibrance of colors has anything at all to do with the amount of bits used to represent them. And it has nothing at all to do with unified memory.

4) As PCIE latency is low, having moving large textures will cause screen tearing.
First of all, I think you mean "high". Secondly, latency has nothing to do with screen tearing.

360/Xbox one:
The cache inside GPU is very very small ~32 MB. So the game developer has to keep the core textures small. Either mute the color or bring down the resolution. That is why Gears and other games have detailed levels but almost all the texture within the level will look the same and muted.
This is just total bull. The eDRAM on 360 was only used for framebuffer storage, not textures, and while the XB1 eSRAM is by all accounts more flexible, that explanation and the connections it implies are still way off.
 
2) In current gen, you only see muted colors and not rich vibrant colors. Even though the textures have high resolution, the color spectrum is very small. Only uses 10 bit to represent a pixel.
With PS4 we will start seeing high res, higher bits per color for textures.

Grade A+ Comedy.
 
It's ironic that the "hard to develop for Cell" demanded the amount of parallelization that modern game engines nowadays strive to achieve anyway.
I said that in 2006. People wouldn't listen. Cell was amazingly forward-looking. In a way, I feel it's still more forward looking than the flat memory, everything-is-cached, "easy" PS4 architecture. That just doesn't scale up indefinitely in hardware.

Could you perhaps link this paper Durante?
Sure.
 
1) Things like particle effects will be easier in PS4.
http://research.microsoft.com/en-us/people/sscarle/complete_talk.pptx‎

2) In current gen, you only see muted colors and not rich vibrant colors. Even though the textures have high resolution, the color spectrum is very small. Only uses 10 bit to represent a pixel.
With PS4 we will start seeing high res, higher bits per color for textures.

In PC:

1) CPU has to setup DMA to move data from (HDD, Blu Ray) to DDR3
2) Game developer has to decide what has to be cached inside GPU and move them ahead of time to the GPU
3) Physics, Lighting and other calculation will require the data to be moved in and out of GPU GDDR5 and CPU DDR3. PCIE latency is 1000x higher than memory
4) As PCIE latency is very high, if moving large textures doesn't complete on time, it may cause screen tearing. Most of the games don't use 32 bit color textures.

360/Xbox one:
The cache inside GPU is very very small ~32 MB. So the game developer has to keep the core textures small. Either mute the color or bring down the resolution. That is why Gears and other games have detailed levels but almost all the texture within the level will look the same and muted.


With PS4:
- You can use GPU for any thing like Physics, Lighting, AI without any over head.
- Game development becomes easy as you don't have to pre-load any thing.
- You can have variety of textures within the same level. As no GPU caching of textures is needed. There is no movement of data, once the texture is loaded to memory GPU can use it for rendering directly.

i feel like this is a bot account that's been sifting through thousands of B3D posts but doesn't actually have the required intelligence to correctly put them together.
 
Imagine the benefits MS had with the 360 and their unified pool. Sony now has that advantage but even more so because they fully comply with the AMD Heterogeneous Uniform Memory Access (hUMA) because they don't have to use another pool of memory like the eSRAM.

Everything the GPU needs from the CPU, it can access. Everything the CPU needs from the GPU, it also can access. There is NO need for it to copy data from the main ram to a different pool of RAM. That along with the huge bandwidth advantage, will put it ahead of the X1.

This. /thread.
 
I said that in 2006. People wouldn't listen. Cell was amazingly forward-looking. In a way, I feel it's still more forward looking than the flat memory, everything-is-cached, "easy" PS4 architecture. That just doesn't scale up indefinitely in hardware.

Although you still have to think about local cache sizes, especially when doing GPGPU. It's pretty much the same scenario as with the SPUs and their tiny local storage + DMA to main memory.
 
i feel like this is a bot account that's been sifting through thousands of B3D posts but doesn't actually have the required intelligence to correctly put them together.

Give my 11 year old cousin a few hours to read B3D and this is the kind of stuff he'd write. Buzz words, no logic, no independent thought.
 
I don't know what the heck you guys are talking about but I love this thread.

I'm going through these posts, nodding my head but really I have no idea what any of it means.

Still very interesting to read it though and understand it ( when it is simplified i mean )
 
It's not only about more flexibility. This new architecture allows for a completely new class of algorithms that utilize both CPU and GPU for a single task. Such algorithms are impossible on classical systems that handle communication of CPU and GPU over a PCIe bus.

This.

The reason why CPUs and GPUs still exist independently is because they excel at different types of processing. CPUs are great at processing relatively low numbers of branching, complex logic (smart) commands and GPUs are great at processing HUGE numbers of non branching, relatively simple logic (dumb) commands.

Without a shared memory pool the only high performance route was a serial one where the CPU created a set of data that was then sent to the GPU, and the result were then sent to the monitor directly. With shared memory both processors get to play with eachothers data simultaneously, so you can pick whichever is faster for the type or code you want to run on any resulting data set.

Most games already do this on current platforms but the bus bottlenecks and latency involved severely limits it's usefulness. Hopefully the newer architecture make it better in enough new cases to matter.
 
Although you still have to think about local cache sizes, especially when doing GPGPU. It's pretty much the same scenario as with the SPUs and their tiny local storage + DMA to main memory.
Absolutely, I just was amused how it often seemed like the same people that would decry Cell as an impossible monster to tame would embrace GPU computing. In many ways, especially until the most recent GPU architecture generations from AMD/NV, those GPUs were actually harder to use than SPEs.
 
If pcie bus is such a bottleneck, going from pcie 2.1 to 3.0 (double the bandwidth) should be a huge performance gain right? But benchmarks show no to little difference.
 
2) In current gen, you only see muted colors and not rich vibrant colors. Even though the textures have high resolution, the color spectrum is very small. Only uses 10 bit to represent a pixel.
With PS4 we will start seeing high res, higher bits per color for textures.

No. Just no.
 
Well, Cerny said that they modfied the cache architecture a little. But as far as my knowledge goes, this only is true when comparing to a traditional system consisting of CPU and dedicated graphics card. PS4 will have a cache architecture that is optimized for GPGPU since Cerny named the Onion Bus, Volatile Tags and Cache Bypasses as optimizations, but if i remember correctly these features are basic HSA stuff.

What I was refering to is a general "limitiation" of GPU architectures. You can only leverage the horespower of the GPU's cores if you don't stall them with main memory access requests. To avoid this, your shader programm has to work within the size limitation of the shader core's private memory. If I am not mistaken, this would be the L1 cache in GCN. The PS4's optimizations don't affect this general issue.
 
Actually, this brings up a good related topic.

I've seen lots of comments from people, asking why it'll take a year or two to see hardware optimized games for the X1/PS4, if they're just X86 computers.

The answer is, that X86 doesn't automatically mean freedom from unique architectural features. Each model CPU and GPU is a little unique, has their own strengths and weaknesses, and when you look out how the components fit into eachother you get even more unique features.

Basically, X86 greatly lowers the effort necessary to port something from Windows/OSX to these platforms, but doesn't lower the resources to optimize nearly as much. It helps a little, but its not a magic bullet.
 
If I am not mistaken, this would be the L1 cache in GCN. The PS4's optimizations don't affect this general issue.

This was actually bull, I looked it up. In GCN, a unit of 16 ALUs has access to registers with a total memory of 64KB.
 
Top Bottom