• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

PS4's memory subsystem has separate buses for CPU (20Gb/s) and GPU(176Gb/s)

Eurogamer has an interesting article on the process of porting Ubisoft racer The Crew from PC to PS4. In it, developers of the PS4 version speak in detail about the porting process. It's a fascinating read:

http://www.eurogamer.net/articles/digitalfoundry-how-the-crew-was-ported-to-playstation-4

What caught my eye was the mention of two separate system buses, one for the GPU and one for the CPU. While the GPU bus has the full bandwidth of 176GB/s, the CPU one is 20Gb/s. Also, according to the devs, allocating data correctly between the CPU and GPU bus is extremely important in order to achieve good performance.

The PS4 operates a system where memory is allocated either to the CPU or GPU, using two separate memory buses.

"One's called the Onion, one's called the Garlic bus. Onion is mapped through the CPU caches... This allows the CPU to have good access to memory," explains Jenner.

"Garlic bypasses the CPU caches and has very high bandwidth suitable for graphics programming, which goes straight to the GPU. It's important to think about how you're allocating your memory based on what you're going to put in there."

"The first performance problem we had was not allocating memory correctly... So the Onion bus is very good for system stuff and can be accessed by the CPU. The Garlic is very good for rendering resources and can get a lot of data into the GPU," Jenner reveals.

So I'd like the opinion of people far more knowledgable than me in matters of technology as to what this means. If the memory pool is unified (which we know it is) but the buses are separate and of different speeds, is the memory subsystem truly unified?
 

benny_a

extra source of jiggaflops
It's one of the major modifications by the hardware team:

Enabling the Vision: How Sony Modified the Hardware

The three "major modifications" Sony did to the architecture to support this vision are as follows, in Cerny's words:
"First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. And by small, I just mean small in next-gen terms. We can pass almost 20 gigabytes a second down that bus. That's not very small in today’s terms -- it’s larger than the PCIe on most PCs!"

5VFsgb4.png
 
I am not a technical person but Mark Cerny said that there is a seperate bus that links the GPU with the memory directly so it can pass data between without having to go through the cache.

*edit* What benny said.
 

onQ123

Member
Eurogamer has an interesting article on the process of porting Ubisoft racer The Crew from PC to PS4. In it, developers of the PS4 version speak in detail about the porting process. It's a fascinating read:

http://www.eurogamer.net/articles/digitalfoundry-how-the-crew-was-ported-to-playstation-4

What caught my eye was the mention of two separate system buses, one for the GPU and one for the CPU. While the GPU bus has the full bandwidth of 176Gb/s, the CPU one is 20Gb/s. Also, according to the devs, allocating data correctly between the CPU and GPU bus is extremely important in order to achieve good performance.





So I'd like the opinion of people far more knowledgable than me in matters of technology as to what this means. If the memory pool is unified (which we know it is) but the buses are separate and of different speeds, is the memory subsystem truly unified?

There is also a bus that's shared between the 2
 

Killthee

helped a brotha out on multiple separate occasions!
Already had a similar thread on the same customizations.

I'm not sure how accurate this summary from the other thread is, but here it is:
lvp2.jpg


TL;DR: (SOMEONE CORRECT ME IF I'M WRONG)
This provides a super fast and efficient way of caching data without having to do redundant work. Cerny mentioned CPU and GPU not having to copy redundant info from the cache in order to use it. It allow straight from CPU to GPU data transferring. All of this is integrated into the CPU, GPU and North Bridge (memory controller). All of this will significantly reduce latency beyond than just providing a large L2 cache because a lot of unnecessary work is cut out and more "shortcuts" are provided.


Even more TL;DR: Worried about GDDR5 latencies? Don't be. Large Cache, very capable bus, shortcuts and clever data transferring between CPU and GPU make that a non-issue.
http://www.neogaf.com/forum/showthread.php?t=533904
 

MORT1S

Member
The article also notes that two CPU cores are reserved for OS use.

This jives with the Shadowfall slides, no?
 
It's one of the major modifications by the hardware team:

Enabling the Vision: How Sony Modified the Hardware

The three "major modifications" Sony did to the architecture to support this vision are as follows, in Cerny's words:
"First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. And by small, I just mean small in next-gen terms. We can pass almost 20 gigabytes a second down that bus. That's not very small in today’s terms -- it’s larger than the PCIe on most PCs!"

This is a positive right?

I consider myself tech-savvy but this is a bit out of my pay grade......
 

BigDug13

Member
In one second, the CPU bus can pass an equal amount of data as 2 DVD's can hold. Nearly one single layer Blu-ray worth of data in a second. On the slower bus. That's pretty incredible.
 
but the core engineering effort in moving The Crew across to PlayStation 4 was accomplished in six months with a team of just two to three people working on it.

Gone are the days of complicated PS3 architecture.
 

TheCloser

Banned
Oh god please close this thread.

What's gonna happen is that we are going to have a whole bunch of people come here and try to use this information to prove that the Xbox one is more powerful. Obviously, these people have no idea what they are talking about but they will still argue for days. Lol.
 
I'm not sure how accurate this summary from the other thread is, but here it is:

Thanks for the link. My question is quite specific though: If you have to manage your data and make sure that you send the right data down the appropriate pipe, doesn't that practically negate part of the purpose behind a unified pool? I mean, you can still allocate the necessary amount of memory dynamically as needed, but you still have to juggle data between the CPU and GPU bus. I apologize in advance if I'm saying dumb things, this issue is way over my tech knowledge level.

Oh god please close this thread.

What? Why? It's about games tech, why shouldn't we be allowed to discuss it? It's a great opportunity for all of us to gain some knowledge on how the internals of a console work.
 

szaromir

Banned
It's GB/s, OP. And the memory setup seems quite efficient to me, it's not like Jaguars are beasts that require that much bandwidth.
 

GameSeeker

Member
This is the impressive part:

Eurogamer said:
"We started off with a large codebase - there were about 12,000 source files. And we started with a 64-bit Windows version of the engine using D3D11," says Reflections' expert programmer (yes, that is an actual job title), Dr. Chris Jenner. [...] With the basic porting complete, the Ubisoft Reflections team is now ramping up its staff in order to complete the PS4 game ready for the Q1 2014 release, but the core engineering effort in moving The Crew across to PlayStation 4 was accomplished in six months with a team of just two to three people working on it. Overall, Reflections felt that the process of porting over the PC codebase was fairly simple and straightforward.

And this:

Eurogamer said:
" "The PS4's GPU is very programmable. There's a lot of power in there that we're just not using yet. So what we want to do are some PS4-specific things for our rendering but within reason - it's a cross-platform game so we can't do too much that's PS4-specific," he reveals.

"There are two things we want to look into: asynchronous compute where we can actually run compute jobs in parallel... We [also] have low-level access to the fragment-processing hardware which allows us to do some quite interesting things with anti-aliasing and a few other effects."

This bodes really well for PS4 ports to be of very quality and the best version on consoles.
 

kaching

"GAF's biggest wanker"
Thanks for the link. My question is quite specific though: If you have to manage your data and make sure that you send the right data down the appropriate pipe, doesn't that practically negate part of the purpose behind a unified pool?
You're never going to get away from data management, but at least a unified pool eliminates the need for additional data management around shuttling data between memory pools.
 
this is actually a very good article, really interesting insight on PS4 dev in general. They mention again that Sony's tools apparently are more mature than Microsoft's. And I specifically loved this part :

Simon O'Connor did point out that Reflections considers its work on The Crew to end up being much more than a simple, feature-complete port. This is an opportunity to explore what the new hardware is available of, and there's a sense that the PlayStation 4's graphics hardware is not being fully exploited.

"The PS4's GPU is very programmable. There's a lot of power in there that we're just not using yet. So what we want to do are some PS4-specific things for our rendering but within reason - it's a cross-platform game so we can't do too much that's PS4-specific," he reveals.

"There are two things we want to look into: asynchronous compute where we can actually run compute jobs in parallel... We [also] have low-level access to the fragment-processing hardware which allows us to do some quite interesting things with anti-aliasing and a few other effects."
 

benny_a

extra source of jiggaflops
My understanding based on previous discussion on this topic is that the 20GB/s bus is not really relevant to the external memory but rather was created as a way that the CPU can change memory while ignoring caches. Combine that with the hUMA (which allows for the shared address space where CPU and GPU can use the same language) it's mostly there to decrease latency for compute jobs.

I think it would be good to have someone with some authority elaborate on it, as I'm just repeating what was mentioned to me when I asked about when the bus was revealed the first time.
 
This is the impressive part:



And this:



This bodes really well for PS4 ports to be of very quality and the best version on consoles.

Yeah, only two people working on a port in that short of time is amazing. Porting is one less issue devs have to worry about now.
 

MORT1S

Member
Yes, the profiler screen caps were used to determine 6 cores and 1.6 GHz frequency.

That seems to be a ton of time reserved for a system with a lot of dedicated hardware.

Would it be fair to assume a good amount of compute would be needed to make up for the two cores?
 
Every day I get in the queue (Too much, the Garlic Bus)
To get on the bus that takes me to you (Too much, the Garlic Bus)
I'm so nervous, I just sit and smile (Too much, the Garlic Bus)
Your console is only another mile (Too much, the Garlic Bus)
Thank you, Cerny, for getting me here (Too much, the Garlic Bus)
You'll be a developer, have no fear (Too much, the Garlic Bus)
I don't want to cause no fuss (Too much, the Garlic Bus)
But can I buy your Garlic Bus? (Too much, the Garlic Bus)
Nooooooooo!

I don't care how much I pay (Too much, the Garlic Bus)
I wanna play my PS4 each day (Too much, the Garlic Bus)
*[Garlic Bus, Garlic Bus, Garlic Bus
Garlic Bus, Garlic Bus, Garlic Bus
Give me 206 (Garlic Bus)
I won't take under (Garlic Bus)
Goes like thunder (Garlic Bus)
It's a four-stage wonder (Magic Bus!)

Garlic Bus, Garlic Bus, Garlic Bus, Garlic Bus
I want it, I want it, I want it...(You can't have it!)
Think how much you'll render...(You can't have it!)]
I want it, I want it, I want it, I want it ... (You can't have it!)
 
this is actually a very good article, really interesting insight on PS4 dev in general. They mention again that Sony's tools apparently are more mature than Microsoft's. And I specifically loved this part :

Good quote. Ubisoft is a good developer to have excited about your hardware (speaking from a total software sales point of view).

That seems to be a ton of time reserved for a system with a lot of dedicated hardware.

Would it be fair to assume a good amount of compute would be needed to make up for the two cores?

I think it's disingenuous to assume the two cores need to made up for in most scenarios.
 

Zukuu

Banned
Wow. A port done by 2-3 people? That's literally the best part of the PS4. Ease of use. We might see a downfall in budget after all.
 

iceatcs

Junior Member
Wow. A port done by 2-3 people? That's literally the best part of the PS4. Ease of use. We might see a downfall in budget after all.

But I have heard it similar, it was pre-vita on Vita. I hope it should be better than this time.
 
Top Bottom