The Xbox Series X RAM Setup
How is it really set up?
After thinking about it, I think the XSX memory setup is a bit deceiving in its advertising... They are basically telling you it is like this (each number is a memory chip with the amount of GB it has);
1 1 1 1 1 1 1 1 1 1 + 1 1 1 1 1 1 (560GB/s + 336 GB/s)
While in reality the memory config is more like this;
2 2 2 2 2 2 + 1 1 1 1
That makes it look like 336 GB/s + 224 GB/s, but technically that's not true either... Because the lanes from the 2GB chips and the 1GB chips are not 'separate'. The RAM is not split, but one pool. So like this;
2 2 2 2 2 2 1 1 1 1
The question is, why don't they simply advertise with 560 GB/s? That looks like a perfectly viable 10 x 56GB/s setup... Right? Well... They are aware that if you do not allocate RAM efficiently, you'll run into problems. If you fill only the 2GB chips first, you get 336GB/s. If you fill the 1GB chips first, you get 224GB/s. If you fill them randomly, you'll get inconsistent performance and the effective bandwidth constantly changing on you. They want developers to use the RAM like they are advertising it, which is entirely possible. The more lanes you use for data, the better, obviously. Even though it is not configured like that in reality, by artificially 'splitting' the 2GB modules in two 1GB modules, you achieve the same result as what they are advertising.
Aaaand here's where the complexity starts...
There is one caveat though. Obviously the 2GB modules use the lanes that they have. So even if you artificially split them, there isn't magically additional lanes for data transfer. The lanes needs to be shared by the two sections of the 2GB chip... To put it another way, the 1GB chips get the full 56GB/s per chip and thus per GB (please stick with me here). The 2GB chips, if not used correctly, rather than getting the advertised 56GB/s to reach the total of 560 GB/s will get 28GB/s per GB in the worst case scenario . So you can't really advertise it as 560GB/s + 336 GB/s here. In the worst case scenario, you are talking about 280 GB/s + 336 GB/s. Now that is REALLY atrocious bandwidth.
Is the RAM split or not?
The reality is, that the RAM will work like a hybrid between a split and a unified RAM pool. What do I mean by that? It will work as a unified RAM pool in the sense that both the GPU and the CPU will have access to all the data on all the 16GB. However, it will work as a split RAM pool in terms of data allocation. There will have to be two priority levels in the 2GB RAM chips. Only when there are no 1st priority calls on the RAM, can the 2nd priority be executed. So whatever uses high bandwidth (like textures) will need to be given 1st priority, and whatever uses low bandwidth, can go into 2nd priority.
It's getting more complicated...
And sadly, once again it is not that simple either... Because if you need something right now on screen that is low bandwidth, and it is set to low priority, you will get pop-ins for example. Or if you allocate all sound to the low priority section, then you'll get weird sound delays etc... Yes. That is quite complicated... If it is like that, I can see why we are getting many developers saying are liking the PS5 more. It's simply much simpler. Despite the power of the XSX, it will require some creativity to learn and work with the RAM system of the XSX. If it really is like this, it's actually possible (if not inevitable) that initially we see PS5 games looking better than XSX games, unless they decide to keep all RAM usage under 10GB for both for ease of development. I don't think it will actually work that way. I certainly hope it doesn't...
Free cache lesson for you. It's relevant, I promise
The best way to really do it is that you use the 2GB RAM chips as a sort of L1 and L2 cache. I don't know if people here know how cache works... I'll try and explain it shortly...
Say you have a processor, and the processor has two levels of cache. The first level L1 can store two letters, and the second level can store four letters (typically, the L2 cache is larger, but for the XSX RAM it would be smaller). Cache basically saves the most frequently used data in it for data access. The 'closer' the cache to the CPU, the faster it is.
In the beginning, the cache is empty, and as the processor does jobs, it fills the caches and changes the data in the caches accordingly.
Now imagine I am typing a long word, like pneumonoultramicroscopicsilicovolcanoconiosis (yeah that's the longest word in the english language lol). The CPU has nothing in cache in the beginning, but it doesn't know that. It checks L1, no data. Then checks L2, no data. Then Checks RAM, no data. Ultimately it arrives at the storage device, and copies all the used letters of the program into the caches and the RAM, in order. The letter that is the most common will be saved in L1. The second most common letter will also be saved in L1. Now L1 is full. The 3rd, 4th, 5th, and 6th most common letters will be saved in L2. The rest are in RAM. Next time if I type that word, the CPU can read much of the data from L1, then L2, then the RAM. It will do it much faster than before.
So if we go letter for letter, first the L1 and L2 caches will look like this;
L1 [p,n]
L2 [e,u,m,o]
Now as we type further, things start to change... As we type pneumono, the o and the n become the most common, so p is 'downgraded' to L2, and o is added to L1. n stays;
L1 [n, o]
L2 [e,u,m,p]
As we type ultra, u has been used as often as the other letters, but L1 is full, so it stays in L2, and everything stays the same. And so on and so on. When a letter has been used 3 times it will shift down one of the ones in L1 towards L2, and since L2 is full, the least common letter will be shifted to RAM if it's not already there.
Enough caches. I want some more RAM sweetness!
So that was the short lesson on caches... Going back to the XSX RAM... If they let it work like cache, then everything is allocated to the 10GB first as data is accessed. When there's a data lookup, it will always look in the 10GB first. So, that means the 1GB ram chips and the first 'tier' of the 2GB chips will have priority. Only when the required data is not found there, the lookup will take place in the 2nd tier of the 2GB chips. If done this way, the bandwidth will not interfere with each other, and realistically give you 56GB/s per GB in the 2GB RAM chips also, for both tiers. Now, the 560 GB/s is practically guaranteed, and so is the 360 GB/s.
If it works that way, it's actually a really smart design... And the XSX will have practically zero issues with RAM allocation. Then the XSX will truly have a great bandwidth advantage over the PS5. This is more likely the solution that MS came up with. Having developers manually tune it would be a nightmare. You might have to be a bit more careful with RAM than the PS5, but it would not be a huge issue.