• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Xbox Velocity Architecture - 100 GB is instantly accessible by the developer through a custom hardware decompression block

rntongo

Banned
And this is where we disagree strongly since there has been zero evidence given, just wishful thinking and hype. They can't. If so how does it achieve this? That's what we're discussing. There is zero evidence of this.

I don't think anyone claimed all cards prior to 2018 could do SF as efficiently as 2020 cards (They can do it though ). I'm pretty sure even GCN cards can. Hardware does get better incrementally no doubt just not all of a sudden 2x from something that we are pretending is secret sauce yet at the same time known.

So think logically. When the RTX cards get this DX12U update does the introduction of SF in these 2018 cards result in needing half the VRAM on nvidia cards? Meaning they can store 2x as much and have 2x as much bandwidth. Would it suddenly free up all these resources on all the games you say use this streaming tech from this gen meaning they can do twice as much now?

We would have heard about that right? Nvidia would be shouting about the update from the rooftops. Why are we suddenly finding out about this hyped mysterious yet known feature that will offer 2x 3x the memory and bandwidth savings after an UE5 demo? I know why.

I'm saying this isn't XSX secret sauce as it's in GPUs from 2018, it isn't even patented by MS. I'm not sure what you're expecting from it but as somebody already said I think you will be disappointed in the incremental jump of this particular technology. You will not get 2x-3x as much memory savings and bandwidth, and certainly not in comparison to other GPUs that are years old let alone new ones that are going to come out. You will however get new games and engines that are more memory efficient and stream a hell of a lot better due to SSDs. That's the burger, you won't get the exclusive 3x as efficienct hardware sauce.

I would have given you the benefit of a doubt of being more knowledgable than the MSFT system architect for the XSX(Andrew Goossen), but unfortunately not even your explanation matches up technically to what he gave. you just posted something that sounds like conjecture.
 

rntongo

Banned
It is exactly the same.
Stop this delirium of hidden power.

2013: Hidden GPU.
2020: Hidden SSD bandwitch.

I was wondering why it is always on the xbox side that we have this.

With the PS5 I didn't see anyone finding the hidden TF multiplier to make the PS5 have more TF than the SX, but on the SX SSD we have the bandwitch multiplier to match the PS5.


"Hidden Xbox technique increases SSD power by 300%."
"Hidden PS5 TF technique increases GPU power by 30%, GPU with 13TF"

Which of the two above is most likely to be real?

And yet I don't see anyone raving about hidden TF multiplier.

Ad hominem.

Edit: Thank you for editing out the the remarks they were honestly not improving the discussion.
 
Last edited:

jimbojim

Banned
I would have given you the benefit of a doubt of being more knowledgable than the MSFT system architect for the XSX(Andrew Goossen), but unfortunately not even your explanation matches up technically to what he gave. you just posted something that sounds like conjecture.


But your explanation is beyond Goossen's knowledge
 
Last edited:

ToadMan

Member
Last edited:

jimbojim

Banned
So, my post was this at first ( before edit, later i've added "OR HIGHER" ) :


When MSFTWTFBBQ said on official page it is 2x-3x crap, but nevertheless, it's much higher for some reason, it's 4x.


Then this :

You respond without researching then you start spreading lies based off your biases. They said 2 or 3 or even higher.

This is from the xsx technology glossary:
“it is an effective 2x or 3x (or higher) multiplier on both amount of physical memory and SSD performance.“

So you edited your post where you lied and didn’t even have the decency to apologize, just doubled down on an ad hominem.

You forgot to edit out the red part that you claimed was a lie. Please resort to research before posting here. I already posted a tweet from a MSFT engineer with proof of the claim.


Yes, i'm a liar because i didn't took the info from XSX PR link and wrote "OR HIGHER" . Also, i have need to apologize. But Andrew Goossen said this in Eurogamer article ( for the record, didn't have time to search further because time for edit was less and less, so i chose to take info from XSX PR link ) . Then rntongo few post before quoted Goossen from Eurogamer article :

I'm honestly amazed by how much you have misunderstood this whole thing. You do realize all the custom hardware for texture streaming is under SFS?? And that SFS is responsible for efficient texture streaming?

Here is a quote from The eurogamer article with Andrew Goossen. Where he explains how a 2-3x gain is made using SFS and it's features:

"From this, we found a game typically accessed at best only one-half to one-third of their allocated pages over long windows of time," says Goossen. "So if a game never had to load pages that are ultimately never actually used, that means a 2-3x multiplier on the effective amount of physical memory, and a 2-3x multiplier on our effective IO performance."

A technique called Sampler Feedback Streaming - SFS - was built to more closely marry the memory demands of the GPU, intelligently loading in the texture mip data that's actually required with the guarantee of a lower quality mip available if the higher quality version isn't readily available, stopping GPU stalls and frame-time spikes
. Bespoke hardware within the GPU is available to smooth the transition between mips, on the off-chance that the higher quality texture arrives a frame or two later."

link to the eurogamer article:


So, Andrew Goossen isn't a liar. Or maybe he is. Depend on situation when there is a need in this thread to overstate SFS and texture multiplier to 4x or even higher ( lets say 6x ) and show XSX superiority in every regard. Yeah, i'm a liar and i mislead people. And i need to apologize. I mean WTF??

tenor.gif
 
Last edited:

rnlval

Member
It's enirely valid for the comarison I was making - Power usage. 2080 and the 5700 are in the same range as the xsex and both are ~250W.
From https://www.techpowerup.com/review/nvidia-geforce-rtx-2080-founders-edition/31.html

power_peak.png


RTX 2080 FE has 226 watts for peak gaming which is similar to Toms's RTX 2080 FE review.

power_average.png


RTX 2080 FE has 215 watts for average gaming.

Again, show your source for "250 watts" for RTX 2080.

XSX GPU can reach higher clock speed when it reduces CPU usage like Smartshift function.

Reminder, AMD has 50% performance per watt improvements with RDNA 2 for the rumored 2X scale RX 5700 XT BiG NAVI (~+250 to 300 watts).

R9-390X has ~47% performance per watt improvements from 7970 GE/R9-280X.
 
Last edited:

THE:MILKMAN

Member
I've been reading up on this trying to get my head around it all and the best I can come up with is this.....Rightly or wrongly.

Currently a 4K 8MB texture has to be streamed into and 'parked' in RAM as a whole whether all is needed or not, but with SF/SFS they can now just stream and park the precise amount of that texture in RAM that will be visible. This is where the 2-3x multiplier comes in. If they only need 2MB or 4MB of the texture, there's the saving.

Conversely on the PS5 (in theory) it is said to be so much faster/higher throughput in I/O and with such low latency that it doesn't need the parking lot (RAM) to park data as it just copies it in and out as needed (as you turn).

I'm sure I'm missing many steps where this isn't quite so good as either sound, but this is where my head is right now.
 

rnlval

Member
Don't you think it's kind of weird with Microsoft being transparent they didn't go into so much depth with their I/O system?

Same reason why Cerny avoided talking about the GPU alot while he didn't hesitate to give us details on the PS5 I/O.

Then there's this.

xboxseriesxvsps5.jpg
XSX GPU has 12.147 TFLOPS fix frequency

PS5 GPU has 10.275 TFLOPS variable frequency

You round up with PS5 while you round down XSX. LOL You're biased.
 
People are just trying to assume where that delta actually shrinks to.

I'm also reading about theories on how there's bottlenecks in the Xbox I/O system that don't don't allow it to operate at maximum speed. I was given a discussion on how Microsoft is having issues with BCpak which is causing them to have bottle necks within the system. I can definitely see them improve the software to reduce potential bottlenecks.

What I can't see them doing is matching Sonys I/O system mostly due to the way the actual drive is designed. All the software and additional hardware that they have can't change the output if that SSD. If physical limitation is 2.4 GB/s raw then it will always be that way unless Microsoft changes the physical structure of the drive itself.

People just have to accept that the GPU delta (in favor of the XSX) and the I/O delta (in favor of the PS5) can't be eliminated at this point.


"But as you said, we'll have to wait until more official information arrives."

That's true but I'm not confident of the delta being narrowed due to the physical designs of the drives.
 

ToadMan

Member
From https://www.techpowerup.com/review/nvidia-geforce-rtx-2080-founders-edition/31.html

power_peak.png


RTX 2080 FE has 226 watts for peak gaming which is similar to Toms's RTX 2080 FE review.

power_average.png


RTX 2080 FE has 215 watts for average gaming.

Again, show your source for "250 watts" for RTX 2080.

XSX GPU can reach higher clock speed when it reduces CPU usage like Smartshift function.

Reminder, AMD has 50% performance per watt improvements with RDNA 2 for the rumored 2X scale RX 5700 XT BiG NAVI (~+250 to 300 watts).

R9-390X has ~47% performance per watt improvements from 7970 GE/R9-280X.

You're comparing a stock 2080 which is why down on tflops and specs compared to xsex - the super is closer but still less. Adn the 5700xt is lower spec too.The other points I already know and never raised - but thanks for catching up.


The start of this was the claim the xsex could be upclocked like xb1 - it won't there's not enough Wattage available. You haven't provided any answer to this point.


If you feel happier thinking the xsex will have the performance of a 2018 GPU.... well I never I thought I'd hear such stupidity from even the most feverish PS fanboy. Enjoy your feels man - facts don't care about them...
 

Deto

Banned
The best way to end the delusion of hidden SSD bandwitch is to start saying that the PS5 has a hidden 3TF as well.

Because in arguments and rationality it doesn't work with these people.
 
I think you've very clearly misunderstood this technology then.

The previous technology uses the same feedbackBuffer and feedback texture for selecting the textures to stream. Texture space shading is different.

Even if I assume that it is doing this 300% faster what you are misunderstanding is that this isn't a difference in efficency in streaming assets in and out of memory this would be compute efficiency.
Doing sampling "more frequently" as you said before is worse for the memory..

You know what, turns out I did misunderstand something about this. Reading into this a bit more it doesn’t look like it saves you a sample.

This doesn’t really change much about our main argument though. I said the performance difference comes from an improvement in making sampling cheaper and an improvement in accuracy.

I wrongly emphasized the sampling performance improvement misunderstanding how texture sampling fit into this process. I speculated that more performant access to this information would let you sample more often giving you better results. All of that was wrong.

That said, both of my main points are still true in that:

- Sampler Feedback allows you to make more accurate decisions by providing information that previously wasn’t available from the texture sampler (as I described correctly multiple times)
- It’s possible to figure some of this information out (MIP level and which tile) but these methods are prohibitive (compared to using such methods Sampler Feedback is more performant). The example given by Microsoft for a previous method is using CheckAccessFullyMapped to check each tile or MIP level. This obviously is not the same efficiency as Sampler Feedback since it returns a true/false and not just the information you want.

Sorry for inserting my own misinformation into this discussion.

Getting back to the main questions at hand:

1) Is this a new hardware feature or just some new API wrapping of existing hardware? The answer is this is obviously new hardware according to 4 different sources I’ve quoted that explicitly state this (Microsoft, Nvidia, Anandtech, that texture streaming middleware company). You have pointed me to a paper that does not clearly state that this information is directly available without any additional effort or heuristic. Maybe I’m a dunce but you haven’t proven that this information is already readily available. And if it was, why would multiple companies lie about the hardware nature of this new feature? Where are all the DX12 devs online shouting “Sampler Feedback is a lie!”

2) What is the mechanism by which this improves texture streaming that already exists and is already enhanced by hardware? Microsoft states in that video and in that blog post that Sampler Feedback gives you more accurate information by which you can make your streaming decisions. They show a “silly” example where the streaming decisions are so bad that all tiles are in memory (effectively no streaming) using 10X more memory than the example showing only the tiles that are used. In this example a perfectly accurate streaming system is 10X better. Clearly they are saying that Sampler Feedback helps you get closer to this perfect scenario and father away from the other by helping you make more accurate streaming decisions.

3) Is it or can it be 2X-3X better? I think we can only speculate unless you’ve got a test bed to try it out on with an RTX card.
 
Last edited:

Ascend

Member
I summary there's a theory that because of BCPACK, SFS and other features in the Xbox I/O system it will be superior to Sonys I/O system.
I haven't seen anyone claim this. The angle is that the difference between them will be smaller. Sony will most likely still have the advantage.

That being said, people are mainly focusing only on SSDs right now, and forget that the consoles are a whole package. The whole angle of the SSD advantage for the PS5 assumes that the SSD on the PS5 is perfectly balanced with its CPU and GPU, while the XSX SSD is underpowered compared to the rest of the system. But in reality, we do not know that yet.

  • It is quite possible that the SSD on the PS5 can provide more data than its GPU can handle. In that case, the SSD is overpowered and we won't see graphical advantages on the PS5. It will still ease development, because efficient data transfer will not be required, but that would be it.
  • It is quite possible on the XSX that its GPU is overpowered compared to the data its SSD can deliver. In this case, the PS5 could produce more detailed textures in the final image, but the XSX will still have an advantage in other graphical processing due to its stronger GPU. You could possibly off-set that difference with some other method.
  • It is also equally possible that the SSD on the XSX is fast enough to feed what its GPU can handle. In that case, despite the SSD advantage of the PS5, the XSX will simply have the image quality advantage all-around.

All this with depend on a bunch of things. It might vary from game to game, engine to engine, developer to developer. We simply don't know yet.
 

rnlval

Member
You're comparing a stock 2080 which is why down on tflops and specs compared to xsex - the super is closer but still less. Adn the 5700xt is lower spec too.The other points I already know and never raised - but thanks for catching up.


The start of this was the claim the xsex could be upclocked like xb1 - it won't there's not enough Wattage available. You haven't provided any answer to this point.


If you feel happier thinking the xsex will have the performance of a 2018 GPU.... well I never I thought I'd hear such stupidity from even the most feverish PS fanboy. Enjoy your feels man - facts don't care about them...
Reminder, RTX 2080 has a split integer and floating math processors, hence it has higher operations per cycle potential when combined both integer and floating-point workloads.

Each Turing CUDA FP core has corresponding CUDA Integer core.

int-fp-concurrent-execution.jpg


From https://www.gamersnexus.net/guides/3364-nvidia-turing-architecture-technical-deep-dive


clock_vs_voltage.jpg


RTX 2080 FE has 1897 Mhz average clock speed, hence 11.134 TFLOPS average and not including TIOPS for INT32 performance.

RDNA CU's ALUs shares integer and floating workloads.

XSX GPU can be clocked higher when CPU usage is reduced as per Smartshift function


46CTYSs.png


Atm, XSX has the fixed frequency strategy i.e. CPU at 3.6 Ghz with 16 threads mode or CPU at 3.8 Ghz with 8 threads mode and GPU at 1.825 Ghz.

XSX's fixed frequency strategy is before AMD's Smartshift is enabled, but CPU itself has mini-Smartshift since disabled SMT reduces power consumption which enables higher clock speed. XSX's Smartshift exists in hardware while waiting to be fully enabled.

Any pure TFLOPS argument hides Turing's TIOPS CUDA core hardware.
 
Last edited:

Handy Fake

Member
Reminder, RTX 2080 has a split integer and floating math processors, hence it has higher operations per cycle potential when combined both integer and floating-point workloads.

Each Turing CUDA FP core has corresponding CUDA Integer core.

int-fp-concurrent-execution.jpg


From https://www.gamersnexus.net/guides/3364-nvidia-turing-architecture-technical-deep-dive


clock_vs_voltage.jpg


RTX 2080 FE has 1897 Mhz average clock speed, hence 11.134 TFLOPS average and not including TIOPS for INT32 performance.

RDNA CU's ALUs shares integer and floating workloads.

XSX GPU can be clocked higher when CPU usage is reduced as per Smartshift function


46CTYSs.png


Atm, XSX has the fixed frequency strategy i.e. CPU at 3.6 Ghz with 16 threads mode or CPU at 3.8 Ghz with 8 threads mode and GPU at 1.825 Ghz.

XSX's fixed frequency strategy is before AMD's Smartshift is enabled, but CPU itself has mini-Smartshift since disabled SMT reduces power consumption which enables higher clock speed. XSX's Smartshift exists in hardware while waiting to be fully enabled.
Can't it boost both CPU and GPU during the same workload?

"A new interface within AMD Radeon Software Adrenalin 2020 Edition makes it easy to see how power is being shifted to the CPU and GPU.
Unlike other implementations, AMD SmartShift can boost both components during the same workload."
 
I haven't seen anyone claim this.

I've seen people claim that we don't know which I/O system is superior. With the evidence that we have we should be able to come to the conclusion that they are not equal. As in one systems I/O is better than the others. Like the GPU for example.
 

rnlval

Member
Can't it boost both CPU and GPU during the same workload?

"A new interface within AMD Radeon Software Adrenalin 2020 Edition makes it easy to see how power is being shifted to the CPU and GPU.
Unlike other implementations, AMD SmartShift can boost both components during the same workload."
Clock speed alone doesn't tell the usage factor e.g. lighter integer workload can cause CPU's boost mode while GPU has its own boost mode.

PS; I own HP Envy X360 15 inch with mobile AMD Ryzen 5 2500U APU 1-in-2 laptop/tablet and I use Ryzen Adjtool to change the power design limit beyond 25W i.e. effectively restoring the hidden BIOS feature that enables the user to change power design limit from Windows desktop. Laptop's PSU is rated at 65 watts and the cooling solution handles 35 watts without major issues and my Surface Pro 5 tablet has 30 watts when connected to wall power.
 
Last edited:

Ar¢tos

Member
  • It is quite possible that the SSD on the PS5 can provide more data than its GPU can handle. In that case, the SSD is overpowered and we won't see graphical advantages on the PS5. It will still ease development, because efficient data transfer will not be required, but that would be it.
The idea that Sony would spend money creating a fully custom ssd that is faster than what the APU can handle is completely ridiculous.
It might surprise you, but they test these things.
They picked 5.5gb/s for a reason. Not 4.5, not 6.5, but 5.5gb/s.
If their system couldn't handle the speed, they would use a slower ssd and save money.
We are not talking about picking a retail ssd and having to pick between too slow or too fast. The ps5 ssd was designed specifically for the console having in consideration its capabilities.
 

Ascend

Member
The idea that Sony would spend money creating a fully custom ssd that is faster than what the APU can handle is completely ridiculous.
It might surprise you, but they test these things.
They picked 5.5gb/s for a reason. Not 4.5, not 6.5, but 5.5gb/s.
If their system couldn't handle the speed, they would use a slower ssd and save money.
We are not talking about picking a retail ssd and having to pick between too slow or too fast. The ps5 ssd was designed specifically for the console having in consideration its capabilities.
Aaaaaaand why would Sony be smart enough to do this, and Microsoft dumb enough to overpower their GPU compared to the SSD?
 

Deto

Banned
The idea that Sony would spend money creating a fully custom ssd that is faster than what the APU can handle is completely ridiculous.
It might surprise you, but they test these things.
They picked 5.5gb/s for a reason. Not 4.5, not 6.5, but 5.5gb/s.
If their system couldn't handle the speed, they would use a slower ssd and save money.
We are not talking about picking a retail ssd and having to pick between too slow or too fast. The ps5 ssd was designed specifically for the console having in consideration its capabilities.

You can answer that it is actually the SX GPU that is idle waiting for the slow SSD of the xbox sx.

But then it would be FUD, which is the monopoly of the "fans" of the xbox.
 

Panajev2001a

GAF's Pleasant Genius
I haven't seen anyone claim this. The angle is that the difference between them will be smaller.

Aaaaaaand why would Sony be smart enough to do this, and Microsoft dumb enough to overpower their GPU compared to the SSD?

Great point, so basically 12.3 TFLOPS might be purely a marketing figure they really never sustain hence they are not really overpowering... is this what we are saying since it seems that al is up for grabs ;).
 

rnlval

Member
The idea that Sony would spend money creating a fully custom ssd that is faster than what the APU can handle is completely ridiculous.
It might surprise you, but they test these things.
They picked 5.5gb/s for a reason. Not 4.5, not 6.5, but 5.5gb/s.
If their system couldn't handle the speed, they would use a slower ssd and save money.
We are not talking about picking a retail ssd and having to pick between too slow or too fast. The ps5 ssd was designed specifically for the console having in consideration its capabilities.
Factor this statement from https://www.pcgamer.com/unreal-engine-5-tech-demo/
Would this demo run on my PC with a RTX 2070 Super? Yes, according to Libreri, and I should get "pretty good" performance.
Epic Games chief technical officer Kim Libreri.

Smashing AMD GPU with heavy geometry workload may not be wise.
 

Ar¢tos

Member
Aaaaaaand why would Sony be smart enough to do this, and Microsoft dumb enough to overpower their GPU compared to the SSD?
There is no relation between gpu TFs and ssd speed. We could have a 20tf gpu and a 5400rpm hdd.
Sony simply decided to achieve the top speed possible while MS went for fast enough. Different approaches.
 

Ascend

Member
There is no relation between gpu TFs and ssd speed. We could have a 20tf gpu and a 5400rpm hdd.
Sure there is. The slower your drive compared to the GPU power, the more RAM you need.

Sony simply decided to achieve the top speed possible while MS went for fast enough. Different approaches.
Then I find it hilarious that you just claimed;

The idea that Sony would spend money creating a fully custom ssd that is faster than what the APU can handle is completely ridiculous.
I mean... Yeah... Ok... If the MS one is fast enough, wouldn't that mean that the PS5 one is overpowered? Just a question....

As for the APU not being able to handle what the SSD can transfer, we most likely already saw that, considering how Epic kept hammering on their billions of triangles assets, and the GPU outputting 'only' 20 million. But it's obvious that this is how it needs to be done.
 
There is no relation between gpu TFs and ssd speed. We could have a 20tf gpu and a 5400rpm hdd.
Sony simply decided to achieve the top speed possible while MS went for fast enough. Different approaches.

Would also like to add that Microsoft would probably have to make sacrifices in the APU to have a better I/O solution. This could potentially lead them to having a narrower GPU which would decrease the flop count. Flops is alot easier to market than a superior I/O system plus it leads to other advantages such as increased resolution.

71340_512_understanding-the-ps5s-ssd-deep-dive-into-next-gen-storage-tech.png


I don't think one is making dumb mistakes compared to the other it's just their goals for next gen are different.
 

rnlval

Member
There is no relation between gpu TFs and ssd speed. We could have a 20tf gpu and a 5400rpm hdd.
Sony simply decided to achieve the top speed possible while MS went for fast enough. Different approaches.
UE5 demo has PS5 shown excess data (with heavy geometry e.g. triangle pixel density) can overwhelm PS5's GPU into mostly 1440p and 30 fps runtime performance while RTX 2070 Super (entry-level TU104) running the same UE5 demo has "pretty good performance".

AMD has to surprise me with heavy geometry performance beating NVIDIA's geometry advantage.
 

Ar¢tos

Member
Sure there is. The slower your drive compared to the GPU power, the more RAM you need.


Then I find it hilarious that you just claimed;


I mean... Yeah... Ok... If the MS one is fast enough, wouldn't that mean that the PS5 one is overpowered? Just a question....

As for the APU not being able to handle what the SSD can transfer, we most likely already saw that, considering how Epic kept hammering on their billions of triangles assets, and the GPU outputting 'only' 20 million. But it's obvious that this is how it needs to be done.
Saying one is faster doesn't mean that the one other one is slow or bad.
Not everything has to work at full speed, and games can be designed with the limitations of each console. Sony went for the fastest transfer speed their system can handle, but that doesn't mean all games will be designed around it, simply there is that option.
MS went for the max GPU they could get. Again, this doesn't mean that all games have to be designed around it, simply there is the option.
If you can't understand that and think the honor of your precious xbox is under attack , that's your problem.
I'm done with pointless discussions.
 

Ascend

Member
Saying one is faster doesn't mean that the one other one is slow or bad.
Not everything has to work at full speed, and games can be designed with the limitations of each console. Sony went for the fastest transfer speed their system can handle, but that doesn't mean all games will be designed around it, simply there is that option.
MS went for the max GPU they could get. Again, this doesn't mean that all games have to be designed around it, simply there is the option.
Good. So I guess we understand each other. Note that my previous comment on the different options are not universal, but I clearly stated they would depend on the game, engine etc.

If you can't understand that and think the honor of your precious xbox is under attack , that's your problem.
I'm done with pointless discussions.
I have no interest in discussing 'console honor' or whatever. You were doing fine until that comment.

Would also like to add that Microsoft would probably have to make sacrifices in the APU to have a better I/O solution. This could potentially lead them to having a narrower GPU which would decrease the flop count. Flops is alot easier to market than a superior I/O system plus it leads to other advantages such as increased resolution.

71340_512_understanding-the-ps5s-ssd-deep-dive-into-next-gen-storage-tech.png


I don't think one is making dumb mistakes compared to the other it's just their goals for next gen are different.
That's quite fair. There are many factors to take into account when designing a system. In a sense, one could argue that the reason MS wants to work so hard on higher compression and SFS is exactly because the SSD is 'underpowered'. But we will not know how things work in practice until we have many games on the systems. And things might not work universally either. Maybe even in the same game, having a lot of foliage will give MS the advantage, while in another spot on the game where you're driving through the streets, the PS5 can stretch its legs more. It's a wait and see. The only thing that really is certain right now is that the PS5 will be easier to throw code at without causing bottlenecks.
 
MS wants to work so hard on higher compression and SFS is exactly because the SSD is 'underpowered'.

That makes alot of sense. Kind of like how they boosted the X1s clocks to close the gap with the PS4.

But like I said before there's only so much they can do at this point. Also if there's anything they they can do to close the gap doesn't mean the competition isn't working to make improvements.

Like you said we have to wait and what the final results will be but I'm pretty sure they will be in the PS5s favor where the I/0 system is concerned.
 

FranXico

Member
As for the APU not being able to handle what the SSD can transfer, we most likely already saw that, considering how Epic kept hammering on their billions of triangles assets, and the GPU outputting 'only' 20 million. But it's obvious that this is how it needs to be done.
The mesh shaders (or, as they are called in UE5, nanites) run in the GPU. The reduction to 20 million is for easing rendering.
 
Last edited:
You know what, turns out I did misunderstand something about this. Reading into this a bit more it doesn’t look like it saves you a sample.

...or maybe I was right? Probably not, but this reporting from guru3d had the same take:

"Sampler feedback solves this by allowing a shader to efficiently query what part of a texture would have been needed to satisfy a sampling request, without actually carrying out the sample operation."

 
I'm also reading about theories on how there's bottlenecks in the Xbox I/O system that don't don't allow it to operate at maximum speed. I was given a discussion on how Microsoft is having issues with BCpak which is causing them to have bottle necks within the system. I can definitely see them improve the software to reduce potential bottlenecks.

What I can't see them doing is matching Sonys I/O system mostly due to the way the actual drive is designed. All the software and additional hardware that they have can't change the output if that SSD. If physical limitation is 2.4 GB/s raw then it will always be that way unless Microsoft changes the physical structure of the drive itself.

People just have to accept that the GPU delta (in favor of the XSX) and the I/O delta (in favor of the PS5) can't be eliminated at this point.


"But as you said, we'll have to wait until more official information arrives."

That's true but I'm not confident of the delta being narrowed due to the physical designs of the drives.

Where are you getting these theories from and who's discussing these things? I'd like some links, myself, because I've seen some people discussing these things around various places but IMO they are usually either missing big pieces of the picture or have some agenda of their own, fudging certain points they bring up to paint a different picture than what's probably actually there.

Regarding BCPack, the only thing I've heard personally is that they are looking to push the compression rate higher. I don't see how that suddenly means it is causing problems or bottlenecks, unless "bottlenecks" now pertains to temporary design challenges which every system goes through when under development (and more often than not resolves).

In some ways it feels like some of these bottleneck concerns have only came up in light of Road to PS5 and seem like a talking point to warp some part of the discussion around. I've seen seemingly knowledgeable people around Youtube comments, for example, try implying XSX's SSD won't even reach half the stated speed but then if you look at their reasoning it turns out to be complete bunk they fabricate out of thin air. And then they might have a pattern to their discussion that further shows why they have the impression of that type of bottleneck but none of that impression is really formed on anything MS have publicly stated.

For example, some people try saying since the CPU still does some of the work, that is now a bottleneck. But only 1/10th of one of the cores is actually doing any work in relation to the SSD I/O stack, not to mention even with SMT enabled XSX's CPU is still 100 MHz faster than PS5's, so essentially it cancels itself out. Plus if they're going with a more software-tailored approach they would want some portion of the management being handled on the CPU; on other systems it could be more than 1/10th of a Zen 2 core, depends on the processor. But that effectively means it should be deployed easily on a range of PC devices.

You're misunderstanding something here. I said going by the paper specs themselves does not paint the full picture. This is applicable to the GPUs, how is it suddenly impossible to see it being applicable to the SSD I/O? Because of physical structure? We don't even fully know what the physical structure for both are yet. One is 12 channels, the other is 4 channels, but that could be 12x 64 GB modules on one and either 4x 256 GB modules or 4x4 64 GB modules on the other. That in itself could point to a physical structure different than what some are assuming.

Whatever software optimizations are being taken, would not be taken if the companies didn't feel the hardware could support them. I find it a bit hard to believe MS are pushing optimizations that are basically creating bottlenecks in hardware they seem to have shown a great understanding for thus far, considering how many engineers and department divisions they have to leverage talent and development from. That part feels a bit like 4Chan drama bait, comparable to the rumors Sony PS5 devkits were overheating, or Sony panicking to up the GPU clock at the last minute. Those were pretty ridiculous rumors, true.

We'll find out more, for certain, but I'd say don't be surprised if the actual real-world performance in SSD I/O stack between the two systems is closer than what the (sparse) paper specs we know so far indicate. It won't close the gap, that's never been suggested. But it won't be of any real shock if the actual performance delta is notably smaller than it currently appears with what little is known so far. If you're only looking at the hardware side of the SSD I/O stack, you are not looking at the full picture. There are in fact other aspects of the way texture asset data can be stored, prioritized, accessed and drawn that could also factor into this, but that's getting too complicated for discussion here.

Just saw a tweet by a developer on Twitter. She believes that 100GB being instantly accessible means that the system will believe the XSX actually has 116GB of RAM rather than 16GB. So, you have the 10GB @ 560GB/s, the 6GB @ 320GB/s, and the 100GB @ SSD speed.


Interestingly, she posted this a few months ago;



Kind of gives me flashbacks to older microcomputers that basically allocated ROM as part of the memory addressing space. Unfortunatley NAND is not as fast as ROM but I'm curious if this is divergent in any way from what AMD's SSG cards do.

Gonna need to do some more research on this, curious about what they're trying to do here. I'm almost ready to leapfrog to August but wouldn't want to miss the PS5 and XSX gameplay events first xD.
 
Last edited:

quest

Not Banned from OT
The idea that Sony would spend money creating a fully custom ssd that is faster than what the APU can handle is completely ridiculous.
It might surprise you, but they test these things.
They picked 5.5gb/s for a reason. Not 4.5, not 6.5, but 5.5gb/s.
If their system couldn't handle the speed, they would use a slower ssd and save money.
We are not talking about picking a retail ssd and having to pick between too slow or too fast. The ps5 ssd was designed specifically for the console having in consideration its capabilities.
The epic demo says otherwise they did not even have to compress 8k textures and neither next generation apu could handle what was left over for bandwidth. God help them if they compressed textures. The Sony SSD clearly can over power either apu like one of us drinking from a firehose. The question left is the Microsoft apu going to be left thirst or not. We probably won't know for 3 years when cross generation is done and games designed around SSD are coming to market. Both systems have questions to answer in 3-4 years. Will the sony apu throttling cause issues when games ditch the jaguar and push the apu workloads. Microsoft can the SSD keep the apu fed.
 
Last edited:
Where are you getting these theories from and who's discussing these things?

R rntongo explained it to me a while ago. I believe the discussion started around the time the State of Decay demo was revealed. The theory behind the load times was that BCpak was still being worked on and wasn't fully ready yet. R rntongo quoted a Microsoft engineer who claims that they are still working with BCpak. The speculation is that once they finish with the improvements it will allow the console to be closer to it's maximum I/O capabilities.


"If you're only looking at the hardware side of the SSD I/O stack, you are not looking at the full picture."

The hardware needs to be there for a good I/O system to be even possible. I know there's a ton of software that's designed to function with the hardware but it's still limited by the actual hardware of the device. The SSD itself combined with the decompresses, co-processors, esram, memory controller and other hardware forms the basis of how the I/O system will function.

I keep hearing about how Microsoft is trying to close the gap but I'm wondering if the same isn't true for the competition trying to widen it?

I still believe that Microsoft will inform us if they closed the gap much like they did when they increased the clock speed of the X1. There's no way that Microsoft is going to let people believe there's a big I/O gap between the two of it's isn't true.
 

jimbojim

Banned
Just saw a tweet by a developer on Twitter. She believes that 100GB being instantly accessible means that the system will believe the XSX actually has 116GB of RAM rather than 16GB. So, you have the 10GB @ 560GB/s, the 6GB @ 320GB/s, and the 100GB @ SSD speed.


Interestingly, she posted this a few months ago;




Whatever Louise said, i wouldn't take it seriously in any way ( especially when he/she is only XSX/W10 dev and also he/she follows these guys ) :

p4MdIHE.jpg
 
Last edited:
R rntongo explained it to me a while ago. I believe the discussion started around the time the State of Decay demo was revealed. The theory behind the load times was that BCpak was still being worked on and wasn't fully ready yet. R rntongo quoted a Microsoft engineer who claims that they are still working with BCpak. The speculation is that once they finish with the improvements it will allow the console to be closer to it's maximum I/O capabilities.


"If you're only looking at the hardware side of the SSD I/O stack, you are not looking at the full picture."

The hardware needs to be there for a good I/O system to be even possible. I know there's a ton of software that's designed to function with the hardware but it's still limited by the actual hardware of the device. The SSD itself combined with
the decompresses, co-processors, esram, memory controller and other hardware forms the basis of how the I/O system will function.

I keep hearing about how Microsoft is trying to close the gap but I'm wondering if the same isn't true for the competition trying to widen it?

I still believe that Microsoft will inform us if they closed the gap much like they did when they increased the clock speed of the X1. There's no way that Microsoft is going to let people believe there's a big I/O gap between the two of it's isn't true.

Ah okay, I see. Yeah, R rntongo is right in that regard bringing up the quote, but I think there might be some difference between you and I on what is actually meant by "allow(ing) the console to be closer to it's maximum I/O capabilities.". You might seem to think that in terms of them having current SSD I/O that can't be fully utilized due to software issues being worked on. That might be true but IMO that wouldn't be worth calling a bottleneck; a bottleneck would only be present if a certain software feature simply can't be implemented on the hardware due to a limitation of the hardware itself, but this is considering the software feature is fully complete and the hardware is fully tuned to a retail spec. Neither of those are true at the moment.

The hardware you're speaking of, it's already there to an extent. XSX's memory controller is rated higher than the stated speed; around 3.75 GB/s from what I've heard, if it's the Phison E16 variety (which has probably been tuned for the system, at least whatever version MS is utilizing). I don't know where the perception the XSX doesn't have the hardware in-place for a good SSD I/O system comes from. Again, some of the work for example is done on the CPU, but only on 1/10th of a single core, and the cores are clocked higher. That should be indication the hardware is present.

Other things like decompression, the hardware is there for it. Co-processors? We don't know but there's a good chance co-processors of some sort are within the APU design. Probably not the same as Sony's, but I'd be surprised if some type of co-processor isn't present. There is not ESRAM in PS5's SSD I/O block; they've stated SRAM, which probably also means it's going to be relatively in small in terms of a cache compared to the DRAM caches on higher-level SSDs.

Regarding Sony, I'm certain they are also tuning the software stack side of their SSD I/O system, though the hardware is more or less locked in place. TBF, Sony doesn't seem as invested in specifically custom software for theirs compared to Microsoft; they are licensing out Kraken, after all. Which they can customize to an extent, but there's going to be a limit to that since the foundation is not their own.

I don't see MS announcing if they've "closed the gap" because for one, it might be bad optics and secondly, they can't fully close the gap due to the physical differences in their respective I/O systems. Not to mention, both consoles are taking somewhat different approaches to resolving these issues, so some aspects are not apples-to-apples comparisons. I think MS will focus more on presenting an alternative approach to handling data throughput on the SSD I/O pipeline compared to Sony's, and demonstrate how effective it is. Different approaches that are equally valid in their own ways, and each having some advantages over the other.
 
jimbojim jimbojim Gave it a read; nothing they say is impossible and it sounds like the type of solution MS would prefer to take in all honesty. I'm interested to see what happens to come of fruition from the ideas presented.

Also I couldn't care less who someone follows on Twitter, just focused on the person in particular and what they have to say. It's like trying to deplatform someone because they follow Donald Trump.

MasterCornholio MasterCornholio C'mon man, you can be better than that. Focus on the content of the message, not who someone follows. Guilt by association is such a cancel-culture tactic IMHO. I don't care who someone follows on Twitter. If they bring up an interesting idea or theory that can potentially be plausible, I'll look into it and see what I think of it.

Also assuming someone is engaging in behavior without proof is...well, not good. Gotta show some receipts; would say the same of anyone accusing someone affiliated with a PS circle of FUD. No proof? No care from me :LOL:
 

jimbojim

Banned
jimbojim jimbojim Gave it a read; nothing they say is impossible and it sounds like the type of solution MS would prefer to take in all honesty. I'm interested to see what happens to come of fruition from the ideas presented.

Also I couldn't care less who someone follows on Twitter, just focused on the person in particular and what they have to say. It's like trying to deplatform someone because they follow Donald Trump.

MasterCornholio MasterCornholio C'mon man, you can be better than that. Focus on the content of the message, not who someone follows. Guilt by association is such a cancel-culture tactic IMHO. I don't care who someone follows on Twitter. If they bring up an interesting idea or theory that can potentially be plausible, I'll look into it and see what I think of it.

Also assuming someone is engaging in behavior without proof is...well, not good. Gotta show some receipts; would say the same of anyone accusing someone affiliated with a PS circle of FUD. No proof? No care from me :LOL:

Nope! Like've i said, i wouldn't take that person seriously. Need to find some previous tweets from her/him which are hilarious
 
Last edited:
Top Bottom