• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Sampler Feedback Streaming appears to be the real deal. Game Stack Live real-time demo impressions. (video to come soon)

So I watched the session titled "Xbox Velocity Architecture: Fast Game Asset Streaming and Minimal Load Times for Games of Any Size"

Remember that demo they showed running on Series S? Well, they expanded on it significantly and ran it this time on Series X. They showed a lot more detail and gave a lot more detail, and showed us it running in real-time. Needless to say, the results are fucking impressive.

So the demo we saw has over 10GB of texture data in it, highly detailed for close up inspection.

They showed the numbers for asset streaming the same content on the equivalent of Xbox One X, and how long it takes to load it all (22 seconds or so for 2.7GB of data)

On Series X this was dropped to only 565MB and series x completed the task twice in real-time in under .20 seconds. .19 seconds the first time .17 seconds the second time.

They also showed a highly optimized and intentionally conservative Gen 9 console equivalent of texture streaming without Sampler Feedback Streaming. They pointed out that the numbers present are actually lower for the gen 9 version without sfs because many texture streaming systems will often do over-streaming because they won't be nearly as optimized as the example they're using. Versus the last gen equivalent Series X did the equivalent of 2.68GB in less than .20 seconds. Versus the intentionally conservative gen 9, highly optimized equivalent (that they stress would actually be using more memory than what's shown), the Series X did the equivalent of 1.57GB in less than 0.20 seconds.

By far one of the most impressive parts for me was once they started to do really fast quick camera cuts of the nature that would typically give most streaming systems trouble and expose pop-in. Even in those extreme cases Sampler Feedback Streaming was insanely fast with no pop-in, at least that I could see, and with each cut you could be dealing with north of an instant 1-2GB/s of texture data required at any given second.

They pointed out that typically whenever games need to do this type of loading, there is a loading screen or a cutscene used to hide pop-in, but SFS is just that fast that none is needed even when dealing with over 10GB of texture data.

The video session is over so the video isn't available now, but will surely be available later. Here's a screen cap from the demo.

CmdOzMx.png


They point out that Xbox Series X and S games are guaranteed no less than a sustained 2GB/s of raw read speed. It's still 2.4GB/s raw, and it does this, but only when the OS and hardware aren't doing anything else. They reiterate that this is a legitimate multiplier of both SSD speed and performance as well as graphics memory. So going with the minimum guaranteed read speed that would be 6GB/s raw with SFS and 10GB/s compressed with SFS.

Now what about far more complex looking games because one could say it works that well only due to the controlled demo right? They say with confidence that Sampler Feedback Streaming's multiplier effect will be the same no matter what.

"the absolute numbers you see in the numbers bar aren't that high especially compared to the 10 - 16GB of memory you see in our consoles. Our content in this tech demo is fairly simple. A real AAA title would likely have significantly more complex materials and more objects visible. Crucially however, the comparison between sampler feedback streaming and traditional MIPS streaming holds true regardless of material complexity. The numbers will scale with content and the multiplier will still ring true."

Full Game Stack Live Presentation Video added:


 
Last edited:

cormack12

Gold Member
Need to see the video really, looks interesting though. Wasn't the differentiating feature about SFS that it passes the next read back in the same loading call or something.
 
Need to see the video really, looks interesting though. Wasn't the differentiating feature about SFS that it passes the next read back in the same loading call or something.

The differentiating factor, to my understanding, is that it can load bits and pieces of textures instead of the whole texture, and more accurately and quickly predict what will be needed for the scene, and just do away with what it has no need for. That's the layman's version, but there is a more complex explanation that they give in the video.
 

cormack12

Gold Member
The differentiating factor, to my understanding, is that it can load bits and pieces of textures instead of the whole texture, and more accurately and quickly predict what will be needed for the scene, and just do away with what it has no need for. That's the layman's version, but there is a more complex explanation that they give in the video.

Yeah I was thinking more from sampler feedback for which it was confused. Whixh was part of DX12? Its able to use the previous frame to know what's next to stream I guess?

 

longdi

Banned
does this mean we get really nice textures always?

looks really nice and CG like even without RT global illumination
 

Godfavor

Member
The differentiating factor, to my understanding, is that it can load bits and pieces of textures instead of the whole texture, and more accurately and quickly predict what will be needed for the scene, and just do away with what it has no need for. That's the layman's version, but there is a more complex explanation that they give in the video.
PRT+ and SF, it can do the same thing, it loads a part of the texture into memory and saves memory space by loading only a part of it. This is done in all games that support partial texture rendering.

What SFS does is to stream bits of partial textures on the fly depending on the player's distance camera view. So if an object is far away and consists of 10 pixels on screen, then the SFS will load a partial texture of 10 pixels and not the partial mapping of PRT+, or the texture LOD of that model.

Not sure of how much better is SFS compared to the old PRT+ on accuracy of predicting the next texture on screen.
 
Since this seems to be a software solution, I assume sony can implement it as well?

It's more than just software, there are specific GPU related hardware enhancements to implement what's taking place, plus custom texture filtering hardware as confirmed by a Microsoft engineer. Then there's the DirectStorage API, the hardware decompression block using Microsoft's BCPack texture compression. Sony has their own solution, but I don't think they have Sampler Feedback. Either way, both consoles will be amazing.
 

cormack12

Gold Member
PRT+ and SF, it can do the same thing, it loads a part of the texture into memory and saves memory space by loading only a part of it. This is done in all games that support partial texture rendering.

What SFS does is to stream bits of partial textures on the fly depending on the player's distance camera view. So if an object is far away and consists of 10 pixels on screen, then the SFS will load a partial texture of 10 pixels and not the partial mapping of PRT+, or the texture LOD of that model.

Not sure of how much better is SFS compared to the old PRT+ on accuracy of predicting the next texture on screen.

Yeah like it can break the mipmap up and even mix and match it looked like. It looks quite cool. But obviously when you see this, but then see how an engine can ruin it (like in the medium collectible model viewer) we need to see it evolve. But I loved the demo and video
 

Dampf

Member
Yep, it's great.

Also got the confirmation on Game Stack Live regarding hardware support for the PC gang here.

It is supported on all DX12 GPUs and PCs plus laptops having an NVMe drive.

For the best experience, a DX12 Ultimate GPU is recommended (Turing, Ampere, RDNA2), this will enable SFS similar to the Xbox consoles.
 
Last edited:
Here's another impressive as hell piece. On the quick camera cuts, the equivalent of over 5GB and 4GB was called upon in the Xbox One X equivalent streaming system and the gen 9 console equivalent without SFS.

Sampler Feedback Streaming went max as high as 2.1GB of texture data and it was still all handled instantly without a sign of any pop-in.
 

DForce

NaughtyDog Defense Force
Since this seems to be a software solution, I assume sony can implement it as well?
"With Nanite, we don't have to bake normal maps from a high-resolution model to a low-resolution game asset; we can import the high-resolution model directly in the engine. Unreal Engine supports Virtual Texturing, which means we can texture our models with many 8K textures without overloading the GPU." Jerome Platteaux, Epic's special projects art director, told Digital Foundry. He says that each asset has 8K texture for base colour, another 8K texture for metalness/roughness and a final 8K texture for the normal map. But this isn't a traditional normal map used to approximate higher detail, but rather a tiling texture for surface details.
Something similar was used in the Unreal Engine 5 demo shown on the PS5


 

Major_Key

perm warning for starting troll/bait threads
Here's another impressive as hell piece. On the quick camera cuts, the equivalent of over 5GB and 4GB was called upon in the Xbox One X equivalent streaming system and the gen 9 console equivalent without SFS.

Sampler Feedback Streaming went max as high as 2.1GB of texture data and it was still all handled instantly without a sign of any pop-in.

XBOX VELOCITY ARCHITECTURE BEAST.

It could mean that the Series S is much stronger machine on Next gen only games developed with XVA in mind.
 
The differentiating factor, to my understanding, is that it can load bits and pieces of textures instead of the whole texture, and more accurately and quickly predict what will be needed for the scene, and just do away with what it has no need for. That's the layman's version, but there is a more complex explanation that they give in the video.
Is this being used currently in any games?
 

DForce

NaughtyDog Defense Force
This isn't evidence of it being the same thing. Virtual Texturing has been supported since like Unreal Engine 4.

rPdUpth.png
Runtime Virtual Texture =/= Streaming Virtual Texturing.


Streaming Virtual Texturing (SVT) is an alternative way to stream textures in your project from disk, having several advantages—along with some disadvantages—when compared to existing mip-based Texture Streaming in Unreal Engine 4 (UE4).

Traditional mip-based texture streaming performs offline analysis of material UV usage and then at runtime decides which mip levels of a texture to load based on object visibility and distance. This process can be limiting because streaming data considered is the full texture mip levels. When using high-resolution textures, loading a higher mip level of a texture can potentially have significant performance and memory overhead. Also, mip-based texture streaming decisions are made by the CPU using CPU-based object visibility and culling. Visibility is more conservative—meaning something is more likely to be loaded than not—to avoid objects popping into view. So, if even a small part of the object is visible, the entire object is considered visible. The object loaded including any associated textures that may be required to stream in.

In contrast, the virtual texturing system only streams in parts of the textures that are required to be visible. It does this by splitting all mip levels into tiles of a small, fixed size. The GPU determines which of the of the visible tiles are accessed by all visible pixels on the screen. This means that when an object is considered visible, it's communicated to the GPU which loads the required tiles into a GPU memory cache. No matter the size of the texture, the fixed tile size of the SVTs only consider the ones that are visible. Tile visibility is computed on the GPU using standard depth buffers causing SVT requests to only happen for visible parts that affect pixels.


This was how PlayStation 5 Unreal Engine demo was able to stream 8K textures. https://www.eurogamer.net/articles/...eal-engine-5-playstation-5-tech-demo-analysis
 
Very nice. This DOES sound much more similar to how Sampler Feedback works. Xbox Series X has added hardware customization on top to further help improve texture streaming according to an MS engineer, but this does sound as if PS5 indeed have a similar solution, and for all we know hardware on top as well.
 
Last edited:

Very nice. This DOES sound much more similar to how Sampler Feedback works. Xbox Series X has added hardware customization on top to further help improve texture streaming according to an MS engineer, but this does sound as if PS5 indeed have a similar solution, and for all we know hardware on top as well.

Bollocks! PS5 doesn't have it because Sony didn't said anything about it. :/
 
It's very interesting to see both Sony and Microsoft coming up with solutions to better utilize memory, since they couldn't increase the amount of RAM as much as they would've wanted to. Do more with the same amount, pretty much.
It's clear the true revolution for both consoles isn't raytracing or the number of polygons on-screen, rather it's data management. I'm very excited to see what will be possible in the coming years. I expect early next-gen exclusive games to look laughably dated once games really take advantage of the new data-streaming pipelines.
 

Matsuchezz

Member
The video is not exciting at all, if this is a tech demo, it is really unimpressive, If i were a developer probably i could be excited. Is there anything to be excited about that is clearly visible in the video?
 
Last edited:

CamHostage

Member
It's clear the true revolution for both consoles isn't raytracing or the number of polygons on-screen, rather it's data management.

I think raytracing and # of polygons really are data management, aren't they?

To people, it's shapes and colors and lights and brilliance, but to the computer, it's all just math that it's constantly trying to organize itself well enough through nested tables and algorithms and caches of asset locations in order to crunch it in time for the need. It's been a while since numbers seemed to matter (we've been able to distract ourselves with the fun tangible stuff like texture quality and layers of effects and production values for a couple generations now,) but eventually the computer needs to take the next leap when the manpower can only do so much to make a game wow us.
 
Last edited:

IntentionalPun

Ask me about my wife's perfect butthole
Not really the same thing I don't think.

UE5 is insanely cool.. but you are comparing unrelated things. The "virtual" in UE5 means that the 8k texture isn't actually used during rendering, nor are the "billions of polygons of models." I believe it's different than the UE4 virtual texturing because it's applied to virtual models (virtual geometry)

Arguable UE5 is more impressive than SFS, but it's just.. not the same thing
 
Last edited:

Fafalada

Fafracer forever
The differentiating factor, to my understanding, is that it can load bits and pieces of textures instead of the whole texture
That's not the differentiator - you could do that all the way back to PS2 (And some have).

SFS customizations address some specific pain-points with PRT that still persisted as far as last gen - making the whole thing more practical. Eg. making process of identifying 'what' to load more efficient, etc.

Not sure of how much better is SFS compared to the old PRT+ on accuracy of predicting the next texture on screen.
That's a software problem. SFS provides an efficient way for to get a readout of what was just rendered, but you still need to do the heavy lifting of working out a prediction model (or whatever other heuristics may work).
 

IntentionalPun

Ask me about my wife's perfect butthole
That's also a UE4 feature.. it's right in the link you provided.

My understanding (which could be wrong TBH) is that UE5 extends this because the models themselves are also virtualized down to near pixel sized polygons (from polygons much smaller than a pixel.) The virtual texturing is applied to the high-res models before they are then virtualized, which is how UE5 demo uses a ton of 8K textures for a single object.
 

Elog

Member
Since this seems to be a software solution, I assume sony can implement it as well?
Not really the same thing I don't think.

UE5 is insanely cool.. but you are comparing unrelated things. The "virtual" in UE5 means that the 8k texture isn't actually used during rendering, nor are the "billions of polygons of models." I believe it's different than the UE4 virtual texturing because it's applied to virtual models (virtual geometry)

Arguable UE5 is more impressive than SFS, but it's just.. not the same thing
The key point of the I/O complex in the PS5 is exactly what MS is trying to achieve through SFS, DirectStorage etc, i.e. to expand the usable VRAM pool to increase the quality and number of assets that the GPU works with in a given virtual environment. That is also what they used to achieve the high amount of 8K textures in the UE5 demo since a normal graphics card would run out of VRAM very fast with so many high quality assets.
 

IntentionalPun

Ask me about my wife's perfect butthole
The key point of the I/O complex in the PS5 is exactly what MS is trying to achieve through SFS, DirectStorage etc, i.e. to expand the usable VRAM pool to increase the quality and number of assets that the GPU works with in a given virtual environment. That is also what they used to achieve the high amount of 8K textures in the UE5 demo since a normal graphics card would run out of VRAM very fast with so many high quality assets.
That's a good point that high SSD speed is achieving some of the same goals.

But that's not how UE5 achieves a high amount of 8K textures. It's not actually rendering with 8K textures. It's also not rendering 1 billion polys per scene. It's also a tech designed to scale down to cell phones.. it's actually incredibly efficient with how much data is needed.. it's honestly a bit confusing what the UE5 demo had that was only possible on PS5 after they revealed the details.
 
Last edited:
Its not a software solution on xbox but yeah it could be implemented in software with some pretty blatant performance drawbacks.

According to the UE5 documentation it seems that the method used has some performance and other disadvantages in flexibility of use. It produces very impressive results nonetheless. In glossing over the Sampler Feedback Streaming documentation on github I've actually found what appears to be some of the same drawbacks? That said, Microsoft has added on additional hardware to support Sampler Feedback that's custom to Series X|S, but it seems to be to improve other parts of the texture streaming process, such as further minimizing chances of pop-in. And, personally, those drawbacks that are listed in both the UE5 documentation and Sampler Feedback documentation may not prove to be much of a problem at all for games.

Appears the advantages far outweigh any downsides.
 

Andodalf

Banned
That's a good point that high SSD speed is achieving some of the same goals.

But that's not how UE5 achieves a high amount of 8K textures. It's not actually rendering with 8K textures. It's also not rendering 1 billion polys per scene. It's also a tech designed to scale down to cell phones.. it's actually incredibly efficient with how much data is needed.. it's honestly a bit confusing what the UE5 demo had that was only possible on PS5 after they revealed the details.

It was always very clear that they meant it wasn't possible on the PS4.
 

Three

Member
Is this being used currently in any games?
Most likely yes it is. Believe it or not virtual texturing like this has existed before. It's mostly software (which has very little performance penalty) and hardware support is any turing card (before XVA was even a name) . There is middleware available right now for UE4 that does the same thing. Look up Graphine.




4GB down to 1GB.
Performance improvement.
Any platform.

The only thing that makes it more attractive today in engines is a guaranteed SSD.
 
Last edited:
Most likely yes it is. Believe it or not virtual texturing like this has existed before. It's mostly software (which has very little performance penalty) and hardware support is any turing card (before XVA was even a name) . There is middleware available right now for UE4 that does the same thing. Look up Graphine.




4GB down to 1GB.
Performance improvement.
Any platform.

The only thing that makes it more attractive today in engines is a guaranteed SSD.


Virtual Texturing and Sampler Feedback Streaming is a little different. Using Sampler Feedback to handle texture streaming is what's new. You could do virtual texturing before for many years, but that was all without Sampler Feedback functionality, at least no DirectX title had ever used it. And to my knowledge neither did any other major gaming platforms or hardware till Nvidia's Turing launch in late 2018. Sampler Feedback isn't just the same old thing we've always been using. It's quite new in what it makes possible.
 

Elog

Member
That's a good point that high SSD speed is achieving some of the same goals.

But that's not how UE5 achieves a high amount of 8K textures. It's not actually rendering with 8K textures. It's also not rendering 1 billion polys per scene. It's also a tech designed to scale down to cell phones.. it's actually incredibly efficient with how much data is needed.. it's honestly a bit confusing what the UE5 demo had that was only possible on PS5 after they revealed the details.
The UE5 demo did many things - and it is of course scalable engine across various platforms/ hardware! - but one of the things they demonstrated was the consistent use of 8K assets that was dynamically streamed to VRAM by the PS5 I/O complex and dedicated API. That is why they could just zoom in on the rocks in the beginning of the demo without any loss in texture quality.

Personally, I think the virtual increase in VRAM though streaming methodologies such as SFS and the I/O complex in the PS5 together with increased geometry complexity (primarily through better geometry culling methodologies such the GE API in the PS5 and mesh shaders on XSX/S) will have the highest impact on graphical fidelity this coming generation. Exciting!
 
So I watched the session titled "Xbox Velocity Architecture: Fast Game Asset Streaming and Minimal Load Times for Games of Any Size"

Remember that demo they showed running on Series S? Well, they expanded on it significantly and ran it this time on Series X. They showed a lot more detail and gave a lot more detail, and showed us it running in real-time. Needless to say, the results are fucking impressive.

So the demo we saw has over 10GB of texture data in it, highly detailed for close up inspection.

They showed the numbers for asset streaming the same content on the equivalent of Xbox One X, and how long it takes to load it all (22 seconds or so for 2.7GB of data)

On Series X this was dropped to only 565MB and series x completed the task twice in real-time in under .20 seconds. .19 seconds the first time .17 seconds the second time.

They also showed a highly optimized and intentionally conservative Gen 9 console equivalent of texture streaming without Sampler Feedback Streaming. They pointed out that the numbers present are actually lower for the gen 9 version without sfs because many texture streaming systems will often do over-streaming because they won't be nearly as optimized as the example they're using. Versus the last gen equivalent Series X did the equivalent of 2.68GB in less than .20 seconds. Versus the intentionally conservative gen 9, highly optimized equivalent (that they stress would actually be using more memory than what's shown), the Series X did the equivalent of 1.57GB in less than 0.20 seconds.

By far one of the most impressive parts for me was once they started to do really fast quick camera cuts of the nature that would typically give most streaming systems trouble and expose pop-in. Even in those extreme cases Sampler Feedback Streaming was insanely fast with no pop-in, at least that I could see, and with each cut you could be dealing with north of an instant 1-2GB/s of texture data required at any given second.

They pointed out that typically whenever games need to do this type of loading, there is a loading screen or a cutscene used to hide pop-in, but SFS is just that fast that none is needed even when dealing with over 10GB of texture data.

The video session is over so the video isn't available now, but will surely be available later. Here's a screen cap from the demo.

CmdOzMx.png


They point out that Xbox Series X and S games are guaranteed no less than a sustained 2GB/s of raw read speed. It's still 2.4GB/s raw, and it does this, but only when the OS and hardware aren't doing anything else. They reiterate that this is a legitimate multiplier of both SSD speed and performance as well as graphics memory. So going with the minimum guaranteed read speed that would be 6GB/s raw with SFS and 10GB/s compressed with SFS.

Now what about far more complex looking games because one could say it works that well only due to the controlled demo right? They say with confidence that Sampler Feedback Streaming's multiplier effect will be the same no matter what.

"the absolute numbers you see in the numbers bar aren't that high especially compared to the 10 - 16GB of memory you see in our consoles. Our content in this tech demo is fairly simple. A real AAA title would likely have significantly more complex materials and more objects visible. Crucially however, the comparison between sampler feedback streaming and traditional MIPS streaming holds true regardless of material complexity. The numbers will scale with content and the multiplier will still ring true."

Video added:


Its virtual texturing its been done before its now faster on series x io but it wont remove popin, popin will always depend on your io latencies and ssd bandwidth and sampler feedbakc wont solve this in comparison they are talking about 10gb of data in a small room here ue5 was streaming 100s of gb on a scene with 8k textures... i want to see how series x handles that since weve already seen popin in the medium a game thats using series x velocity architecture already.. this is all marketing talk weve heard the same talks since xbox 360. They called it mega tectures on x360 then they called it tilled resources on xbone partial resident textures on ps4 in hardware, and now sfs its the same crap only a bit faster its not the solution of memory, it just helps reduce popin
 
Its virtual texturing its been done before its now faster on series x io but it wont remove popin, popin will always depend on your io latencies and ssd bandwidth and sampler feedbakc wont solve this in comparison they are talking about 10gb of data in a small room here ue5 was streaming 100s of gb on a scene with 8k textures... i want to see how series x handles that since weve already seen popin in the medium a game thats using series x velocity architecture already.. this is all marketing talk weve heard the same talks since xbox 360. They called it mega tectures on x360 then they called it tilled resources on xbone partial resident textures on ps4 in hardware, and now sfs its the same crap only a bit faster its not the solution of memory, it just helps reduce popin
Wonderful take, "it's the same but faster", so it's not the same then? The Medium is a last generation game, not a next generation engine.
 
Top Bottom