• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.
  • The Politics forum has been nuked. Please do not bring political discussion to the rest of the site, or you will be removed. Thanks.

Sampler Feedback Streaming appears to be the real deal. Game Stack Live real-time demo impressions. (video to come soon)

Aug 28, 2019
7,051
13,313
530
We might define culling differently? If the number of triangles that the GPU needs to process for a given scene is reduced, culling is occuring. There are multiple ways of achieving this such as off-face culling but also if several triangles are reduced to a single pixel in the final rasterized image (which is one of the primary ways that EU5 seems to do polygon reduction) all those triangles do not need to be processed, i.e. culling. This allows the use of raw complex source geometry which is great for development but also increases the polygon count of objects that you move close to.

They used 8K source data on that SSD, for example the statue consisted of 24 8K assets. The assets are then streamed into VRAM on a per view basis and in a compressed format in VRAM (they did not state the compression ratio except that it was less than a fully compressed file on your storage device so I would assume 3x or something like that) was 768MB or in other words a single view used roughly 2 GB of uncompressed assets to render a frame. Since the compression is better on your storage device it probably means that around 0.5GB of compressed data needs to be read from storage, uncompressed, sorted and moved to VRAM in parts of a second - and that probably represents something in the vincinity of 100 individual files (really roughly - but it is not 10 and not 1000 files). That requires SFS or the PS5 I/O and an SSD to achieve.

That last sentence was my whole point in this thread - not sure anymore if we disagree or agree tbh. I am confused...
Again, not being hostile.. but lets not forget how this started:

The key point of the I/O complex in the PS5 is exactly what MS is trying to achieve through SFS, DirectStorage etc, i.e. to expand the usable VRAM pool to increase the quality and number of assets that the GPU works with in a given virtual environment. That is also what they used to achieve the high amount of 8K textures in the UE5 demo since a normal graphics card would run out of VRAM very fast with so many high quality assets.

They are not using 8k assets at runtime.. The key to UE5 is not anything similar to what SFS is doing. Culling is removing something from an image; you can't scale by simply removing things.. that's not what scaling is, and scaling is not a type of culling. You repeated this idea that UE5 depended on streaming tons of 8K assets when it's not true.

You also described.. culling.. not scaling, in your post I was responding to. They aren't doing some "pre-raster step to determine which geometry to send to the GPU".. they are, scaling.. both in the engine way before the game is even put on the console, and at runtime.

And yes, it's still insanely detailed.. a ton of data.. and requires a fast SSD to hit the detail level of the UE5 demo running on PS5. But it's not streaming a bunch of 8k assets.. that's not why it needs so much memory for the view. It's extremely efficient compared to what it is losslessly scaling.. it's just also starting at such detail, that the scaled object/maps/textures are still an insane amount of data if you truly want 1440p levels of detail w/ that technique. If you had half the SSD speed, instead of their 768MB streaming pool, you'd need 1.5 GB or so.. but that would only be a small loss of detail on the same amount of RAM because the vast majority of the RAM pool is dedicated to the view. UE5 demo is likely going to look nearly identical to the naked eye even with an SSD 1/3rd the speed of Sony's, w/ a 16GB of VRAM system.
 
Last edited:
  • Thoughtful
Reactions: CurtBizzy

ToTTenTranz

Member
Feb 4, 2021
239
473
235
. The Epic China engineer never claimed he's running the demo live. He showed the demo off in the video and talked about how it ran back at the office.

"I have a laptop here where I could be running the demo live but instead I'm just running the same video you can see on youtube. It totally works in my work laptop I promise, I'm just not showing it to you because my dog ate my homework true story".

- Translated by ToTTenTranz
 

Elog

Member
Aug 28, 2016
736
2,974
470
They are not using 8k assets at runtime.. The key to UE5 is not anything similar to what SFS is doing. Culling is removing something from an image; you can't scale by simply removing things.. that's not what scaling is, and scaling is not a type of culling. You repeated this idea that UE5 depended on streaming tons of 8K assets when it's not true.

You also described.. culling.. not scaling, in your post I was responding to. They aren't doing some "pre-raster step to determine which geometry to send to the GPU".. they are, scaling.. both in the engine way before the game is even put on the console, and at runtime.

And yes, it's still insanely detailed.. a ton of data.. and requires a fast SSD to hit the detail level of the UE5 demo running on PS5. But it's not streaming a bunch of 8k assets.. that's not why it needs so much memory for the view. It's extremely efficient compared to what it is losslessly scaling.. it's just also starting at such detail, that the scaled object/maps/textures are still an insane amount of data if you truly want 1440p levels of detail w/ that technique. If you had half the SSD speed, instead of their 768MB streaming pool, you'd need 1.5 GB or so.. but that would only be a small loss of detail on the same amount of RAM because the vast majority of the RAM pool is dedicated to the view. UE5 demo is likely going to look nearly identical to the naked eye even with an SSD 1/3rd the speed of Sony's, w/ a 16GB of VRAM system.
A few things - I do not believe we reach each other here.

1) They used the 8k assets and the source geometry raw at run-time. The engine is handling it so that the GPU is not overloaded but it puts a lot of strain on the I/O to manage given the VRAM limitation. See 12:30 in their UE5 demo explained video you linked yourself. Exactly what the engine is doing with these assets at runtime before pushing it to the GPU is a little of a black-box at least to me.

2) Reduction in number of polygons that enter the rendering pipeline compared to the original source material at run-time = culling. Not sure what 'scaling' even means from a technical point of view. They take polygons away from the source geometry to never be touched by the GPU due to that it is off-face or far away (i.e. a single pixel in the final image). I agree that I am making an assumption that the way they know that a large amount of polygons will be a single pixel before rendering is to perform some sort of pre-rasterization image mathematically - just my assumption - but I cannot see how they otherwise would know that several triangles will en up as single pixels.

3) When you write about SSDs and I/O you are talking about bandwidth. That is the least interesting parameter when doing this. It of course needs to be high enough but the key aspect is latency. And that is much more complex. SFS is primarily about latency reduction - peak bandwidth was already there - same with the PS5 I/O complex. That is why SFS and the PS5 I/O complex is needed to be able to cherry pick files and read 0.5 GB of compressed data, uncompress it, sort it and transfer it to VRAM in parts of a second. If you lower the quality of the assets or the source geomtry the latency and bandwidth requirements of course becomes lower. But in the demo they obviously wanted to show off what was possible to do.
 
  • Like
Reactions: Panajev2001a

Kataploom

Member
Jan 30, 2014
1,340
892
725
Colombia
And this is why Series S is gonna make the One X feel what it is, a last gen machine... Finally we can start seeing a generational leep we didn't see last gen, games looked definitely better, but the jump as in PS2 to PS3 wasn't there, not even close... This time I/O and efficiency will definitelly push a big graphical leap (and I don't mean DeSR, that's still a last gen game in my eyes)... Or I hope so, since raw power increases seems to be "just ok", but seeing all this SFS, mesh shaders, etc. talk makes me hope for it!
 

Kazekage1981

Member
Apr 7, 2019
1,230
2,146
410
I have a question about PC SFS and SSD's

1)Can PC's with PCIE5 and PCIE6 connectivity have SSD's with higher RAW bandwidth greater than 10GB/sec? The updated RAW bandwidth, along with updated SFS (which is part of the velocity architecture): How much theoretical GB of texture streaming can be achieved?

2) The New SD Express 8.0 specification for PCIe G4L2 can achieve bandwidth close to 4.0GB/sec. Can that also be part of SFS and velocity architecture?
 
Aug 28, 2019
7,051
13,313
530
A few things - I do not believe we reach each other here.

1) They used the 8k assets and the source geometry raw at run-time. The engine is handling it so that the GPU is not overloaded but it puts a lot of strain on the I/O to manage given the VRAM limitation. See 12:30 in their UE5 demo explained video you linked yourself. Exactly what the engine is doing with these assets at runtime before pushing it to the GPU is a little of a black-box at least to me.

What video? Do you mean 1:30? I didn't link to any 12 minute long video. I'll respond when you clarify here.

2) Reduction in number of polygons that enter the rendering pipeline compared to the original source material at run-time = culling. Not sure what 'scaling' even means from a technical point of view. They take polygons away from the source geometry to never be touched by the GPU due to that it is off-face or far away (i.e. a single pixel in the final image). I agree that I am making an assumption that the way they know that a large amount of polygons will be a single pixel before rendering is to perform some sort of pre-rasterization image mathematically - just my assumption - but I cannot see how they otherwise would know that several triangles will en up as single pixels.

I'm sorry but this is silly; scaling doesn't just remove triangles.. they can't go from "There are over a billion triangles of source geometry in each frame, that Nanite crunches down losslessly to around 20 million drawn triangles" by removing triangles that are "off face or away." 98% of the polygons aren't off-face or away.. they scale the objects, even before they ever are stored on disk. They then scale the objects, as well as cull invisible parts of the objects. They are making the objects SMALLER... not removing things from the objects. They are doing this in the engine before it ever is even stored.. because it would be impossible to store 1 billion triangles per scene on a game.. hell there are questions surrounding how they are going to ever have 20 million tirangles per scene stored on disk.

Find me one place where a technical document somewhere describes scaling as culling. Either way, this is a bit silly.. semantics.. but I have never heard anyone use "culling" to describe "scaling" other than on this forum, when people seem to not really realize what UE5 does. It's not the same as SFS.. SFS is not scaling things.

3) When you write about SSDs and I/O you are talking about bandwidth. That is the least interesting parameter when doing this. It of course needs to be high enough but the key aspect is latency. And that is much more complex. SFS is primarily about latency reduction - peak bandwidth was already there - same with the PS5 I/O complex. That is why SFS and the PS5 I/O complex is needed to be able to cherry pick files and read 0.5 GB of compressed data, uncompress it, sort it and transfer it to VRAM in parts of a second. If you lower the quality of the assets or the source geomtry the latency and bandwidth requirements of course becomes lower. But in the demo they obviously wanted to show off what was possible to do.

We are talking about memory/streaming usage.. not the latency involved in computing something. That's what SFS is talking about w/ latency.. the cycle to compute what shouldn't be rendered.

Scaling is not deciding what should/shouldn't be rendered. It's.. scaling. lol
 
Last edited:

MonarchJT

Member
Sep 25, 2020
2,180
3,113
350
What are your credentials for being able to accurately translate what they said?
Are they speaking in Cantonese or Mandarin?


Funny how the goalposts keep moving.
"It was shown running in a laptop" -> but they showed a video -> "They showed a video but in reality he said it could run in his laptop at work, he just didn't show it running live in a stream dedicated to showing off UE5 because reasons.".

Next step is saying the laptop with a RTX2080 Max-Q is capable of running the game through the power of the cloud.


and after all the fuzz we got this officially from a serious pc gamer interview now you can trust whatever you want


"I couldn't get any exact specifications from Epic, but on a conference call earlier this week I asked how an RTX 2070 Super would handle the demo, and Epic Games chief technical officer Kim Libreri said that it should be able to get "pretty good" performance. But aside from a fancy GPU, you'll need some fast storage if you want to see the level of detail shown in the demo video."
 
Last edited:

Kholinar

Member
Nov 9, 2020
318
941
335
You fell for Sweeney's damage control. The Epic China engineer never claimed he's running the demo live. He showed the demo off in the video and talked about how it ran back at the office. Sweeney is the one who purposefully misunderstood what was happening in order to do some weak ass damage control.
There's no evidence that the demo on the laptop - assuming that's actually the case - is running on the same settings as the PlayStation 5. UE5 is supposed to be exceptionally scalable to the point that it can run on mobile phones. A cut-down 2070 laptop variant is nowhere near the same ballpark as a PS5 graphics-wise and definitely wouldn't be able to run the same demo, nor does it have the same data management necessary. Can't believe folks are this down bad that they rely on dodgy chinese translations and whatnot to get their little warring jabs in, lmao.
 
Last edited:

SenjutsuSage

Member
Jan 23, 2010
10,667
3,768
1,260
So, what changed from last year?

What changed from last year is we have a more direct statement as to what they are comparing to with the numbers to back it up. They even showed this year a highly optimized gen 9 equivalent texture streaming system with a fast SSD that would run on Xbox Series X or a high end PC and further compared that against Sampler Feedback Streaming running on Series X. So we know it isn't just being compared to worst case scenarios not taking full advantage of the various tricks of the trade.

They also stress that the gen 9 bar is purposely conservative because traditionally the ram usage will be even higher due to over-streaming.

What also changed is we see the direct load time compared Xbox One X game streaming. A load equivalent to 2.68GB on Xbox One X, loaded on Series X with SFS 3 different times at speeds of 0.19 seconds, 0.17 seconds and again at 0.19 seconds.

Last year we also didn't see the incredible speed with which this technique is able to handle quick camera cuts, and how with each successive camera cut, well over a 1GB of new data was being loaded instantly. The load after multiple successive loads hit as much as over 5.5GB on the Xbox One X simulation, and over 4GB on the gen 9 console streaming without SFS. Series X with SFS barely budged past 2.1GB/s through all of that, meaning performance of this type of texture streaming is really mindblowing fast. We didn't get any of these kinds of specific details last year.

What changed from last year is that they told us that this level of 2.5x+ multiplier effect is will hold true even for more complex AAA titles. It isn't just a function of the controlled demo environment.

There's a lot of new information. It's tempting to pretend like there isn't, but there is.
 

ToTTenTranz

Member
Feb 4, 2021
239
473
235
and after all the fuzz we got this officially from a serious pc gamer interview now you can trust whatever you want

Except that's not a "serious pc gamer interview", it's a pc gamer article talking about the same video I posted above, where the author has no idea whether or not the translation is accurate.

The conference call bit is from the week before the "laptop 2080" video, and only says the "(desktop) 2070 Super would get pretty good performance" with zero reference to IQ settings, and they still say it'd need "a hefty SSD".
 
  • Like
Reactions: Panajev2001a

Panajev2001a

GAF's Pleasant Genius
Jun 7, 2004
18,820
12,401
2,110
What changed from last year is we have a more direct statement as to what they are comparing to with the numbers to back it up. They even showed this year a highly optimized gen 9 equivalent texture streaming system with a fast SSD that would run on Xbox Series X or a high end PC and further compared that against Sampler Feedback Streaming running on Series X. So we know it isn't just being compared to worst case scenarios not taking full advantage of the various tricks of the trade.

They also stress that the gen 9 bar is purposely conservative because traditionally the ram usage will be even higher due to over-streaming.

What also changed is we see the direct load time compared Xbox One X game streaming. A load equivalent to 2.68GB on Xbox One X, loaded on Series X with SFS 3 different times at speeds of 0.19 seconds, 0.17 seconds and again at 0.19 seconds.

Last year we also didn't see the incredible speed with which this technique is able to handle quick camera cuts, and how with each successive camera cut, well over a 1GB of new data was being loaded instantly. The load after multiple successive loads hit as much as over 5.5GB on the Xbox One X simulation, and over 4GB on the gen 9 console streaming without SFS. Series X with SFS barely budged past 2.1GB/s through all of that, meaning performance of this type of texture streaming is really mindblowing fast. We didn't get any of these kinds of specific details last year.

What changed from last year is that they told us that this level of 2.5x+ multiplier effect is will hold true even for more complex AAA titles. It isn't just a function of the controlled demo environment.

There's a lot of new information. It's tempting to pretend like there isn't, but there is.

There is new info yes, but it is more of an XVA (SSD + BCPack Decompression engine + Direct Storage + SFS) than a pure comparison between PRT/best of the best virtual texturing schemes vs SFS which was a big point in defining what the multiplier meant in terms of external I/O efficiency improvements.
As others already said, relying on mechanical HDD would limit how much you would use that storage to continuously stream data in and out for short term use (you would stream in a lot more data early on and the update it continuously keeping a good amount of seconds of gameplay worth of data in memory).

With that said they are selling XVA (and SFS as part of it) very well.
 

MonarchJT

Member
Sep 25, 2020
2,180
3,113
350
Except that's not a "serious pc gamer interview", it's a pc gamer article talking about the same video I posted above, where the author has no idea whether or not the translation is accurate.

The conference call bit is from the week before the "laptop 2080" video, and only says the "(desktop) 2070 Super would get pretty good performance" with zero reference to IQ settings, and they still say it'd need "a hefty SSD".
No, no and Absolutely not.....PC GAMER contacted Kim Libreri and did a phone interview. It's better for you that you accept that a pc will run the demo without any type of problem without having a super duper customized ssd because i suppose you could lose your mind otherwise
 
Last edited:

SenjutsuSage

Member
Jan 23, 2010
10,667
3,768
1,260
There is new info yes, but it is more of an XVA (SSD + BCPack Decompression engine + Direct Storage + SFS) than a pure comparison between PRT/best of the best virtual texturing schemes vs SFS which was a big point in defining what the multiplier meant in terms of external I/O efficiency improvements.
As others already said, relying on mechanical HDD would limit how much you would use that storage to continuously stream data in and out for short term use (you would stream in a lot more data early on and the update it continuously keeping a good amount of seconds of gameplay worth of data in memory).

With that said they are selling XVA (and SFS as part of it) very well.

Fair enough, fair enough. It isn't just SFS vs PRT/best virtual texturing schemes, but the entire XVA package vs whatever the alternative might be without it.
 
Mar 8, 2015
3,420
1,913
635
Chicago, IL USA
steamcommunity.com
With every new thread I keep wondering which console is Gary and which console is Ash.

 
  • LOL
Reactions: Keihart

Kazekage1981

Member
Apr 7, 2019
1,230
2,146
410
Will DirectStorageAPI make PC Windows OS run faster utilizing the GPU? Possibilities:

Faster Boot Time
Apps and programs running faster such as photoshop, video editing
Faster Web Browsing
Better Media Playback
More Fluid UI?
 

Keihart

Member
Jun 23, 2013
5,187
3,577
775
Will DirectStorageAPI make PC Windows OS run faster utilizing the GPU? Possibilities:

Faster Boot Time
Apps and programs running faster such as photoshop, video editing
Faster Web Browsing
Better Media Playback
More Fluid UI?
What is this DirectSotrageAPI? some to use M2 SSDs better on windows? that would be really cool.
 
Last edited:
  • Like
Reactions: Kazekage1981

Kazekage1981

Member
Apr 7, 2019
1,230
2,146
410
What is this DirectSotrageAPI? some to use M2 SSDs better on windows? that would be really cool.
Toms Hardware Article on Direct Storage API

I am hoping this is MS equivalent to Apple's custom silicon and software in terms of OS performance. I haven't tried the new iMACS, but allegedly everything runs super snappy and instant, and i dont think it has anything to do with newer SSDs with faster bandwidth, im not sure what apple uses for their SSDs.

SD Express 8.0

Also the new SD card read/write standard can go up to 4GB/sec. So can DirectStorageAPI along with SFS be used for this also?
 
  • Like
Reactions: CamHostage

Kazekage1981

Member
Apr 7, 2019
1,230
2,146
410
This is a noob question but does the Xbox Series X|S SSD feed data directly to GDDR6 RAM or directly to the CPU?