• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Epic sheds light on the data streaming requirements of the Unreal Engine 5 demo

Mister Wolf

Member
So then, you think the systems need more GPU and not to focus so much on SSD? Even when JayzTwoCents is posting video saying the Series X is somewhere between a 2080 super and a Ti?

Somethin don't add up here.

We've already know the GPU in the Series X is RTX 2080 level in raster performance from it being benchmarked running Gears 5 an UE4 game. The Coalition provided this information an in house Microsoft Studio and one of the best developers in the world at using UE.
 

geordiemp

Member
All Lumen is doing is cost cutting. Its no more complex and certainly not better than CryEngine's Voxel GI which uses voxels whether the objects are near or far. Lumen using screen space inherits all the flaws of screen space representation of geometry that was the whole point of realtime triangle raytracing removing on top of being more accurate. Same for Sign Distance Fields which are inferior to using Voxels as well.

The important part was the fidelty and low cost 4.5 ms for that amount of detail in Nanite at 1440. Also the temporal upscaling is best we have seen so far.

I would be happier with Nanite at 1440 than any ray tracing game I ahve seen currently. It looks better.

People are still discussing and arguing over Nanite 2 months on becasue the fidelity was THAT GOOD....

If Nanite did not excel, nobody would be here, including you.
 
Last edited:

sainraja

Member
I understand what you wrote, you just don't like the answer, and that's not my problem.

Oh, and here come the genetic fallacies.

:messenger_tears_of_joy:

If you are making a valid point there is no need to be immature about it. Just answer or respond to what he is saying.
 
Last edited:

NickFire

Member
I personally am no tech wizard. But I am reasonably capable of applying logic. And logic says that after countless leaks and rumors on the net about next gen have turned into nonsense, the last thing I should do is trust a forum poster's claim that Sony's engineers suddenly shit the bed and designed a feature that will never be used. If it turns out true in 3 years then bravo. But for now I think its best to trust Sony's engineers over anyone's claims they designed a useless feature months before launch.
 

JeloSWE

Member
All Lumen is doing is cost cutting. Its no more complex and certainly not better than CryEngine's Voxel GI which uses voxels whether the objects are near or far. Lumen using screen space inherits all the flaws of screen space representation of geometry that was the whole point of realtime triangle raytracing removing on top of being more accurate. Same for Sign Distance Fields which are inferior to using Voxels as well.
It's really expensive to do full scene dynamic GI in realtime. Finding smart solutions to do this is exactly what Lumen is doing.
I speculate that Lumen will have a significant performance advantage over the current Nvidia RTX solutions. The screen space GI only affects smaller local details from what I can see and while I could spot occasional occlusion artifacts I'll taket any day over not having any at all. What will contribute most to the over all distribution of light in the scene is likely the SDF part. The Godot Engine latest progress with it's SDF GI development looks VERY impressive so I don't understand why you are scoffing at such solutions. Doing this by pure raytracing will probably be to cost prohibitively for weaker hardware. Who's to say we won't see Lumen on mobile or Switch etc.

So far I haven't seen anything that is completely dynamic the way Lumen is that looks this good with this level of on screen detail. All the other examples are either using simpler scene geometries or mostly baked light maps.
 

Mister Wolf

Member
The important part was the fidelty and low cost 4.5 ms for that amount of detail in Nanite at 1440. Also the temporal upscaling is best we have seen so far.

I would be happier with Nanite at 1440 than any ray tracing game I ahve seen currently. It looks better.

People are still discussing and arguing over Nanite 2 months on becasue the fidelity was THAT GOOD....

If Nanite did not excel, nobody would be here, including you.

Why are you talking to me about Nanite when I was discussing the LUMEN GLOBAL LIGHTING ENGINE.
 

Mister Wolf

Member
It's really expensive to do full scene dynamic GI in realtime. Finding smart solutions to do this is exactly what Lumen is doing.
I speculate that Lumen will have a significant performance advantage over the current Nvidia RTX solutions. The screen space GI only affects smaller local details from what I can see and while I could spot occasional occlusion artifacts I'll taket any day over not having any at all. What will contribute most to the over all distribution of light in the scene is likely the SDF part. The Godot Engine latest progress with it's SDF GI development looks VERY impressive so I don't understand why you are scoffing at such solutions. Doing this by pure raytracing will probably be to cost prohibitively for weaker hardware. Who's to say we won't see Lumen on mobile or Switch etc.

So far I haven't seen anything that is completely dynamic the way Lumen is that looks this good with this level of on screen detail. All the other examples are either using simpler scene geometries or mostly baked light maps.

I will reserve judgement when an actual real game decides to use Lumen and not some demo. If you don't see Hellblade 2 using it then that will let you know they didn't consider it worth the performance hit no different then you don't see complex/detailed Unreal Engine console games using Voxel GI this generation.
 
Last edited:

JeloSWE

Member
I will reserve judgement when an actual real game decides to use Lumen and not some demo. If you don't see Hellblade 2 using it then that will let you know they didn't consider it worth the performance hit no different then you don't see complex/detailed Unreal Engine console games using Voxel GI this generation.
Fair enough. I'm super stoked to see the recent advances of raytraced reflections and global illumination. It's all happening much sooner than expected in large pars to the advances in denoising and upscaling techniques.
 
It's going to be a shocker for some and for others like me not so much, because I've been saying this for months now. I hate to break it to some of you but that demo's data streaming could be handled by a 5 year old SATA SSD.

8wl1rua.png


768MB is the in view streaming requirement on the hardware to handle that demo, 768 MEGABYTES... COMPRESSED. And what was the cost of this on the rendering end?

Well, this is the result...

dQOnqne.png


This confirms everything I've said, not that these SSD's are useless, because they're 100% not. That data streaming would be impossible with mechanical drives, however, and this is a big however. That amount of visual data and asset streaming is already bottlenecking the renderer, it's bringing that GPU to its knees. There's very little cost to the CPU as you will see below, but as noted about 100 different times on this website and scoffed at constantly by detractors; the GPU will always be the limiting factor..

lNv2lKl.png


I've maintained this since square one, Microsoft and Sony both went overkill on their SSD's. That amount of I/O increase is not capable of aligning with the rendering pipeline in terms of the on demand volume of data streaming these SSD allow.

So what's the point here? You've got two systems with SSD's far more capable than their usefulness, but one came at a particularly high cost everywhere else in the system. I'll let you figure out which one that is and where.

deadest.png
Seems like the gpu doesn't have problem with this demo asset. It only has problem with the new GI called Lumen. So games with baked light should not have a problem running at 60 FPS and probably higher resolution.

Also TBH, by the time developers start implement such tech to their games ( it will take at least 3 years. Even Guerrilla games didn't implement similar tech yet. While the Unreal engine wont release officially before next year), PS5 pro and XSXXX will release and they will push the requirements further.

I expect XSXXX to put their money on enhanced SSD while PS5 pro will comfortably focus on developing stronger GPU.
 

Grinchy

Banned
Sony and MS are developing pieces of hardware that will bring in many billions of dollars. They get to hire the absolute best engineers in the world for that task.

Imagine being a random internet guy who looked at a single number somewhere and being completely convinced that he knew more than those world-class engineers. It's a very high level of delusion.

This current thread and this one make me think of the manic phase of bipolar disorder. Not saying that's what this is, just noting that it's all I can think of when I read some of these posts.

QB2OIE4.png
 

Alexios

Cores, shaders and BIOS oh my!
Disclaimer: I have no idea about tech and just want to add another point of view.

Apart from what Dr. Bass has said about latency, if I remember correctly they stated in a interview that they only used a handful of different textures to create this demo (I guess apart from the statues which seem to be unique). The rest was done manipulating these textures to create the illusion of looking like its all different stuff.
So from that point, this is a very simple scene with just a few rock textures, 1 character, no NPC's, hardly any audio sources, no AI and except for the flying part at the end (which I still find impressive no matter what) its rather slow paced, so nothing needs to get in quick.

So how would that number grow in say ... a city like new York. With loads of different and unique textures, massive amounts of NPC's with AI, hundreds of sound sources and a some guy going really quick on top of it all?

Genuine question. Would it go up several times, or is it just a minor increase since non of these things take that much more data or how does it work?
A city would divide the same resources to more types of models and textures. So (uber simplification ahoy) instead of having one 8k rock texture you'd have two 2k building textures and two 2k road textures and so on. Same for triangle count by object and what not. This one character in the tech demo uses many more polys than some random crowd GTA6 NPCs will use because that's all they had to show and used up all the budget they had to make it look its best, to showcase the engine and system. Just like fighting game characters look better than random action game characters, cos you just have two and an arena, all the budget goes in that, not a whole city scene with 10s of NPCs in (though of course with tessellation and stuff those could have similar enough detail when the camera is right up their face as the rest of the scene gets lower details).
 

mckmas8808

Mckmaster uses MasterCard to buy Slave drives
I personally am no tech wizard. But I am reasonably capable of applying logic. And logic says that after countless leaks and rumors on the net about next gen have turned into nonsense, the last thing I should do is trust a forum poster's claim that Sony's engineers suddenly shit the bed and designed a feature that will never be used. If it turns out true in 3 years then bravo. But for now I think its best to trust Sony's engineers over anyone's claims they designed a useless feature months before launch.

Plus all of MS' engineers. The OP is saying both SSDs are over-engineered and a waste of money. Some people really think they know better than some of the best engineers the world offers.
 
So then, you think the systems need more GPU and not to focus so much on SSD? Even when JayzTwoCents is posting video saying the Series X is somewhere between a 2080 super and a Ti?

Somethin don't add up here.
Of course it adds up, games now are fed stream data in double digit megabytes. If they need new data it all has to pulled off of these slow mechanical drives and either passes directly through the RAM to the GPU or is stored in RAM to continuously feed a fixed set of data to the GPU.

These SSD's fundamentally change the way in which and how much data can be delivered. We're talking 20 or more times the amount of visual data which can be instantaneously accessed, that has severe rendering implications.
 

psorcerer

Banned
P psorcerer might be able to shed some light on the matter if he has time.

768MB of streaming buffer.
Probably 3 frames of 256MB i.e. it needs to load 256-512MB of data per frame.
Raw speed of PS5 SSD at 30 fps is ~185MB per frame. With 2x compression we get to 370MB per frame. Seems to be inside the 256-512 target.
Dunno what confuses OP here. Maybe math wasn't the strongest subject in school...
 

psorcerer

Banned
768MB of streaming buffer.
Probably 3 frames of 256MB i.e. it needs to load 256-512MB of data per frame.
Raw speed of PS5 SSD at 30 fps is ~185MB per frame. With 2x compression we get to 370MB per frame. Seems to be inside the 256-512 target.
Dunno what confuses OP here. Maybe math wasn't the strongest subject in school...

P.S. seems like 9-18 fps for XBSX in the same scene.
Depends on how good its Velocity really works.
 
Last edited:
768MB of streaming buffer.
Probably 3 frames of 256MB i.e. it needs to load 256-512MB of data per frame.
Raw speed of PS5 SSD at 30 fps is ~185MB per frame. With 2x compression we get to 370MB per frame. Seems to be inside the 256-512 target.
Dunno what confuses OP here. Maybe math wasn't the strongest subject in school...
P.S. seems like 9-18 fps for XBSX in the same scene.
Depends on how good its Velocity really works.
The things you people pull out of your asses is incredible lol...
 
One number comes from the OP, and from that number everything he said is just made up.

Don't be so obtuse.

If I don't get the math? Every line and number that guy just rambled off came out of nowhere and from nothing. He's just making up figures.

This is beyond me being pedantic btw, every line and number he rambled off came from the number in your OP, it IS all theoretical but theoretical based off the number YOU shared.
 
Last edited:
Having trouble with a figure of speech? Jesus...

Dude, you are, I just said the problem with what you're saying is that the numbers are derived from what you shared, because you don't understand the theoretical aspect of this is besides the point. You tried downplaying the SSD on PS5 by telling us the size of the data being streamed which is NOT the same as how fast that data can be streamed.
 
Dude, you are, I just said the problem with what you're saying is that the numbers are derived from what you shared, because you don't understand the theoretical aspect of this is besides the point. You tried downplaying the SSD on PS5 by telling us the size of the data being streamed which is NOT the same as how fast that data can be streamed.
There's one number and no possible way to build ANY corresponding figures off of it unless of course you're just making things up...

It's total nonsense, everything that guy said is nonsense. Is that clear enough or do you want a diagram?
 
There's one number and no possible way to build ANY corresponding figures off of it unless of course you're just making things up...

It's total nonsense, everything that guy said is nonsense. Is that clear enough or do you want a diagram?

You gave us the number for how big the data being streamed was, using known numbers for SSD throughput we can then figure out how much it takes to stream that number and how fast it can be done. So far the only thing you HAVEN'T been given by people responding is a diagram, so maybe you're the one who needs it?
 
You gave us the number for how big the data being streamed was, using known numbers for SSD throughput we can then figure out how much it takes to stream that number and how fast it can be done. So far the only thing you HAVEN'T been given by people responding is a diagram, so maybe you're the one who needs it?
Quit wasting my time with these devil's advocate side arguments.
 

JeloSWE

Member
768MB of streaming buffer.
Probably 3 frames of 256MB i.e. it needs to load 256-512MB of data per frame.
Raw speed of PS5 SSD at 30 fps is ~185MB per frame. With 2x compression we get to 370MB per frame. Seems to be inside the 256-512 target.
Dunno what confuses OP here. Maybe math wasn't the strongest subject in school...
I don't think that is a streaming buffer as with videos, while it called a streaming pool I find it implausible to mean it has to be filled every frame continuously. The pool simply holds the data that is needed for the time being and you can swap data into it as you move through the level or rotate the camera.
 
Last edited:
I don't think that is a streaming buffer as with videos, while it called a streaming pool I find it implausible to mean it has to be filled every frame continuously. The pool simply holds the data that is needed for the time being and you can swap data into it as you move through the level or rotate the camera.

See now this is an actual response to that post, not just assuming the numbers are made up or an ass-pull. Thank you for saving the thread.
 
P.S. seems like 9-18 fps for XBSX in the same scene.
Depends on how good its Velocity really works.

That's assuming the compression method, data streaming method etc. were the same. Which it wouldn't be since it implements its I/O differently but also has other ways to offset areas it might not particularly excel at to reach a vaguely similar fidelity.

Don't forget the GPU actually still has to do work here, and the GPU RAM pool can be polled faster in XSX's case which would also be a benefit.
 

MrMiyagi

Banned
Well I also questioned why for the PlayStation 4 Pro they put in a 128% more powerful GPU, and a 31% more powerful CPU but only bothered to increase the memory bandwidth by a mere 23% which resulted in their GPU being bottleneck prone.

After the original PS4 their hardware decisions are a bit questionable.

You're kind of missing the point, that mere 768MB's is bottlenecking their GPU already... It's never been about how much you can stream, it's how much can the renderer take before it collapses. This is showing to be a considerably lower figure than the drives themselves are spec'd to.
Didn't Epic say the demo used about the same gpu resources as Fortnite? I'm also no tech expert but from I understand Sony set a hard target of 5Gb throughput as that's required to stream 2GB of data in the time it takes to turn your game character (about 0,25 seconds).

This all sounds like pretty good news to me too lol.
 

Eliciel

Member
I read a lot back and forth in here and whenever there is an idea of how it could have been meant, minutes later there is a lot of usage of the word "assuming".
In that case it's like talking to a wall. Let's just wait and see maybe? I read through 7 pages and we can't really seem to agree on this very vague meaning of the message.
Seems informations are not sufficient to agree. Maybe let's call it a day for the time being guys !
 
Last edited:

Gradly

Member
There’s a difference between a limited tech demo and full open world games. It’s not hard to wait and see.

Edit: at least we won’t see pop ins anymore, and we will be able to travel like The Flash in high speed without buffering or loading
 
Last edited:

01011001

Banned
Didn't Epic say the demo used about the same gpu resources as Fortnite? I'm also no tech expert but from I understand Sony set a hard target of 5Gb throughput as that's required to stream 2GB of data in the time it takes to turn your game character (about 0,25 seconds).

This all sounds like pretty good news to me too lol.

the GEOMETRY took the same amount of resources as the GEOMETRY in fortnite.
this is not including shading said geometry, texturing, rendering effects and particles or anti aliasing and other post processing.
Epic Games/Nick Penwarden said:
I can say that the GPU time spent rendering geometry in our UE5 demo is similar to the geometry rendering budget for Fortnite running at 60fps on consoles.

they gave that estimate out there to show that nanite is extremely effective in scaling geometry accoridn to the resolution needed.
 
Last edited:

onQ123

Member
You do know that this 768MB pool is for less than a second of game play right? This is a pool that's being refilled as you move if you're thinking that the 768MB of data is some kinda "ha ha who needs 5GB/s when the streaming pool is only 768MB" you're wrong
 
Didn't Epic say the demo used about the same gpu resources as Fortnite? I'm also no tech expert but from I understand Sony set a hard target of 5Gb throughput as that's required to stream 2GB of data in the time it takes to turn your game character (about 0,25 seconds).

This all sounds like pretty good news to me too lol.
No, the Gbuffer latency was the same as Fortnite, not the actual rendering demand.
 

onQ123

Member
768MB of streaming buffer.
Probably 3 frames of 256MB i.e. it needs to load 256-512MB of data per frame.
Raw speed of PS5 SSD at 30 fps is ~185MB per frame. With 2x compression we get to 370MB per frame. Seems to be inside the 256-512 target.
Dunno what confuses OP here. Maybe math wasn't the strongest subject in school...

I'm wondering if the 768MB is a hint to the size of the large pool of SRAM on the PS5 chip

 

Jigsaah

Gold Member
I just don't understand how PCs using the similar tech with an SSD in it can render and stream just fine, but the consoles somehow wouldn't. It just doesn't make sense.
 

Allandor

Member
You do know that this 768MB pool is for less than a second of game play right? This is a pool that's being refilled as you move if you're thinking that the 768MB of data is some kinda "ha ha who needs 5GB/s when the streaming pool is only 768MB" you're wrong
The stream pool size has nothing to do with time.
This is just cache. Some data must be updated for the next x frames and most data can stay. You would only reload everything if everything would change from one future frame to another. But in games you more or less have always some static content between multiple frames.
it is just not so easy as deviding the whole bandwidth by the 1 second frame count. Than you would really use your bandwidth very ineffective. In 99% of the rendered frames you have mostly the same content between frames only with minor differences. Even a few frames ahead you still have most the same content and that is data you don't want to throw away.

Also, you don't want to max out the frame to frame budget, so you always leave headroom, else it could get pretty ugly on fast movement.
For example, in the unreal demo, the fast moving flying sequence at the end had to change the cache content much faster than the slow moving content. Therefore the game needs to preload or the slow sequence was never even touching the max bandwidth (not even a bit). The demo always loaded the max quality assets at all time (according to the video from epic).

BTW, they mention that they are compressing the cached data to reduce it. That also costs latency and memory bandwidth. But that could be the use case why Sony and MS implemented not only decompression but also compression in hardware.
 

Allandor

Member
I just don't understand how PCs using the similar tech with an SSD in it can render and stream just fine, but the consoles somehow wouldn't. It just doesn't make sense.
Because a PC would use it main memory to cache assets and when the gpu requests it, copy it to the gpu memory. That is why pc games use a large potion of main memory even though they often have large gpu memory pools. PC games do that because they also support HDDs and SSDs of various speeds. Game logic has most times a very small memory footprint. The current gen consoles with one major memory pool don't need that double caching. PCs still use it, because the memory is there to be used and like I wrote, gamelogic is normally quite small.
using the textures etc directly from main memory (which would be possible) is still to slow. So those resources get copied over. BTW, that concept destroys theories that it would not require caching at all with the new SSD of the ps5. Even main memory which is much much faster than an SSD can not support such big GPUs with assets. That (with system Ram) only works for small integrated GPUs and even there it can get a bottleneck.
 
Last edited:

Jigsaah

Gold Member
Because a PC would use it main memory to cache assets and when the gpu requests it, copy it to the gpu memory. That is why pc games use a large potion of main memory even though they often have large gpu memory pools. PC games do that because they also support HDDs and SSDs of various speeds. Game logic has most times a very small memory footprint. The current gen consoles with one major memory pool don't need that double caching. PCs still use it, because the memory is there to be used and like I wrote, gamelogic is normally quite small.
using the textures etc directly from main memory (which would be possible) is still to slow. So those resources get copied over. BTW, that concept destroys theories that it would require caching at all with the new SSD of the ps5. Even main memory which is much much faster than an SSD can not support such big GPUs with assets. That (with system Ram) only works for small integrated GPUs and even there it can get a bottleneck.
So what would be the end result for the XSX and the PS5 due to this supposed disadvantage?
 
Top Bottom