• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Epic sheds light on the data streaming requirements of the Unreal Engine 5 demo

psorcerer

Banned
I don't think that is a streaming buffer as with videos, while it called a streaming pool I find it implausible to mean it has to be filled every frame continuously. The pool simply holds the data that is needed for the time being and you can swap data into it as you move through the level or rotate the camera.

Yup. I think it holds data for 3 frames.
Because that's actually a smallest pool possible for a triple-buffered render.
It may be more. But why?
 

psorcerer

Banned
That's assuming the compression method, data streaming method etc. were the same. Which it wouldn't be since it implements its I/O differently but also has other ways to offset areas it might not particularly excel at to reach a vaguely similar fidelity.

Don't forget the GPU actually still has to do work here, and the GPU RAM pool can be polled faster in XSX's case which would also be a benefit.

You should understand that's it's a tongue in cheek. Obviously it can run much better. We don't know yet.
 

Paracelsus

Member

G3WYAka.png


Comments?
 

Dory16

Banned
OP, just clarify the "the GPU is bottlenecked at 768MB/s" part. That the throughput demand was "only" 768 MB/s, I get that. But how do we know that the GPU couldn't have rendered smoothly even when receiving a higher throughput than that? Is it impossible that with more optimisation, the framerate and resolution of the demo could have been higher?
 
OP, just clarify the "the GPU is bottlenecked at 768MB/s" part. That the throughput demand was "only" 768 MB/s, I get that. But how do we know that the GPU couldn't have rendered smoothly even when receiving a higher throughput than that? Is it impossible that with more optimisation, the framerate and resolution of the demo could have been higher?
The assumption is that they wouldn't have run it on such a "low" resolution if they didn't have to.
 

JeloSWE

Member
Yup. I think it holds data for 3 frames.
Because that's actually a smallest pool possible for a triple-buffered render.
It may be more. But why?
Ok let me be less vague. I'm saying that the there is no need to constantly stream data to the memory pool. It can just sit there with the data needed to render the environment where you currently are. Only as you move around to a location where data is missing would it be read in to the pool. It's not like a buffered movie. I don't think there is any reason to break it up into smaller pieces, you just add to it and remove the stuff no longer needed.
 

psorcerer

Banned
Ok let me be less vague. I'm saying that the there is no need to constantly stream data to the memory pool. It can just sit there with the data needed to render the environment where you currently are. Only as you move around to a location where data is missing would it be read in to the pool. It's not like a buffered movie. I don't think there is any reason to break it up into smaller pieces, you just add to it and remove the stuff no longer needed.

But why? If you can stream in stuff fast enough why not do it?
Earlier you just couldn't do it on demand.
 

Allandor

Member
The assumption is that they wouldn't have run it on such a "low" resolution if they didn't have to.
But that should really because of the other stuff. Not because of geometry or streaming stuff. But what is a game without more or less dynamic lighting...
 
Last edited:

bitbydeath

Member
I personally am no tech wizard. But I am reasonably capable of applying logic. And logic says that after countless leaks and rumors on the net about next gen have turned into nonsense, the last thing I should do is trust a forum poster's claim that Sony's engineers suddenly shit the bed and designed a feature that will never be used. If it turns out true in 3 years then bravo. But for now I think its best to trust Sony's engineers over anyone's claims they designed a useless feature months before launch.

Those features are going to be standard in every UE5 game and it looks like Square is jumping on board with their own version too as Project Atheia looks similar graphically from what I can tell and it’s due out next year.

 

JeloSWE

Member
But why? If you can stream in stuff fast enough why not do it?
Earlier you just couldn't do it on demand.
You could in theory stream in assets at 5.5~9GB/s when needed of course but the point of the pool is not to be constantly refreshed with data. You would only read in new stuff as needed. My my point is not to think of the pool as a streaming buffer specifically. :messenger_smiling:
 
I personally am no tech wizard. But I am reasonably capable of applying logic. And logic says that after countless leaks and rumors on the net about next gen have turned into nonsense, the last thing I should do is trust a forum poster's claim that Sony's engineers suddenly shit the bed and designed a feature that will never be used. If it turns out true in 3 years then bravo. But for now I think its best to trust Sony's engineers over anyone's claims they designed a useless feature months before launch.
Nobody says it's useless. It's just overengineered for the purpose; possibly as a marketing tool.
You're not trusting their engineers here, you're trusting their marketing.
 

carlosrox

Banned
Developers and publishers will always ride Sony dick so I'm expecting games to perform solid on PS5 no matter what.

The worst Sony ever did was the PS3 and they still saw all the support and ended up selling more that gen.

I almost expect the PS5 to be lead platform for most games but we'll see how the developers feel.
 
768MB/s is already like 20x the throughput of what we see now in games. People are having a tough time grasping that this is an immense amount of data, and at the same time given the specifications of these SSD's they're largely underutilized.

:pie_eyeroll:

no, its not 768MB/s, the pool is 768MB for the view data and that is for mesh data only, how much you need to update that pool depends how much you want/need to change per frame but that is only about mesh data so a scene that quickly changes(or moving the camera) may need to update huge chunks of those 768 MB buffer(in practice you may not need to update all that per frame that would be the worst scenario so probably they can increse buffer size to account for the delay) that is why they say they want to further compress mesh data, you add textures to the equation and then you realize you need as much SSD speed as possible specially at 60 fps or have a bigger buffer in ram or a combination of both
 
Last edited:

psorcerer

Banned
You could in theory stream in assets at 5.5~9GB/s when needed of course but the point of the pool is not to be constantly refreshed with data. You would only read in new stuff as needed. My my point is not to think of the pool as a streaming buffer specifically. :messenger_smiling:

Obviously you don't need to stream in stuff when camera is still (although I suspect that Nanite does temporal improvements to the IQ).
But if you can stream in stuff constantly when camera moves: I would do it. It just simplifies stuff a lot.
 

JeloSWE

Member
Obviously you don't need to stream in stuff when camera is still (although I suspect that Nanite does temporal improvements to the IQ).
But if you can stream in stuff constantly when camera moves: I would do it. It just simplifies stuff a lot.
But if you are in a closed room with one statue, it all fits in the pool. There is no need to stream anything into the pool. That is what I'm trying to convey in the simplest form. You only stream stuff in as needed.
 

bitbydeath

Member
It's just an image, no link?

I don’t know how to Twitter 🤷‍♂️

Edit: Here’s the article link-

Also found this interesting in my search.

“The ability to stream in content at extreme speeds enables developers to create denser and more detailed environments, changing how we think about streaming content. It’s so impactful that we’ve rewritten our core I/O subsystems for Unreal Engine with the PlayStation 5 in mind,” he added.

 
Last edited:

Elog

Member
Sony's approach maximizes a focus on bandwidth, Microsoft's maximizes a focus on latency. This doesn't mean they are necessarily lacking in the other area, just that their main priority is in one of them. They are both very valid approaches but a large contingent of people view Cerny's approach as the only possible solution. The truth is that he is not the first person to notice "Hm, there's some bottlenecks here. Let's try fixing them!", not even by a long shot. And he is not the only one who has found solutions to this problem, either.

Thanks as always for a good discourse!

The only statement I disagreed with is the bolded one above. From the information I have seen and received the PS5 actually delivers both higher bandwidth and lower latency in terms of I/O and that the difference is actually significantly higher when in comes to latency between the two platforms than in bandwidth (which in turn means that the practical bandwidth difference is very much in favor of the PS5 when reading 100's of files such as textures since latency then dominates the use-case). The reason for this is two-fold - 1) limited to no overhead in reading and decompressing textures from the SSD into RAM due to dedicated silicon for both steps and 2) cache scrubbers that allows faster move of data from RAM to the GPU cache. The XSX has AFAIK CPU/driver overhead in reading the textures into RAM after decompression and a more standard PC solution in moving data from RAM to the GPU cache (my two sources for this is public information and dialogue with one developer that have direct access to one dev kit and indirect access to the other dev kit - please note that *he takes the NDAs seriously but have made a few remarks regarding publicly available information and how it relates to what *he has seen).

May I ask what you base your statement on?

Edit: Missed one word!
 
Last edited:

RaySoft

Member
It's going to be a shocker for some and for others like me not so much, because I've been saying this for months now. I hate to break it to some of you but that demo's data streaming could be handled by a 5 year old SATA SSD.

8wl1rua.png


768MB is the in view streaming requirement on the hardware to handle that demo, 768 MEGABYTES... COMPRESSED. And what was the cost of this on the rendering end?

Well, this is the result...

dQOnqne.png


This confirms everything I've said, not that these SSD's are useless, because they're 100% not. That data streaming would be impossible with mechanical drives, however, and this is a big however. That amount of visual data and asset streaming is already bottlenecking the renderer, it's bringing that GPU to its knees. There's very little cost to the CPU as you will see below, but as noted about 100 different times on this website and scoffed at constantly by detractors; the GPU will always be the limiting factor..

lNv2lKl.png


I've maintained this since square one, Microsoft and Sony both went overkill on their SSD's. That amount of I/O increase is not capable of aligning with the rendering pipeline in terms of the on demand volume of data streaming these SSD allow.

So what's the point here? You've got two systems with SSD's far more capable than their usefulness, but one came at a particularly high cost everywhere else in the system. I'll let you figure out which one that is and where.

deadest.png
Are you sure you interperate the word «streaming pool» correctly?
Could you please enlighten me what that is?
 
I guess someone should talk to MS and Sony to change their hardware design to good old SATA SSDs. Since this shows there's absolutely no benefit from using the revamped I/O that both manufacturers are using.

Or, this is just some early demo from an unfinished product on an unfinished hardwared that maybe will be used for future improvements.

Hmmmmmmm. I wonder which one.
 

onQ123

Member
But if you are in a closed room with one statue, it all fits in the pool. There is no need to stream anything into the pool. That is what I'm trying to convey in the simplest form. You only stream stuff in as needed.

768 MB of data in the pool depending on where you look so going close up on the statue while still keeping the same amount of data streaming in would be like unlimited detail.
 
Last edited:
You should understand that's it's a tongue in cheek. Obviously it can run much better. We don't know yet.
It's hard to tell through text-only, at least for me :p

This isn't necessarily about XvA in particular and it's utilizing a multi-hardware setup that obviously can't be directly applicable to a singular video game console, but the following does give some indication of the type of bandwidth Nvidia is getting via GPUDirectStorage (basically their implementation of DirectStorage) on 4x DGX2 systems:



So they're hitting peaks of around 168 GB/s raw bandwidth on the SSD I/O with 4x DGX-2 systems. Cut that down by four and it's 42 GB/s raw bandwidth for a single DGX-2 system. Each DGX-2 has 16 Nvidia Tesla v100s, and each of those is about 14 TF on Volta architecture (which is older than Turing). Honestly though the GPUs aren't important here because we're just looking at SSD I/O capability with GPUDirectStorage in mind.

Each DGX-2 comes with 8x 3.84 TB SSDs. 42/8 gives 5.25 GB/s raw bandwidth for each v100's SSD. However if you look at the SSD they actually use, Micron 9200, that provides actual raw bandwidth of 3.5 GB/s, so physical peak in this setup would be 112 GB/s. However, it's more likely that since this is data NASA is using for simulation and would be uncompressed, they are using 4 additional Micron 9200s because 42/12 gets you 3.5 GB/s.

Unfortunately that doesn't really tell us much about compression capabilities, just that GPUDirectStorage (and therefore DirectStorage) is very scalable and networks really well with clusters of machines. Which will have obvious benefits. I'm hoping I can find some figures of GPUDirectStorage tests with compressed data being ran because it would be relatively easy to do a bit of work to figure out how that could translate to Series X SSD I/O performance at compressed rates (provided those tests are using all the same features as XvA; DirectStorage is just one part of XvA).

Thanks as always for a good discourse!

The only statement I disagreed with is the bolded one above. From the information I have seen and received the PS5 actually delivers both higher bandwidth and lower latency in terms of I/O and that the difference is actually significantly higher when in comes to latency between the two platforms than in bandwidth (which in turn means that the practical bandwidth difference is very much in favor of the PS5 when reading 100's of files such as textures since latency then dominates the use-case). The reason for this is two-fold - 1) limited to no overhead in reading and decompressing textures from the SSD into RAM due to dedicated silicon for both steps and 2) cache scrubbers that allows faster move of data from RAM to the GPU cache. The XSX has AFAIK CPU/driver overhead in reading the textures into RAM after decompression and a more standard PC solution in moving data from RAM to the GPU cache (my two sources for this is public information and dialogue with one developer that have direct access to one dev kit and indirect access to the other dev kit - please note that *he takes the NDAs seriously but have made a few remarks regarding publicly available information and how it relates to what *he has seen).

May I ask what you base your statement on?

Edit: Missed one word!

Well to address what you're bringing up regarding, I guess the best way I can put my perspective on it is like so:

Yep, we've known for a while about PS5's dedicated central processor for moving data in/out of RAM. But I think what isn't being considered there is, it still has to contend with other processors along the bus when doing this. I actually think this is another reason they need the cache scrubbers, because if the dedicated processor in the I/O block is DMA'ing to the memory bus for read/write operations to RAM, CPU, GPU etc. will have to wait their turn, as to be expected with hUMA architectures. And to try cutting down on the GPU's need to then spend more cycles of, instead of waiting for access back to the bus, then copying through data from RAM to the caches wholesale after getting privileges back after already waiting, the cache scrubbers are there to cut down that time period. Hence selective eviction of info within the GPU caches.

So the Series systems might not have the cache scrubbers (they may or may not have some equivalent to it, not necessarily the ECC-configured memory that's already been mentioned which would serve a different role anyway), but part of the reason their design doesn't require it is because at least in regards to CPU-bound tasks, they don't need to wait while a dedicated I/O processor does its thing to get access to the memory bus. So CPU-bound game logic can still access the bus if it needs to. Maybe not at the full bandwidth of the slower pool of GDDR6, but the capability is there. And keep in mind it's the same OS core on the Series systems handling that task, and we know what kind of CPUs these systems are using. Do we actually have information on what exactly the PS5's I/O dedicated processor is? Is it a repurposed Zen 2 core? If so, how? Like as in is it cut down on local caches (which might also explain reason for the Cache Coherency Engines if it comes to that)?

Overall I think the estimate of overhead incurred by the Series systems for what you're describing is a bit much; keep in mind I believe MS were already aware of this and therefore that could've been a factor in them clocking their CPUs higher, to account for such overhead, whatever much it may be. You also have to keep in mind they are not literally dropping some PC version of Windows 10 into the system and leaving it there. Whatever overhead you might associate with W10 (speaking of you can always cut it down to even under 512 MB if you really wanted, granted you lose out on a lot of features), you can't really automatically associate with Series X because they already use their own OS, Xbox OS, that's built specifically for the console, even if it leverages Windows tech.

I don't know what you mean by "standard PC-like" solution regarding how Series X is moving data to/from RAM to/from storage. DirectStorage hasn't actually been fully deployed in the PC space yet, and other parts of XvA won't even be available on PC for a while. Nvidia GPUs support their own implementation of DirectStorage called GPUDirectStorage, and it appears to be extremely good (you can check out the clip I linked in this post replying to psorcerer). If that is "standard PC implementation", then it doesn't really matter much if it works well. MS are simply leveraging their strengths here, same as Sony with theirs, but one other thing to keep in mind is that MS are also using their developments with the Series systems to then leverage in other markets they're involved in, such as PC, mobile, server/data center and more.

Similarly I don't see where you are hearing the latency on PS5 side is leagues better. Sony actually haven't talked much of anything about latency in any aspect of their system. I don't doubt they have low latency, but I think MS have simply prioritized this moreso, while Sony have prioritized bandwidth moreso. One factor that benefits MS in latency, as I said before, is that they're using faster NAND devices with larger storage. You usually get higher bandwidth per NAND module with the bigger modules, and latency figures also increase. If you'd like some thought on what perhaps could be providing them an advantage in latency figures (plus maybe some speculation on some parts to XvA particularly the "100 GB pool partition", I'll quote a couple of insightful posts I read over on B3D:

function:

I've been wondering about the "Veolcity Archtecture" and MS's repeated mentions of low latency for their SSD - something that's been sort of backed up by the Dirt 5 technical director. There's also the talk of the "100GB of instantly accessible data" aka "virtual ram".

Granted I could be reading too much into some fairly vague comments, but I think there's probably something to the comments, and also that the two things are possibly related. So I think that maybe one of the key things that that allows MS to have such (presumably) low latency from the SSD is also responsible for the strange seeming "100GB" figure.

Now I'm assuming that the "virtual memory" is storing data as if it were already in, well, memory. So the setup, initialisation and all that is already done and that saves you some time and overhead when accessing from storage compared to, say, loading assets for an SSD on PC. But this virtual memory will need to be accessed via a page table, that then has to go through a Flash Translation Layer. Normally this FTL is handled by the flash controller on the SSD, accessing, if I've got this right, a FTL stored in either an area of flash memory, or in dram on the SSD or on the host system.

XSX has a middling flash controller, and no dram on the SSD. So that should be relatively slow. But apparently it's not (if we optimistically run with the comments so far).

My hypothesis is that for the "100 GB of virtual ram" the main SoC is handling the FTL, doing so more quickly than the middling flash controller with no dram of its own, and storing a 100GB snapshot of the FTL for the current game in an area of system reserved / protected memory to make the process secure for the system and transparent to the game. Because this is a proprietary drive with a custom firmware, MS can access the drive in "raw mode" like way bypassing all kinds of checks and driver overhead that simply couldn't be done on PC, and because it's mostly or totally read access other than during install / patching, data coherency shouldn't be a worry either.

My thought is that this map of physical addresses for the system managed FTL would be created at install time, updated when wear levelling operations or patching take place, and stored perhaps in some kind of meta data file for the install. So you just load it in with the game.

And as for the "100GB" number, well, the amount of reserved memory allocated to the task might be responsible for the arbitrary seeming 100GB figure too.

Best I could find out from Google, on a MS research paper from 2012 (https://static.usenix.org/events/fast12/tech/full_papers/Grupp.pdf), was that they estimated the FTL might be costing about 30 microseconds on latency. Which wouldn't be insignificant if you could improve it somewhat.

So the plus side of this arrangement would be, by my thinking:
- Greatly reduced read latency
- Greatly improved QoS guarantees compared to PC
- No penalty for a dram-less SSD
- A lower cost SSD controller being just as good as a fast one, because it's doing a lot less
- Simplified testing for, and lower requirements from, external add on SSDs

The down sides would be:
- You can only support the SSDs that you specifically make for the system, with your custom driver and custom controller firmware
- Probably some additional use of system reserved dram required (someone else will probably know more!)

dsoup:

I can't offer much insight because, as you said, these are thoughts based on a number of vague comments and much of my comments where about the Windows I/O stack which is likely very different than Xbox Series X, but it would indeed be truly amazing if Sony have prioritised raw bandwidth and Microsoft have prioritised latency.

My gut tells me that if this is what has happened that they'll largely cancel each other out except in cases where one scenario favours bandwidth over latency and another favours latency over bandwidth. Nextgen consoles have 16Gb GDDR6 so raw bandwidth is likely to be a preferable in cases where you want to start/load a game quicker, e.g. loading 10Gb in 1.7 seconds at 100ms latency compared to 3.6 seconds at 10ms latency. Where the latency could make a critical difference is frame-to-frame rendering and pulling data off the SSD for the next frame, or the frame after.

At the end of the day, I can't claim I have some connections to actual developers. Yes I've talked to developers but it's been mainly through this forum and in public posts, and a couple in private, discussing the next-gen systems. One in particular we've ultimately ended on a pretty stark disagreement because a few things they mentioned didn't add up with actual publicly available information and it was just too often that kept happening with them.

However, I've definitely taken the time to read upon and research so much of this stuff, it's not even funny. Because I like trying to make sense of all of this. I'm also the kind of guy who likes forming their own opinion by listening to as many well-reasoned perspectives as possible even if they come in conflict with each other on where they stand here or there. That helped a lot when discussing the GPU leaks, and I think it's helping a lot here, too.

Trust me, if you can name a particularly technically well-reasoned person here, on Era, Beyond3D wherever, or the insiders, data miners through Twitter (Rogame, Komachi etc.)...chances are I've heard of them and seen what they've had to say. And hard data too, of course ;) Actual developer statements, too, even the controversial ones like the Crytek guy's. I've always seen merit in all the stuff they've presented forward even if there are parts I either don't understand at first or don't agree with. The fun's in taking all of those points and trying to see what's all correlating together, however disparately, and how.

And from that I try forming my own perspective on it, even if it's in flux on parts. I'm no expert on this, none of us are, and we all have our preferences when it comes to certain technological features, standards, methodologies, systems, architectures etc. shaped from first-hand experience, education, learned knowledge and critical thinking & logical reasoning.
 
Last edited:

Cock of War

Member
The entire point of the SSD is volume capability not possible with the limited capacity of dedicated RAM, you're going to swap out data as infrequently as possible. That's the entire point, scene volume, complexity and consistency.

In terms of Lumen and Nanite that's a non-starter conversation because you still need lighting, you still need shadows, you still need AA, AF, possibly RT and so on and so forth. The combined effect of a full scene rendering is heavy demand on the GPU, and the capability of something like Nanite requires higher accuracy and scaling from a system like Lumen, there's a knock on effect across the entire rendering stack.

It's all related. I've already got my bases covered, I'm not some shave tail louie.

I feel like OP is doing some 4D chess level concern trolling with added salt and gas lighting. Spouting random tech words with wreckless abandon, and when I read the bolded I was reminded of this:



Subsequently I've read his posts in this thread in that kids voice.

I don't have much to add to this discussion other than I guess paradigm shifts are really hard for some people to comprehend. At this point tho, it's bordering on modern day flat earthers level of logic.
 

RaySoft

Member
I guess someone should talk to MS and Sony to change their hardware design to good old SATA SSDs. Since this shows there's absolutely no benefit from using the revamped I/O that both manufacturers are using.

Or, this is just some early demo from an unfinished product on an unfinished hardwared that maybe will be used for future improvements.

Hmmmmmmm. I wonder which one.
OP Just didnt understand the streaming aspect of it.
Take that 768MB pool and multiply it by 30fps (768x30=23040MB) You end up with 22.5GB.
In other words within the PS5´s decompression hw max throughput, wich is 22GB/s.

Even though the engine would never need to handle that kind of load, your logic would still need to support it, otherwise it would break if Indeed that unlikely scenario would happen.
 
No one in here will reply to you.

So, what is the correlation between 768MB of that data in RAM and the SSD speed? I'm not a tech wizard or anything, not like 80% of the people in this thread. The main point of this thread is that the 768MB in RAM is proof that the SSD are over engineered and the console makes are idiots. In the video the guy specifically says "streaming from RAM" do I just not understand how that relates to SSD and I/O?

So far, I don't think anyone has explained how 768MB of data in RAM is connected to SSD and I/O speeds.

There's no sensible way to articulate the exact correlation between the 768MB working set and SSD I/O speed directly because it depends entirely on a number of factors from compression time on and off disk, to the exact patterns of how data is streamed based on the geometry virtualisation model.

You'll have some form of sparse octree data structure that defines the hierarchy of streamable world segments made up of collections of objects. There will be some heuristics that govern the on-demand preemptive scheduling of streaming in new areas of the scene that may soon come into view according to current camera position, rotation and motion trajectory through the world, and similarly dropping streamable segments that the game expects can be safely removed as they become less likely to be needed across immediately subsequent frames.

There are alot of variables to consider here:

- How fast can the camera move? (i.e. faster the camera the lower streaming latency required, again that's "latency", not simply bandwidth)
- How much occlusion is present in the environment?
- How many frames does it takes to stream in a chunk? (i.e. what is the streaming latency?)
- How much geometry can we fit in a chunk?
- How densely populated is that geometry? (i.e. streaming helps less the more densely populated the world geometry is as you have to fit more data into the working set that might be visible in the view frustrum at any given time, and thus you'll need a much bigger working set)
- Etc etc

This isn't so clear cut as you're trying to bottom out.

But one thing is absolutely certain: the higher the streaming bandwidth + lower latency for the same world density, the more you can efficiently balance minimising the memory working set of data required to on-demand load very high geometry detailed assets for rendering, doubly so with very very fast geometry compression, which is exactly what you're aiming for to allow for rich detailed worlds, with a lot of resource overhead that can go towards simulating the rest of the game (of which there is a lot going on in most games).

So again, the conclusion of the OP is simply false, as any traditional SSD does not have the low-latency required to allow you to minimise the working set sufficiently, and therefore you end up having to fill up RAM with far more geometry in-order to ensure its available for rendering on-demand as its needed (with no guarantee you can sustain that depending on how slow the SSD is, and how fast your view frustrum updates such that new segments must be loaded and unloaded frame-to-frame).
 
Last edited:

hyperbertha

Member
I guess someone should talk to MS and Sony to change their hardware design to good old SATA SSDs. Since this shows there's absolutely no benefit from using the revamped I/O that both manufacturers are using.

Or, this is just some early demo from an unfinished product on an unfinished hardwared that maybe will be used for future improvements.

Hmmmmmmm. I wonder which one.
Read some of the posts on this thread. OP is rambling nonsense.
 

JeloSWE

Member
768 MB of data in the pool depending on where you look so going close up on the statue while still keeping the same amount of data streaming in would be like unlimited detail.
Well, at one point having the infinite details on the statue asset up close will just be to large to manage, both to sculpt in Z-brush and then space on the drive. I'm not sure how much that 33M polys statue asset takes up in RAM on the PS5 but an object containing 1M polys exported from Blender in FBX format is 25MB then mulitiply that with 33 and you get 825MB in total. The mesh is probably stored in a more efficient format in the demo.
 

Bo_Hazem

Banned
The xbox also has no CPU requiremrent for streaming and can load GPU datasets nativly .. its becoming obvious you know your spouting blatant lies now ...

Who's talking about CPU anyway? That 15GB/s $5000 can't stream that. Open your mind and listen to Epic Games themselves instead of nitpicking.

“We’ve been working super close with Sony for quite a long time on storage,” he says. “The storage architecture on the PS5 is far ahead of anything you can buy on anything on PC for any amount of money right now. It’s going to help drive future PCs. [The PC market is] going to see this thing ship and say, ‘Oh wow, SSDs are going to need to catch up with this.”


You can "unread" that if you want and continue the mental gymnastics.
 

JeloSWE

Member
Who's talking about CPU anyway? That 15GB/s $5000 can't stream that. Open your mind and listen to Epic Games themselves instead of nitpicking.

“We’ve been working super close with Sony for quite a long time on storage,” he says. “The storage architecture on the PS5 is far ahead of anything you can buy on anything on PC for any amount of money right now. It’s going to help drive future PCs. [The PC market is] going to see this thing ship and say, ‘Oh wow, SSDs are going to need to catch up with this.”


You can "unread" that if you want and continue the mental gymnastics.
Sweeney is obviously lying /S
:messenger_tears_of_joy:
 

Bo_Hazem

Banned
That...that's not what he's saying, Bo. At least regards the Series systems, which can also do raw transfer through GPU native format. They have their own proprietary texture format, after all xD

Why it's not running on XSX anyway? Or PC? So now we'll go back to them being bribed and a bunch of liars and use funny theories instead?

It will scale down to less capable platforms, no one needs to panic.
 

hyperbertha

Member
Op has been disproven as its not 768mb/s. But just imagine the OP being true. Imagine Cerny actually designing 5.5 GB/s when games get maxed out at 760 before the GPU tanks. Makes lots of sense doesn't it?
 

Elog

Member
Do we actually have information on what exactly the PS5's I/O dedicated processor is? Is it a repurposed Zen 2 core? If so, how? Like as in is it cut down on local caches (which might also explain reason for the Cache Coherency Engines if it comes to that)?

At least I do not know. The key is that - assuming Cerny was truthful - that the dedicated hardware I/O allows for a bypassing of the kernel buffers straight into RAM (so VRAM on a console). The key latency in any system (albeit that consoles can make this much more streamlined than the PC platform) is from the kernel AFAIK.

Overall I think the estimate of overhead incurred by the Series systems for what you're describing is a bit much; keep in mind I believe MS were already aware of this and therefore that could've been a factor in them clocking their CPUs higher, to account for such overhead, whatever much it may be. You also have to keep in mind they are not literally dropping some PC version of Windows 10 into the system and leaving it there. Whatever overhead you might associate with W10 (speaking of you can always cut it down to even under 512 MB if you really wanted, granted you lose out on a lot of features), you can't really automatically associate with Series X because they already use their own OS, Xbox OS, that's built specifically for the console, even if it leverages Windows tech.

This is true but as long as you need to run the I/O through the kernel (but to your point in can be much more bare-bone on a console so a key question here is just how bare-bone the software solution on the XSX is) it will add latency. This is from what I understand the heart of the matter.

Similarly I don't see where you are hearing the latency on PS5 side is leagues better. Sony actually haven't talked much of anything about latency in any aspect of their system. I don't doubt they have low latency, but I think MS have simply prioritized this moreso, while Sony have prioritized bandwidth moreso.

Cerny's presentation was very much about latency as he talked about being able to transform the on-paper bandwidth of the SSD into practical bandwidth for the GPU. The first step is of course to ensure that all components have the correct bandwidth requirement but since latency is the key aspect to achieve the above, latency was at the core of his presentation. System latency is dominant when reading 100 of files into RAM continuously.
 

tryDEATH

Member
Those features are going to be standard in every UE5 game and it looks like Square is jumping on board with their own version too as Project Atheia looks similar graphically from what I can tell and it’s due out next year.



That game is supposed to come out next year? Any source on that, because that doesn't even sound like the real name of the game, but just a place holder, we also only got minuscule snippet of "gameplay".
 

DavidGzz

Member
Imagine being me. I will buy all platforms cause I have the money and I love games. Watching the fanboys fight is amazing. So many chuckles. On top of it all I'm juicy. Life is grand.
 
Top Bottom