• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Next-Gen PS5 & XSX |OT| Console tEch threaD

Status
Not open for further replies.
giphy.gif


mR2zZGZ.gif
 

Ptarmiganx2

Member
Gotta agree with this. Gears 5 is a damn impressive looking game. I think it's more a question of taste but overall I think it's awfully hard to best Horizon: Zero Dawn and God of War. God of War was super impressive but they were areas and scenes of HZD that I literally just stood around looking and taking pictures because it was just incredible scenery.
I was awestruck at the graphics. I never thought a game would replace TLOU as my all time favorite game...I think HZD did...I go back and forth...
 

3liteDragon

Member
Has anyone thought about the possibility of a GTA 6 reveal on Thursday? I mean, they could show us like a 1 minute teaser running on PS5 and not even have the release year on it. Just saying there’s a chance, could happen or it won’t.
 
Last edited:
Has anyone thought about the possibility of a GTA 6 reveal on Thursday? I mean, they could show us like a 1 minute teaser running on PS5 and not even have the release year on it. Just saying there’s a chance, could happen or it won’t.

All they would have to do is show a teaser of Trevor dancing in his underwear.

200.gif


It would be cool to see the next GTA at any event.
 
can someone simplify this for us dummies? thank you lol
I will try to explain you each tweet in the simple way I can:

1)Increment the speed of the GPU has other effects than just increment TF number has the GPU contains many parts aside the CUs
like the Cache (think this as a superfast memory with low latency), ROP (the responsible for rasterization), TMU (those which manipule the textures)
between many other parts, some of them are attached to the quantity of CUs others not.

So if you is increment your speed in your GPU clock also makes all this parts work faster that is why a GPU like the PS5 in a real world could perform
better than for example 44 CUs to 1.825 Ghz and even as some parts of the GPU are not attached to number of CU so some kind of operations can be done
faster than even a bigger GPU.

But he talk in specific of one component the Cache (you had many level each bigger but slower), even the cache is the fastest memory in the system,
still exist a price to pay each time you want to read something for example 100 cycles just to access the L2, so your system is just doing nothing while this
time pass, probably 100 cycles sounds like nothing but remember in a game which run to 60 fps the engine only have 16.6 milliseconds to process all the
data for that frame so each cycle lost will hurt you.

This one of the reason of why Flops is only a metric for measure theoretical performance (for part of the GPU) because only with this cycles waste just accessing
your memory you already lost part of this peak performance, but if you increment your clock you decrease the cycles lost (will still exists) so you have a better use
of each flop (please don't put number for this gains is too complex to do that) .

2)Everytime a GPU need to refresh the data of the Cache basically is just flushed completely even when part of this could be still valid a this cost more cycles and bandwidth, this
why Cache Scrubber and Coherency engines exists. So imagine than instead to just flush all your memory, you only need to do it for a specific part so you payback is less
if you question is how much ? only a dev with years of experience, with access to the PS5 devkit and specialized in that part of the system can tell you with exactitude.

3) The advantage of be a system design for a specific purpose is you can add hardware which help you in reduce bottleneck which are considered even normal in the PC, which
as we know force many things just using brute force.
giphy.gif


This one of the reason of why Matt Hargett found hilarious when the fans of any side without any kind of knowledge
talk about specs. Is like when a virologist heard someone said "why is so hard to make a vaccine for Covid if is just a normal flu?" :lollipop_neutral:
 
Last edited:
T

Three Jackdaws

Unconfirmed Member
I will try to explain you each tweet in the simple way I can:

1)Increment the speed of the GPU has other effects than just increment TF number has the GPU contains many parts aside the CUs
like the Cache (think this as a superfast memory with low latency), ROP (the responsible for rasterization), TMU (those which manipule the textures)
between many other parts, some of them are attached to the quantity of CUs others not.

So if you is increment your speed in your GPU clock also makes all this parts work faster that is why a GPU like the PS5 in a real world could perform
better than for example 44 CUs to 1.825 Ghz and even as some parts of the GPU are not attached to number of CU some kind of operations can be done
faster than even a bigger GPU.

But he talk in specific of one component the Cache (you had many level each bigger but slower), even the cache is the fastest memory in the system,
still exist a price to pay each time you want to read something for example 100 cycles just to access the L2, so your system is just doing nothing while this
time pass, probably 100 cycles sounds like nothing but remember in a game which run to 60 fps the engine only have 16.6 milliseconds to process all the
data for that frame so each cycle lost will hurt you.

This one of the reason of why Flops is only a theoric performance (for part of the GPU) because only with this cycles waste just accessing your memory you
already lost part of this peak performance, but if you increment your clock you decrease the cycles lost (will still exists) so you have a better use of each flop
(please don't put number for this gains is too complex to do that) .

2)Everytime a GPU need to refresh the data of the Cache basically is just flushed completely even when part of this could be still valid a this cost more cycles and bandwidth, this
why Cache Scrubber and Coherency engines exists. So imagine than instead to just flush all your memory, you only need to do it for a specific part so you payback is less
if you question is how much ? only a dev with years of experience, with access to the PS5 devkit and specialized in that part of the system can tell you with exactitude.

3) The advantage of be a system design for a specific purpose is you can add hardware which help you in reduce bottleneck which are considered even normal in the PC, which
as we know force many things just using brute force.
giphy.gif


This one of the reason of why Matt Hargett found hilarious when the fans of any side without any kind of knowledge
talk about specs. Is like when a virologist heard someone said "why is so hard to make a vaccine for Covid if is just a normal flu?" :lollipop_neutral:
Another great explanation, I understand that there is only so much you can simplify because some of this stuff is a bit more sophisticated than usual. Thank you
 

Saberus

Member
Now, I don't know how true this is, so please take this as BS, but my buddy has a friend at EA up here in Vancouver, told him that the SSD is so fast (game code is assigned to priority levels?) and certain priority (lower) can bypass the ddr6 and get pumped directly into the GPU, said it makes it look like the PS5 has more memory than it has.. something like that.. dunno if true but always fun to speculate.
 

Shmunter

Member
Now, I don't know how true this is, so please take this as BS, but my buddy has a friend at EA up here in Vancouver, told him that the SSD is so fast (game code is assigned to priority levels?) and certain priority (lower) can bypass the ddr6 and get pumped directly into the GPU, said it makes it look like the PS5 has more memory than it has.. something like that.. dunno if true but always fun to speculate.
Unlikely, but the net effect is the same loading it into ram due to how quickly things can be cycled in and out on PS5. The priority channels increase the effectiveness even further.
 
Last edited:
Now, I don't know how true this is, so please take this as BS, but my buddy has a friend at EA up here in Vancouver, told him that the SSD is so fast (game code is assigned to priority levels?) and certain priority (lower) can bypass the ddr6 and get pumped directly into the GPU, said it makes it look like the PS5 has more memory than it has.. something like that.. dunno if true but always fun to speculate.
That is what I am understanding is happening. I'm no tech head but this much was alluded to by some more knowledgable members on this thread.
Hopefully they can expand on that.
 

Bo_Hazem

Banned
Think of the GPU CUs as buckets.

The theoretical maximum teraflops is how much water they can all hold if they were all filled to the brim multiplied by how many times per minute they are all moved from the source pool of water and poured out into the destination pool of water.

How full each bucket gets to be depends on the efficiency of the game code and the rest of the GPU machine that moves them around.

XSX is like having 52 buckets moving at 182mph.
PS5 is like having 36 buckets moving at 223mph.

Typical game code can only fill each bucket 30% full each time it comes around to the source pool again to fill up.

It takes time for the water to flow into these buckets as they scoop up water.

Sometimes when a bucket scoops up some water it can also catch a fish in it, and we don’t want that because my weird ass analogy says so.

In the 52 bucket version, if a fish is detected in the bucket as it’s moving from pool to pool, it has to empty the entire bucket out and go back and start scooping again.

In the 36 bucket version, if a fish is detected in the bucket as it’s moving from pool to pool, Mr Coherency Engine can tell Mr GPU cache scrubber to yoink the fish out and let the bucket and remaining water carry on its way.

This is simplified and the analogy could be expanded to account for more of what is going on and be a better fit for the kind of process that is happening, but this is a rough notion of what Matt was talking about.

Because the fish’d bucket hasn’t got to empty everything and start again the 36 bucket system averages a higher fill percentage per bucket because it never has empty ones to bring the average down.

Because the fish’d bucket doesn’t need to be refilled again from scratch, there are never any buckets being filled that will be tipped out and need to come back, meaning the average fill rate of buckets is higher in the 36 bucket system.

Because of this Matt is saying it’s not just as simple as counting the amount of buckets or the speed they travel.
In a perfect world with no fish the 52 bucket system delivers more water.
In a world with fish in your source water pool, you can’t make that calculation, and more than just that, Matt seems to think this fish plucking system is actually more significant than just counting buckets and bucket speed and multiplying them.

He’s suggesting the Coherency Engine and GPU cache scrubbers effectively increase CU occupancy by having them stall less often due to cache misses caused by entire caches being flushed instead of selectively pruned.
He’s suggesting that because refetching of cache data from system memory when entire caches are invalidated isn’t required, the system memory bandwidth requirement is less, because it’s not having to keep refetching the same pages as the good get thrown out with the bad.

It’s about efficiency vs brute force.
It’s about smarter buckets versus shit loads of them.

It’s suggesting what has been murmured for a while by developers in that there’s just not much difference between the two in real world game code. Not as much as comparing theoretical numbers might suggest.

PS5 is quite exotic, and these cache scrubbers are just another example of how streamlined and efficient the entire IO pipeline is to the point comparing numbers while assuming all other things are equal doesn’t really work. At least not to the extent some people are thinking.

It’s like if we had gasoline engines whereby spark plugs only fired 30% of the time in real world driving, even if they could do so 100% of the time in an engine stand in ideal conditions. If we wanted to compare two engines with different cylinder counts and maximum RPMs, but neglected to factor in that one has got some new and unconventional spark plugs that fire more than 30% of the time in real world driving, just multiplying the cylinder count by the RPM wouldn’t make for an accurate comparison.

PS5’s Coherency Engines and GPU cache scrubbers will make a real world difference to CU occupancy and effective GPU system memory bandwidth. We have no idea to what extent in typical game code. Matt thinks it will be significant.

One of the best posts I've ever read in this thread.

tenor.gif


Bookmarked! Thank you for your wonderful input!🤓
 

FeiRR

Banned
Because I think next gen games might cross the threshold and look believable enough to me. Also no they shouldn't all look ugly, tons of games make up for their lack of technical prowess with good art design. The last of use is just a game trying very hard to impress with realism that it just can't achieve.

I like tons of games that make the most of what they system is good at and can achieve instead of ones that aim for realism when they are so far from it.
It's an interesting take. As gamers, we're used to game graphics and tiny differences in the ways devs choose to show us world. But what if you take a look from a pure layman perspective? I know a video which tries to answer such a question. They asked an elderly lady who had no experience with games to rate graphics in various games. I think no gamer would agree with those but it's intriguing. Unfortunately, the video is just in Polish with no chance of subtitles but they display the scores she gives, at least, so maybe take a look. I had a lot of fun trying (and failing) to guess her predictions.

 
Status
Not open for further replies.
Top Bottom