Valued Gamer
Banned
Not sure if this has been linked here before, but here's the full video presentation of the Series X Hot Chips talk.
One interesting point was on ML.
Said that the XSX ML can be used for super resolution.
They added a small amount of extra logic to the compute unit, for up to a 10 times performance gain over using standard shader ops.
The whole talk went straight over my head.This is the optional (originally RDNA1) vector ALU option that extends and scales RPM from FP16 to INT8 and INT4 (2x, 4x, 8x of the FP32 throughput respectively).
I found their stereotypical nerdiness comforting. It's the kind of grating, nasal tones that let's you trust something complicated is in safe hands.The whole talk went straight over my head.
After hearing these Microsoft tech nerds talk, it makes me appreciate Mark Cernys delivery lol
Those were two talks targetted at different publics. Hot chips conference is where you want to give as many detail as possible about the innords of a chip. What Cerny did, without wanting to minimize it whatsoever, was a semi-marketing talk.After hearing these Microsoft tech nerds talk, it makes me appreciate Mark Cernys delivery lol
I really like that bit about squashing the HAL so games are closer to the metal. The narrative here is usually that the HAL having to support so many different configurations (Xbox + PC) is really inefficient. Removing the unused parts makes that not a bad prospect at all.Also nice to have MS engineers confirm their API's and DirectX tools are indeed the PC versions with extra Xbox specific tools, and flat out state that the relative performance of a console is vastly (VASTLY) greater than it's PC equivalent.
Indeed, and that is actually one of my bigger concerns with the XSS/X/PC unified coding environment, so it's nice to have it directly addressed and know they've actively attempted to minimise any issues with the approach.I really like that bit about squashing the HAL so games are closer to the metal. The narrative here is usually that the HAL having to support so many different configurations (Xbox + PC) is really inefficient. Removing the unused parts makes that not a bad prospect at all.
Sounds like their audio hardware is actually pretty robust.
This is the optional (originally RDNA1) vector ALU option that extends and scales RPM from FP16 to INT8 and INT4 (2x, 4x, 8x of the FP32 throughput respectively).
Where is this coming from? Do you have any proof for that statement or is it 100% guesses?[..] or just like RPM, will remain a feature that's never been used even once [..]
Where is this coming from? Do you have any proof for that statement or is it 100% guesses?
Exactly, and that's why it brings my concerns whether the DirectML will ever be actually utilized, or just like RPM, will remain a feature that's never been used even once. I think before they'll figure it out there will already be new consoles on the horizon, and they'll go "nah screw it, we have so much power now we don't need it".
After RPM was introduced with the PS4 pro, the feature has become standard in both NVIDIA and AMD GPUs, so there is no need to talk about it anymore. It is a standard feature now, every vendor supports it and I assume it is heavily used in modern games. For developers there is just no downside to using it for shaders that do not need high precision data types. RPM is automatically used when the shader is compiled on supported hardware.It was (supposedly) used in FC5, where it didn't bring anything to the table, the X1X version was the better one, with the mythical 8.4TF on Pro nowhere to be found, and... that's pretty much it, the tech hasn't been mentioned ever since.
Do you mean the "bloated" shader arrays? I'd imaging that both PS5 and XSX look like Big Navi cut in half horizontally, but XSX with larger shader arrays.Spot the differences in architecture, not counting infinity cache,. winner gets a special GIF
[..]
One interesting point was on ML.
Said that the XSX ML can be used for super resolution.
They added a small amount of extra logic to the compute unit, for up to a 10 times performance gain over using standard shader ops.
I’m sure you are “concerned”.Exactly, and that's why it brings my concerns whether the DirectML will ever be actually utilized, or just like RPM, will remain a feature that's never been used even once. I think before they'll figure it out there will already be new consoles on the horizon, and they'll go "nah screw it, we have so much power now we don't need it".
This is essentially rapid pack maths. The added FP16, INT4 etc.One interesting point was on ML.
Said that the XSX ML can be used for super resolution.
They added a small amount of extra logic to the compute unit, for up to a 10 times performance gain over using standard shader ops.
Check out the performance of gears 5 running at 120hz. Split cache doesn’t seem to be having an impact on that game.We know that Xbox SX has 2x4MB l3 cache as Zen2 renoir APUs, I'm curious about PS5, if it has unified cores (similar to Zen3) and cache, for example 1x8MB 1X8Core, it would explain better performance in 120hz modes
What has Gears 5 to do with it ? We can't compare it with ps5. Low amount of split cache doesn't mean that you can't hit 120fps in all games. Split cache vs unified would give edge to ps5 in CPU heavy places, especially when they both use GDDR6 ( high latency memory )Check out the performance of gears 5 running at 120hz. Split cache doesn’t seem to be having an impact on that game.
Not sure, care to elaborate?Spot the differences in architecture, not counting infinity cache,. winner gets a special GIF
That game is built for the XSX. Doesn't mean it can't run better be easier to optimise if it were on other systems.Check out the performance of gears 5 running at 120hz. Split cache doesn’t seem to be having an impact on that game.
Spot the differences in architecture, not counting infinity cache,. winner gets a special GIF
What has Gears 5 to do with it ? We can't compare it with ps5. Low amount of split cache doesn't mean that you can't hit 120fps in all games. Split cache vs unified would give edge to ps5 in CPU heavy places, especially when they both use GDDR6 ( high latency memory )
I never said otherwise.That game is built for the XSX. Doesn't mean it can't run better be easier to optimise if it were on other systems.
Not sure, care to elaborate?
I don't think the red box on the XSX slide should include the command processor as part of the shader engine.
So more Work group Processors and less shader engines? Or if someone can elaborate. I really don't watch GPUs closely, which are not nVidia...5 vs 7 Work Group Processors in each shader array also 2 vs 4 shader engines
However Microsoft change that balance to give it's console some leeway for when games really start using RT. Suddenly those underutilized extra CUs will find work to do and the difference will start to show.
5 vs 7 Work Group Processors in each shader array also 2 vs 4 shader engines
2 points working against that positive outlook.
Will the 4 extra CUs tacked at end of shader array provide measurable benefits for RT? Surely AMD would have created their RDNA2 GPUs with such configuration if they didn't think it was not worth it by reason of diminishing returns?
Although admittedly I don't have much knowledge how RT works, but from what I gather it's performance is very heavily dependant on memory access and not a simple compute increase would fix it.
Exactly, and that's why it brings my concerns whether the DirectML will ever be actually utilized, or just like RPM, will remain a feature that's never been used even once. I think before they'll figure it out there will already be new consoles on the horizon, and they'll go "nah screw it, we have so much power now we don't need it".
Instead of having all these concerns about hardware people should be more concerned about Microsoft grit and tenacity to endure. They have a bad habit of giving up half way right? Will they keep their work and keep working this generation with all the new studios as promised or will they lose spirit again and start to promise to do better next time?
Also the smaller detail, look at the prim and raster on 6800, it is shared across both shader arrays likely for better distribution of work.
XSX slide even says distributed primatives and raster, its per shader array, which is same as 5700. It is likely due to the server application keeping shader arrays more private.....So you get an A-
Well, Ninja Theory are on record saying they've been utilizing it a lot in experiments and testing behind-the-scenes, so we should see if and how it's leveraged in their next game, which should be Hellblade II (and/or Project Mara).
That's where I'm at. Just looking at the temperature in the room, Sony's really started this gen running full-speed and between games already confirmed for 2021 (R&C Rift Apart, Horizon Forbidden West, GT7, Ragnarok) and some rumored stuff (Silent Hill, MGS Remake possibly with Cerny involved, etc.), plus the fact they might have demos for Ragnarok and MGS Remake "ready to go", that's going to build up a snowball that could be extremely difficult if not impossible to stop for MS if their 2021 is anything but amazing.
So first off the bat, they have to ensure 3P performance is up to snuff across the board. That means sorting out any bugs or missing features in the GDK APIs, and devs having enough time to familiarize themselves with them. They need some of those smaller exclusives like BMI, Exo-Mecha, Scorn, The Medium, they need to be well-received and at least modestly successful as commercial ventures.
They also need that port for FS2020 out by first half 2021, with a lot of extra content and some "gamey" play modes. If they can have Hellblade II ready by late 2021, all the better. Halo Infinite needs to be extremely solid to make up for the delay and tepid July presentation, and I'd still consider it necessary for MS to scale back on 343i and/or let one of the other teams (iD Software perhaps?) handle the next game while 343i focuses on something new.
Lastly, MS hopefully have been working with some 3P studios on some ecosystem exclusives that play into nostalgia. Sega and Tecmo would be two obvious picks, but if they've gotten something going with Capcom or Konami as well, that'd be nice. Preferably something that's both got a lot of nostalgia to it and could make for a pretty big release. That ties into something else: some kind of strong showing with a Japanese studio in 2021 would help round the year out for them. Again, it needs to be an ecosystem exclusive, something with a lot of cache but otherwise probably wouldn't of been made without funding by a platform holder.
If MS can do all of that, they can realistically at least stand pretty well against a potential onslaught from Sony in 2021. But if they drop the ball and too many of those things I mentioned just don't happen, and Sony's releases stay on course PLUS those rumored games end up dropping, IMHO the gen is already over. And while MS may not care about console sales, I think a tepid 2021 from them for Series X WILL hurt their push for Gamepass and Xcloud as well, in a trickle-down effect. Especially since Jim Ryan's out already saying they're working on something similar to Gamepass with PS5...whatever that actually is. And can't also forget they still have peripherals like PSVR2 in the pipe for (most likely) 2022.
When MS said they didn't see Sony and Nintendo as competitors, that honestly didn't rub me the right way, because who truly gives a shit about Amazon or Google when it comes to gaming? And Xbox is, all things considered, a video game brand. I understand in the grander corporate space what MS means when they say they're more focused on Amazon, Google etc., but those are competitors in fields outside of gaming predominantly. Gamers don't care about that, they care about what's up in the gaming space, which means they also care about Xbox in relation to PlayStation and Nintendo.
So whether MS likes it or not, their gaming initiatives are fundamentally rooted in competition with Sony's and Nintendo's, even if MS as a corporation is moreso concerned with other giants like Amazon, Google, Apple etc. However since their arguably biggest weapons against those giants is predicated on gaming (Gamepass, Xcloud), then they can't "throw the baby out with the bathwater", as the saying goes. They still need to take the traditional console market very seriously and that means they need to seriously consider what Sony are doing and do their absolute best to match, if not surpass, them on those metrics.
Because if they do, that makes their Gamepass and Xcloud push in more hardware-agnostic areas magnitudes easier and more likelihood of meeting (even exceeding) their projections.
Yeah like I suggested over on B3D MS might've gone with this set-up in order to better virtualize 4x One S instances on a single system. With that as a design goal it's likely better to give each Shader Array its own Raster Unit.
The fact it's set up that way could require a tad more work in ensuring good distribution of work across the full cluster of SAs and the four RUs, while (potentially) on PS5 this isn't the case if that part of their frontend moreso mirrors the PC cards. By default it'd mean a bigger need to schedule saturation of more CUs on Series X, and we could therefore be having instances of some of these 3P cross-gen games not doing so well with that, due to combinations of just balancing that scheduling being a bit trickier with more CUs (I'd imagine not that much trickier though; it's a GPU after all), missing features/incomplete state of some GDK tools that'd assist in that for devs, and probable lack of time familiarizing with those tools by even bigger 3P devs.
Hopefully those issues are resolved sooner rather than later.
If you read what Cerny said about bottlenecks, which will also be worse on larger shader arrays, details of the issue sbelow will be amplified
Paramater cache / LDS bottlenecks in summmary on a 10 CU system, on 14 CU will be worse by an unknown amount.
This is from the Cerny / ND patent, and suggests the new pipe line will be less costly for effects and overcomes the above bottlenecks .....
The live blog is not a direct translation but have everything that was talked posted.Is this video translated in English somewhere?
I mean, that could happen, but we don't know if MS have made any customizations on how some of these elements operate within their design. Doubt they just took the raw specifications from RDNA 1.0 and went with that, no changes to be had. They would be aware of these drawbacks, doubly so since they have made other design choices to allow the APU service multiple streaming instances.
Those kind of microscopic changes, if they've been made, wouldn't be covered at a Hot Chips, we'd need some patent uncoverings to see what work's been done there. I'd love to come across some of them. MS and Sony may not have the same frontend setup on this particular part of the pipeline but I don't think MS would go with something completely borking their design, either, to be realistic.
My gut tells me MS are really mainly interested in the cloud for expansion...... but I dont know.
The whole talk went straight over my head.
After hearing these Microsoft tech nerds talk, it makes me appreciate Mark Cernys delivery lol
But Hot Chips is for the nerds no?The whole talk went straight over my head.
After hearing these Microsoft tech nerds talk, it makes me appreciate Mark Cernys delivery lol
Lol mythical. Educate yourself. PS5 has 20.56 TF and XSX | S have 24.30 TF and 8.012 TF of compute perf at half precision (FP16), respectively. This isn't some myth.It was (supposedly) used in FC5, where it didn't bring anything to the table, the X1X version was the better one, with the mythical 8.4TF on Pro nowhere to be found, and... that's pretty much it, the tech hasn't been mentioned ever since.
Lol mythical. Educate yourself. PS5 has 20.56 TF and XSX | S have 24.30 TF and 8.012 TF of compute perf at half precision (FP16), respectively. This isn't some myth.
Go look up the GPU specs tab on AMD's website or look up the difference between FP16 and FP32. Again, it's not mythical, it's math. FP16 TF is meant to be 2x that of FP32's compute perf.1) Source please.
2) Even if it's there, it doesn't mean it's being used, at all, like in PS4 Pro, hence "mythical".
1) Source please.
2) Even if it's there, it doesn't mean it's being used, at all, like in PS4 Pro, hence "mythical".
Go look up the GPU specs tab on AMD's website or look up the difference between FP16 and FP32. Again, it's not mythical, it's math. FP16 TF is meant to be 2x that of FP32's compute perf.
2) Even if it's there, it doesn't mean it's being used, at all, like in PS4 Pro, hence "mythical".
Ah, right... RPM is only useful if it is used in Xbox titles... gotcha .
Quoting nyself one again since I see there's some serious reading comprehension issurs:
I never said that, quite the opposie actually, but what to expect from one of the most famous die-hard PS fanboys who seeks for console warring wherever possible, colour me surprised... Same as above, learn to read first, then write.