• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

AMD working closely with Microsoft on DX12, details Asynchronous Shading

So if i get it right you have one main task stream where task are getting pumped into,
but if a ACE sees a gap big enough for it's smaller task it will interleave it within the task gap in the main stream?

Example:
ACE 1 is putting task on the main stream, and there is a gap worth of 10gflops of computation. Now ACE 4 has a task worth 9gflops it could interleave that task in the main stream.
 
All GCN GPUs should be DX12 compatible.
GCN1/2/3 GPUs (excluding the first gen "GCN0") should be able to handle DX12 FL12_0.
Fiji is likely to be the only discreet GPU from AMD which will be able to handle all DX12 features -- FL12.1+.


Mantle and other APIs has nothing to do with asynchronous compute. It was available in NV's Kepler since GK104 launch and in AMD's Tahiti since 7970 launch.


Yes. But it won't support all features of DX12.


No currently available AMD GPU will be "fully" DX12 compatible.


Mantle and AMD has nothing to do with asynchronous compute.


The efficiency of asynchronous compute in graphics is a moot point. It certainly is nice to have the feature but as for how much performance can actually be gained from asynchronous compute in games is up for discussion.


Maxwell 2 has 32 active queues while the latest GCN chips have 8. Not sure why you would get a GPU based on that metric as it is rather unclear by how much that feature is actually helping in games.


You can stop looking: all Maxwell 2 GPUs are fully compatible with the highest DX12 feature level.


Better compute capabilities are known to be a reason for a loss of gaming GPU marketshare. This is why gaming Keplers were cut down and why some compute features are cut from Maxwell 2 as well. Then again if the number of active queues is an indication of better compute capabilities then Maxwell 2 is four times ahead of Hawaii here actually.

So how much of this post is factual?
 

Irobot82

Member
So how much of this post is factual?

original.jpg


Here, I posted this in another thread. This is about the most complete information I've seen on it.

Edit: So yeah, most of that post is full of shit. There is measurable performance gains using Async Compute, devs have talked about it. Also, no Maxwell_2 is not fully DX12. No graphics card is currently.
 

dr_rus

Member
original.jpg


Here, I posted this in another thread. This is about the most complete information I've seen on it.

Edit: So yeah, most of that post is full of shit. There is measurable performance gains using Async Compute, devs have talked about it. Also, no Maxwell_2 is not fully DX12. No graphics card is currently.

What exactly is "full of shit" in this post? Async compute do not lead to "measurable performance gains" in all workloads and without a lot of tweaking hence why it's efficiency on PC is a moot point still. Those "devs" are on AMD's payroll or are developing for a fixed h/w console platform which is very a different situation. And nowhere did I say that "Maxwell 2 is fully DX12".

So how about you do us a favor and try to actually read the post?
 

Kezen

Banned
What exactly is "full of shit" in this post? Async compute do not lead to "measurable performance gains" in all workloads and without a lot of tweaking hence why it's efficiency on PC is a moot point still. Those "devs" are on AMD's payroll or are developing for a fixed h/w console platform which is very a different situation. And nowhere did I say that "Maxwell 2 is fully DX12".

So how about you do us a favor and try to actually read the post?

Sebbi on Beyond3d talked about 30% performance boost with async compute. The Oxide dev mentioned something along those lines referring to those who used it on consoles.

It's unfortunate that I will not see such gains considering Maxwell does not support this feature as well as GCN cards.

It's not a magic button but AMD invested in it for a good reason. If it didn't matter we would not be talking about async compute.

There is no need to downplay this feature, I would have preferred Nvidia to offer something nearly identical, alas that will have to wait until Pascal probably. It seems to be a major D3D12 feature so I'm surprised Nvidia are behind the curve.

I don't think Kepler or Maxwell cards will do poorly in DX12 games but not as good as their equivalently priced GCN cards.

I'm curious to see what gains will we see on DX12 games using this feature, to my knowledge there are none at the moment.
 

ZOONAMI

Junior Member
What exactly is "full of shit" in this post? Async compute do not lead to "measurable performance gains" in all workloads and without a lot of tweaking hence why it's efficiency on PC is a moot point still. Those "devs" are on AMD's payroll or are developing for a fixed h/w console platform which is very a different situation. And nowhere did I say that "Maxwell 2 is fully DX12".

So how about you do us a favor and try to actually read the post?

"You can stop looking: all Maxwell 2 GPUs are fully compatible with the highest DX12 feature level."

You can argue semantics if you want but people are going to interpret that as you saying Maxwell 2 has full dx12 support.
 

dr_rus

Member
Sebbi on Beyond3d talked about 30% performance boost with async compute. The Oxide dev mentioned something along those lines referring to those who used it on consoles.
This is the same Q's assessment of gains from async compute in Tomorrow's Children. This example is a) true for a PS4 console only and b) is actually somewhat light on asynchronous processing if their presentations on the matter are accurate. It is unknown if this is in fact a gain from running things asynchronously or a gain from doing stuff differently on the GCN architecture in question.

It's unfortunate that I will not see such gains considering Maxwell does not support this feature as well as GCN cards.
It is unknown right now if Maxwell handles the feature worse than GCN does. We don't have enough data to make such conclusion yet.

It's also a moot point of what is really unfortunate here - that you won't get some free performance out of async compute on Maxwell in D3D12/Vulkan or that all GCN owners will never get this additional performance anywhere but in D3D12/Vulkan.

It's not a magic button but AMD invested in it for a good reason. If it didn't matter we would not be talking about async compute.

There is no need to downplay this feature, I would have preferred Nvidia to offer something nearly identical, alas that will have to wait until Pascal probably. It seems to be a major D3D12 feature so I'm surprised Nvidia are behind the curve.
Nobody is downplaying anything. The only thing which I'm trying to accomplish is to kill some PR/marketing FUD around the issue as all things in GPUs are hardly as simple as they are presented by the interested parties.

As for the major D3D12 feature and NV being behind the curve - NV has support for several major D3D12 features which are completely unsupported by GCN. What's even more important, async compute as a feature has no requirement of being implemented in any specific way. The only requirement is to support it in the drivers - the hardware can run it serially or asynchronously with varying results depending on the architecture.

As I've already said, from a pure architectural excellence point of view a GPU which is running good in all APIs with one workload as good as with several is a better designed GPU than the one which is running at its peak only in D3D12/Vulkan and with several workloads in parallel. So instead of being disappointed you should be glad as your GPU doesn't have much things idle right now, without any need for a new APIs and games.

I don't think Kepler or Maxwell cards will do poorly in DX12 games but not as good as their equivalently priced GCN cards.

I'm curious to see what gains will we see on DX12 games using this feature, to my knowledge there are none at the moment.

Kepler will do bad in DX12 games as it is a purely DX11 architecture targeted at a lighter workloads of three years ago. GCN 1.0 won't do much better than Kepler though.
And Maxwell 2 will do DX12 just fine as it is a DX12 architecture. It's a safe bet that Pascal and even Volta will be a better DX12 architectures - but that's how things always are.

"You can stop looking: all Maxwell 2 GPUs are fully compatible with the highest DX12 feature level."

You can argue semantics if you want but people are going to interpret that as you saying Maxwell 2 has full dx12 support.

I'm sorry but I'm not responsible for people's reading skills. Maxwell 2 supports the highest feature level of DX12 while GCN 1.2 doesn't - this is what I've said and this is completely true.
 

Irobot82

Member
I'm sorry but I'm not responsible for people's reading skills. Maxwell 2 supports the highest feature level of DX12 while GCN 1.2 doesn't - this is what I've said and this is completely true.

Not according to the chart. It's missing a few features.
 

dr_rus

Member
Not according to the chart. It's missing a few features.

It is supporting the highest feature level of DX12 -- FL12_1. Your table shows this as well by the way. For all intents and purposes this is the maximum feature level to be used in DX12 unless there will be an update which will bring a higher feature level with it. It doesn't support every feature in DX12 and I never said that it does.
 

Kezen

Banned
This is the same Q's assessment of gains from async compute in Tomorrow's Children. This example is a) true for a PS4 console only and b) is actually somewhat light on asynchronous processing if their presentations on the matter are accurate. It is unknown if this is in fact a gain from running things asynchronously or a gain from doing stuff differently on the GCN architecture in question.


It is unknown right now if Maxwell handles the feature worse than GCN does. We don't have enough data to make such conclusion yet.

It's also a moot point of what is really unfortunate here - that you won't get some free performance out of async compute on Maxwell in D3D12/Vulkan or that all GCN owners will never get this additional performance anywhere but in D3D12/Vulkan.


Nobody is downplaying anything. The only thing which I'm trying to accomplish is to kill some PR/marketing FUD around the issue as all things in GPUs are hardly as simple as they are presented by the interested parties.

As for the major D3D12 feature and NV being behind the curve - NV has support for several major D3D12 features which are completely unsupported by GCN. What's even more important, async compute as a feature has no requirement of being implemented in any specific way. The only requirement is to support it in the drivers - the hardware can run it serially or asynchronously with varying results depending on the architecture.

As I've already said, from a pure architectural excellence point of view a GPU which is running good in all APIs with one workload as good as with several is a better designed GPU than the one which is running at its peak only in D3D12/Vulkan and with several workloads in parallel. So instead of being disappointed you should be glad as your GPU doesn't have much things idle right now, without any need for a new APIs and games.



Kepler will do bad in DX12 games as it is a purely DX11 architecture targeted at a lighter workloads of three years ago. GCN 1.0 won't do much better than Kepler though.
And Maxwell 2 will do DX12 just fine as it is a DX12 architecture. It's a safe bet that Pascal and even Volta will be a better DX12 architectures - but that's how things always are.



I'm sorry but I'm not responsible for people's reading skills. Maxwell 2 supports the highest feature level of DX12 while GCN 1.2 doesn't - this is what I've said and this is completely true.

I get that making full use of async compute requires a pipeline suited to it, but the benefits seem worth it. I would certainly not say no to 20-30% more performance, DX12 games using async compute are coming our way (Fable Legends, Deus Ex MK) and games using this on consoles (Mirror's Edge Catalyst, Tomb Raider); it's frustrating that those tasks will be serialized on PC due to DX11 but we can't do anything about that. That alone won't mean performance will be disappointing on PC but there will be a hit somewhere.
Basically I see no reason to be sceptical about those features, but that won't make me switch to AMD. I'll upgrade to Pascal next year.

I've noticed that this particular subject makes Nvidia owners very defensive and I don't get why, Nvidia hardware does not seem sufficiently equipped to handle async compute as well as GCN cards, so what ? We can't change that, I don't regret my 980 purchase.
I'm pleased to see more GPU features used where applicable even if I won't benefit much from it myself.
 

dr_rus

Member
I get that making full use of async compute requires a pipeline suited to it, but the benefits seem worth it. I would certainly not say no to 20-30% more performance, DX12 games using async compute are coming our way (Fable Legends, Deus Ex MK) and games using this on consoles (Mirror's Edge Catalyst, Tomb Raider); it's frustrating that those tasks will be serialized on PC due to DX11 but we can't do anything about that. That alone won't mean performance will be disappointing on PC but there will be a hit somewhere.
Basically I see no reason to be sceptical about those features, but that won't make me switch to AMD. I'll upgrade to Pascal next year.
The benefit from TLP is completely dependent on the architecture in question. Both Intel and AMD have been on and off and on again on TLP in their CPU architectures. Same can be said of GPU architectures - even more so as GPUs are running thousands of threads anyway inside the typical graphics workload so they're in less need of a higher level TLP switching between wavefront groups / job contexts / whatever to keep their resources occupied.

I'm not skeptical of the feature, I'm skeptical of the current Internet whining about not having GCN level efficiency support for the feature being end of the world somehow. Skylake iGPU is the most feature full DX12 implementation at the moment and as far as I know it doesn't support async compute at all - meaning that it runs everything in a serial fashion. This is as much of an architectural choice as anything else - async compute is not something that will improve performance automatically all the time, the rest of your GPU architecture must be suited for this. And while GCN clearly is - it is unknown if Maxwell or any other architecture would have the same levels of benefits from it and some evidence tend to suggest that at least Maxwell wouldn't.

Then there's also the question of how much performance even on GCN are we talking about? We have a peak number from PS4 at +30% which is definitely nice but it is unknown right now if this can even be transferred to the PC platform with it's loads of different h/w + s/w configurations. The AoS dev didn't provide any numbers on this at all which is rather fishy to me as it suggests that the performance improvements from async shaders on PC aren't that much even on GCN h/w.

Basically what I'm saying is that we need more real use cases tested before starting running around with our cocks out throwing our 980Tis and TitanXs out of the window. Async compute is pretty hard to make use of on an unstable PC platform and we need to see how it will pan out in games coming this and next year.
 
original.jpg


Here, I posted this in another thread. This is about the most complete information I've seen on it.

Edit: So yeah, most of that post is full of shit. There is measurable performance gains using Async Compute, devs have talked about it. Also, no Maxwell_2 is not fully DX12. No graphics card is currently.

Can we please have this chart updated to include new architectures such as Pascal and Polaris. Perhaps we can add rumored features in future architectures as well?
 

dr_rus

Member
Can we please have this chart updated to include new architectures such as Pascal and Polaris. Perhaps we can add rumored features in future architectures as well?

This chart is based on Wikipedia spreadsheet which you can find here (above the "See also" section): https://en.wikipedia.org/wiki/Feature_levels_in_Direct3D
It's updated (and maintained because some people are constantly breaking it by including nonsense and conjecture in it) pretty regularly there.

As for Pascal and Polaris: Polaris has zero changes in supported DX features; Pascal bumped Conservative Rasterization tier of Maxwell from 1 to 2 which is a very minor change.
 
Top Bottom