• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Oxide: Nvidia GPU's do not support DX12 Asynchronous Compute/Shaders.

Not seeing the difference in these shots to be quite honest, the curves don't seem massively impacted by tessellation. Are there specific points in any of those images that look substantially worse? I mean the usual candidates (perfect circles) like the weapon sights or the curves on that big rock seem identical. Crysis 3 is a bit of a pain in the arse for showing this anyway as the screenspace effects like CA are completely washing out details.

Surely you opened them in tabs and switched between them?

the first and last comparisons show the biggest diff, but those pictures do more to hurt his argument than mine imo

Would you say the non-rounded and non-displaced geometry looks better?

1 single graphical effect is rarely huge enough to single change every single pixel of the entire image. And expecting that of tesselation is unrealistic given what it does (round or displace). What makes these differences not signifcant and/or work against my argument?

In that case, the differences between console and PC versions is insignificant always because rarely does a dev make an entirely different renderer, assets catalogue, and otherwise for PC hardware.
 
the first and last comparisons show the biggest diff, but those pictures do more to hurt his argument than mine imo

From what I understand from artists on Beyond3D tessellation is a technique best deployed sparingly anyway as it imposes restrictions on the model creation pipeline. There was a great gif of GT6 showing dynamic tessellation on the dash dials during a replay sequence that showed off the effect quite well. NV have always offered the best tessellation tech but I struggle to come up with any stand out 'Wow' examples of it in the wild (could be chicken and egg problem as AMD are weak at this why bother?).
 

Crisium

Member
But then the first Nvidia series that allows DX12 compatibility (Fermi) is more than a year older than the first AMD card that does the same (Southern Islands).

And the current top end Nvidia card is several feature levels above the current top end AMD card.

Replying every untenable opinions I read here makes me look like a Nvidia fanboy. Just too many.

It really only comes down to the recent present when AMD launched GCN as this is when they settled on the same architecture for a half decade. GCN is older than Kepler, but no one could argue the latter was more forward thinking at all. And mentioning DX feature sets, GCN is again above Kepler and even Maxwell v1 despite being older. Meanwhile Nvidia is Fermi, Kepler, Maxwell, Pascal every two years like clockwork.
 
Surely you opened them in tabs and switched between them?



Would you say the non-rounded and non-displaced geometry looks better?

1 single graphical effect is rarely huge enough to single change every single pixel of the entire image. And expecting that of tesselation is unrealistic given what it does (round or displace). What makes these differences not signifcant and/or work against my argument?

In that case, the differences between console and PC versions is insignificant always because rarely does a dev make an entirely different renderer, assets catalogue, and otherwise for PC hardware.

i thnink in the first and last ones, tessellation looks marginally better. in the rest it just looks different. couldnt honestly say which is better.

i also think plenty of indivudal effects can make a huge difference.
 
Surely you opened them in tabs and switched between them?



Would you say the non-rounded and non-displaced geometry looks better?

1 single graphical effect is rarely huge enough to single change every single pixel of the entire image. And expecting that of tesselation is unrealistic given what it does (round or displace). What makes these differences not signifcant and/or work against my argument?

In that case, the differences between console and PC versions is insignificant always because rarely does a dev make an entirely different renderer, assets catalogue, and otherwise for PC hardware.

Lesson #467 Phone tabs != to looking on my PC.

Very obvious difference on the bigger screen. I could think of alternate ways to achieve the same effect (displacement mapping, more complex base geometry) and I hope devs do this as the current consoles are all based on AMDs weaker tessellation implementation. I think tessellation suffered in my mind thanks to the outsize claims made for it in the first cards, the Crysis 2 Barco of a Billion Vertexes didn't help either in making it seem like a vendor scam (like AMDs crappy mip map 'optimisations').
 
i thnink in the first and last ones, tessellation looks marginally better. in the rest it just looks different. couldnt honestly say which is better.
Moving around an object and having its shading not show immediate edges is something that helps immerse one in a world (something that is obviously not captured in a still image). Even then, I think any bit helps and, as long as it is not some naive phong tesselation, it is unambiguously better. Splitting hairs about whether something looks "marginally" better or not is something you hear from the worst opinions in DF threads regarding differences in PC SKUs.

Vegetation not having polygonal edges seems like a nice thing for me and is significant enough for me to enjoy it:
crysis3_2015_08_30_17xwuqt.png

crysis3_2015_08_30_174yu0b.png

crysis3_2015_08_30_1725u8h.png

crysis3_2015_08_30_17b1uha.png

crysis3_2015_08_30_17epuq3.png

crysis3_2015_08_30_17eau2y.png

Lesson #467 Phone tabs != to looking on my PC.

Very obvious difference on the bigger screen. I could think of alternate ways to achieve the same effect (displacement mapping, more complex base geometry) and I hope devs do this as the current consoles are all based on AMDs weaker tessellation implementation. I think tessellation suffered in my mind thanks to the outsize claims made for it in the first cards, the Crysis 2 Barco of a Billion Vertexes didn't help either in making it seem like a vendor scam (like AMDs crappy mip map 'optimisations').
Higher base geometry should always be a something we want. SO I can only agree. BTW though, the whole "Crysis 2 uses tesselation inefficiently" was debunked multiple times by Cryengineers and community alike years ago.
 

x3sphere

Member
I doubt this is going out turn out the way some are thinking. Whenever there's news that looks extremely positive for AMD, it seems NV always still manages to pull ahead somehow.
 

Locuza

Member
And then, we are talking about how Nvidia lacks Asynchronous Compute™ when that's the commercial name AMD gives to that feature. For your info, Nvidia calls it Dynamic Parallelism, and has been among us for several generations with CUDA.

Fud sprinkler.
No, Dynamic Parallelism or nested parallelism does not describe the ability to asynchronously compute command-buffers.
It is the ability to enqueue and spawn kernels from the GPU-Side of things, without going every time back to the CPU.
One feature which also supports AMD and Intel, but didn't managed into DX12.

And the current top end Nvidia card is several feature levels above the current top end AMD card.
12.0 AMD --> 12.1 Nvidia

One feature level = several?
 

frontieruk

Member
I think the main concern is not that the developer is blocking access to Nvidia, but rather that they set out to design their engine in such a way that it plays to AMD's strengths. We know the Nitrous engine was originally designed to show off the advantages of Mantle before DX12 came about. They make this clear in their own marketing materials:
https://www.youtube.com/watch?t=40&v=6PKxP30WxYM

Now, there is nothing wrong with that on its own, IF what we seeing is representative of what other games/engines will do in the future. On the other hand, if it is not representative, then it only serves to act as misleading propaganda. So is it representative? We know that Star Swarm / Ashes of Singularity is different from other games in that it focuses on having a large number of independent objects on screen at once and issues a very large number of draw calls, and that this is the reason it sees such a big performance increase from Mantle / DX12.

Did Oxide set out to build the best engine possible, or did they set out to build an engine that would be advantageous to AMD's architecture? Will other engines go down the same path? And if not, will they still see the same improvements? That is what remains to be seen, and I don't think there's anything wrong with being skeptical about it. Oxide has no track record that we can use to gauge their credibility.

See this is a reasonable argument, but as the game is an RTS it would make sense to have an engine that focuses on having a large number of independent objects on screen at once and issues a very large number of draw calls rather than UE4 right? Is it representative of 90% AAA games? No.

I remember seeing benchmarks comparing DX11 / DX12 and Mantle and DX12 had a slight advantage over Mantle on AMD hardware, which as DX12 and Mantle were both designed to overcome the same problems doesn't surprise me, if I was a conspiracy nut I'd probably imply AMD released Mantle to force Microsofts' hand with DX12, as if Mantle / OpenGL / Vulkan gained traction within the PC market their desktop market would probably fall a lot quicker than if DX remained the API of choice as porting to Linux would become easier.

I'm of the mind that AMD knew they could gain back their driver inefficiencies with Mantle which is why they invested heavily with EA and Dice on the frostbite engine (hasn't Dice said all Frostbite games next year will support DX12?) which pulled their performance in BF4 up to match NVidia.

Will there be a huge difference in next years games? It really depends on the engines, are they using Async GFX calls as the engines are designed for the consoles, if so worst case is we may see more effects on screen or maybe just higher performance at 4k for AMD where CPU/GPU bottle necks occur more frequently, but at the mid range NVidia will hold their own no matter what as their cards are efficient enough to be ahead, it'll just be AMD will be closer.
 

DieH@rd

Banned
See this is a reasonable argument, but as the game is an RTS it would make sense to have an engine that focuses on having a large number of independent objects on screen at once and issues a very large number of draw calls rather than UE4 right? Is it representative of 90% AAA games? No.

Tomorrow's Children is far from some narrow genre that aims for max object count, and it still managed to get significant gains with async compute [they used it to calculate fancy lightning].

So if done correctly, many tasks can be re-arranged with this technique, either making everything run or freeing up space for more rendering tasks that were not possible to be processed before.
 

mrklaw

MrArseFace
I like to think of async compute a bit like running furmark on your GPU. Everyone says that is an unrealistic stress test because when playing games you are never running your GPU at 100%. With async compute there is the potential to get more out of the GPU you have and get closer to 100% usage. One downside is not enough PC GPUs having the capability, but you should be able to brute force through that if you have enough power
 

DieH@rd

Banned
I like to think of async compute a bit like running furmark on your GPU. Everyone says that is an unrealistic stress test because when playing games you are never running your GPU at 100%. With async compute there is the potential to get more out of the GPU you have and get closer to 100% usage. One downside is not enough PC GPUs having the capability, but you should be able to brute force through that if you have enough power

Yeah. PS4 is gonna get louder when heavy async compute games start being introduced. :)
 

frontieruk

Member
Tomorrow's Children is far from some narrow genre that aims for max object count, and it still managed to get significant with async compute [they used it to calculate fancy lightning].

So if done correctly, many tasks can be re-arranged with this technique, either making everything run or freeing up space for more rendering tasks that were possible to be processed before.

Someone didn't read the whole post...

frontieruk said:
Will there be a huge difference in next years games? It really depends on the engines, are they using Async GFX calls as the engines are designed for the consoles, if so worst case is we may see more effects on screen or maybe just higher performance at 4k for AMD where CPU/GPU bottle necks occur more frequently, but at the mid range NVidia will hold their own no matter what as their cards are efficient enough to be ahead, it'll just be AMD will be closer.
 

mrklaw

MrArseFace
Tomorrow's Children is far from some narrow genre that aims for max object count, and it still managed to get significant with async compute [they used it to calculate fancy lightning].

So if done correctly, many tasks can be re-arranged with this technique, either making everything run or freeing up space for more rendering tasks that were possible to be processed before.

Also quite a few of the PS3 first party devs would be very job oriented and comfortable with decoupling some graphics tasks from the classic GPU handling - they'll have used CELL SPEs to augment the RSX. That potentially leads to them being more comfortable with using multiple compute jobs running in parallel alongside the normal GPU activities.

PS4 is kind of a logical extension of how programmers got the best from PS3
 

Easy_D

never left the stone age
Colors and that.

Please do! I want others to feel the pain I did. When I bought into DX11 hype and got a 5870 because it had tesselation. Then it turned out the 5870 was really fucking shit at tesselation :lol

Always wait for actual in-game results when drivers and such are in place to fully support the new stuff. Wouldn't surprise me if the gap was significantly smaller between AMD/Nvidia once everything is set in place. I do hope my 280x sees a nice little boost though, along with my 6300, which will most likely benefit more due to low IPC
 

Locuza

Member
I don't know what you are thinking but for AMDs GPUs the FLs are:

GCN Gen 1 = 11.1
GCN Gen 2 = 12.0
GCN Gen 3 = 12.0

and for Nvidia

Fermi = 11.0
Kepler = 11.0
Maxwell Version 1 = 11.0
Maxwell Version 2 = 12.1

Intel:

Haswell = 11.1
Broadwell = 11.1
Skylake = 12.1
 

Arkanius

Member
I don't know what you are thinking but for AMDs GPUs the FLs are:

GCN Gen 1 = 11.1
GCN Gen 2 = 12.0
GCN Gen 3 = 12.0

and for Nvidia

Fermi = 11.0
Kepler = 11.0
Maxwell Version 1 = 11.0
Maxwell Version 2 = 12.1

Intel:

Haswell = 11.1
Broadwell = 11.1
Skylake = 12.1

Actually if I'm not mistaken,GCN 1.0 has a feature level of 12.0 as well.
 

tokkun

Member
:) then I can do this!

The fact that Nvidia had a marketing agreement with Project CARS does mean you should be skeptical of people trying to draw broad conclusions off of just that game. The thing is that Project CARS is just one game that gets used in a benchmarking suite these days, so you have many other data points to use to judge if it is representative of broader performance trends. Any site that tried to benchmark a new GPU and only included numbers from Project CARS would get laughed out of the room.

The problem with the Ashes of Singularity benchmark is that you have all of these sites using it as the only data point and trying to make very broad conclusions about DX12 and future performance. And it isn't even based on a complete game. That is foolish.

Tomorrow's Children is far from some narrow genre that aims for max object count, and it still managed to get significant gains with async compute [they used it to calculate fancy lightning].

So if done correctly, many tasks can be re-arranged with this technique, either making everything run or freeing up space for more rendering tasks that were not possible to be processed before.

There is a concept in computer science called Amdahl's Law; to summarize it briefly, it states that the overall benefit you can get from speeding up any one part of of a computation is limited by how big of a fraction of the total computation time that part occupies.

With Tomorrow's Children, the question is how directly you can extrapolate the performance increases on a PS4 to what they would be on a PC. PCs have more powerful CPUs and GPUs, a different memory architecture, and the games are often run at higher resolution and with more effects. The bottlenecks on a PC may be different than on the PS4.

The issue becomes more convoluted because developers are aware of bottlenecks and design their engines around them. If Nvidia holds an 80% marketshare in PC discrete graphics, developers may hesitate to design an engine around AMD's strengths if they are doing PC-first development. We saw this pattern on consoles, where few game engines were designed to take advantage of Cell's strengths outside of PS3 exclusives. And Nvidia is already pushing their marketshare advantage hard with GameWorks. I think the hope for AMD that a few people have expressed is that engines designed for AMD chips in consoles will also run well on AMD chips in PCs and that developers will do console-first development. If so, it was a masterful strategic move on AMD's part, but it is not exactly guaranteed to play out that way.
 

anothertech

Member
Wow. Asynchronous compute, HBM, and cloud processing all seem to actually be a thing in the near future.

Future looks bright!
 
You're saying he is wrong to correct you?

Yup:

DX12FeatureLEvels-640x436.png


Every feature level is composed of several features and those features have their own tiers, as you can see on the graph above.

So it's safe talking about several differences between feature sets beyond just 0.1 difference.
 

tuxfool

Banned
Yup:

DX12FeatureLEvels-640x436.png


Every feature level is composed of several features and those features have their own tiers, as you can see on the graph above.

So it's safe talking about several differences between feature sets beyond just 0.1 difference.

Let me quote you:

And the current top end Nvidia card is several feature levels above the current top end AMD card.

You didn't call them dx12 features or sets. You called it Feature Levels.
 

vpance

Member
I like to think of async compute a bit like running furmark on your GPU. Everyone says that is an unrealistic stress test because when playing games you are never running your GPU at 100%. With async compute there is the potential to get more out of the GPU you have and get closer to 100% usage. One downside is not enough PC GPUs having the capability, but you should be able to brute force through that if you have enough power

I was thinking the same thing a while back about the Furmark like utilization. Devs may be using async compute loosely here and there, but the true examples of implementation won't be seen until next year and beyond.
 

frontieruk

Member
The fact that Nvidia had a marketing agreement with Project CARS does mean you should be skeptical of people trying to draw broad conclusions off of just that game. The thing is that Project CARS is just one game that gets used in a benchmarking suite these days, so you have many other data points to use to judge if it is representative of broader performance trends. Any site that tried to benchmark a new GPU and only included numbers from Project CARS would get laughed out of the room.

The problem with the Ashes of Singularity benchmark is that you have all of these sites using it as the only data point and trying to make very broad conclusions about DX12 and future performance. And it isn't even based on a complete game. That is foolish.



There is a concept in computer science called Amdahl's Law; to summarize it briefly, it states that the overall benefit you can get from speeding up any one part of of a computation is limited by how big of a fraction of the total computation time that part occupies.

With Tomorrow's Children, the question is how directly you can extrapolate the performance increases on a PS4 to what they would be on a PC. PCs have more powerful CPUs and GPUs, a different memory architecture, and the games are often run at higher resolution and with more effects. The bottlenecks on a PC may be different than on the PS4.

The issue becomes more convoluted because developers are aware of bottlenecks and design their engines around them. If Nvidia holds an 80% marketshare in PC discrete graphics, developers may hesitate to design an engine around AMD's strengths if they are doing PC-first development. We saw this pattern on consoles, where few game engines were designed to take advantage of Cell's strengths outside of PS3 exclusives. And Nvidia is already pushing their marketshare advantage hard with GameWorks. I think the hope for AMD that a few people have expressed is that engines designed for AMD chips in consoles will also run well on AMD chips in PCs and that developers will do console-first development. If so, it was a masterful strategic move on AMD's part, but it is not exactly guaranteed to play out that way.


Well said!

This gen is different from last gen when a game was made for the console then ported, due to the hardware differences, this gen you basically have x64 based Windows game machine and Linux games machine which means that you could create a PC engine and port to the console then optimise it there (ala Elite Dangerous), it's more likely though as with the recent Batman release that publishers are getting Devs to aim at consoles first meaning the PC is using an optimised engine for consoles.
 

nib95

Banned
Pretty interesting OP and tidbits of information. I wonder what this will mean for the Nvidia GPUs in question going forward. Also interested to see how the consoles, notably the PS4, use asynchronous compute to better effect in future.
 
Pretty interesting OP and tidbits of information. I wonder what this will mean for the Nvidia GPUs in question going forward. Also interested to see how the consoles, notably the PS4, use asynchronous compute to better effect in future.

Only problem is the OP is unspecific regarding how necessary it is on Maxwell 2 and if Maxwell 2 even has it. But it not being a part of Maxwell 1 (basically 1 GPU) and Kepler has been known for a while I think.
 

Locuza

Member
Yup:

DX12FeatureLEvels-640x436.png


Every feature level is composed of several features and those features have their own tiers, as you can see on the graph above.

So it's safe talking about several differences between feature sets beyond just 0.1 difference.
But you said:
And the current top end Nvidia card is several feature levels above the current top end AMD card.

Going from your original statement it was of course safe to say, that there is only 0.1 difference, because it is.
And a Feature-Level is a clear set, there are no differences between them.
Everything else is optional and must be explicit queried.

Now, if you want to talk about every feature, then of course we are looking at different things.

The table is wrong by the way, as many tables before and after it.
 
There aren't any AMD forums besides semiaccurate. The market share implies as much.

Every forum big enough will have at least one overenthusiastic, always ready to help and inform people about good and evil, AMD fan.

Call it CM if you want.


Let me quote you:

You don't call them dx12 features or sets. You called it Feature Levels.

How would you call some card having no conservative rasterization and the competitor one having it at tier 2? Or having Raster Order Views against not having it? You only need a given feature to be at tier 1 to be eligible for that feature level, but then GM2xx surpass those minimum requirements for DX12_1.

You can try to rethink the matter as much as you want, attack my wording or sacrifice a virgin engineer, but that doesn't change the fact that current Nvidia top end cards are better suited for DX12 according to Microsoft, the makers of the thing.

Reading this thread and some reactions, it looks like the other way around.


The table is wrong by the way, as many tables before and after it.

Yup, I know, but serves it purpose.
 

frontieruk

Member
Every forum big enough will have at least one overenthusiastic, always ready to help and inform people about good and evil, AMD fan.

Call it CM if you want.




How would you call some card having no conservative rasterization and the competitor one having it at tier 2? Or having Raster Order Views against not having it? You only need a given feature to be at tier 1 to be eligible for that feature level, but then GM2xx surpass those minimum requirements for DX12_1.

You can try to rethink the matter as much as you want, attack my wording or sacrifice a virgin engineer, but that doesn't change the fact that current Nvidia top end cards are better suited for DX12 according to Microsoft, the makers of the thing.

Reading this thread and some reactions, it looks like the other way around.




Yup, I know, but serves it purpose.

I missed that memo, Source?

I completely agree on the Raster Order Views btw, blows my mind AMD don't have HW support for it
 

Locuza

Member
How would you call some card having no conservative rasterization and the competitor one having it at tier 2? Or having Raster Order Views against not having it? You only need a given feature to be at tier 1 to be eligible for that feature level, but then GM2xx surpass those minimum requirements for DX12_1.

You can try to rethink the matter as much as you want, attack my wording or sacrifice a virgin engineer, but that doesn't change the fact that current Nvidia top end cards are better suited for DX12 according to Microsoft, the makers of the thing.

Reading this thread and some reactions, it looks like the other way around.
Simply looking at the FLs of course 12.0 against 12.1.
Not looking at the FLs,it becomes a lot more complicated.

GCN Gen 3 supports FL12.0 + Resource Binding Tier 3, Pixelshader Specified Stencil Reference and effective async computing.
(+FP16 and some additional UAV formats.)

Maxwell v2 supports FL12.1 + Conservative Rasterization Tier 1, Rasterizer Ordered Views and Tiled Resources Tier 3.

It's hard to say which is clearly better than the other and it could stay that way.
There are huge benefits for GCN coming with async compute, Nvidia could do some cool things with GameWorks for Conservative Rasterization, Tiled Resources Tier 3 and also ROVs are supported.
There is no clear conclusion to draw yet.

Yup, I know, but serves it purpose.
The purpose to show and say wrong things?
I thought you were against spreading fud.
 

Easy_D

never left the stone age
Man, starting to think I should've gotten a 960 over the 280x considering the 900 series supports the 12.0 featureset.
 

tuxfool

Banned
How would you call some card having no conservative rasterization and the competitor one having it at tier 2? Or having Raster Order Views against not having it? You only need a given feature to be at tier 1 to be eligible for that feature level, but then GM2xx surpass those minimum requirements for DX12_1.

You can try to rethink the matter as much as you want, attack my wording or sacrifice a virgin engineer, but that doesn't change the fact that current Nvidia top end cards are better suited for DX12 according to Microsoft, the makers of the thing.

Reading this thread and some reactions, it looks like the other way around.

My objection here is you misusing labels and acting smug when corrected. Additionally your other comments reek of a persecution complex here that isn't warranted.
 

tuxfool

Banned

tuxfool

Banned
Simply looking at the FLs of course 12.0 against 12.1.
Not looking at the FLs,it becomes a lot more complicated.

GCN Gen 3 supports FL12.0 + Resource Binding Tier 3, Pixelshader Specified Stencil Reference and effective async computing.
(+FP16 and some additional UAV formats.) .

Are you sure it supports FP16?
 
Top Bottom