there have been lots of hardware features that were suppose to speed things up, but due to poor implementations were graphics decelerators.
dynamic branching
geometry shaders
deferred contexts
plenty more
the twitter post i linked earlier is by a reputable guy according to dictator. pretty sure it was ROV he was saying was slow on nvidia
You are right on this one.
In the past several good looking ideas turned out underwhelming and didn't worked out in practise.
But Pixelsync (now as ROV in DX12) was already used in Grid 2 and ran very efficient.
Nvidia also published real time numbers for VXGI with an 980, which is far slower with Kepler, without CR.
I have good feelings that this round the new graphics features will be very useful.
Reaching here but....AMD has apparently got priority on HBM2 manufacturing next year. We know that they are also more experienced with the new memory type as they had a year head-start on Nvidia. Add to that, they've apparently got some kind of architectural advantage with DX12 games.
What's that I see in the distance, AMD rising from the ashes next-gen?
Maybe priority at Hynix but Samsung will also produce HBM next year.
AMD has a long journey in front before they truly can rise from the ashes.
They need products which can beat the competition at several key points, they need to be financial stable, they need a robust revenue stream, they need to invest in future products.
The outlook is hard for them.
2016 is a milestone for AMD, then we will know if this company can survive or not.
This doesn't make sense as Hyper-Q would simply not work as intended if that was the case. There's also no way of actually _forcing_ the async compute (i.e. making the pipeline to preempt prior to finishing its current workload) thus it can't make the performance worse. Worst case* for async compute is zero performance increase, not worse performance.
There should be the possibility to halt the workload and switch the context.
This would of course lead to worse performance.
NV is able to get DX12-like CPU overheads in DX11 with them so they clearly worth a lot when used properly.
No, only if the developer really screwed up with DX12.
Considering that the benchmark in question is a pure PC engine for a PC exclusive game I find it very strange that these guys are using some FL12_0+ features of GCN but don't use FL12_1 features of Maxwell - as this is something which I would expect from a console-focused engine mostly.
Of course not all features are equal and easy to implement.
Conservative Rasterization and Tiled Resources Tier 3 are definitely not straightforward to use without a clear target and use case in mind.
Completely agree but I still think that several accusations made in this post must be answered by NV - even if it's just to officially confirm them. It's very strange that async compute would lead to "abysmal" performance on NV's h/w especially as there is no clear way of actually switching it on or off in DX12 code. What the OG post imply - that NV provided them with the code which switched it off on their h/w doesn't make any sense either as there is no way of choosing to use or not async compute from inside the DX12 code. So there are a lot of questions around the whole post which should be answered by the IHV in question I think.
The developer can choose to put every command in one queue, instead of dispatching additional compute-queues alongside with some synchronisations points.
I would guess, that's the thing oxide did for Nvidia GPUs.