Support NeoGAF

justsomeguy · Sep 3, 2015

Basically for any issue like this, go read what sebbbi says and ignore everyone else. He understands the tech and doesn't have any technology bias that I've noticed.

W!CK!D · Sep 3, 2015

dr. apocalipsis said:
It will be interesting to see how some people here will try to discredit Sebastian now.

You're still not getting it?

What he is saying is that this benchmark is not a real-world scenario. This is directly adressed to the people who said "Why should I even care? Nvidia is still faster in this benchmark, anyway...". (Didn't you say something alike?) He also says that you won't see CPU -> GPU -> CPU rountrips. DX12 is a PC API and these roundtrips are not rational in any way because of the copy overhead that you get with discrete GPUs. No unified architecture, no roundtrips. On PC, devs will use this feature for fancy graphics.

dr. apocalipsis · Sep 3, 2015

You got me with the coding for GDDR5 sentence.

Somebody should put it as your status.

Hamster Plugin · Sep 3, 2015

My current most anticipated games for PC are Just Cause 3 and Deus Ex Mankind Divided. They're both Square Enix games so they have an AMD marketing deal. Deus Ex will even use DirectX 12.

Should I expect these to perform much better on an AMD card, or at least the DX12-enabled Deus Ex?

Irobot82 · Sep 3, 2015

VariantX said:
Yep. I'm not going to pretend im super educated about this stuff, but I know enough that there's not enough info available right now, particularly with actual games being built with DX12 in mind to be worried about it at this point.

Hamster Plugin said:
My current most anticipated games for PC are Just Cause 3 and Deus Ex Mankind Divided. They're both Square Enix games so they have an AMD marketing deal. Deus Ex will even use DirectX 12.

Should I expect these to perform much better on an AMD card, or at least the DX12-enabled Deus Ex?

You should expect them to run very well on both cards. Avalanche makes great PC games and I think Nixes is doing the Deus Ex port, they also did a good job last time. Also no PhysX so that's a plus. They should both be well made PC games.

mckmas8808 · Sep 3, 2015

W!CK!D said:
First of all, there is no need for personal attacks.

A CPU is a processor that consists of a small number of big processing cores. A GPU is a processor that consists of a very huge number of small processing cores. Therefore most home PCs have multiple processors. "APU" is a marketing term for a single processor that consists of different kinds of processing cores. In the case of the PS4, the APU has two Jaguar modules with four x86 cores each and 18 GCN compute units with 64 shader cores each. You distinguish processors as the following: Single core (like Intel Pentium), multi core (Intel Core i7 or any GPU), hetero core (APUs like the one in PS4) and cloud core (Microsoft Azure for example).

If you take a look back, the evolution of computer technology was always about maximum integration. The reason for that you is want to minimize latency as much as possible. A couple of years ago, GPUs only had fixed-function hardware. That means that every core of the GPU was specialized for a certain task. That changed with the so called unified shader model. Today, the shader cores of a modern GPU are freely programmable. Just think of them as extremely stupid CPU cores. The advantage of a freely programmable GPU, however, is that you have thousands of those cores. The PS4 has 1156 shader cores. That makes a GPU perfectly suited for tasks that benefit from mass parallelization like graphics rendering. You can also utilize them for general purpose computations (GPGPU) which, in theory, opens up a whole new world of possibilities since the brute force of a GPU is much higher than the computational power of a traditional CPU. In practice, however, the possibilities of GPGPU are limited by latency.

If you want to do GPGPU on a traditional gaming PC, you have to copy your data from your RAM pool over the PCIe to your VRAM pool. The process of copying costs latency. A roundtrip from CPU -> GPU -> CPU usually takes so long that the performance gain from utilizing the thousands of shader cores gets immediately eaten up by the additional latency: Even if the GPU is much faster at solving the task than the CPU, the process of copying the data back and forth will make the GPGPU approach slower than letting the CPU do it on its own. That's the reason why GPGPU today is only used for things that don't need to be send back to the CPU. The possibilities on a traditional PC are very limited.

The next step in integration is the so-called hetero core processor. You integrate the CPU cores as well as the GPU shader cores on a single processor die and give them one unified RAM pool to work with. That will allow you to get rid of that nasty copy overhead. Till this day, the PS4 has the most powerful hetero core processor (2 TFLOPS @ 176GB/s) available. Not only that, since the APU in PS4 was built for async compute (see Cerny interviews), it can do GPGPU without negatively affecting graphics rendering performance. It's a pretty awesome system architecture, if you want my opinion.

The only problem is, that PC gamers don't have a unified system architecture. The developers of multiplatform engines have to consider that fact. 1st-party console devs can fully utilize the architecture, though.

WOW!! That was a great explanation!

mephixto · Sep 3, 2015

dr. apocalipsis said:
You got me with the coding for GDDR5 sentence.

Somebody should put it as your status.

I remember that avatar, I think he was banned like a year ago, at that time he was following Pope Cerny and the hUMA church.It seems he made a new account.

Caayn · Sep 3, 2015

W!CK!D said:
First of all, there is no need for personal attacks.

A CPU is a processor that consists of a small number of big processing cores. A GPU is a processor that consists of a very huge number of small processing cores. Therefore most home PCs have multiple processors. "APU" is a marketing term for a single processor that consists of different kinds of processing cores. In the case of the PS4, the APU has two Jaguar modules with four x86 cores each and 18 GCN compute units with 64 shader cores each. You distinguish processors as the following: Single core (like Intel Pentium), multi core (Intel Core i7 or any GPU), hetero core (APUs like the one in PS4) and cloud core (Microsoft Azure for example).

If you take a look back, the evolution of computer technology was always about maximum integration. The reason for that you is want to minimize latency as much as possible. A couple of years ago, GPUs only had fixed-function hardware. That means that every core of the GPU was specialized for a certain task. That changed with the so called unified shader model. Today, the shader cores of a modern GPU are freely programmable. Just think of them as extremely stupid CPU cores. The advantage of a freely programmable GPU, however, is that you have thousands of those cores. The PS4 has 1156 shader cores. That makes a GPU perfectly suited for tasks that benefit from mass parallelization like graphics rendering. You can also utilize them for general purpose computations (GPGPU) which, in theory, opens up a whole new world of possibilities since the brute force of a GPU is much higher than the computational power of a traditional CPU. In practice, however, the possibilities of GPGPU are limited by latency.

If you want to do GPGPU on a traditional gaming PC, you have to copy your data from your RAM pool over the PCIe to your VRAM pool. The process of copying costs latency. A roundtrip from CPU -> GPU -> CPU usually takes so long that the performance gain from utilizing the thousands of shader cores gets immediately eaten up by the additional latency: Even if the GPU is much faster at solving the task than the CPU, the process of copying the data back and forth will make the GPGPU approach slower than letting the CPU do it on its own. That's the reason why GPGPU today is only used for things that don't need to be send back to the CPU. The possibilities on a traditional PC are very limited.

The next step in integration is the so-called hetero core processor. You integrate the CPU cores as well as the GPU shader cores on a single processor die and give them one unified RAM pool to work with. That will allow you to get rid of that nasty copy overhead. Till this day, the PS4 has the most powerful hetero core processor (2 TFLOPS @ 176GB/s) available. Not only that, since the APU in PS4 was built for async compute (see Cerny interviews), it can do GPGPU without negatively affecting graphics rendering performance. It's a pretty awesome system architecture, if you want my opinion.

The only problem is, that PC gamers don't have a unified system architecture. The developers of multiplatform engines have to consider that fact. 1st-party console devs can fully utilize the architecture, though.

I do wonder though, do you work for Sony's PR team?

Locuza · Sep 4, 2015

dr. apocalipsis said:
This is like saying Tessellation was a key feature of DX10 because many cards of the era were able to support it, but it wasn't a requisite of DX set until DX11.

Who knows, maybe Horse Armour is right and it becomes a requisite for DX13.

Except the little difference that Tessellation wasn't available in DX10 at all, in contrast to Async Compute in DX12.
Actually it is a requisite in DX12, but there is no requisite how to implement it.

justsomeguy said:
Interesting - so maybe the XBox's 2 ACE units are fine after all, if I've understood correctly.

I would claim it is.

W!CK!D said:
It's too early to judge the value of Sony's additional resources. They didn't pack 8 ACEs and 64 queues for no reason.

I think for most cases they did.
Kabini has 4 ACEs with 32 queues, for 128 ALUs...
But AMD wouldn't pack so much for no reasons right?

dr. apocalipsis said:
Sweet lord...

He's not necessarily wrong. The access granularity of memory becomes bigger with HBM, which is affecting which code is ideal.

Arkanius said:
Beyond3D have gotten more results, and it Async Computing in Maxwell 2 is "supported" through the Driver offloading the Computing calculations to the CPU and back, hence the huge delay added, and why it was faster for Oxide t

This is a very wrong speculation.
No calculations are offloaded to the CPU.

W!CK!D · Sep 4, 2015

mephixto said:
I remember that avatar, I think he was banned like a year ago, at that time he was following Pope Cerny and the hUMA church.It seems he made a new account.

Okay, let's get this out of the way once and for all:

I never received a single ban on NeoGAF.

More than a year ago, there was a small group of NeoGAF users, all of them PC gamers, that were extremely angry about the things I wrote. Back in that time, I tried to explain how ACEs are going to work within the new consoles, what GPGPU is, the benefits of hetero-core processors, what low-level APIs are (that was before Mantle was even announced, by the way) and that the Microsoft PR is lying about the power of the new Xbox.

The problem was, that the aforementioned group of users wasn't able to prove me wrong with arguments. So they decided to do pretty much the same as you're doing now: They attacked and discredited me whenever possible which led to dozens of derailed threads. I thought the problem was me and so I contacted Bish (go ask him!) to delete my account and all my posts. If you need a prove, try searching for any of old my old posts (username W!CKED). They're all deleted.

I became a "silent" NeoGAF user and even though pretty much everything I claimed during my time as an active user turned out to be true, I realized that I was completely wrong about another thing: The group that discredited me didn't stop when I was gone. They didn't even understand why I was gone. They interpreted the situation as a victory for themselves and so they kept spreading their bullshit.

A couple of weeks ago I signed up for NeoGAF again with the exact same email adress I used last time.

Freshmaker · Sep 4, 2015

Caayn said:
I do wonder though, do you work for Sony's PR team?

Don't really get the salt. He just explained the basic idea behind the PS4 architecture. He didn't say it was more powerful than PC hardware except vs other integrated solutions which doesn't really amount to much when the competition is Intel 4000 graphics or whatever.

As far as system warz stuff goes, that was mild as heck.

dr. apocalipsis · Sep 4, 2015

Locuza said:
Except the little difference that Tessellation wasn't available in DX10 at all, in contrast to Async Compute in DX12.
Actually it is a requisite in DX12, but there is no requisite how to implement it.

No, stop this already. That claim IS WRONG.

Kollock said:
I don't believe there is any specific requirement that Async Compute be required for D3D12, but perhaps I misread the spec.

He's not necessarily wrong. The access granularity of memory becomes bigger with HBM, which is affecting which code is ideal.

Nothing to do. It is related to GCN being geometry limited, thing that Nvidia exploited. Some time ago, I remember explaining in this forum how GPUs can't perfectly scale just by adding shader processors because they would hit some other architectural bottlenecks. This is one of those cases. Of course, many enthusiastic users reacted against me because it was against their agenda.

W!CK!D said:
Okay, let's get this out of the way once and for all:

You are just saying nonsense and leading people to confusion.

I have to admit that terms like 'cloud core' are totally genius, though. But that post that many users are giving you thanks for is undocumented bullshit for the most part, misleading and tendentious. You are just blowing the embers of ignorance and console brawls.

There are 2 concepts that are leading to confusion here:

Asynchronous Shaders: The AMD GCN hardware capability of injecting compute tasks during the rendering pipeline, hyperthreading alike, allowing them to be completed without slowing down significantly the original rendering tasks. This is an AMD only HW feature and Nvidia won't be able to match it using software or patches of any kind because Maxwell doesn't have that capacity. Period.

Asynchronous Compute: Being able to run compute along rendering tasks. Most DX11 GPUs are able to do this, but most of them with performance losses because of the need of context switching. Nvidia still can improve their performance offloading scheduling tasks to the CPU via drivers, this is, software.

Anything else about unified memory, clouds and GDDR5 is just pure bias and the chance some people found to stick the tip in.

derExperte · Sep 4, 2015

W!CK!D said:
Okay, let's get this out of the way once and for all:

I never received a single ban on NeoGAF.

More than a year ago, there was a small group of NeoGAF users, all of them PC gamers, that were extremely angry about the things I wrote.

This is a gross misrepresentation of what happened. You pretty much went off the rails at the end and ruined your already flimsy reputation with posts like this. That infamous Crysis thread was a mess too thanks in large parts to you. Trying to put the blame solely on others is just pathetic and I urge everyone to be very careful about believing anything W!CKED says. The 'don't attack me, attack my arguments' shtick is just part of the act, many people tried to reason with him back in the day but he'll wiggle around, evade, delete parts of comments he can't counter and post more bullcrap until someone gets fed up. Too bad some are starting to fall for it again.

Locuza · Sep 4, 2015

dr. apocalipsis said:
No, stop this already. That claim IS WRONG.

No.

Look how Nvidia themself see it:

Multi-Engine is an API, part of DX12, which every vendor must support.
That's a statement also Andrew Lauritzen did from Intel:

Andrew Lauritzen said:
When someone says that an architecture does or doesn't support "async compute/shaders" it is already an ambiguous statement (particularly for the latter). All DX12 implementations must support the API (i.e. there is no caps bit for "async compute", because such a thing doesn't really even make sense), although how they implement it under the hood may differ. This is the same as with many other features in the API.

Another valuable thing Andrew is stating:

Andrew said:
From an architecture point of view, a more well-formed question is "can a given implementation ever be running 3D and compute workloads simultaneously, and at what granularity in hardware?" Gen9 cannot run 3D and compute simultaneously, as we've referenced in our slides. However what that means in practice is entirely workload dependent, and anyone asking the first question should also be asking questions about "how much execution unit idle time is there in workload X/Y/Z", "what is the granularity and overhead of preemption", etc. All of these things - most of all the workload - are relevant when determining how efficiently a given situation maps to a given architecture.

Without that context you're effectively in making claims like 8 cores are always better than 4 cores (regardless of architecture) because they can run 8 things simultaneously. Hopefully folks on this site understand why that's not particularly useful.

... and if anyone starts talking about numbers of hardware queues and ACEs and whatever else you can pretty safely ignore that as marketing/fanboy nonsense that is just adding more confusion rather than useful information.

https://forum.beyond3d.com/threads/intel-gen9-skylake.57204/page-6#post-1869935"]https://forum.beyond3d.com/threads/intel-gen9-skylake.57204/page-6#post-1869935[/URL]

Nothing to do. It is related to GCN being geometry limited, thing that Nvidia exploited. Some time ago, I remember explaining in this forum how GPUs can't perfectly scale just by adding shader processors because they would hit some other architectural bottlenecks. This is one of those cases. Of course, many enthusiastic users reacted against me because it was against their agenda.

I can't follow you here. The context was optimized code for GDDR5 and HBM on Fury. (although I believe HBM1 from Hynix has the same granularity like GDDR5, only later on they will double on this).

hooijdonk17 · Sep 4, 2015

dr. apocalipsis is still arguing that multi-engine ergo ACE are not a DX12 specification and people still have not put him on ignore. What is the world coming to..

Dictator93 · Sep 4, 2015

Locuza said:
Another valuable thing Andrew is stating:

I can't follow you here. The context was optimized code for GDDR5 and HBM on Fury. (although I believe HBM1 from Hynix has the same granularity like GDDR5, only later on they will double on this).

Andrews comment is pretty valuable thanks for sharing.

ALso, I am pretty sure the differences between Gen 1 HBM and GDDR5 should not be the reasons why FUry X is performing only a bit better than A 290X in the benchmark. I do not think there is such a large distinction between the two... as was made above. NAmely you have to code differently for higher bandwidth...

Andrew's comments and even the Oxide dev's comment about Async not being very germane to all architectures is starting to sound like one of the more reasonable opinions out there.

Yes, the scheduling is non-trivial and not really something an application can do well either, but GCN tends to leave a lot of units idle from what I can tell, and thus it needs this sort of mechanism the most. I fully expect applications to tweak themselves for GCN/consoles and then basically have that all undone by the next architectures from each IHV that have different characteristics. If GCN wasn't in the consoles I wouldn't really expect ISVs to care about this very much. Suffice it to say I'm not convinced that it's a magical panacea of portable performance that has just been hiding and waiting for DX12 to expose it.

SeeNoWeevil · Sep 4, 2015

DX12 Async Compute Latency thread

So it looks like the B3D test isn't resulting in work going in the compute queue on a 980Ti but instead it's going in the graphics queue. PhysX and the Window manager *are* putting stuff in the compute queue and showing signs of asynchronous behaviour. Interesting.

MaLDo · Sep 4, 2015

mephixto said:
I remember that avatar, I think he was banned like a year ago, at that time he was following Pope Cerny and the hUMA church.It seems he made a new account.

http://www.neogaf.com/forum/showpost.php?p=83107485&postcount=25

Locuza · Sep 4, 2015

hooijdonk17 said:
dr. apocalipsis is still arguing that multi-engine ergo ACE are not a DX12 specification and people still have not put him on ignore. What is the world coming to..

Allow me to be nitpicky.
Ergo ACE is not quite right, since ACE is an AMD specific term and also a hardware specific implementation ergo not part of the DX12 specifications, which "only" specify what and how you can do certain things, but not how the hardware must handle it.

Dictator93 said:
ALso, I am pretty sure the differences between Gen 1 HBM and GDDR5 should not be the reasons why FUry X is performing only a bit better than A 290X in the benchmark. I do not think there is such a large distinction between the two... as was made above. NAmely you have to code differently for higher bandwidth...

No, I agree, it shouldn't be in that case.
Actually I just wanted to say that HBM brings some differences were certain adjustments needs to be made.
(The memory-controller handles more channels now, the load-balancing may be tricky, the access granularity of HBM2 should be 64-Byte in the future)

But I didn't made a clear statement.

Diablos · Sep 4, 2015

I'm so confused. Are nVidia GPU's gimped when it comes to Async Computer or not??

dr. apocalipsis · Sep 4, 2015

Locuza said:
No.

Look how Nvidia themself see it:

Apparently More Control and Reduced Overhead are requisites of DX12 instead of benefits too, if that makes any sense for you.

That slide is talking about DX12 features AND their benefits, the ability for Async Compute being a benefit, NOT a requisite. I was about to post that slide yesterday, but I thought devs word would suffice.

Both Maxwell 2 and Intel Skylakes Gen9 Intel HD are considered higher DX12 tier than GCN, and neither of those 2 support Async Shaders as GCN does. What does that tell to you?

Some of you having the need of pushing this is beyond me.

Locuza said:
Multi-Engine is an API, part of DX12, which every vendor must support.
That's a statement also Andrew Lauritzen did from Intel:

... and if anyone starts talking about numbers of hardware queues and ACEs and whatever else you can pretty safely ignore that as marketing/fanboy nonsense that is just adding more confusion rather than useful information.

Click to expand...

As some of us have been saying all this time.

But then, the GPUs engines AREN'T an API. An API is the software layer capable -or not- of making use of a given hardware. When we talk about DX12 or Vulkan being able to discover certain HW feature we talk about that API being able to make use of it, not of it being it.

Locuza said:
I can't follow you here. The context was optimized code for GDDR5 and HBM on Fury. (although I believe HBM1 from Hynix has the same granularity like GDDR5, only later on they will double on this).

All I say is it doesn't makes any sense at all. DDR1, DDR4 or magic, it doesn't matter, that just refer to memory standards. All your processor will care about if for more bandwidth and storage being good, less latencies and stalls too. There is not such a thing as coding for GDDR5. As much, you should accomodate to the bandwidth or bus width available when optimizing for certain devices. That's all.

FuryX not being better than 290x have little to do with the memory interface, and it's related to architecture limitations that aren't solved throwing there more shaders.

hooijdonk17 said:
dr. apocalipsis is still arguing that multi-engine ergo ACE are not a DX12 specification and people still have not put him on ignore. What is the world coming to..

Of course ACE is not a DX12 specification, it's just the name of the AMD compute controller. Just one more proof of how clueless you are at this subject.

dr. apocalipsis · Sep 4, 2015

Diablos said:
I'm so confused. Are nVidia GPU's gimped when it comes to Async Computer or not??

All this time NVIDIA has been bypassing DX10/11 limits through their propietary GPGPU solution CUDA. Now DX12 is standarizing all this compute stuff and we still have to know if Nvidia will be able to accomodate their current architectures to this new standard efficiently.

We know Nvidia has been pushing GPGPU since G80 era. We know Maxwell 2 has strong compute abilities where little Kepler (GTX700 consumer GPUs) doesn't.

All we need to know is if Nvidia will be able to make Maxwell cards work as DX12 wants or if they will need to make another software workaround. Basically, we are on the dark for now.

Diablos · Sep 4, 2015

dr. apocalipsis said:
All this time NVIDIA has been bypassing DX10/11 limits through their propietary GPGPU solution CUDA. Now DX12 is standarizing all this compute stuff and we still have to know if Nvidia will be able to accomodate their current architectures to this new standard efficiently.

We know Nvidia has been pushing GPGPU since G80 era. We know Maxwell 2 has strong compute abilities where little Kepler (GTX700 consumer GPUs) doesn't.

All we need to know is if Nvidia will be able to make Maxwell cards work as DX12 wants or if they will need to make another software workaround. Basically, we are on the dark for now.

I see. CAN they make a software workaround or are we just assuming they could?

Also, what about cards such as my 660?

Kezen · Sep 4, 2015

Diablos said:
Also, what about cards such as my 660?

I'm no expert but I don't foresee your 660 doing very well, and it's decidedly not great at compute at all.

Diablos · Sep 4, 2015

Kezen said:
I'm no expert but I don't foresee your 660 doing very well, and it's decidedly not great at compute at all.

I'm just wondering if it would ever be capable, period. Bear in mind I use it for 1080p gaming with med/high settings.

dr. apocalipsis · Sep 4, 2015

You can't expect much about a mid-range Kepler card concerning compute to start with.

Anyway, you will enjoy the baseline boost coming from DX12. Just not any of the new features.

Locuza · Sep 4, 2015

dr. apocalipsis said:
That slide is talking about DX12 features AND their benefits, the ability for Async Compute being a benefit, NOT a requisite. I was about to post that slide yesterday, but I thought devs word would suffice.

I really need help with your viewpoint.

Multi-Engine (Async compute/Async Copy) is like ExecuteIndirect an integral part of DX12.
It's not optional to support it or not, every vendor must support Multi-Engine.

tuxfool · Sep 4, 2015

Yikes

http://www.twitch.tv/thetechreport/v/14316298

go to 01:21:05 (roughly, hang around for a bit)

This seems to explain why Oculus is using AMD gpus on all their demos.

Easy_D · Sep 4, 2015

tuxfool said:
Yikes

http://www.twitch.tv/thetechreport/v/14316298

go to 01:28:40

What. That's when the video ends

tuxfool · Sep 4, 2015

Easy_D said:
What. That's when the video ends

Fixed it. Thanks

DouglasteR · Sep 4, 2015

tuxfool said:
Yikes

http://www.twitch.tv/thetechreport/v/14316298

go to 01:21:05

This seems to explain why Oculus is using AMD gpus on all their demos.

Well, they explicit said that their demos ran on the 980ti (Valve said that too)

tuxfool · Sep 4, 2015

DouglasteR said:
Well, they explicit said that their demos ran on the 980ti (Valve said that too)

They have previously stated that all their demo machines for recent events are running Fury cards. Of course it runs on the 980ti, just have to do more work to get it working nicely for VR purposes.

dr. apocalipsis · Sep 4, 2015

Locuza said:
I really need help with your viewpoint.

Multi-Engine (Async compute/Async Copy) is like ExecuteIndirect an integral part of DX12.
It's not optional to support it or not, every vendor must support Multi-Engine.

Multi-engine isn't Async compute/Async Copy.

It is the queue and priority structure that allows parallelization. Every DX11 card have it, every DX11 card will benefit from it on DX12. This is what allows Async Compute, but it is not limited to Async compute. This can, for example, allow a background texture streaming with a low level priority.

Arulan · Sep 4, 2015

derExperte said:
This is a gross misrepresentation of what happened. You pretty much went off the rails at the end and ruined your already flimsy reputation with posts like this. That infamous Crysis thread was a mess too thanks in large parts to you. Trying to put the blame solely on others is just pathetic and I urge everyone to be very careful about believing anything W!CKED says. The 'don't attack me, attack my arguments' shtick is just part of the act, many people tried to reason with him back in the day but he'll wiggle around, evade, delete parts of comments he can't counter and post more bullcrap until someone gets fed up. Too bad some are starting to fall for it again.

Oh, I remember. I encourage everyone to take his offer, look up his posts under W!CKED, you can still find the quotes, and see for yourself. This fabrication of persecution and unreasonable users who couldn't stand up to his arguments is hysterical.

W!CK!D said:
If you need a prove, try searching for any of old my old posts (username W!CKED). They're all deleted.

Locuza · Sep 4, 2015

dr. apocalipsis said:
Multi-engine isn't Async compute/Async Copy.

It is the queue and priority structure that allows parallelization. Every DX11 card have it, every DX11 card will benefit from it on DX12. This is what allows Async Compute, but it is not limited to Async compute. This can, for example, allow a background texture streaming with a low level priority.

Well, this sounds nicer to me.
It depends on the Hardware, how the commands will be executed.
But like you said, that's the thing which allows async compute/copy.

Async Copy should be possible to some extend under all three IHVs.
Intel mentioned that their DMA-Engine(s?) is/are quite slow and that the driver may decide how the GPU will execute the copy-queue.
The presentations rendering BF4 with Mantle also named a few restrictions for DMA-Queues, where they had a fallback if necessary back to the main-queue to address some issues.

AMD's GCN should eat Async-Compute well and execute effectively, Intel doesn't support the execution of graphics and compute at the same time, but stated the possibility of pre-emption.
Nvidia may also work like Intel, but with much higher pre-emption costs.

SeeNoWeevil · Sep 4, 2015

There's now no evidence Maxwell can't do async compute as the B3D test doesn't result in work going onto the compute queue at all.

Locuza · Sep 5, 2015

Kollock said:
We actually just chatted with Nvidia about Async Compute, indeed the driver hasn't fully implemented it yet, but it appeared like it was. We are working closely with them as they fully implement Async Compute. We'll keep everyone posted as we learn more.

Also, we are pleased that D3D12 support on Ashes should be functional on Intel hardware relatively soon, (actually, it's functional now it's just a matter of getting the right driver out to the public).

http://www.overclock.net/t/1569897/various-ashes-of-the-singularity-dx12-benchmarks/2130#post_24379702

Curious to see what "fully implement Async Compute" actually means for Nvidia.
And I'm also very interested in the Intel results.

Dictator93 · Sep 5, 2015

Locuza said:
http://www.overclock.net/t/1569897/various-ashes-of-the-singularity-dx12-benchmarks/2130#post_24379702

Curious to see what "fully implement Async Compute" actually means for Nvidia.
And I'm also very interested in the Intel results.

Well isn't that something.

Unknown Soldier · Sep 5, 2015

Dictator93 said:
Well isn't that something.

I guess the previous 8 pages of fanboy flamewars was all a waste after all :3

icecold1983 · Sep 5, 2015

Dictator93 said:
Well isn't that something.

Sounds like damage control to me

mephixto · Sep 5, 2015

I have a feeling this is gonna end like the salty nvidia thread and turn into a joke.

Turin Turambar · Sep 5, 2015

tuxfool said:
Yikes

http://www.twitch.tv/thetechreport/v/14316298

go to 01:21:05 (roughly, hang around for a bit)

This seems to explain why Oculus is using AMD gpus on all their demos.

Better on youtube
https://www.youtube.com/watch?v=tTVeZlwn9W8&feature=youtu.be&t=1h21m35s

Alej · Sep 5, 2015

This thread is the fucking example why we can't have any real discussion about consoles capabilities here on GAF. Any positive for consoles is nucked to death by a very vocal minority who distributes personal attacks to anyone talking positively about console architecture.

I mean, seeing DerExperte spreading FUD about users talking here, despite those claims can be returned to him the same way, what the fuck am I reading really. It's gameFAQs level of insanity and it's really burning me down.

I walked by NeoGAF in 2007 coming from B3D. I was mostly interested in console architecture back in the day and games in general. I was frankly a PC gamer at this time and my PS2 was collecting dust. I remember the big flamewars on this board about 360 and PS3, that was very cringeworthy to read. I made an account but failed to see a reason about participating.

This thread proves me I was so right about this. That was a good time to talk about async compute because at this time we don't really know the fuck why PS4 was so designed around that feature. And the OP talks about having performance advantages while using it on consoles. I never implied that async compute would save the consoles or anything like that, or even making PS4 an highend PC or something. Never i was saying that, and i never will. I just can't have a serious discussion about this facet of the OP, some guys are totally allergic to that.

And yet, like that pink panther avatar guy, in every discussion I come I always find me flagged as a console warrior that shouldn't be trusted or something. I mean, seriously? Maybe i'm naive or very enthusiast about my PS4, i don't care. But why attacking me? Why do I represent something to be defended from. Seriously? There is no tech discussion to have here about consoles. I won't regret this shit bath.

I'll play some CIV games instead. (*wink* at the mod banning me last year while saying "go back playing your fantastic PS4 instead" while I was answering to a bunch of angry guys why I like to play on PS4 because reasons in a thread whining about why people like consoles).

SeeNoWeevil · Sep 5, 2015

Did anyone actually get rid of their green gpu on the back of this?!

Dictator93 · Sep 5, 2015

icecold1983 said:
Sounds like damage control to me

Why do you interpret a statement like that as damage control? It wasn't towards a comunity or towards a journalist, but towards a developer and he was just relaying it to us.

SeeNoWeevil said:
Did anyone actually get rid of their green gpu on the back of this?!

If someone did, I would laugh heartidly. Nothing out of this entire situation has pointed toward NV GPU's being difficient (Maxwell 2) at all.

Alej said:
This thread is the fucking example why we can't have any real discussion about consoles capabilities here on GAF. Any positive for consoles is nucked to death by a very vocal minority who distributes personal attacks to anyone talking positively about console architecture.

I mean, seeing DerExperte spreading FUD about users talking here, despite those claims can be returned to him the same way, what the fuck am I reading really. It's gameFAQs level of insanity and it's really burning me down.

I walked by NeoGAF in 2007 coming from B3D. I was mostly interested in console architecture back in the day and games in general. I was frankly a PC gamer at this time and my PS2 was collecting dust. I remember the big flamewars on this board about 360 and PS3, that was very cringeworthy to read. I made an account but failed to see a reason about participating.

This thread proves me I was so right about this. That was a good time to talk about async compute because at this time we don't really know the fuck why PS4 was so designed around that feature. And the OP talks about having performance advantages while using it on consoles. I never implied that async compute would save the consoles or anything like that, or even making PS4 an highend PC or something. Never i was saying that, and i never will. I just can't have a serious discussion about this facet of the OP, some guys are totally allergic to that.

And yet, like that pink panther avatar guy, in every discussion I come I always find me flagged as a console warrior that shouldn't be trusted or something. I mean, seriously? Maybe i'm naive or very enthusiast about my PS4, i don't care. But why attacking me? Why do I represent something to be defended from. Seriously? There is no tech discussion to have here about consoles. I won't regret this shit bath.

I'll play some CIV games instead. (*wink* at the mod banning me last year while saying "go back playing your fantastic PS4 instead" while I was answering to a bunch of angry guys why I like to play on PS4 because reasons in a thread whining about why people like consoles).

Sorry you feel that way.

JimPanzer · Sep 5, 2015

SeeNoWeevil said:
Did anyone actually get rid of their green gpu on the back of this?!

I just bought a second GTX970. I've read some of the posts in this thread previously. Tbh I don't think I understand the whole situtation, but it doesn't sound that severe to me.

Can anyone link me to a post / article without too much technical gibberish?

KKRT00 · Sep 5, 2015

JimPanzer said:
I just bought a second GTX970. I've read some of the posts in this thread previously. Tbh I don't think I understand the whole situtation, but it doesn't sound that severe to me.

Can anyone link me to a post / article without too much technical gibberish?

Conclusion is that we dont have finalized drivers, so current async is not working as intended on Nvidia GPUs for developers.

Will Nvidia GPUs be penalized by async in comparison to similarly speced AMD GPUs in future is yet to be seen, but for now, we dont have any info except for a fact that Nvidia said that final drivers that enables fully async are not ready.

icecold1983 · Sep 5, 2015

Dictator93 said:
Why do you interpret a statement like that as damage control? It wasn't towards a comunity or towards a journalist, but towards a developer and he was just relaying it to us.

If someone did, I would laugh heartidly. Nothing out of this entire situation has pointed toward NV GPU's being difficient (Maxwell 2) at all.

Sorry you feel that way.

i just suspect nvidia gpus are indeed bad at async compute based on everything ive read and nvidias history. i obv didnt sell my nvidia card, but should nv fare poorly in dx12 games ill just switch to amd on the next wave of gpus.

Kinthalis · Sep 5, 2015

I doubt we'll be seeing many games implementing much async compute on PC for a while, unless AMD is willing to spend money and help developers implement it in games on their PC releases.

Odds are it will continue ot be Nvidia who does the leg work and the investment in their own tools.

Time will tell I guess.

SeeNoWeevil · Sep 5, 2015

The only benchmark for DX12, for a game which uses a 'moderate' amount of async compute, performs as well on Nvidia's flagship gpu which has busted async implementation, as AMD's flagship gpu. Which is awesome at async compute.

Does not compute.

Support NeoGAF

Oxide: Nvidia GPU's do not support DX12 Asynchronous Compute/Shaders.

Member

Banned

Banned

Member

Member

Mckmaster uses MasterCard to buy Slave drives

Banned

Member

Member

Banned

I am Korean.

Banned

Member

Member

Member

Member

Member

Member

Member

Member

Banned

Banned

Member

Banned

Member

Banned

Member

Banned

never left the stone age

Banned

Member

Banned

Banned

Member

Member

Member

Member

Member

Member

Member

Banned

Member

Banned

Member

Member

Member

Member

Member

Banned

Member

Similar threads