Please stop spreading FUD.
Err
You're more likely to be limited by the horsepower of the Xbone/PS4 than any separate memory pools and types PCs have.
Right, because PCs are even more powerful than physics. ><
Perhaps you should explain how you think hUMA works, and we can explain where you've got it wrong.
Because DX11 doesn't support it, and because DX12 won't be relevant in the market for a couple of years yet? I don't know, just putting the question out there to those who do..
So, async isn't really important because we're gonna keep ignoring it a while longer? lol
If Maxwell is already being well utilized. Obviously any bit helps though as mentioned earler in the thread.
Well, it's
not being well utilized. That's the problem async compute aims to solve, actually. You know a GPU isn't homogenous inside, right? Within the GPU there are various types of processors, specialized in different types of math. Throughout the rendering process, there will be different groups of processors sitting idle at various times, because the math they know how to do isn't needed on this particular cycle. It's waiting for its neighbor to finish preparing the data it needs.
Async compute can improve utilization by assigning additional jobs to those idle processors. This is illustrated in the diagrams DieH@rd posted earlier, where anything black represents processors going unused.
Plus, the GPU is simply better at some stuff than the CPU is, so simply having the GPU do it instead is a win, especially if you can do it without interrupting the rendering. Maxwell 2 finally adds a 32-queue compute scheduler, but it's sounding like it's still not actually asynchronous, and the GPU actually rapidly switches contexts between render and compute. So you can quickly switch between job types, but you can't truly run them simultaneously using idle transistors while the rendering takes place around you. Compare this with the eight, 8-queue schedulers on GCN, which are fully independent of the render scheduling, and basically have free access to any transistors not actively being used for rendering.
Oh, and another advantage of a truly asynchronous system is that once the CPU dispatches a job to the GPU, it doesn't need to sit there doing nothing while it waits for the result. As a crude example, the CPU can ask the GPU what actor1 can see. Instead of simply waiting for the result, the CPU can then ask what actor2 is able to see, and actor3, and so on. As the results start coming back from the GPU,
then the CPU can decide what action actor1 will take based on what he can see. GPUs are good at ray-casting, and CPUs are good at branchy decision making, so the flexibility provided by async again allows us to utilize resources more efficiently. Running strictly on a CPU, a typical AI routine would spend 90% of its cycles determining perception and pathfinding for the actor. So if you normally spend 10ms on AI, 9ms is spent on perception and pathfinding. The GPU may be able to do that part of the job in 2ms, so you get your results a lot faster this way, but perhaps more importantly, you've freed up 9ms on the CPU, giving it time to work on
other tasks.
So no matter how powerful your system is, async compute just makes it that much more powerful. hUMA makes it more powerful still, because the shared memory increases your opportunities to leverage GPGPU overall.