That's just not true at all.
A. All GPUs are natively multithreaded since early days of programmable shaders.That was an autocorrect, it was supposed to be hyperthreaded.
B. Both GCN and Pascal allow concurrent execution of different contexts. The difference is that GCN allows to mix threads of different contexts in flight on one CU (aka SM) while Pascal allows to mix them only between SMs (aka CUs).they both allow a core to receive instructions as soon as a task is finished, but Pascal cores can only do graphics or compute, GCN cores can do both at once. So a task that requires both needs two Pascal cores to complete while one GCN core can do both.
C. What Pascal does requires h/w support. Otherwise they'd do it on Kepler and up. What does do needed hardware support, but that doesn't mean that it's full asynchronous compute integration, it's just a half step.
D. Async compute running serially doesn't impact performance at all, as can be seen from most Maxwell benchmarks. It doesn't provide any performance boost, yes, but it's doesn't make things slower.it makes things slower compared to cores that fully integrate async compute otherwise we wouldn't be having this conversation.
E. What you seems to not get still is that there's no "proper" way of handling this execution in h/w. Theoretically speaking it would be great to have Pascal support the same context agnostic execution of threads on its SMs -- but in practice there's no such thing as a free feature. Just as an example what if implementing GCN-style async execution would result in Pascal loosing it's frequency advantage over GCN? Would a ~1-5% performance gain from such implementation be a good trade off for loosing ~25-33% of the clocks? A rhetorical question.
So you agree that Nvidia released a poorly designed Dx12 architecture in order to increase theoretical performance over actual capability to run code fully optimized for Dx12? One of the greatest advantages of async compute is the ability to hyperthread graphics and compute on the same core, Nvidia chose not to include that and the 4 year old GCN vs modern Pascal performance shows the a Tualatin performance that trade off cost them. If you can't see that you're in denial.