Theoretical is the key word, Xbox 360 has a shader efficiency around 60%, and it's shader count is actually hindered to only 217gflops iirc, which means that you are only capable of extracting about 130gflops from the 360 per frame, at least without extreme care to GPU bottlenecks like low level cache and thread wave allocation.
VLIW5 in Wii U is capable of about 80% efficiency, and so it's 176gflops is only capable of about 140gflops per frame, again you can painstakingly extract a bit more, but threads will allocate through the architecture and run into issues where only 4 of the 5 available stream processors can process data.
Maxwell architecture has over 96% efficiency, so that 196 gflops in handheld mode is 188gflops or 377gflops when docked. since 2005, these processors have gotten better at handling effects more efficiently as well, and you shouldn't directly compare gflop numbers because it just doesn't work that way. This is why generally Nvidia gflops will perform 4/3 better than GCN, this is without heavy Async compute engine utilization, but that is only recently wide spread and hasn't been a focus for Nvidia's drivers where they could gain back ground here.
We also haven't touched on fp16, as many graphic operations do not need 32bit values, this is a game by game basis, but with some developers reporting a 70% utilization in their own games, it is safe to say that it can certainly gain ground as developers get more and more use to writing code with 16bit floating point values in mind.