When testing our first code on Wii U we were amazed how much we could throw at it without any slowdowns, at that time we even had zero optimizations. The performance problem of hardware nowadays is not clock speed but ram latency. Fortunately Nintendo took great efforts to ensure developers can really work around that typical bottleneck on Wii U. They put a lot of thought on how CPU, GPU, caches and memory controllers work together to amplify your code speed. For instance, with only some tiny changes we were able to optimize certain heavy load parts of the rendering pipeline to 6x of the original speed, and that was even without using any of the extra cores.
......
We didn’t have such problems (about CPU power). The CPU and GPU are a good match. As said before, today’s hardware has bottlenecks with memory throughput when you don’t care about your coding style and data layout. This is true for any hardware and can’t be only cured by throwing more megahertz and cores on it. Fortunately Nintendo made very wise choices for cache layout, ram latency and ram size to work against these pitfalls. Also Nintendo took care that other components like the Wii U GamePad screen streaming, or the built-in camera don’t put a burden on the CPU or GPU.
......
Nano Assault Neo only needs a fraction of the memory, even when all assets are unpacked and processed. So we use all remaining memory as a cache. So for instance loading times are nearly zero after a short while. It feels like playing from a SNES rom
etc.