half precision stuff has been a thing in GPU's since dx9, but for one reason or another it didn't take off the way it has been now. mostly been hidden from public eye.
FP16 was used for pixel shaders, but vertex shaders were always FP32 (vertices really need the increased accuracy to prevent artifacts). Back then, we had separate pixel/vertex shader pipelines, so different precision made sense. ATi (AMD) had FP24 for pixel shaders, which was a good compromise between FP16 and FP32.
When the industry migrated from DX9 (discrete shader pipelines) to DX10 (unified shaders), engineers had to adopt FP32 universally because of the vertex shader requirements for high accuracy.
Nowadays we have a new type of shaders: Compute Shaders. These type of shaders don't necessarily require 32-bit precision (i.e. AI pathfinding). nVidia even supports INT8 (8-bit integer) precision in HPC for deep neural network applications. People who compare this with the GeForce FX fiasco are really ignorant, since Compute Shaders didn't exist back then. GPUs were merely toys for 3D graphic processing/video games. Not anymore though:
https://www.theregister.co.uk/2016/09/13/nvidia_p4_p40_gpu_ai/
https://petewarden.com/2015/05/23/why-are-eight-bits-enough-for-deep-neural-networks/
Mixed FP16/FP32 coding will become the norm in the future, since next-gen consoles (PS5/XB2) are going to support it as a baseline feature (most likely with a Navi GPU). RPM, CBR and Ryzen will allow us to have more 4k60 games in the future. Good times ahead.
One could argue that PS4 Pro is beta testing "experimental" technologies for the PS5, just like PS3 (Cell SPUs) was beta testing for Compute Shaders... people may mock these technologies all they want, until they become mainstream enough.
Last but not least, don't forget that Moore's Law has slowed down quite a bit. Some people expected that FP64 would replace FP32 in consumer GPUs, but that would halve performance or it would require double the amount of transistors for the same performance. That's why FP16 and even INT8 are becoming a thing. Mobile GPU design is also another factor to consider.
~
In regards to Switch (Tegra X1), does anyone know if any Nintendo exclusives (Zelda BotW, MK8 Deluxe, Splatoon 2) utilize RPM/2xFP16? 3rd party devs are even less likely to use it, but who knows... it's an interesting feature nonetheless.
When it comes to PS4 exclusives, I haven't heard of ND or GG utilizing it in current games. Maybe in the future.
There have always been games that implemented new technology, often with the help of manufacturers to try to demonstrate the new tech. Happened with 3dnow many years ago.
This doesn't mean anything, though. It would be something if it would be implemented in all major engines, though. But like with asynchronous compute, even mentioned in this article, we will see
a) if it will be implemented
b) when it will be implemented
c) what the exact benefits will be
Remember when nVidia had released GeForce 256 (very expensive back then) and everyone said that T&L was a "useless gimmick"? Most people had 3Dfx graphics cards and thought that a fast enough CPU would suffice for vertex transformation. T&L/DX7-ready games took a while to become mainstream... and look where we are now.
People will always be skeptical when a paradigm shift disrupts the status quo.