PS4/XB1's GCN architecture is about 30% faster than R700 architecture used in Wii U, Pascal is ~40% faster than Polaris which is more or less the same as GCN flop to flop... ROUGH estimate, but Pascal should be about 90% faster than Wii U, flop to flop. (X1 is maxwell based, and again there isn't much difference between performance per flop with pascal)
For illustration purposes, X1 is 512GFLOPs, this is over 5 times faster than Wii U's 176 gflops (nearly 3 times faster than the 352gflops it sometimes gets confused to have) and somewhere around 60% of XB1's performance.
If NX is using Pascal (I don't see Nvidia pushing out another maxwell chip tbh) they can hit much higher clocks when docked, so even in the same configuration as X1 (256 cuda cores) if it is reaching 1.5ghz or 1.6ghz, you'll have 768 to 819GFLOPs (+40%) which gives you 1 to 1.15 TFLOPs, slightly under XB1. If the pascal chip is 3 SM, it would be completely possible to hit 1.228 tflops (+40%) which gives you 1.7tflops, or just under PS4's 1.843TFLOPs.
In order to reach those clocks, NX will have to have a fan in the body of the device, but that fan can be kept off while on the go, and be down clocked to 1/4th it's docked clock, which is perfect for 540p, it is also possible to waste battery and go with 1/2 it's docked clock so that it can have a terrible battery life and hit 720p. The dock can also offer a blower to help move air through the device, as long as the vents allow and the passive cooler fins are designed to be cooled from the side. An active cooler inside a device like this would be very interesting, as they could bridge the gap between devices.