1Tflop is only from half precision, FP16. You can't use half precision on everything, most tests show about a 10-20% overall uplift for it. Unfortunately "tech" sites that barely understand this stuff compare the 1Tflop directly to the XBO/PS4 regularly creating this misunderstanding.
Switch is indeed 196 Gflops in handheld mode and 390 Gflops docked, using Full precision like everyone else compares.
Just think of it this way, 1Tflop would make it barely below the OG Xbox One, on the same 28nm node the XBO launched on, which runs at 137 watts, while the Switch runs something under 15, there was no way it was barely below the XBO.
The math on it is Shader cores * operations per core per clock (2) * clock speed, TX1 has 256 shaders, Switch clocks are 768MHz docked and 384 undocked, you can go from there.