Not that one I had in mind, it was an older demo but that works too, even shows that latency wise for a bigger scene than just the helmet
They have, it's even a requirement of NTC and one of the "drawbacks" of it as it does not use hardware filtering, it absolutely needs STF. NTC gives one pixel per decode. Trilinear or aniso would be too expensive.
NVIDIA Neural Texture Compression SDK. Contribute to NVIDIA-RTX/RTXNTC development by creating an account on GitHub.
github.com
there's a bug with the reference actually. All modes in the demo have STF.
We don't know that in the end. We have no idea of the occupancy of the GPU for this. There's a fixed cost sometimes for inference that it won't budge no matter how much NPUs you throw at it. DLSS comes to mind, its impacted of course to some degree but nowhere near what a 2060 to 5090 TOPs difference would show. When you get into the ~0.5-1ms frametimes for an effect in a pipeline inference, you might be capped there for a long time no matter how many tensor cores you throw at it.
Tensor cores are always working concurrently of course, on top that these can be decoded independently and even in the Gbuffer field or ray tracing passes.
They also talk about it in their
papers
That's why I said the performance impact of the small initial demos are not really representative of a full game frametime. There's a shitload of things in a modern game frametime that you can insert this concurrently with minimal impact.
There's also a mode in NTC that will use sampler feedback streaming, in a future DX12 implementation.
Inference on feedback is a middle ground between on-sample and on-load. It reduces VRAM a lot, not as much as on-load, but also better performance.
And as Kepler says, inference on load is basically a free lunch, you get the benefits of disk space and bandwidth, VRAM usage is still a problem though. Which is why I think everyone will look at inference on sampling with the prices of VRAM nowadays
There's an explosion of papers on neural compression and not just textures, there's even papers on how to apply neural compression to grids of light probes to have global illumination maps baked for different times of day or states of environment.
Oh also just Sampler feedback streaming, when it was implemented in half life 2 RTX it has huge VRAM savings, even before NTC.