How long does it take to decompress?
We don't really know for a full fledge game. The github demo has big impacts of FPS for on-sample because it's just running
that demo and nothing else so a ~0.2-1ms frametime has big impacts in the thousands of fps.
Nvidia and I guess other companies are optimistic on deployment in games. Nvidia's paper :
Although NTC is more expensive than traditional hardware-accelerated texture filtering, our results demonstrate that our method achieves high performance and is practical for use in real-time rendering. Furthermore, when rendering a complex scene in a fully-featured renderer, we expect the cost of our method to be partially hidden by the execution of concurrent work (e.g., ray tracing) thanks to the GPU latency hiding capabilities. The potential for latency hiding depends on various factors, such as hardware architecture, the presence of dedicated matrix-multiplication units that are otherwise under-utilized, cache sizes, and register usage. We leave investigating this for future work
With Playstation 6 at least confirmed to have NTC on-load.
So I guess these questions will come back often
- On-load
- You'll save on game size only on SSD.
- You'll stress VRAM bandwidth more due to NTC→BCn
- No VRAM saving, it'll go back to BCn size.
- No impact on render pipeline performance
- On-sample
- You'll save on game size on SSD
- You'll save on VRAM size and bandwidth usage has to be lower too as there's no decompression going on there.
- You'll have a performance hit in pipeline as it will do inference with tensor cores. More than likely "hidden" or near negligible in a modern game's frametime. Inference in in the ~0.5-1ms frametime.
- Stochastic texture filtering is a requirement here
- On-Feedback
- You'll save on game size on SSD
- This is using sample feedback.
- It uses SFS to find the set of texture tiles needed for rendering and decompress only those tiles to BCN at runtime.
- Better than On-load on VRAM usage
- Lower performance hit than on-sample
- They mention potentially uneven frame times in their presentations. Probably an area of improvement.
Anyone can test Nvidia's github demo btw, even AMD and Intel cards. I think AMD only has a bug for on feedback.
At a minimum, anything NTC saves on SSD space and has been tested all the way down to a low tier Turing GPUs
You'll have high end PCs that will likely have margin to have on sample because in the future you'll also be with a ton of neural rendering and saving space is important, upscaling, framegen, ray reconstruction, neural shaders, neural skin, neural radiance cache path tracing, neural everything DLSS 5
While lower hardware will likely pick NTC on-load.
Really curious if Switch 2 can uses NTC on-load for certain games to save on cartridge size and minuscule storage. If the VRAM bandwidth permits it. Likely not for all games but certain developers could probably afford the feature.
Neural compression also opens up many other compression fields with research going on now for compressing light probes for baked lighting for huge open worlds. Although now I think by the time you have GPUs doing NTC, the reasons to go baked lighting over ray tracing doesn't make much sense. Maybe would help for lower end and SSD saving again.