Where else are you going to store the data if you're not "putting it directly in memory"?
They are putting it directly in memory. They are using a pool of memory which stores recently used textures and other data. What the SSD allows them to do is quickly swap data in and out of this pool. Because they can do this quickly, they do not need a lot of data preloaded until it's time for it to be used. This allows them to reduce the size of the pool, not only improving performance but reducing occupancy.
Now, GPU caches that live within a cache hierarchy (Registers -> L1 -> L2 -> L3) have their data written to depending on the code running on them at the time. The control a developer has over these caches is mainly from the data structures and operations performed on those data structures (i.e:. Aligned arrays of a certain size are more cache friendly, certain types of loops are more cache friendly, adding memories barriers or grouping writes and reads, are more cache friendly). The developer can basically only hint what actually ends up in these caches, they really don't have much to go on. These caches are also very small. You cannot jump dump texture data into them, especially from an SSD. Your CPU can just dump any cache it doesn't like (usually using some sub-process to scrub cache based on age or usage) , a developer doesn't do this. You cannot put anything into the cache directly, because it always has to be loaded via RAM, programmers do not, in any way, interact with CPU caches. It just isn't something that happens. This is why I can dismiss it so easily.