• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

if Senua's Saga: Hellblade II is actually what we can expect for next gen consoles does that mean rtx2080 will be outdade by 2020

psorcerer

Banned
If we are to go there, even graphics in general is a trick. I think the main goal is to get as close to reality as possible. For that you need powerful hardware moreso than tricks. It's a nice trick to have an implicit light source be a point or a direction vector. Gives OK results, but it's time for sampling arbitrarily shaped light sources (which is what the real world is made of). It gives way more accurate results that look much better than the tricks. So I'd love to see a trick that gives results from a path-tracer that don't use importance sampling.

To me the ultimate lesson on how to make game graphics is here https://www.froyok.fr/documents/making_of_sotc.pdf
Simple things, clever tricks, good approximations.
Brute-forcing everything is not gonna work.

I wouldn't call it a bad choice for games because it looks better than it's counterparts and runs faster.

It doesn't look better if it was not designed to look better on PC.
Somehow you translate additional effort into platform advantage.
Where platform itself has no advantages at all.
For the same hardware you can extract way less performance.
 

VFXVeteran

Banned
To me the ultimate lesson on how to make game graphics is here https://www.froyok.fr/documents/making_of_sotc.pdf
Simple things, clever tricks, good approximations.

Thanks for the paper. I'll read it in it's entirety this holiday break. From reading just the first few pages, I've implemented all of that before. The projected shadow volume is an old technique. Looks pretty OK but it's decoupled from the real lighting equation. I have issues with that. Also, the psuedo-HDR technique doesn't describe it's tone mapping algorithm. That's important. I'm guessing it's just the basic log-max lumination method. The bloom is the typical downscale, blur, upscale and comp.

Brute-forcing everything is not gonna work.

Yes, that's correct to a point. I think you and I disagree with the fact that you think doing tricks (because you have to on a console) is more respectable than not doing them. I disagree with that. I think if you have the power, implement the real fresnel equation instead of using (1 - cos(theta)) ^ 5. Obviously the GGX equation is a much better approximation than using Phong or Blinn so those crude approximations are no longer enough for a material to look good. PBR makes things look much better. That you can not deny. Yes, an inverted lambertian computation will get you far with light passing through on leaves, but that won't work on a game with caustics or subsurface scattering. God of War is a perfect example of using this inverse lambert's law for the ears on Kratos. It looks odd and weird and "glows". I'd rather take the hit and implement the multiple dipole method across the entire skin and make it look amazing during gameplay but I need more power to do so.

It doesn't look better if it was not designed to look better on PC.

I don't understand this. Can you give an example of a game that looks great on a console due to tricks (baking, using less complex equations, etc..) but that same game doesn't look better on a PC?
 
Last edited:

-kb-

Member
Give me an example. I know x86 assembly, so you can put up some code.

Don't need to put any code for obvious microachirectural differences between manufactures and generations of CPU's. Instructions run at different speeds. If you wan't me to hold your hand and find some examples I can but I don't see why someone who understands x86 assembly so well doesn't grasp microarchitectures.

If you wish to learn.


Are pretty good references.
 
Last edited:

VFXVeteran

Banned
Don't need to put any code for obvious microachirectural differences between manufactures and generations of CPU's. Instructions run at different speeds. If you wan't me to hold your hand and find some examples I can but I don't see why someone who understands x86 assembly so well doesn't grasp microarchitectures.

If you wish to learn.


Are pretty good references.

I didn't say I knew x86 assembly "so well". I just said I knew it - so no hand-holding here. I haven't worked on any project using x86 assembly. It's my own venture.

Thanks for the references.
 
pc defenders are out in full force. The consoles are gonna look and perform great. Even if its not as great as a 2080 its gonna be close and thats all that matters. Pc will always be number one in power but at 4x the price. Id rather buy a switch ps4 pro and a 1x than spend money on a pc again. Id still save a ton of money.

This gen it was pro -------x-----------------------pc
next gen at least for a while ps5(10TFlops)-----Series x(12Tflops) --pc14 Tflops
 
pc defenders are out in full force. The consoles are gonna look and perform great. Even if its not as great as a 2080 its gonna be close and thats all that matters. Pc will always be number one in power but at 4x the price. Id rather buy a switch ps4 pro and a 1x than spend money on a pc again. Id still save a ton of money.

This gen it was pro -------x-----------------------pc
next gen at least for a while ps5(10TFlops)-----Series x(12Tflops) --pc14 Tflops
They are going to perform better than a 2080 this is because of the memory tech behind the ssd n itll take a year for pcs to catch up and yes at 4x the price, its beyond me why pc gamers give nvidia and intel their money they are simply getting scammed but since they take pc gaming as an occult theyll keep doing it just to prove a point.

Its 5 percent of pc gamers with high end cards around the world and 95 percent are 1080p mid settings ones only bought a pc to play fortnite and the funny thing is its the 5 percent of pc gamers shouting about online.
 

Panajev2001a

GAF's Pleasant Genius
From reading just the first few pages, I've implemented all of that before. The projected shadow volume is an old technique. Looks pretty OK but it's decoupled from the real lighting equation. I have issues with that. Also, the psuedo-HDR technique doesn't describe it's tone mapping algorithm. That's important. I'm guessing it's just the basic log-max lumination method. The bloom is the typical downscale, blur, upscale and comp.

Sure, but we are talking about a title whose development traces back to the 2001-2006 period on PS2 which as you know did not have programmable fragment shaders. The techniques and the outcomes in that paper are novel for PS2 (not that nobody else attempted and perhaps discarded due to frame rate impact). Even ICO with some limited but impactful use of per pixel lighting, shadow volumes, and soft self shadowing.
Also, what is most remarkable, is how they went at making it run fast / at an acceptable frame rate on that particular restricted HW. It must have required close collaboration between developers and artists where they both chose the most optimal method to get the quality they wanted at the lowest cost (hence why you will spot point sampled textures in some parts of ICO’s scenes where higher quality textures and bi-linear filtering would have required too much storage and just wasted texture read bandwidth) more than just having implemented each technique in and of itself I think.

Nice B3D reaction thread: https://forum.beyond3d.com/threads/ico-devs-discuss-how-they-made-sotc-in-full-detail.25104/
Nice B3D reaction thread: https://forum.beyond3d.com/threads/ico-devs-discuss-how-they-made-sotc-in-full-detail.25104/

PS2 EE, GS, SPU2, and more PDF’s (some of them come from the PS2 Linux kit, but are the same ones as the one coming with the TOOL): https://hwdocs.webs.com/ps2
Very good VU and GS HW d’oca there.

Overall architecture presentations (old news to you): http://unina.stidue.net/Universita' di Trieste/Ingegneria Industriale e dell'Informazione/Tuzzi/Architetture_Avanzate_dei_Calcolatori/Emotion_2.pdf

 
Last edited:

Fafalada

Fafracer forever
There is never enough graphics memory. I can choke my 2080Ti very easily with just 1 load of assets for 1 character.
You're using software that is solely designed around 'keep everything in memory just in case' because that's easier. So yes, there's never enough memory if the sole objective is minimizing challenges around architecting software. There's also never enough L1 cache, as really everything should just fit in there to REALLY get optimal, but you know...

Where can you show me a real world example of this optimized speed due to using GNM?
It's not really a question of APIs as such - it's being able to target specific hardware profile. Which really means more to 1st party software, and any conversations about benefits should be kept in context of implementation(ie. what was faster by doing 'A', and by how much) not some randomly moving goal posts asking for market-generalized single-number % improvement abstractions and improvements according to non-existing visual guidelines.

Give me an example. I know x86 assembly, so you can put up some code.
To my own surprise I actually wrote 'assembly-level' code this gen (been like 10 years since I've done it last prior to that, and I normally don't get to code to begin with anymore), but none of it was x86. It 'was' useful in context though.
 

VFXVeteran

Banned
Sure, but we are talking about a title whose development traces back to the 2001-2006 period on PS2 which as you know did not have programmable fragment shaders. The techniques and the outcomes in that paper are novel for PS2 (not that nobody else attempted and perhaps discarded due to frame rate impact). Even ICO with some limited but impactful use of per pixel lighting, shadow volumes, and soft self shadowing.
Also, what is most remarkable, is how they went at making it run fast / at an acceptable frame rate on that particular restricted HW. It must have required close collaboration between developers and artists where they both chose the most optimal method to get the quality they wanted at the lowest cost (hence why you will spot point sampled textures in some parts of ICO’s scenes where higher quality textures and bi-linear filtering would have required too much storage and just wasted texture read bandwidth) more than just having implemented each technique in and of itself I think.

Nice B3D reaction thread: https://forum.beyond3d.com/threads/ico-devs-discuss-how-they-made-sotc-in-full-detail.25104/
Nice B3D reaction thread: https://forum.beyond3d.com/threads/ico-devs-discuss-how-they-made-sotc-in-full-detail.25104/

PS2 EE, GS, SPU2, and more PDF’s (some of them come from the PS2 Linux kit, but are the same ones as the one coming with the TOOL): https://hwdocs.webs.com/ps2
Very good VU and GS HW d’oca there.

Overall architecture presentations (old news to you): http://unina.stidue.net/Universita' di Trieste/Ingegneria Industriale e dell'Informazione/Tuzzi/Architetture_Avanzate_dei_Calcolatori/Emotion_2.pdf


Thanks man for all of this. Much appreciated.

Yes, those were days where you had to optimize on a closed platform. Today not so much at all was my point.
 

VFXVeteran

Banned
It's not really a question of APIs as such - it's being able to target specific hardware profile. Which really means more to 1st party software, and any conversations about benefits should be kept in context of implementation(ie. what was faster by doing 'A', and by how much) not some randomly moving goal posts asking for market-generalized single-number % improvement abstractions and improvements according to non-existing visual guidelines.

So what you are saying is that a 1st party developer had to use tricks in order to show visuals that most 3rd party developers could get without doing the tricks? Let's bring Uncharted 4 to the table or even God of War. I understand that Lead developer at ND had to use a parallel queue system to take advantage of all CPU cores, but I would consider that just reverting back to a standard because of how proprietary the PS3 hardware was. Can you point out any other optimization techniques that you declare could not have been possible on a PS4 that is found all over the place in 3rd party software? Let's specifically talk about features we know that are common to all platforms: PBR? Hair? Foliage shading? True 4k vs. upsampled? I'm trying to get a sense for what was needed that wouldn't run otherwise.
 
Last edited:

Fafalada

Fafracer forever
So what you are saying is that a 1st party developer had to use tricks
I find the wording/connotation here a bit weird. Everything we do in CG is using tricks, realtime and offline alike - simulation keeps getting more precise but it's fundamentally still all going to be 'cheap tricks' until we get to simulating sub atomic-interactions some day.

But to your other point(s).
I understand that Lead developer at ND had to use a parallel queue system to take advantage of all CPU cores, but I would consider that just reverting back to a standard because of how proprietary the PS3 hardware was.
If that's the pipelining with fibers they did for TLOU:R first, I wouldn't exactly call that 'standard' as it's quite unlike what any of the most popular middleware solutions do, nor do the tradeoffs they made work for every scenario. That said I'm pretty sure the approach taken could be done on most platforms (including windows), but when you can't target a fixed number of usable cores, your utilization will vary, potentially by a lot, obviously.

Let's specifically talk about features we know that are common to all platforms: PBR? Hair? Foliage shading? True 4k vs. upsampled? I'm trying to get a sense for what was needed that wouldn't run otherwise.
I'm not sure I understand what you're actually asking here? You say 'specifics' but all your counter examples are very-broad and not in performance context at all (actually not sure what the context is - implementability?)
 

VFXVeteran

Banned
I find the wording/connotation here a bit weird. Everything we do in CG is using tricks, realtime and offline alike - simulation keeps getting more precise but it's fundamentally still all going to be 'cheap tricks' until we get to simulating sub atomic-interactions some day.

Of course everything is "tricks" perse. I'm talking about for example, pre-baking light probes for global illumination instead of actually using path-tracing of light to resolve the solution. Pre-baking would be more of a trick pipeline whereas using physically based equations and simulating how light actually interacts with the environment would be the later.

I'm not sure I understand what you're actually asking here? You say 'specifics' but all your counter examples are very-broad and not in performance context at all (actually not sure what the context is - implementability?)

My point is this. If you have worked with using a really good optimized closed-solution for a particular hardware box (i.e. PS4 for example), then what game can you show that uses this solution that would be slow if the same game was ported to the PC (which is underutilized)? More specifically, find several examples of features being run on the console that just wouldn't be possible on PC (i.e. try tessellation + POM at the same time on every piece of terrain). If we have no such data, then I'm wondering why there is even an argument of the advantage of using close access APIs to the closed hardware when it doesn't show any superior results relative to the PC and therefore proves my point that the PC will always be able to brute force it's way ahead of closed hardware.
 
Last edited:

Fafalada

Fafracer forever
Pre-baking would be more of a trick pipeline whereas using physically based equations and simulating how light actually interacts with the environment would be the later.
That's fair - but I don't think there's a really hard line that we can draw. Eg. I can do just in time evaluation and generate probes in real-time, on demand(and thus running path-tracing at runtime, just not quite 'real'time), giving me the results of baked solution without the drawbacks. Not that this would be a platform-specific solution, just since we were on topic of simulation vs tricks.

More specifically, find several examples of features being run on the console that just wouldn't be possible on PC
Ok that helps. First let me make a point here that context is rather important for these, so what works in one title may not work that well everywhere else. And also, the only thing we can really talk about is 'prohibitively expensive' at any given time, nothing is ever truly impossible.

That said - one of the most obvious examples of past few years is VR and various lens-shape optimizations for rendering. These work exceptionally well on fixed platforms (PSVR and Quest most notably) and are incredibly cumbersome to use/implement on PC - no standard features in hardware(even within single vendor's solutions), no API-access to certain acceleration structures (like H-Tile buffers), largest percentage of the market actually missing useful hw or API support alltogether, and unpredictable results where gains on one GPU can actually be non-existant or even detrimental on another.
On top of that - solving for CPU overhead is even harder - while on a box that gives you access to push-buffers you can literally draw multiple-views N-times faster(CPU overhead) even on GPUs that offer no hw-tricks for variable resolution render targets.
The use-case is highly specific, because most of these types of optimizations are. And obviously mileage will vary and results can be much less dramatic than this case.

Another thing worth calling out (less for performance and more flexibility) is eg. mesh-shader style pipeline setup, which has been in use on current-gen consoles for quite some time now, thanks to async-compute and some lower-level API access - ie. it isn't even a new pipeline-feature to consoles, even though it is to PC platform.
Older analogue would be things like PS2 hw-accelerated single-pass shadow-volumes with NO-cpu intervention needed (where PC had to do multiple-pass, CPU extrusion, CPU vertex generation), but that involved specific hardware features as well as low-level access.
 

VFXVeteran

Banned
Good point about VR! :messenger_tears_of_joy: Definitely clumsy to use and I'd much rather have a closed system like the Quest.

Another thing worth calling out (less for performance and more flexibility) is eg. mesh-shader style pipeline setup, which has been in use on current-gen consoles for quite some time now, thanks to async-compute and some lower-level API access - ie. it isn't even a new pipeline-feature to consoles, even though it is to PC platform.

Yea that doesn't really make the algorithm unique for a rendering advantage but I'll read up some more on the mesh-shader. Not sure what it's used for but if for more detailed terrain, I'd rather use the old tried and true displacement mapping. Still used in film and gives the best results.

Older analogue would be things like PS2 hw-accelerated single-pass shadow-volumes with NO-cpu intervention needed (where PC had to do multiple-pass, CPU extrusion, CPU vertex generation), but that involved specific hardware features as well as low-level access.

Yes, PS2 era definitely had it, but these days I'm trying to get most gamers to see that there is a convergence going on. The PC will become the standard platform and the unique optimizations of old are quickly fading away (especially since the programmable shader pipeline was introduced).
 

Fafalada

Fafracer forever
Yea that doesn't really make the algorithm unique for a rendering advantage but I'll read up some more on the mesh-shader.
It's basically a proper replacement for the now defunct Geometry Shader pipeline which just never worked well, for any GPU or vendor. So it's usable across a variety of things - from primitive culling optimizations to GPU based LOD selection and more.
The main point is PC is just receiving it as of 2019, while on consoles we've had access to that kind of feature set since 2013 - so you know, not entirely unlike some things introduced by PS2 or original XBox way ahead of PC-timeline.

The PC will become the standard platform and the unique optimizations of old are quickly fading away (especially since the programmable shader pipeline was introduced).
That's a good point, and I agree to a point. Yes hardware consolidation is obviously a real thing, after all everyone's using the same two-three vendors by now, and some people have already described the coming boxes as being the most similar we've ever seen in console space.
That said, even just looking at local-devices, mobile is driving very different optimization needs (and the influence of that will only grow), and the longer-term future we're transitioning away from local-compute alltogether. Stadia may have been a misfire, but it's only the beginning, and it has more in common with console development than PC, with optimization realities that are actually (much)harsher than any closed-box as you're literally driving $ value with every milisecond.
So to cut the long-winded rant, I see the industry going through significant changes in coming years that actually point to increased relevance of fixed-spec optimizations again in longer term, even as our hw grows powerful and unified enough to make direct access less important in local devices.
I know I was kind of going all over the place, so hopefully I'm making enough sense here.
 
Last edited:
Top Bottom