• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Nvidia has been using tile based rasterization since Maxwell

Jonnax

Member
Source Article:
http://www.realworldtech.com/tile-based-rasterization-nvidia-gpus/

Video explanation with demonstration: includes:
A brief refresher on the 3D pipeline
An explanation of the DirectX shader code, which is available on Github
Behavior of the code on an AMD GPU
Behavior of the code on Nvidia Maxwell and Pascal GPU
Discussion and analysis of the results

https://youtu.be/Nc6R1hwXhL8


Using tiled regions and buffering the rasterizer data on-die reduces the memory bandwidth for rendering, improving performance and power-efficiency. Consistent with this hypothesis, our testing shows that Nvidia GPUs change the tile size to ensure that the pixel output from rasterization fits within a fixed size on-chip buffer or cache.
 

THE:MILKMAN

Member
I won't pretend to understand this but does it go a long way to explaining Maxwell's great improvement in performance while still stuck on 28nm?

And yeah, can the competition do the same/similar?
 

DonMigs85

Member
I won't pretend to understand this but does it go a long way to explaining Maxwell's great improvement in performance while still stuck on 28nm?

And yeah, can the competition do the same/similar?
Yes, pretty much, along with other enhancements over Kepler.
It's what mobile GPUs from Imagination and ARM use to improve efficiency and power consumption. It's been in the Dreamcast PowerVR chip and helped it punch above its weight.
 

THE:MILKMAN

Member
Yes, pretty much, along with other enhancements over Kepler.
It's what mobile GPUs from Imagination and ARM use to improve efficiency and power consumption. It's been in the Dreamcast PowerVR chip and helped it punch above its weight.

Makes me wonder why all don't do this as a matter of course. Is this thing patented or can anyone use it?
 

Window

Member
Makes me wonder why all don't do this as a matter of course. Is this thing patented or can anyone use it?

I don't believe the idea of tile based rendering is patented by Imagination (I may very well be wrong). Certain implementations probably are.

Upon finding out about it, I never understood why desktop GPUs didn't adopt this either. Maybe they didn't need to (until now)? Mobile GPUs do face a lot more resource constraints in design.
 
NX to have better real-world performance than PS4/XB1 confirmed. Hype levels maximum.

Nah just kidding. But it's pretty interesting and it sort of explains why Maxwell/Pascal punch so hard.
 

DonMigs85

Member
Makes me wonder why all don't do this as a matter of course. Is this thing patented or can anyone use it?
That's what I'd like to know. Come to think of it, Adreno 300 and later series GPUs can also switch between tiled and direct mode rendering. Maybe Nvidia implemented something similar.
 

Renekton

Member
Why did PowerVR allow Nvidia to use its tile tech?

Back in 2014, NV sued to stop shipments of SoCs containing PowerVR architecture.
 

dr_rus

Member
So now I wonder if AMD and Intel will ever implement something similar, or if Nvidia has a lock on it.
I may be wrong here but I thought that Intel iGPUs are tile based since their inception. As for AMD - let's see them implementing FL12_1 first at least.

This is incredibly surprising. Woh.
Eh, not really -) The hints were dropping for the last couple of years. This is also the main reason for Maxwell's memory efficiency, DCC plays a lesser role there.

Why did PowerVR allow Nvidia to use its tile tech?
Because it's not "its tech". The general rendering principle is free for anyone to use. Maxwell's handling of rendering isn't really the same as what PowerVR does either as it's hybridized with traditional approach.
 

Renekton

Member
Because it's not "its tech". The general rendering principle is free for anyone to use. Maxwell's handling of rendering isn't really the same as what PowerVR does either as it's hybridized with traditional approach.
First time I heard it is a general principle.

IINM Imagination famously came up with the tile approach first during the voodoo years and would have patented most concepts related to it.

Maybe they have some unofficial ceasefire.

edit: oops nvm https://en.wikipedia.org/wiki/Tiled_rendering#Early_work
 

DonMigs85

Member
First time I heard it is a general principle.

IINM Imagination famously came up with the tile approach first during the voodoo years and would have patented most concepts related to it.

Maybe they have some unofficial ceasefire.
Adreno GPUs can switch modes, so maybe they did something with Qualcomm - or it's really their own design.
 

dr_rus

Member
First time I heard it is a general principle.

IINM Imagination famously came up with the tile approach first during the voodoo years and would have patented most concepts related to it.

Maybe they have some unofficial ceasefire.

Most of work on tiled rendering was done prior to PowerVR's GPU implementation and was used by a lot of companies since then including MS and Intel. Parts of tiled rendering are being used in offline renderers I think. Both Mali and Adreno are tile based. It's as much a general principle now as the traditional approach. I highly doubt that PowerVR have a hold on the principle itself.
 

psy18

Member
Making it perfectly compatible with the traditional rendering method sounds like the real breakthrough in this.

I wonder if any of the engineers from the competing side reading this news will slap their forehead while saying "So that's it!!!".
 

dr_rus

Member
Making it perfectly compatible with the traditional rendering method sounds like the real breakthrough in this.

I wonder if any of the engineers from the competing side reading this news will slap their forehead while saying "So that's it!!!".

If these engineers needed a Web article to figure that out then they aren't very good engineers.
 

BAW

Banned
I had the impression that desktop GPUs gained the ability to "just render what you see" (e.g. similar to the Dreamcast) around the GeForce 3 days, so this must not be a 100% new thing, unless tile-based is even more efficient than what existed before?
 

Locuza

Member
Making it perfectly compatible with the traditional rendering method sounds like the real breakthrough in this.

I wonder if any of the engineers from the competing side reading this news will slap their forehead while saying "So that's it!!!".

11. November 2015:
Andrew Lauritzen from Intel said:
Yeah Maxwell already has one foot towards TBDR and I imagine everyone will do more of that in the future. Should be interesting!
https://twitter.com/AndrewLauritzen/status/664513680374034432

So no, they are not surprised ;)

Edit: Because it's funny:
Andrew Lauritzen from Intel said:
They do a quasi-tiling/buffering/reordering of triangles (up to ~2k or so IIRC) within a draw call and play games with the ROPs/mem.
Andrew Lauritzen from Intel said:
But shhhh, it's a big secret ;)
 

DonMigs85

Member
I had the impression that desktop GPUs gained the ability to "just render what you see" (e.g. similar to the Dreamcast) around the GeForce 3 days, so this must not be a 100% new thing, unless tile-based is even more efficient than what existed before?
That was HSR or hidden surface removal. Also falls under ATI's old HyperZ feature (early z-buffer clear)
 

Window

Member
I had the impression that desktop GPUs gained the ability to "just render what you see" (e.g. similar to the Dreamcast) around the GeForce 3 days, so this must not be a 100% new thing, unless tile-based is even more efficient than what existed before?

This is essentially (from my limited understanding) about breaking up a frame into smaller tiles which may then be stored in the on-chip buffer, improving memory usage efficiency. There is also less work to be done per tile than an entire frame. As the tiles are largely independent, the process can also be parallelized.
 

Durante

Member
Just watched the video.

Now that is good tech reporting, what an incredible treat compared to the opinionated slop that usually passes for technology reviews on Youtube. I'd still rather have an article, but I guess it's a sign of the times :p

Anyway, this explains a whole lot. It was always a bit suspicious just how much Maxwell improved in efficiency on the same node. More cache and compression are nice, but they normally aren't game-changers -- and now we see that the game changer was something else.

Now I'd love to know the exact technical details of why they moved from the diagonal rasterization pattern in Maxwell to a vertical one in Pascal.
 

PnCIa

Member
Does the entire rasterization stage need to be finished as a whole before pixel shading can be done...or is it possible that individual tiles, once finished can be pushed forward in the pipeline?
 

Durante

Member
This also explains why Maxwell increased significantly more in performance than Kepler did after its release. It's not some secret Nvidia conspiracy to make older GPUs look bad (surprise), it's because wrangling an API designed for immediate mode into something which runs efficiently under the hood on an architecture such as this is obviously a massive engineering effort.

I have even more respect for the achievements of the NV driver engineering team now.
 

tuxfool

Banned
Old time PowerVR people must be kicking themselves. They're slowly being proven right after all these years.
 
The crazy conspiracy theory of Nvidia ruining older GPU performance to make you buy Maxwell/Pascal can now be thrown out into the garbage bin where it belongs.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
The crazy conspiracy theory of Nvidia ruining older GPU performance to make you buy Maxwell/Pascal can now be thrown out into the garbage bin where it belongs.
Well, they did drop Vulkan@Fermi at the last moment.
 

Totbjorn

Neo Member
Interesting.

Tile based rendering is only one of several tricks that the mobile GPU designs use to reduce power consumption.
Now that heat seems to be the hard limit on high end GPUs perhaps its time for these designs to be scaled up to PC high end?

Some more competition in the PC GPU market would be great.

Does anyone have any insight in what would be needed to make e.g. PowerVR and ARMs Mali to scale to nVidia Titan levels?
They already scale from ultra low to kind of mid by PC standards and Pascal scales from Titan to mid mobile levels.
I assume they at least need to add memory controllers for HBM or GDDR but are there other roadblocks?

Historically I believe that D3D driver support have been a problem but with the rise of more direct APIs like D3D 12 and Vulkan that should be easier right?
 

KOCMOHABT

Member
Source Article:
http://www.realworldtech.com/tile-based-rasterization-nvidia-gpus/

Video explanation with demonstration: includes:
A brief refresher on the 3D pipeline
An explanation of the DirectX shader code, which is available on Github
Behavior of the code on an AMD GPU
Behavior of the code on Nvidia Maxwell and Pascal GPU
Discussion and analysis of the results

https://youtu.be/Nc6R1hwXhL8

Thanks, interesting topic

Cool stuff, interesting that they changed the approach.
 

martino

Member
Well, they did drop Vulkan@Fermi at the last moment.

is it really usefull for 2012 gpu tech to support modern game api they will be crap at ?
but why not then i hope you are also not happy because no terascale support
 

Durante

Member
Historically I believe that D3D driver support have been a problem but with the rise of more direct APIs like D3D 12 and Vulkan that should be easier right?
A GPU which doesn't effectively run DX11 (and even DX9 and OpenGL) would be a hard sell to the vast majority of PC gamers.

I really think driver support is a major roadblock.
 

pottuvoi

Banned
Most of work on tiled rendering was done prior to PowerVR's GPU implementation and was used by a lot of companies since then including MS and Intel. Parts of tiled rendering are being used in offline renderers I think. Both Mali and Adreno are tile based. It's as much a general principle now as the traditional approach. I highly doubt that PowerVR have a hold on the principle itself.
Also there was once upon time a company called GigaPixel which had TBDR architecture under works.
It was bought by 3DFX, which later was bought by Nvidia.
Yes, pretty much, along with other enhancements over Kepler.
It's what mobile GPUs from Imagination and ARM use to improve efficiency and power consumption. It's been in the Dreamcast PowerVR chip and helped it punch above its weight.
Sadly even PowerVR dropped the order independent transparency from their implementation after Dreamcast.

Proper OIT is still one of the most wanted features in realtime graphics.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
is it really usefull for 2012 gpu tech to support modern game api they will be crap at ?
but why not then i hope you are also not happy because no terascale support
As a matter of fact I'm not happy.
 

Totbjorn

Neo Member
A GPU which doesn't effectively run DX11 (and even DX9 and OpenGL) would be a hard sell to the vast majority of PC gamers.

I really think driver support is a major roadblock.

Sure, but less and less so as they become older and I believe both ARM and Imagination supports DX11.
In a couple a years it will probably be OK to have subpar DX11 drivers as long as they support it.

Competing in the consumer space could be another problem though. These companies are used to sell to other companies not directly to consumers but perhaps someone like Samsung or even Sony would be interested in entering the PC GPU market with licensed IP.

Another option could maybe be some of the bigger AIB manufacturers like ASUS could be interested in making their own chips.
 

dr_rus

Member
11. November 2015:

https://twitter.com/AndrewLauritzen/status/664513680374034432

So no, they are not surprised ;)

Edit: Because it's funny:
Yeah, the hints on this was around for some time now.

This also explains why Maxwell increased significantly more in performance than Kepler did after its release. It's not some secret Nvidia conspiracy to make older GPUs look bad (surprise), it's because wrangling an API designed for immediate mode into something which runs efficiently under the hood on an architecture such as this is obviously a massive engineering effort.
It's worth mentioning that it's not a complete TBDR, it's a hybrid which is able to use tiled rendering inside L2 when it makes sense and when it doesn't it falls back to the usual immediate mode. It can do this intraframe IIRC, between different "passes" (not sure this still means what it used to but don't know how to call it otherwise).
Kepler's performance was increased with drivers as well. Unfortunately, Kepler's DCC wasn't very robust and uniform and it breaks very often in recent games which is partially the reason for it's subpar performance in >DX11.0 titles.

Historically I believe that D3D driver support have been a problem but with the rise of more direct APIs like D3D 12 and Vulkan that should be easier right?
More direct API usually means more complexity, not less. However both modern D3D and Vulkan (obviously) have built-in support for TBDRs so this isn't an issue. The reason ARM/Img/Qualcomm don't try their GPUs on PC desktops is because they understand that the effort there is much more than just scaling their mobile chips to >300mm^2 while the market itself isn't growing fast enough to accommodate more players. Basically, it's unlikely that they would be able to gain anything by this, at least not until the end of physical scaling.
 

Durante

Member
It's worth mentioning that it's not a complete TBDR, it's a hybrid which is able to use tiled rendering inside L2 when it makes sense and when it doesn't it falls back to the usual immediate mode. It can do this intraframe IIRC, between different "passes" (not sure this still means what it used to but don't know how to call it otherwise).
Well, that makes sense, it should always be possible to write code which breaks the underlying assumptions to a point where you (at best) don't get any advantage from tiling. It would be extremely interesting to get stats about various current games and how much of their rendering (if any) is tiling-friendly and how much isn't.

For example, I imagine that doing multiple deferred shading passes over a G-buffer should be a good use case.
 

LordOfChaos

Member
Nice...Given Nvidias power and bandwidth advantages, this makes sense. It's dominating mobile for these reasons.

The things we still don't know about the GPUs we buy...I wonder what other surprises they pack, and if AMD still has some surprises in theirs.
 

ethomaz

Banned
I do question how they do the compatibility with old hardware... they did create suck magical drivers for that... I mean what effort put into.

nVidia is really ahead in GPU tech than competition like Intel is ahead in CPU tech.
 

LordOfChaos

Member
nVidia is really ahead in GPU tech than competition like Intel is ahead in CPU tech.

It's why I'm coming around to the idea of Intel buying RTG if AMD parts it out. Radeon Technologies Group has some capable engineers but they can't get much funding from AMD, hence Polaris and Vega being largely evolutionary over GCN, while trying to compete with Pascal.

Imagine Intel funding plus Intels still 18-24 month fab advantage over the universe (everyone elses 14/16nm is what Intel would call 22nm with finfett), applied to RTG GPUs.


I really think it's why RTG started branding itself RTG and now you see RTG stamped so many times over every RTG article.

RTG.
 

Lumberjackson

Neo Member
I was surprised to see that older AMD GPU doing the rasterization from right to left.

Out of curiosity, has anyone compiled the application and let it run on GCN, Kepler or even intel iGPU equipped machines to see how those behave?
 

Durante

Member
I played around a bit with the tool myself just now.

At 8 floats per vertex (let's call it FPV) Pascal (in my 1080) bins up to 1200 triangles. Interestingly, this is quite non-linear in FPV. With 16 FPV, it's down to ~485, and with 32 FPV it's down to ~88.


Also, the in-tile pattern is diagonal on my 1080:
tileexl9e.png
 
Top Bottom