WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.
The Digital Foundry Hypothesis

SPUs: 8 blocks (N1-N8)
TMUs: 4 blocks (J1-J4)
ROPS: 2 blocks ? (no proposed location)
ARM: 1 block (no proposed location)
DSP: 1 block (no proposed location)
Video encode/decode: 1 block (no proposed location)
Command processor: 1 block (no proposed location)

Total blocks explained: 18 (12 with locations)
Total blocks unexplained: 22
 
The DF article thread got locked, anyone know why?
Because it is a most basic example of GAF->internet->GAF, and the OP of that thread was framed in a very poor way that encouraged drive-by trolling.
We already have this thread which was the source for the Digital Foundry article to begin with.
 
The DF article thread got locked, anyone know why?
Because the article is shit, or gaffers went a bit too excited. Maybe both.
edit: beaten with slime and cat fur.

The Digital Foundry Hypothesis



SPUs: 8 blocks (N1-N8)
TMUs: 4 blocks (J1-J4)
ROPS: 2 blocks ? (no proposed location)
ARM: 1 block (no proposed location)
DSP: 1 block (no proposed location)
Video encode/decode: 1 block (no proposed location)

Total blocks explained: 17 (12 with locations)
Total blocks unexplained: 23
Thanks to show that nothing is solved yet Thraktor !
 
Because it is a most basic example of GAF->internet->GAF, and the OP of that thread was framed in a very poor way that encouraged drive-by trolling.
We already have this thread which was the source for the Digital Foundry article to begin with.
Thanks. Also, thanks for the title change, I should have put something more descriptive in the first place.
 
Because it is a most basic example of GAF->internet->GAF, and the OP of that thread was framed in a very poor way that encouraged drive-by trolling.
We already have this thread which was the source for the Digital Foundry article to begin with.
You'll have to explain that to me sometime. Thanks for telling me though. Other boards ban you for asking why a board was locked/someone was banned.
 
I'm fascinated by this whole thing. The impression I get is that most of the people with a stick in their ass about the Wii U don't own it and have no plans to. Odd, then, that they're so upset by it.

Even more fascinating are the folks who just, no matter what, really have venom for Nintendo fans. Not for any good reason, mind you...they're just affected deeply by someone who would DARE to have different interests than themselves.
 
The Digital Foundry Hypothesis



SPUs: 8 blocks (N1-N8)
TMUs: 4 blocks (J1-J4)
ROPS: 2 blocks ? (no proposed location)
ARM: 1 block (no proposed location)
DSP: 1 block (no proposed location)
Video encode/decode: 1 block (no proposed location)

Total blocks explained: 17 (12 with locations)
Total blocks unexplained: 23
According to your diagram, there are an additional 3 repeating sections. 1 of them must be the ROPS. Any clue what the other two could be?

Also, DF did mention the command processor.
 
Awesome work

Do you have any understanding why the repeating logic sections differ slightly from each other? Is this common or also unusual?
The logic is all laid out by computer, which will change the layout of components to fit them in as compactly as possible. Incidentally, this is also why all the logic blocks are just a flat brown colour. They're packed so tightly you can't actually make out any patterns in them, unlike the eDRAM, SRAM and I/O, which are arranged in repeating patterns so are easy to make out.

According to your diagram, there are an additional 3 repeating sections. 1 of them must be the ROPS. Any clue what the other two could be?

Also, DF did mention the command processor.
Thanks, updated the post with the command processor. Regarding the extra repeating blocks, I'll put up my own theories in a minute.
 
I'm fascinated by this whole thing. The impression I get is that most of the people with a stick in their ass about the Wii U don't own it and have no plans to. Odd, then, that they're so upset by it.

Even more fascinating are the folks who just, no matter what, really have venom for Nintendo fans. Not for any good reason, mind you...they're just affected deeply by someone who would DARE to have different interests than themselves.
Solipsism. For whatever reason in the console warz, the most heinous quality a human being can possess is not conforming to the hive mind.
 
Someone should do a more in-depth investigation of what's the fraction of "unexplained" die space on a normal AMD GPU (e.g. RV770) compared to this. I think it's not as different as people would expect, particularly considering BC, the ARM core and the DSP.

If no one does it before I'll do it this evening.

Even more fascinating are the folks who just, no matter what, really have venom for Nintendo fans. Not for any good reason, mind you...they're just affected deeply by someone who would DARE to have different interests than themselves.
Oh, there are plenty of good reasons to have a distaste for some Nintendo fans. Go read some of the Wii U speculation threads.
 
I'm fascinated by this whole thing. The impression I get is that most of the people with a stick in their ass about the Wii U don't own it and have no plans to. Odd, then, that they're so upset by it.

Even more fascinating are the folks who just, no matter what, really have venom for Nintendo fans. Not for any good reason, mind you...they're just affected deeply by someone who would DARE to have different interests than themselves.
The thing that baffles me the most about the console wars are self-proclaimed graphic whores that would rather buy a console than building their own PC for maximum performance (that's what I do, PC for "power" titles, Nintendo for exclusives and third parties able to think out of the box). Anyway, enough OT. It's been really educative so far. I really wonder if we'll ever be able to decode the rest of the GPU...
 
The Digital Foundry Hypothesis



SPUs: 8 blocks (N1-N8)
TMUs: 4 blocks (J1-J4)
ROPS: 2 blocks ? (no proposed location)
ARM: 1 block (no proposed location)
DSP: 1 block (no proposed location)
Video encode/decode: 1 block (no proposed location)

Total blocks explained: 17 (12 with locations)
Total blocks unexplained: 23


They leave all that unexplained, claim with no real evidence that "we can now finally rule out any next-gen pretensions for the Wii U", and don't even link to the original article back here at GAF?

WTF maybe they should to the GTTV special on DF sure sounds like there are a lot of assholes there.
 
About the Mem1/Mem2 thing

How exactly does that work?

Is the 32mb edram considered the main RAM? Why else would you name it Mem1?

If so, does Wii U store data it needs into Mem2 and preloads stuff that it needs next into the Mem1, the edram? To make use of its high bandwidth compared to gDDR3?

Is the edram now 140gb/sec or 70gb/sec?

Sorry for all the questions...
 
Someone should do a more in-depth investigation of what's the fraction of "unexplained" die space on a normal AMD GPU (e.g. RV770) compared to this. I think it's not as different as people would expect, particularly considering BC, the ARM core and the DSP.

If no one does it before I'll do it this evening.
Going by the RV770 die shot on the first page, it looks like something in the 10%-20% range. Of course, the ratios may change for a lower-end chip such as this.
 
We know it has a tesselator of some kind. Plus the command processor. Do we know how large the tesselator was on the RV770? How many blocks could it possibly take up on a 40nm die? Plus we have to figure we have legacy logic of some kind. Has anyone seen a die shot of flipper or hollywood to know what a TEV unit looks like?
 
Looking over thing I think we can assume that core parts of the chip are based of the 5550 LE (redwood core). Everything seems to fit, same clock speed, same GFLOPS, Core configuartion is the same, 40nm process. Just seems like too simular for it not to be based on that design. Of course there's all the other stuff on the die that we have little clue about.

http://www.tomshardware.com/reviews/radeon-hd-5550-radeon-hd-5570-gddr5,2704.html

This is a review of a DD3 version of the 5550. It's low watt and requires only passive cooling. Performance wise it's close to the 4670 but much more effiecient. Plays crysis at about 30fps on medium settings.
 
Looking over thing I think we can assume that core parts of the chip are based of the 5550 LE (redwood core). Everything seems to fit, same clock speed, same GFLOPS, 40nm process. Just seems like too simular for it not to be based on that design. Of course there's all the other stuff on the die that we have little clue about.
There's that "V" block that some people have mentioned to be similar to others in the newest AMD GPUs. Maybe the tesselation unit?
 
About the Mem1/Mem2 thing

How exactly does that work?

Is the 32mb edram considered the main RAM? Why else would you name it Mem1?

If so, does Wii U store data it needs into Mem2 and preloads stuff that it needs next into the Mem1, the edram? To make use of its high bandwidth compared to gDDR3?

Is the edram now 140gb/sec or 70gb/sec?

Sorry for all the questions...

It does appear the GPU is a console in itself.
Like the Wii or GameCube shrunk down to a GPU.
And the GDDR3 would act like the DVD.
 
I'm guessing W1 and W2 are DDR3 memory controllers (even though it's low bandwidth, it might still need two for multichannel layout, no?)

I think U1 and U2 are ROPs. Would make sense with their proximity to the eDRAM and also DDR3 interface.

I think the X/Y area is media decode stuff. I'm guessing the ARM core is F and the DSP perhaps D.

Lots of guesswork, but it doesn't hurt to throw around alternate theories.

Take a look at the ifixit teardown. You can get a general idea of what part of the GPU matches up to the traces on the motherboard and where they lead.

http://www.ifixit.com/Teardown/Nintendo+Wii+U+Teardown/11796/2
 
The Asymmetric Shaders Hypothesis

(This is my own pet theory, and I have no pretensions of expertise here, I'm just putting it up for discussion.)

SPUs: 12 blocks (J1-J4, N1-N8)
TMUs: 4 blocks (Q1, Q2, T1, T2)
ROPs: 4 blocks (U1, U2, W1, W2)
ARM: 1 block (no proposed location)
DSP: 1 block (Y)
Video encode/decode: 1 block (X)
Command processor: 1 block (no proposed location)

Total blocks explained: 24 (22 with locations)
Total blocks unexplained: 16

The above proposal is based on the theory that the shader groupings are not symmetric. In particular, four of the shader groupings have less register memory than the other 8 (the logic being these would handle tasks with lower data reuse). TMUs and ROPs would similarly be split asymmetrically.
 
Great work on the diagram, Thraktor!

Of course, now marcan steps up to reveal more info! What use is a 2 MB eFB besides Wii BC? It's only enough for an SD image. Oh yeah...
Considering Marcan calls that pool "MEM0", that might actually be what it is in native mode. That would suggest that the pool is freely accessible.
 
I'm fascinated by this whole thing. The impression I get is that most of the people with a stick in their ass about the Wii U don't own it and have no plans to. Odd, then, that they're so upset by it.

Even more fascinating are the folks who just, no matter what, really have venom for Nintendo fans. Not for any good reason, mind you...they're just affected deeply by someone who would DARE to have different interests than themselves.
Because some people have a natural inability to see things from the perspective of another. Therefore they fear/hate anything that does not conform to their narrow vision of the world. This is true for more than just video games. See politics, race, religion, morality, etc...
 
About the Mem1/Mem2 thing

How exactly does that work?

Is the 32mb edram considered the main RAM? Why else would you name it Mem1?

If so, does Wii U store data it needs into Mem2 and preloads stuff that it needs next into the Mem1, the edram? To make use of its high bandwidth compared to gDDR3?

Is the edram now 140gb/sec or 70gb/sec?

Sorry for all the questions...
I'd say its a safe bet the larger 1GB pool is still the "main" memory in the sense that it stores more, like how in your computer your processor has an L1, L2, maybe L3 cache but still has larger main memory. The eDRAM is probably used for high bandwidth GPU operations, a framebuffer, maybe a scratch space between GPU and CPU, but not most of the game memory since that obviously needs more than 32MB.



And the GDDR3 would act like the DVD.
For wii emulation mode, most of the Wiis memory could fit in the eDRAM, yeah. Not for Wii U mode of course. The Wii still had 64 MB external GDDR3 to account for but I guess the DDR3 is fast enough for that part, the eDRAM would be for the 1T-SRAM and eDRAM on the Wii GPU.
 
Considering Marcan calls that pool "MEM0", that might actually be what it is in native mode. That would suggest that the pool is freely accessible.
Yep. They may have needed it for Wii BC and decided to just give devs access to it anyway, giving them a neat place to store their gamepad FB.
 
I'd say its a safe bet the larger 1GB pool is still the "main" memory in the sense that it stores more, like how in your computer your processor has an L1, L2, maybe L3 cache but still has larger main memory. The eDRAM is probably used for high bandwidth GPU operations, a framebuffer, maybe a scratch space between GPU and CPU, but not most of the game memory since that obviously needs more than 32MB.
I think the eDRAM is the main memory in the sense that it holds the most important things (framebuffer, etc.). Less important data (eg textures) go into the much larger secondary DDR3 memory pool.

Edit: I think that A and/or B may be dedicated to managing all the eDRAM and SRAM.
 
I think the eDRAM is the main memory in the sense that it holds the most important things (framebuffer, etc.). Less important data (eg textures) go into the much larger secondary DDR3 memory pool.
So the idea is to write out as much as you can to DDR3, and then read it all back through EDRAM as necessary? Due to the penalty that comes with direction switching to write to DDR3, I would think you'd want to avoid writing out as much as possible.
 


1) The non-TMU/non-SIMD/non-IO parts of RV770 are about as big as the SIMD/TMU parts. And that GPU has a lot of SIMD/TMUs.
2) The non-TMU/non-SIMD/non-IO parts of Wii U's GPU, using the original assumptions, are about 2.4 times as big as the SIMD/TMU parts.
3) The latter includes at least 1 ARM CPU, a DSP, logic for managing and interfacing with 2 separate pools of eDRAM, and maybe some BC components.

My conclusion: If there is some special GPU sauce, it's not all that significant.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Ok, re GPU blocks/processors, the following can all be assumed to be in there:

  • Command Processor and Thread Scheduler (not necessarily the same block)
  • Trisetup and rasterizer (R800 dropped that and delegated the workload to SPs)
  • Global Data Share (traditionally not very large, and likely encased nicely by some of the numerous embedded pools, in a much larger size)
  • A bunch of caches (vertex, texture) which could be really tiny or not so much (again, memory pools ahoy)
  • DMA engines
  • Ring buses
  • Tessellator (likely still sitting in fixed-function silicon)

BTW, a quick google for ARM9 die area got me to this article discussing a Qualcomm broadband/app processor (yes, ARM9 stand-alone dies are a bit hard to track these days), which appears to be ~0.8mm2 @40nm (the original part is 90nm, so I've applied a squares rule).
 


1) The non-TMU/non-SIMD/non-IO parts of RV770 are about as big as the SIMD/TMU parts. And that GPU has a lot of SIMD/TMUs.
2) The non-TMU/non-SIMD/non-IO parts of Wii U's GPU, using the original assumptions, are about 2.4 times as big as the SIMD/TMU parts.
3) The latter includes at least 1 ARM CPU, a DSP, logic for managing and interfacing with 2 separate pools of eDRAM, and maybe some BC components.

My conclusion: If there is some special GPU sauce, it's not all that significant.
Do you think this disqualifies Thraktor's theory about assymetric shaders?
 
While it is disappointing to hear this thing is anemic on the power side but as long as the games come rolling along it really doesn't matter in the end game. Nintendo is legendary because of their first party support so they will be fine.
Usually legendary.

I felt the Wii was a huge let down when it came to Nintendo games, personally.
 
Someone should do a more in-depth investigation of what's the fraction of "unexplained" die space on a normal AMD GPU (e.g. RV770) compared to this. I think it's not as different as people would expect, particularly considering BC, the ARM core and the DSP.

If no one does it before I'll do it this evening.
Going by the RV770 die shot on the first page, it looks like something in the 10%-20% range. Of course, the ratios may change for a lower-end chip such as this.
I think it's more than that, it just looks like less since its around the periphery. I'll measure it later.
If you rearrange the orange parts in your preferred picture editing software, you can see that the orange parts, take up exactly 1/3rd of the entire chip (189px out of 588px). I used the measure tool in photoshop, after stacking from left to right.

For what it's worth.
...
 
Yeah, but comparing to the size of the whole chip (including IO) doesn't make as much sense for our purposes as comparing to just the SIMD+TU block, which I did.
 
Status
Not open for further replies.