• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.
Everyone should just remember that long ago brain_stew (a trusted insider) posted that nobody would be disappointed w/ Wii U's memory subsystem. I think this die photo shows this to be true. The 64-bit bus to main memory definitely does not tell the whole story - or even most of it!
 
So, do we have hard numbers?

They're not really hard, but we're on our way there:



Wii U!

3 x OOE CPU at 1.2 ghz
~300-400 Gflops GPU (???)
2GB T1 SRAM @ 12.8 GB/s (???)
32MB eDRAM @ 140 GB/s (???)
1MB sRAM/eDRAM (???)

Durango!

6 x OOE CPU at 1.6 ghz
1.243 Gflops GPU (3 - 4x more than Wii U)
8GB DDR3 RAM @ 68GB/s (4x more than the Wii U, ~5x faster than the Wii U)
32MB eSRAM @ 102GB/s (Not edram)

Orbis!

6 x OOE CPU at 1.6 ghz
1.843 Gflops GPU (5-6x more than Wii U; 1,5x more than Durango)
4GB GDDR5 RAM @ 176 GB/s (2x more than Wii U, 15x faster than Wii U; 0.5x less than Durango, 2.5x faster than Durango)
0MB eDram @ 0MB/s (Infinitely less than Wii U; Infinitely less than Durango)
 
not too long ago, I seen people in the Sony speculation threads guessing 2.5GFLOPs minimum for both next-gen consoles. just like it was in the WUST, it was just a bunch of random people being excited and saying shit they wanted to say. i honestly can't believe all of the grudges people have held against those speculation threads - it blows my mind. i was in and out of the threads the entire time and nowhere did i get the impression that the specs would be much higher than they are now.

The way a large number of people behaved when people like Arkham and to a much lesser extent llhere came in as confirmed sources and told people to lower their expectations left a bad taste in my mouth.

edit: also alot of the early speculation for PS4/720 was people telling us that anything more than 2gb ram was unrealistic
 

BlackJace

Member
They're not really hard, but we're on our way there:



Wii U!

3 x OOE CPU at 1.2 ghz
~300-400 Gflops GPU (???)
2GB T1 SRAM @ 12.8 GB/s (???)
32MB eDRAM @ 140 GB/s (???)
1MB sRAM/eDRAM (???)

Durango!

6 x OOE CPU at 1.6 ghz
1.243 Gflops GPU (3 - 4x more than Wii U)
8GB DDR3 RAM @ 68GB/s (4x more than the Wii U, ~5x faster than the Wii U)
32MB eSRAM @ 102GB/s (Not edram)

Orbis!

6 x OOE CPU at 1.6 ghz
1.843 Gflops GPU (5-6x more than Wii U; 1,5x more than Durango)
4GB GDDR5 RAM @ 176 GB/s (2x more than Wii U, 15x faster than Wii U; 0.5x less than Durango, 2.5x faster than Durango)
0MB eDram @ 0MB/s (Infinitely less than Wii U; Infinitely less than Durango)

Thank you!
 
The way a large number of people behaved when people like Arkham and to a much lesser extent llhere came in as confirmed sources and told people to lower their expectations left a bad taste in my mouth.

I'm glad you said people.

Because after a few days of not really posting about him I was a little princess towards Arkam. AND SPELL IT ARKAM DAMMIT! He's not Batman. More like the commissioner.
 

Meelow

Banned
It's all just numbers on a page just now, as many people have said the proof will be in the games, let's wait until we see a major next gen multiplatform third party game across all three platforms before we go jumping to conclusions (we should see at least one at E3 in June).

Even if the WiiU GPU is at the low end of guesses so far, 176GFLOP GPU is still a 14.6x leap over the original Wii GPU with the console also having 22x more system Ram and 32MB's of eDRAM (i don't know how many times more powerful the WiiU CPU is than the Wii CPU).

Im sure Nintendo and the third party developers that work to the strengths of the system will create some astounding looking games just as Nintendo did with Mario Galaxy 1 & 2, Skyward Sword, Retro did with Metroid 3 and Monolith Soft did with Xenoblade Chronicles on near decade old hardware.

I bought the WiiU for exclusive games (PS4 / 720 will be my consoles for multi platform games) so i for one am truly excited to see the major exclusive games at E3, can't wait ! :).

Exactly.

The usual consensus/excuses with Nintendo owners is people buy them for Nintendo games, and get other systems for multiplats.
But I really want to see some good multiplat support on the system, because I want the option for off tv play, which looks like it wont be available for the other consoles. So I think there is a good enough interest and reason to want and expect off tv play supported ports, even at the cost of IQ.

I really wish Nintendo made it easier for devs to do this. Hopefully the custom hardware isn't too much of a barrier.

I 100% agree with this, I would like to play first party Nintendo games and third party games on 1 system and not have to buy 2 systems (I'm not a PC gamer).

We know at E3 2011 that the Wii U dev kits we're weaker yet it still was really easy to port 360 games to Wii U, maybe the architecture changed because the Wii U arechtecture looks closer to the PS4/720 than the Wii ever was to PS3/360 (even though weren't they all different?)

I guess we'll have to see in the future how Wii U stands up against the PS4/720. Ideaman did confirm a next gen game most people thought the Wii U couldn't handle is coming to Wii U.
 

Darryl

Banned
The way a large number of people behaved when people like Arkham and to a much lesser extent llhere came in as confirmed sources and told people to lower their expectations left a bad taste in my mouth.

i completely missed what you're talking about, no clue who that dude is, but if it was a bunch of people dismissing him than that shit happens in these threads all of the time. it's like a constant process in any spec discussion forum.
 
Orbis!

6 x OOE CPU at 1.6 ghz
1.843 Gflops GPU (5-6x more than Wii U; 1,5x more than Durango)
4GB GDDR5 RAM @ 176 GB/s (2x more than Wii U, 15x faster than Wii U; 0.5x less than Durango, 2.5x faster than Durango)
0MB eDram @ 0MB/s (Infinitely less than Wii U; Infinitely less than Durango)

Orbis is the odd one out here, it has the most PC-like configuration amongst the other consoles if its lack of EDRAM/ESRAM (in the Durango) is to be believed.

PC graphics cards have that massive GDDR5 RAM bandwidth to offset the lack of EDRAM's monster bandwidth.
 
i completely missed what you're talking about, no clue who that dude is, but if it was a bunch of people dismissing him than that shit happens in these threads all of the time. it's like a constant process in any spec discussion forum.

Well you said you followed the WUS threads so I assumed you would have knoen about one of the bigger things to happen in them
 
A lot of that is wrong. Both Durango and Orbis have 8 cores, and the Wii U uses gDDR3, not 1T-SRAM. I don't even want to know how large a motherboard would be if it had that much SRAM.
Don't tell anyone, but 1T-SRAM is not really SRAM; it's an hybrid design of sorts, SRAM being 4T/6T/8T or 10T, this nomenclature meaning the amount of access transistors.

1T-SRAM is 1 access transistor interfacing DRAM (instead of SRAM).


Also, the Wii U doesn't use gDDR3.
 

Schnozberry

Member
Don't tell anyone, but 1T-SRAM is not really SRAM; it's an hybrid design of sorts, SRAM being 4T/6T/8T or 10T, this nomenclature meaning the amount of access transistors.

1T-SRAM is 1 access transistor interfacing DRAM (instead of SRAM).

Sorry, PSRAM. It's a lot like eDRAM, from what I've read, but smaller on die and consumes less power. Google is my friend.
 

plank

Member
They're not really hard, but we're on our way there:



Wii U!

3 x OOE CPU at 1.2 ghz
~300-400 Gflops GPU (???)
2GB T1 SRAM @ 12.8 GB/s (???)
32MB eDRAM @ 140 GB/s (???)
1MB sRAM/eDRAM (???)

Durango!

6 x OOE CPU at 1.6 ghz
1.243 Gflops GPU (3 - 4x more than Wii U)
8GB DDR3 RAM @ 68GB/s (4x more than the Wii U, ~5x faster than the Wii U)
32MB eSRAM @ 102GB/s (Not edram)

Orbis!

6 x OOE CPU at 1.6 ghz
1.843 Gflops GPU (5-6x more than Wii U; 1,5x more than Durango)
4GB GDDR5 RAM @ 176 GB/s (2x more than Wii U, 15x faster than Wii U; 0.5x less than Durango, 2.5x faster than Durango)
0MB eDram @ 0MB/s (Infinitely less than Wii U; Infinitely less than Durango)

Whats was the comparison specs with Wii, 360, PS3?
 

ozfunghi

Member
The way a large number of people behaved when people like Arkham and to a much lesser extent llhere came in as confirmed sources and told people to lower their expectations left a bad taste in my mouth.

Like Guek said earlier, you can be right and still make your point in a trollish fashion. The way Arkam leaked his info was for many people (including myself) a bigger issue than the content of his info. It was very hit&run as well. Later he admitted not having hands-on with the hardware himself, and later still, he admitted his studio had massively increased performance due to tweaking the code for the hardware. Meaning his initial comments were inaccurate and premature.
 
Whats was the comparison specs with Wii, 360, PS3?

The gap between Wii and 360/PS3 was a whole lot greater than the gap between Wii U and Durango, or Wii U and Orbis.

What's new is that unlike last-gen, where there was a nonsignificant gap between 360 and the PS3, there is a pretty significant gap between the Durango and PSOrbis - with the Orbis being more powerful.
 
They're not really hard, but we're on our way there:



Wii U!

3 x OOE CPU at 1.2 ghz
~300-400 Gflops GPU (???)
2GB T1 SRAM @ 12.8 GB/s (???)
32MB eDRAM @ 140 GB/s (???)
1MB sRAM/eDRAM (???)

Durango!

6 x OOE CPU at 1.6 ghz
1.243 Gflops GPU (3 - 4x more than Wii U)
8GB DDR3 RAM @ 68GB/s (4x more than the Wii U, ~5x faster than the Wii U)
32MB eSRAM @ 102GB/s (Not edram)

Orbis!

6 x OOE CPU at 1.6 ghz
1.843 Gflops GPU (5-6x more than Wii U; 1,5x more than Durango)
4GB GDDR5 RAM @ 176 GB/s (2x more than Wii U, 15x faster than Wii U; 0.5x less than Durango, 2.5x faster than Durango)
0MB eDram @ 0MB/s (Infinitely less than Wii U; Infinitely less than Durango)

You could add: Durango and Orbis CPUs: ~100GFlops, Wii U CPU: 12GFlops

Jesus, the Wii U seems to have a fairly good GPU (if you consider what the original Wii had...) but its CPU is just... weak.
 

PetrCobra

Member
In any comparisons of the Wii U versus other consoles, we must keep in mind that the console actually has a second screen. It needs to render stuff for that second screen at the same time as it runs the game on the main screen. That means we need a fair amount of overhead power to keep a graphics level on the main screen at certain standards.

It seemed to me that this kind of wasn't getting much attention from the people who like to compare FLOPS, so just a little reminder.
 

guek

Banned
Like Guek said earlier, you can be right and still make your point in a trollish fashion. The way Arkam leaked his info was for many people (including myself) a bigger issue than the content of his info. It was very hit&run as well. Later he admitted not having hands-on with the hardware himself, and later still, he admitted his studio had massively increased performance due to tweaking the code for the hardware.

While I agree that the initial treatment of Arkam was unacceptable (I was overly hostile as well), he did pop up out of nowhere, drop some knowledge as if we should obviously just take his word for it, and then disappeared for a 1-2 weeks, all before he was verified as a dev by mods. It wouldn't have been as big of an issue if he had actually presented himself properly and stuck around to clarify what he meant and get himself verified. That's why people were far less hostile to lherre who also tended to try to reel in expectations.


To those more knowledgeable than I: Is there any chance you'll be able to learn more about the fixed functions on the chip or is that just a task beyond the limits of your wizardry?

In any comparisons of the Wii U versus other consoles, we must keep in mind that the console actually has a second screen. It needs to render stuff for that second screen at the same time as it runs the game on the main screen. That means we need a fair amount of overhead power to keep a graphics level on the main screen at certain standards.

It seemed to me that this kind of wasn't getting much attention from the people who like to compare FLOPS, so just a little reminder.

Depends on the function in question. Mirroring in the gamepad for example is essentially free from what I've been told.
 

Alexios

Cores, shaders and BIOS oh my!
In any comparisons of the Wii U versus other consoles, we must keep in mind that the console actually has a second screen. It needs to render stuff for that second screen at the same time as it runs the game on the main screen.
Eh? So? If they want to push graphical fidelity on the main screen they can relegate the GamePad's screen to a mini map, HUD, menu, or anything they can think of just to have it anything but blank (in fact, if they want to also allow off-TV play for the given title, they will have to make the second screen's use simplistic and non mandatory in this manner, otherwise the game won't be playable with anything but two screens). Not every game's going to render two normal 3D views just like not every PS360 game employs picture-in-picture and split screen just because in theory they can. It's up to the developer.
 

Thraktor

Member
Alright, I've been doing a few calculations around the die size. The total die is 146.48mm², and of that about 50.78%, or 74.38mm² is what I'll call "GPU logic" (that is, everything except the memory pools and interfaces). Now, looking at Durante's image from the first page:

c10234f5_d813301_13195mk1y.png

Let's assume, for a minute, that the four blue sections are TMUs, the red sections are shader clusters, and the two yellow sections down at the bottom right are ROP bundles. This would, we assume, produce a core configuration of 320:16:8. Now, if you measure out the sizes of these, you get only 28.9% of the total GPU logic space, just 21.48mm². What's going on with the other 52.9mm² of GPU logic? There's probably a DSP and ARM on there, but that accounts for a couple of mm² at most.

There are basically two possibilities here:

- The 320:16:8 core config is accurate, and there's ~50mm² of custom logic or "secret sauce" (more than twice the size of the conventional GPU logic).

- The 320:16:8 core config isn't accurate.

Here's the interesting thing about the second possibility, it challenges one assumption that has gone unquestioned during our analysis; that all shaders are equal. What if they aren't?

What if Nintendo has gone for an asymmetrical architecture? What if they've decided that some of the shaders will be optimised for some tasks, and some for others? This doesn't necessarily require a complete reworking of the shader microarchitecture, and it could be as simple as having different shader clusters with different amounts of register memory. The ones with lots of register memory would be suited for compute tasks (we can assume that these are the red ones) and the others could be dedicated to graphical tasks with low memory reuse (the blue squares above the red squares might be a fit for these).

Why would I think Nintendo would do something like this? Well, for one, they've done exactly the same thing with the CPU. Although this is pending the CPU die photo, it appears that Nintendo have gone with three identical cores with very different amounts of cache, with two of the cores getting 512KB of L2 each, and the other core getting 2MB of L2. The assumed reason for this is that different threads are naturally going to have different cache requirements, so if developers are writing code specifically for the hardware, they can run cache-intensive threads on the cacheful core, and less cache-intensive threads on the other cores. The logic would be the same in this case. Not all GPU threads are created equal, so why give them all equal resources? Why not have register-heavy shader bundles for compute tasks to run on, alongside register-light shader bundles for other tasks?

I don't know as much about texture units and ROPs, but could the same principle be applied to them? Might there be different texture units specialised for different tasks? Could we have asymmetrical ROPs, for instance with some specialised for the Gamepad?
 

ozfunghi

Member
In any comparisons of the Wii U versus other consoles, we must keep in mind that the console actually has a second screen. It needs to render stuff for that second screen at the same time as it runs the game on the main screen. That means we need a fair amount of overhead power to keep a graphics level on the main screen at certain standards.

It seemed to me that this kind of wasn't getting much attention from the people who like to compare FLOPS, so just a little reminder.

Not necessarily. Devs can choose to keep the screen blanc and just use it for off-tv play if they wanted, and wanted to push the graphics. They could also choose to have a simple map or inventory that refreshes at 1fps.
 

ozfunghi

Member
Alright, I've been doing a few calculations around the die size. The total die is 146.48mm², and of that about 50.78%, or 74.38mm² is what I'll call "GPU logic" (that is, everything except the memory pools and interfaces). Now, looking at Durante's image from the first page:



Let's assume, for a minute, that the four blue sections are TMUs, the red sections are shader clusters, and the two yellow sections down at the bottom right are ROP bundles. This would, we assume, produce a core configuration of 320:16:8. Now, if you measure out the sizes of these, you get only 28.9% of the total GPU logic space, just 21.48mm². What's going on with the other 52.9mm² of GPU logic? There's probably a DSP and ARM on there, but that accounts for a couple of mm² at most.

There are basically two possibilities here:

- The 320:16:8 core config is accurate, and there's ~50mm² of custom logic or "secret sauce" (more than twice the size of the conventional GPU logic).

- The 320:16:8 core config isn't accurate.

Here's the interesting thing about the second possibility, it challenges one assumption that has gone unquestioned during our analysis; that all shaders are equal. What if they aren't?

What if Nintendo has gone for an asymmetrical architecture? What if they've decided that some of the shaders will be optimised for some tasks, and some for others? This doesn't necessarily require a complete reworking of the shader microarchitecture, and it could be as simple as having different shader clusters with different amounts of register memory. The ones with lots of register memory would be suited for compute tasks (we can assume that these are the red ones) and the others could be dedicated to graphical tasks with low memory reuse (the blue squares above the red squares might be a fit for these).

Why would I think Nintendo would do something like this? Well, for one, they've done exactly the same thing with the CPU. Although this is pending the CPU die photo, it appears that Nintendo have gone with three identical cores with very different amounts of cache, with two of the cores getting 512KB of L2 each, and the other core getting 2MB of L2. The assumed reason for this is that different threads are naturally going to have different cache requirements, so if developers are writing code specifically for the hardware, they can run cache-intensive threads on the cacheful core, and less cache-intensive threads on the other cores. The logic would be the same in this case. Not all GPU threads are created equal, so why give them all equal resources? Why not have register-heavy shader bundles for compute tasks to run on, alongside register-light shader bundles for other tasks?

I don't know as much about texture units and ROPs, but could the same principle be applied to them? Might there be different texture units specialised for different tasks? Could we have asymmetrical ROPs, for instance with some specialised for the Gamepad?

So either way... you are thinking there still is some secret sauce to be had?
 

Chronos24

Member
Not necessarily. Devs can choose to keep the screen blanc and just use it for off-tv play if they wanted, and wanted to push the graphics. They could also choose to have a simple map or inventory that refreshes at 1fps.

Well actually not true. I remember reading a crytek dev saying they could get more performance (something like 1080p 30fps?) out of the console if they could turn off the screen which there wasn't an option to.
 

Donnie

Member
Well actually not true. I remember reading a crytek dev saying they could get more performance (something like 1080p 30fps?) out of the console if they could turn off the screen which there wasn't an option to.

Link? I really doubt the system forces 854x480 frames to the pad 60x per second. Also note that even if it did that wouldn't have to effect flops, only fillrate (if a dev chose to make those frames 2d like a map for instance).
 

Thraktor

Member
So either way... you are thinking there still is some secret sauce to be had?

Well, what I'm saying is that it might be something a lot simpler than "secret sauce" (at least for the most part). Asymmetric registers would definitely be unusual, but it's not like creating some crazy fixed function hardware from scratch, and would in fact be pretty simple to implement and only a relatively minor hassle to code for.
 

majik13

Member
Well actually not true. I remember reading a crytek dev saying they could get more performance (something like 1080p 30fps?) out of the console if they could turn off the screen which there wasn't an option to.

I remeber this from the WUST, but was it ever confirmed? and also Crytek?
 

ozfunghi

Member
Well actually not true. I remember reading a crytek dev saying they could get more performance (something like 1080p 30fps?) out of the console if they could turn off the screen which there wasn't an option to.

They could still leave it blanc. Have it show a black image. It wouldn't need to push any polygons or high-res textures. Output could be minimized or they could just mirror the TV image, just like Trine2 does, which doesn't eat in on performance.

Well, what I'm saying is that it might be something a lot simpler than "secret sauce" (at least for the most part). Asymmetric registers would definitely be unusual, but it's not like creating some crazy fixed function hardware from scratch, and would in fact be pretty simple to implement and only a relatively minor hassle to code for.

I was just kidding. I meant that either it's 320 SPU's + some additional SPU's scattered across the chip. Or it's 320 SPU's + some fixed functions.
 

PetrCobra

Member
Well actually not true. I remember reading a crytek dev saying they could get more performance (something like 1080p 30fps?) out of the console if they could turn off the screen which there wasn't an option to.

Careful there... I ended up getting banned for citing that exact rumor some time ago (I remembered it as being confirmed as well and presented it as a fact)
 

guek

Banned
There are basically two possibilities here:

- The 320:16:8 core config is accurate, and there's ~50mm² of custom logic or "secret sauce" (more than twice the size of the conventional GPU logic).

- The 320:16:8 core config isn't accurate.

Here's the interesting thing about the second possibility, it challenges one assumption that has gone unquestioned during our analysis; that all shaders are equal. What if they aren't?

What if Nintendo has gone for an asymmetrical architecture? What if they've decided that some of the shaders will be optimised for some tasks, and some for others?

Which do you feel would be preferable? If the core config isn't accurate, doesn't that throw a wrench into estimating its theoretical performance?
 
First of all, big thanks to the Chipworks folks for doing this pro bono!

Now, I have a question for the qualified techies in this thread. Given what we know about the hardware, are we going to see noticeable improvement from high-end current generation games throughout the Wii U's life?

Obviously, I'm not expecting them to ever look as good as what we'll see on Orbis/Durango. I'm thinking more along the lines of Halo 3 vs Halo 4.

Its less conventional and off the shelf than thought which means it potentially has room for improvement. '
 

Chronos24

Member
Careful there... I ended up getting banned for citing that exact rumor some time ago (I remembered it as being confirmed as well and presented it as a fact)

Gotcha. Back on topic... With thraktor's analysis a few posts ago s it plausible that there could actually be some sort of "secret sauce" as he says? I'm nowhere near the technical expert but looking at the die photo what he says about the shaders makes sense. Also if there was one thing that devs seemed to agree on was their satisfaction with the gpu.
 

Thraktor

Member
Which do you feel would be preferable? If the core config isn't accurate, doesn't that throw a wrench into estimating its theoretical performance?

To be honest, I simply can't find any logical reason for the first. What would you do with that quantity of custom hardware? Have they discovered some sort of magical Mario rendering technique that demands over two thirds of the GPU logic?
 

OryoN

Member
- Any estimates of the bandwidth of the 4MB high speed eDRAM pool and whether or not it's accessible for Wii U games?

Statements in the 'Iwata ask' did seem to suggest that Nintendo would pursue a design for backward compatibility that would not require fully separate Wii hardware. So I'm kinda hoping all these resources can be use in Wii U mode too.

- What about estimates for the SRAM bandwidth?

- I'm familiar with eDRAM, but is that amount of SRAM/caches common in GPUs?

Perhaps all these significant pools of memory/caches help underscore Takeda's comments:

Takeda:
I would draw attention to how efficient it is. For a computer to function efficiently, memory hierarchy structure is very important, and this time the basic memory hierarchy is tightly designed. Although that is an orthodox solution, it makes the foremost feature of this machine's high efficiency.

Sorry for all the questions! I think even more mysteries came up since the new details. But that's part of the fun I guess.
 

StevieP

Banned
The way a large number of people behaved when people like Arkham and to a much lesser extent llhere came in as confirmed sources and told people to lower their expectations left a bad taste in my mouth.

edit: also alot of the early speculation for PS4/720 was people telling us that anything more than 2gb ram was unrealistic

It's the same way that people behaved towards me when I told them the specs of the Gpus of the ps4/Xbox 3 last summer, you mean? And I'm not even a developer. Enough with this persecution crap in the Wii u GPU thread.
 
Status
Not open for further replies.
Top Bottom