• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.

Lenardo

Banned
so after getting the picture...we can "guess" and say around 320Gflops + whatever the fixed function units do?

i find it all fascinating, don't care how it ends up but its fascinating to observe how far we go for our Hobby!
 
Most estimates I've heard have given the GPU 25-30W to work with. Besides, we know the GPU is clocked at 550Mhz, and we know how big it is. That logic is doing something, and it's doing it at 550MHz. Whether it's being used for computation (glfops) or something else won't have much effect on the energy consumption.

3x02b.gif


I thought we were going to actually get somewhere fast with this photo? Isn't Fourth Storm coming back with some more concrete info in this?

Regardless, very exciting stuff today!
 

Khrno

Member
Maybe in America... in Italy a Cappuccino is made by Espresso+Latte(milk)

Edit: nope even wikipedia in english says that espresso+milk=cappuccino
http://en.wikipedia.org/wiki/Cappuccino

By the way for the CPU we have to wait another 2-3 days right?

Spoiler tagged because this is totally off-topic, but:

Maybe I should have been more specific and said "steamed" milk http://en.wikipedia.org/wiki/Latte.

Either way the codename Latte is referring to that particular type of coffee drink, not to just milk.
 

kinggroin

Banned
Its customized by Nintendo. It matches no known architecture going by OP.

I'm having a real hard time wrapping my head around that kind of performance at that power draw.

This would make this the most efficient console ever created. I'm not sure I buy it.


Edit: Thraktor, not meaning to offend you or question your sensibilities, just that it seems like a high performance number given power draw and past (heck, even upcoming) architecture.

I think I'm just a tech fan erring on the low side of what I believe Nintendo would give us. I know they've been keen on customizing towards efficiency with the GC and Wii, but this SEEMS to be on another level given some of the data we have. I'd like to think you're closer to being right than USC-fan if it means anything.
 

ozfunghi

Member
15w Thraktor. How?

And IF its as you say, where is the desktop equivalent?! I'd love to throw one of these in a media center box.
I'm having a real hard time wrapping my hand around that kind of performance at that power draw.

This would make this the most efficient console ever created. I'm not sure I buy it.



I'm not a tech head, but someone else in the other topic said it would be feasable to get this chip down to about 18W after stripping some unnecessary stuff, without GDDR5 etc...

http://www.amd.com/la/Documents/AMD-Radeon-E6760-Discrete-GPU-product-brief.pdf
 
I'm having a real hard time wrapping my hand around that kind of performance at that power draw.

This would make this the most efficient console ever created. I'm not sure I buy it.


It's the GameCube of this generation


Having an asymmetric shader/tmu/rop design should give it a lower TDP than having a uniform design.
 

Kai Dracon

Writing a dinosaur space opera symphony
I'm having a real hard time wrapping my hand around that kind of performance at that power draw.

This would make this the most efficient console ever created. I'm not sure I buy it.

The pikmin are fed with tiiiiiiiny little icey pops.

Keeps things cool and on the down low.
 

AzaK

Member
Going clockwise from bottom left, you're basically looking at the render back ends (they're they last step before outputting to RAM.

The bottom middle I believe is geometry setup/command processor. Bottom right is the UVD/display controllers.

Most of the area to the top and left is ROPs. At the bottom right there's some video decode hardware, etc. There is certainly some miscellaneous logic on there, but it's a lot closer to 10-20% than it is to 70%.

DOH! I was going off a failed memory when I looked at a die shot with overlayed diagram. Looking at the diagram again I can see it didn't show the ROPS.

Thanks.

The 33 watts is what the console uses at the wall. We can all measure that and this is the same for running any wiiu games. Been tested by many different sites.

Not sure if it's the same for all Wii U games is it, just no game and NSMBU?
 

AlStrong

Member
Thanks. Would there be any reason that separating them out would make sense to you from a design perspective?

Well, they're pretty much isolated/independent functions. :)

There's certainly some such units, but it simply wouldn't make sense to me that they should take up that proportion of the die.

Indeed... Well, I think it's hard to say IMO.
 
I'm having a real hard time wrapping my head around that kind of performance at that power draw.

This would make this the most efficient console ever created. I'm not sure I buy it.

What kind of performance? We know it currently runs roughly equivalent PS3/360 games and we know the power draw. Unless it's already been maxed out by those early ports (which would also be unprecedented), it still has some theoretical performance gains to be had.

We may all be underwhelmed with its specs as they come to light, but maybe we should also be impressed by its efficiency.
 

Schnozberry

Member
Maybe they aren't accurately reading the draw?

Dunno :/

It's possible if they have a poorly calibrated wattmeter. Iwata said in a Nintendo Direct that the Wii U would generally consume 45w, but there is overhead there for accessory power via USB and the SD Card reader and such.
 

Baki

Member
By the supposed theoretical GFLOP performance and the really world performance thanks to fix functions and efficiency.

Gamecube's gpu was theoretically 8 gflops while Xbox was 3 times more, yet they were on par in real world application.

So MS should sue them for using the secret sauce(tm) haha
 

guek

Banned
It's possible if they have a poorly calibrated wattmeter. Iwata said in a Nintendo Direct that the Wii U would generally consume 45w, but there is overhead there for accessory power via USB and the SD Card reader and such.

It's not really wise to go down the road of "what ifs." There's no evident reason to doubt the figure.
 

kinggroin

Banned
What kind of performance? We know it currently runs roughly equivalent PS3/360 games and we know the power draw. Unless it's already been maxed out by those early ports (which would also be unprecedented), it still has some theoretical performance gains to be had.

We may all be underwhelmed with its specs as they come to light, but maybe we should also be impressed by its efficiency.


No, don't get me wrong. Even at the theroized 176GFlops, I'm impressed with the results so far. The ports being as good as they are, given that, just makes what they built here that much more impressive IMO.

In any event, I've PMd Thraktor my clarification. Last thing I want is to seem like an unappreciative asshole.
 
I may be crazy, but I think I may have zeroed in on the purpose of the 2 distinct eDRAM pools. The idea must have primarily sprung from the need for Wii BC and very low latencies. Follow me here...

You're trying to emulate the main pool of 24 MB 1t-SRAM from Flipper/Hollywood. Fine. You've got 32 MB of eDRAM to do that. That takes 3/4 of your allotted eDRAM. Now you've got 8 MB to emulate the additional 2 MB frame buffer and 1 MB texture RAM.

But you can't do it in 8 MB because that 1 MB of texture cache alone on Flipper was amazingly rigged to a 512-bit bus! This gave it not merely high (for then) bandwidth, but extremely low latency. If the 32 MB eDRAM is hooked up to a 2048-bit bus, then you're only left with a 512-bit bus left to share between the texture cache and the frame buffer.

Which leads to the proposed 4x1MB eDRAM modules on top. It's probably simply the smallest amount that Renesas could offer on a 512-bit bus. Or perhaps they figured a 4 MB texture cache could help them somehow in Wii U play more than 2 MB.

I'm probably completely off, but it stands to be refuted.

Edit: According to their website, Renesas were planning to sell 1 MB modules with 256 bit buses per piece, but you never know what's really available or financially sensible.
 
No, don't get me wrong. Even at the theroized 176GFlops, I'm impressed with the results so far. The ports being as good as they are, given that, just makes what they built here that much more impressive IMO.

In any event, I've PMd Thraktor my clarification. Last thing I want is to seem like an unappreciative asshole.


It's theorized to be 352glfops now though :/

I don't understand how the ports can perform worse when given more gflops and more ram... Only explanation is rushed products made by weaker dev units.
 

Schnozberry

Member
It's not really wise to go down the road of "what ifs." There's no evident reason to doubt the figure.

Nope. I don't think they're wrong. All that unidentified space on the GPU has to be doing something, though. Getting a full performance figure pretty much relies on figuring out what.
 

kinggroin

Banned
If it turns out that the fixed functions are an aid to the programmable shader units in the Wii U...


...is there a documented way of measuring performance of something so customized?

Honest question.
 
It's theorized to be 352glfops now though :/

I don't understand how the ports can perform worse when given more gflops and more ram... Only explanation is rushed products made by weaker dev units.

There's more to it than just GFLOPS and ram amount. Ports could be bottlenecked by the cpu and ram speed.

Same goes for your 'gamecube of this gen' comparison. gflops doesn't tell the whole story.
 

LCGeek

formerly sane
More food for thought: Nintendo is unifying its development structure for console and handhelds. I always assumed this referred to the software development tools side of things. 3DS uses pretty modern fixed functions graphics. So besides keeping their knowledge about TEV useful even on Wii U (which is a given as the units for Wii BC is also usable in Wii U) its also in Nintendo's best interest having all fixed function graphics capabilities of the 3DS available on the Wii U as well, now in Full HD...

You should realize why would nintendo have an interest in doing so, perhaps a changing market most don't realize is about to change a lot of dynamics. For nintendo shrinking things and merging both ends and then making products that tie in to that is much better than the status quo. Let them grow power from there anything else we end up with earlier situations and tons of money loss.
 

FLAguy954

Junior Member
There's more to it than just GFLOPS and ram amount. Ports could be bottlenecked by the cpu and ram speed.

Same goes for your 'gamecube of this gen' comparison. gflops doesn't tell the whole story.

The ram isn't a bottleneck anymore (actually never was) so try again (you can only make a case for the cpu being a bottleneck of sorts).
 

Ryoku

Member
I may be crazy, but I think I may have zeroed in on the purpose of the 2 distinct eDRAM pools. The idea must have primarily sprung from the need for Wii BC and very low latencies. Follow me here...

You're trying to emulate the main pool of 24 MB 1t-SRAM from Flipper/Hollywood. Fine. You've got 32 MB of eDRAM to do that. That takes 3/4 of your allotted eDRAM. Now you've got 8 MB to emulate the additional 2 MB frame buffer and 1 MB texture RAM.

But you can't do it in 8 MB because that 1 MB of texture cache alone on Flipper was amazingly rigged to a 512-bit bus! This gave it not merely high (for then) bandwidth, but extremely low latency. If the 32 MB eDRAM is hooked up to a 2048-bit bus, then you're only left with a 512-bit bus left to share between the texture cache and the frame buffer.

Which leads to the proposed 4x1MB eDRAM modules on top. It's probably simply the smallest amount that Renesas could offer on a 512-bit bus.

I'm probably completely off, but it stands to be refuted.

If this is true, then wouldn't developers have 32+4MB of eDRAM to play with? Or am I missing something, completely?

EDIT: Another way to put it: Why would the additional 4MB be locked off for Wii U game development?
 
If this is true, then wouldn't developers have 32+4MB of eDRAM to play with? Or am I missing something, completely?

Yes, it would basically be like a modern Flipper. Like most other modern chips, bandwidth and size have increased much more than latency has decreased. However, the latency is still much reduced and obviously a focus.
 

Thraktor

Member
Right folks, it's late here and I'm off to bed. I've cleaned up the OP a bit, and I expect lots of newly deciphered info for it by the time I wake up tomorrow :)
 
The ram isn't a bottleneck anymore (actually never was) so try again (you can only mae a case for the cpu being a bottleneck of sorts).
Well, a new, unknown architecture can be quite the "bottleneck", in particular if you're on a tight budget with limited time.
 
If it turns out that the fixed functions are an aid to the programmable shader units in the Wii U...


...is there a documented way of measuring performance of something so customized?

Honest question.
That 352 isn't counting any proposed fixed-function assistance afaik. The number you quoted was a knee-jerk response thrown out before the chip had even begun to be analyzed properly.
 

Kai Dracon

Writing a dinosaur space opera symphony
You should realize why would nintendo have an interest in doing so, perhaps a changing market most don't realize is about to change a lot of dynamics. For nintendo shrinking things and merging both ends and then making products that tie in to that is much better than the status quo. Let them grow power from there anything else we end up with earlier situations and tons of money loss.

For all Nintendo has denied, for the moment, plans to merge their console and handheld divisions in a literal sense, I have to think that option is absolutely on their mind. And would be one side benefit of their current strategy.

I think it's been hard to get a grasp on what might actually be going on here, because everything surrounding the situation has been drowned out by cries of how dumb, cheap, and incompetent Nintendo is. (For not having dropped everything to make a super computer in a box and lose a few hundred a unit on.)

Nintendo might be frugal (to say the least), but whether they're being dumb, or merely playing dumb, remains to be seen.
 

Ryoku

Member
Yes, it would basically be like a modern Flipper. Like most other modern chips, bandwidth and size have increased much more than latency has decreased. However, the latency is still much reduced and obviously a focus.

I just realized this. The smaller block of EDRAM is apparently of a greater density and higher bandwidth than the larger block. Why would that 4MB be higher speed than the larger pool of 32MB?--especially if it was put in there [primarily] because of the Wii emulation?
 

ozfunghi

Member
I just realized this. The smaller block of EDRAM is apparently of a greater density and higher bandwidth than the larger block. Why would that 4MB be higher speed than the larger pool of 32MB?--especially if it was put in there [primarily] because of the Wii emulation?

And what are the consequences for WiiU games? How much faster would this be?
 

japtor

Member
More food for thought: Nintendo is unifying its development structure for console and handhelds. I always assumed this referred to the software development tools side of things. 3DS uses pretty modern fixed functions graphics. So besides keeping their knowledge about TEV useful even on Wii U (which is a given as the units for Wii BC is also usable in Wii U) its also in Nintendo's best interest having all fixed function graphics capabilities of the 3DS available on the Wii U as well, now in Full HD...
Well from the Q&A answer on top of the investor presentation stuff it definitely seems like it was referring to just the software side to me. I wouldn't take that as having anything necessarily to do with hardware, other than portable hardware becoming more capable and these days (i.e. supporting the same shaders or whatever 3D APIs and stuff as the console hardware). I figure they're working on a middleware or some standardized higher level dev setup to work with to make development faster and more portable down the line.

If I'm right it seems like something that should've been done long ago, but I guess the 3DS and Wii U are their first systems to be modern enough for them to consider it, plus it sounds like they ran into some of the development hurdles everyone else did with the HD transition (...except a few years later).
It's theorized to be 352glfops now though :/

I don't understand how the ports can perform worse when given more gflops and more ram... Only explanation is rushed products made by weaker dev units.
From the investor stuff it sounds like the finalized dev kits weren't out until around Q3 so I imagine a lot of stuff was rushed, on top of the hardware being completely different. Standard crappy launch stuff basically, albeit without the usual generational big performance overhead to easily mask issues.
If it turns out that the fixed functions are an aid to the programmable shader units in the Wii U...


...is there a documented way of measuring performance of something so customized?

Honest question.
Might be possible for a dev to make their own benchmarks to test this or that perhaps, or ideally there'd be some documented performance metrics for everything in the documentation so they know what to expect rather than having to test things out themselves.
 
I just realized this. The smaller block of EDRAM is apparently of a greater density and higher bandwidth than the larger block. Why would that 4MB be higher speed than the larger pool of 32MB?--especially if it was put in there [primarily] because of the Wii emulation?

It's a higher speed for its size, not overall. But it should be extremely low latency even relative to the 32 MB MEM1.
 

FLAguy954

Junior Member
I just realized this. The smaller block of EDRAM is apparently of a greater density and higher bandwidth than the larger block. Why would that 4MB be higher speed than the larger pool of 32MB?--especially if it was put in there [primarily] because of the Wii emulation?

Why aren't we considering that it could be used to both help with Wii emulation and to further lessen the impact of the ddr3 bandwidth?
 

Ryoku

Member
It's a higher speed for its size, not overall. But it should be extremely low latency even relative to the 32 MB MEM1.

Then why does the diagram from Chipworks refer to the larger pool as "slower"? Same reason (for its size, not overall)? I don't mean to sound antagonistic--I'm genuinely curious.
 
Status
Not open for further replies.
Top Bottom