• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.

The_Lump

Banned
Gotta love these answers, right next to each other.


Yeah seems those not fond of WiiU like to make out the speculation was all crazy off-the-charts optimistic;

Those fiercely defending WiiU like to make out everyone actually expected the lower figures all along;

And those who've been simply looking at the facts/evidence from the last year or so were pretty much spot on and don't need to spin the figures one way or the other. 350 - 500 was always the realistic consensus as far as I can recall (apart from very early on in the WUST threads)
 

Thraktor

Member

Thanks for that.

This is an interesting theory, considering that the this behaviour (the "spilling" of registers) was often bandied about as a reason for the R700 architecture's disappointing GPGPU performance.

Spilling registers off to DDR3/GDDR5 just seems downright bizarre, as you're looking at several orders of magnitude increase in latency. What with the amount of eDRAM and SRAM on the die, I would be very surprised if any such spilling would be taking place from Latte to the DDR3.

That's actually not crazy at all, looking at the macros. But I have my doubts. There are three off-the-shelf macros:

8MB, 256bit
1MB, 256bit
1MB, 128bit

And this is exactly where things get weird. There are apparently eight macros of 4MB each. That doesn't seem to make sense. 4 x 8MB would give you 32MB on a 1024bit bus. Why not use that? Why go with 8 x 4MB instead? It looks like each macro is connected via a 64bit bus in this particular case, so 8 * 64bit * 550MHz = 32.8GB/s. Not bad, but not really enough to "emulate" eFB and eTC. Maybe that's where the second eDRAM pool and the SRAM pool come into play?

Isn't the Renesas 40nm eDRAM info we had pretty old? Furthermore, Nintendo might be a sufficiently high-volume customer to get a more customised solution. Why do you say the macros are connected via 64 bit busses, out of interest?
 
I don't know where you may have read that, but if I understand you right (which I may not, as it's past bedtime here), that's not true. Each shader stage (or was it a clause - don't remember ATM; anyhow, think of it as a subroutine) gets assigned a certain number of GP registers (up to a limit, as you note, of 128, IIRC) from a common pool, but it also gets constant regs and registers which carry over results from previous clauses/to following clauses. The registers of the active shader code could/should not be swapped out, or else the entire clause (across the entire SIMD wavefront!) will suffer badly.

Yes, it was per clause. Is there a chance of "inactive" shader code filling in registers to be used later? Sorry for what may be a foolish question. Here's the link to where I read that info. Check out what it says about GPRs in the glossary at the end when you get a chance...

http://developer.amd.com/wordpress/media/2012/10/R700-Family_Instruction_Set_Architecture.pdf
 

EloquentM

aka Mannny
So, does this explain to a certain degree why 360/PS3 ports don't look great? Is this a similar situation to the GC/Wii and its TEV where you need to write custom code to get the best quality and ports using standard coding will be nothing special?
I'll requote this since it was a good question that got stuck at the bottom of the page.
 

The_Lump

Banned
So, does this explain to a certain degree why 360/PS3 ports don't look great? Is this a similar situation to the GC/Wii and its TEV where you need to write custom code to get the best quality and ports using standard coding will be nothing special?


They don't look great because they're not running on their native xb360 ;)
 
Chipworks really came through for us, thank you guys. As for the rest, I only got to this thread now, so I'll start digging.

So, it's probably 352 GFlops?
 
Are we finally getting close to a compare spec comparison between all three next gen consoles?

Wii U!

352 Gflops GPU
2GB DDR3 RAM @ 48MB/s
32MB eDRAM @ 78MB/s
???? CPU

Durango!

1,200 Gflops GPU (3,4x more powerful than Wii U)
8GB DDR3 RAM @ 78MB/s
32MB eDRAM @ 178MB/s (???)
???? CPU

Orbis!

1,800 Gflops GPU (5x more powerful than Wii U, 1,5x more powerful than Durango)
4GB GDDR5 RAM @ 192 MB/s
???? CPU
 

wsippel

Banned
Isn't the Renesas 40nm eDRAM info we had pretty old? Furthermore, Nintendo might be a sufficiently high-volume customer to get a more customised solution. Why do you say the macros are connected via 64 bit busses, out of interest?
Yes, it's old, and yes, I think it's custom. It has to be, it doesn't seem to use any of the three macros listed on the site. As I wrote, there are no 4MB macros on the site to begin with.

For the bus, I was just counting pins. Figured that if it works for external memory, it should work for eDRAM as well.;)
 
To Blu, Traktor and company. For a thread this important you need to seek mod asistance.

Is not too late. Close this one and open another when people that can contribute debate and educate so the rest of us can pay atention to, maybe learn something. Any comentary out of context gets deleted.
 
Yeah. The result of kneejerk reactions, especially when there was barely any info posted at that point in the thread

?? It's practically the same information it was just unsure whether it was 20ALU's per block or 40. Initial estimates were at 20, and it was clearly stated where the numbers were coming from. 20 = 176GFLops 40 = 352.
 

The_Lump

Banned
Are we finally getting close to a compare spec comparison between all three next gen consoles?

Wii U!

352 Gflops GPU
2GB DDR3 RAM @ 48MB/s
32MB eDRAM @ 78MB/s
???? CPU

Durango!

1,200 Gflops GPU (3,4x more powerful than Wii U)
8GB DDR3 RAM @ 78MB/s
32MB eDRAM @ 178MB/s (???)
???? CPU

Orbis![/B]

1,800 Gflops GPU (5x more powerful than Wii U, 1,5x more powerful than Durango)
4GB GDDR5 RAM @ 192 MB/s
???? CPU


The word 'Powerful' should be substituted with 'FLOPS' in each case, as that's all that that multiplication technically tells us :p


Edit: oh and I think you mean GB/s not MB/s for the bandwidths?
 
Thanks for that.
Spilling registers off to DDR3/GDDR5 just seems downright bizarre, as you're looking at several orders of magnitude increase in latency. What with the amount of eDRAM and SRAM on the die, I would be very surprised if any such spilling would be taking place from Latte to the DDR3.

Does seem bizarre, but I've also heard of this spilling of registers. Of course, now I can't recall where for the life of me.


Thraktor said:
Isn't the Renesas 40nm eDRAM info we had pretty old? Furthermore, Nintendo might be a sufficiently high-volume customer to get a more customised solution. Why do you say the macros are connected via 64 bit busses, out of interest?

I agree that we might be looking at a custom solution. Blu and I communicated a while back and 70.4 GB/s is the bare minimum it would take for per-clock parity with Xenos. And that's, I believe, without a Z pass or MSAA. 140.8 GB/s would service things alot better.
 

Donnie

Member
Thraktor

Don't know if this has been mentioned yet but just FYI in your OP you mention Broadway GPU a few times. Wii's GPU is Hollywood.
 

NBtoaster

Member
Are we finally getting close to a compare spec comparison between all three next gen consoles?

Wii U!

352 Gflops GPU
2GB DDR3 RAM @ 48MB/s
32MB eDRAM @ 78MB/s
???? CPU

Durango!

1,200 Gflops GPU (3,4x more powerful than Wii U)
8GB DDR3 RAM @ 78MB/s
32MB eDRAM @ 178MB/s (???)
???? CPU

Orbis![/B]

1,800 Gflops GPU (5x more powerful than Wii U, 1,5x more powerful than Durango)
4GB GDDR5 RAM @ 192 MB/s
???? CPU

48MB/s for the DDR3? has something changed from the 12.8MB/s?

edit: didn't see you using MB/s instead of GB/s, lol
 

FourMyle

Member
Are we finally getting close to a compare spec comparison between all three next gen consoles?

Wii U!

352 Gflops GPU
2GB DDR3 RAM @ 48MB/s
32MB eDRAM @ 78MB/s
???? CPU

Durango!

1,200 Gflops GPU (3,4x more powerful than Wii U)
8GB DDR3 RAM @ 78MB/s
32MB eDRAM @ 178MB/s (???)
???? CPU

Orbis!

1,800 Gflops GPU (5x more powerful than Wii U, 1,5x more powerful than Durango)
4GB GDDR5 RAM @ 192 MB/s
???? CPU

Wow. The power disparity is going to be even bigger than I thought. That is nuts.
 

LeleSocho

Banned
Are we finally getting close to a compare spec comparison between all three next gen consoles?
No, if you guys didn't noticed between all yours GFLOP/s and numbers invented on the fly the tech guys are still trying to figure out the gpu

you can't pull any comparision yet

To Blu, Traktor and company. For a thread this important you need to seek mod asistance.

Is not too late. Close this one and open another when people that can contribute debate and educate so the rest of us can pay atention to, maybe learn something. Any comentary out of context gets deleted.

THIS.
 
I'll requote this since it was a good question that got stuck at the bottom of the page.

*some* don't look great and ye *some* look better. the speculation for that has been based on the fact that games designed for the 360 and PS3 are designed around a system which (comparatively) is CPU centric, rather than GPU centric. adapting a game for a system which leans more of it's power on the GPU isn't straight forwards, even if running code you built for the 360 and PS3 games is straight forwards.

these numbers wouldn't shine any new light there really. we'd still be looking at a system with a more powerful GPU than either the 360 or PS3 had. 1.5 times more powerful than 360, and more than that in comparison to the PS3.

presumably the multiplatform games that look better on Wii U are ones that already leant more heavily on GPU than CPU. the ones that don't, either require more out of the CPU than the Wii U can handle, or weren't as well optimized as they could have been (or both).

when you're straining to make launch and you have your game up and running at what you deem an acceptable framerate (even if it isn't as good as that seen on the other consoles) I don't think you're going to lose too much sleep over it. Obviously EA deemed Mass Effect 3's framerate as good enough, because it averages higher than the PS3 version. Activision got COD:BlOps 2 running at a locked 60 fps in multiplayer, and probably weren't too concerned that single player was running more around 45 fps.

I'm sure going forwards we'll still see the occasional multiplat that runs worse on Wii U (because the lead design on 360 doesn't fit too well into it's architecture) but I'm sure we'll see this less often now that people have more time to get things right.
 

guek

Banned
What you trying say, boy? That I don't know my DBZ? SCREW YOU! DON'T BLAME ME BECAUSE CARTOON NETWORK STOPPED AIRING DBZ HALFWAY THROUGH THE BOO SAGA AND I DIDN'T GET TO SEE THE OTHER HALF TILL YEARS LATER, THUS DISTORTING MY DBZ MEMORIES!

I actually took the time one night to look at all the consoles and handhelds that have released since the Casio PV-1000 up to the PS4/XB3, try to do as much research as possible, and give them DBZ rankings.

It was a fun night...
 
Yeah seems those not fond of WiiU like to make out the speculation was all crazy off-the-charts optimistic;

Those fiercely defending WiiU like to make out everyone actually expected the lower figures all along;

And those who've been simply looking at the facts/evidence from the last year or so were pretty much spot on and don't need to spin the figures one way or the other. 350 - 500 was always the realistic consensus as far as I can recall (apart from very early on in the WUST threads)
The idea of 350 GFLOPS was floated but largely dismissed comparative to the general groupthing settling on a higher number. This is in reference to recent speculations - not even considering much earlier and much higher consensus speculation. The only one who seemed adamant that it was ~350 GFLOPS was roundly derided as trolling.
 
To Blu, Traktor and company. For a thread this important you need to seek mod asistance.

Is not too late. Close this one and open another when people that can contribute debate and educate so the rest of us can pay atention to, maybe learn something. Any comentary out of context gets deleted.

That would be nice. Or maybe just keep using the old "Wii U technical discussion" thread?

In any case, more tech and less Dragonball would be really nice.
 
I never understand dbz because i have never seen it. Wiiu should do fine for what it is. It will get great games regardless just like wii did. Nintendo will have to woo 3rd party to do something exclusive for this.

I personally can live with any spec but the biggest mistake they made is the controller. With wii they had something revolutionary in their hand. They lost it. They should have improved that and kept it.
 
Are we finally getting close to a compare spec comparison between all three next gen consoles?

Wii U!

352 Gflops GPU
2GB DDR3 RAM @ 48MB/s
32MB eDRAM @ 78MB/s
???? CPU

Durango!

1,200 Gflops GPU (3,4x more powerful than Wii U)
8GB DDR3 RAM @ 78MB/s
32MB eDRAM @ 178MB/s (???)
???? CPU

Orbis!

1,800 Gflops GPU (5x more powerful than Wii U, 1,5x more powerful than Durango)
4GB GDDR5 RAM @ 192 MB/s
???? CPU

What? From what I remember the main pool of ram is not DDR3, and it's bandwidth was reported to be 12.8gb/s. has this changed???

Edit: also u put "MBs" on everything when you meant to put "GBs". Anyways Wii U bandwidth to the main pool is a lot less than u think it is.
 

guek

Banned
In any case, more tech and less Dragonball would be really nice.

True that. *sorry*

If anyone's gonna write up a console spec comparison thread though, it should be Thraktor with the help of the others mentioned. He's usually incredibly neutral in analysis.
 

Thraktor

Member
To Blu, Traktor and company. For a thread this important you need to seek mod asistance.

Is not too late. Close this one and open another when people that can contribute debate and educate so the rest of us can pay atention to, maybe learn something. Any comentary out of context gets deleted.

Thanks for the advice, but even for the five of us who were tasked with analysing the photo (who had been following Wii U hardware news in detail for over a year and a half), we simply aren't capable of figuring out all the details on our own. By posting the photo up with our current theories and "crowdsourcing" the problem to GAF, we stand a much better chance of finding out what's going on in there. Perhaps when we get to a point where we know the bulk of the die's functionality we can go to a new thread, but at the moment I'm happy keeping the conversation going and updating the OP as needed.

Edit: Perhaps I misunderstood what you were saying. It's inevitable on a forum as broad as GAF that you're going to get a handful of people jumping to conclusions in a thread like this, however I'm fairly happy with some of the progress we've already made, and I wouldn't feel entirely comfortable asking mods to operate some sort of crazy strict banning policy in the thread. Simply ignore the people who aren't making any contribution to the thread and we'll be fine.

Thraktor

Don't know if this has been mentioned yet but just FYI in your OP you mention Broadway GPU a few times. Wii's GPU is Hollywood.

Thanks! A bit too much coffee today, methinks.
 
one thing for sure, we can't easily tell marginal differences in power by looking at games. X might be something the 360 and PS3 couldn't do... but you can't easily tell. and 'can't do' means what exactly?

maybe they could run the game with a minor drop in graphics. would that mean they 'couldn't do' X? I remember arguments on whether or not the 360 could handle Uncharted. It seems laughable now.

It's not likely to ever return to the situation we had in PS2 vs Xbox, or Wii vs 360/PS3 where there were specific effects that we could definitively say the weaker hardware couldn't do.
 

ozfunghi

Member
The idea of 350 GFLOPS was floated but largely dismissed comparative to the general groupthing settling on a higher number. This is in reference to recent speculations - not even considering much earlier and much higher consensus speculation. The only one who seemed adamant that it was ~350 GFLOPS was roundly derided as trolling.

This is false. It has always been considered a possibillity, albeit the low-end estimate. There was even speculation among Wsippel and BGassassin well over half a year ago, that stated these flop numbers in combination with fixed functions. Guys like USC-fan where stating that reaching 360 numbers would be as good as it got. Later changing their tune to "best case scenario/we should be lucky to get 350".

If anyone predicted this in the speculation threads, you would get neutered.

Funny how two of the leading persons of the WUST threads actually contemplated that exact option.
 
The idea of 350 GFLOPS was floated but largely dismissed comparative to the general groupthing settling on a higher number. This is in reference to recent speculations - not even considering much earlier and much higher consensus speculation. The only one who seemed adamant that it was ~350 GFLOPS was roundly derided as trolling.

If anyone predicted this in the speculation threads, you would get neutered.
 

AzaK

Member
Yes, it's old, and yes, I think it's custom. It has to be, it doesn't seem to use any of the three macros listed on the site. As I wrote, there are no 4MB macros on the site to begin with.

For the bus, I was just counting pins. Figured that if it works for external memory, it should work for eDRAM as well.;)

Re pin counting. Do we just read each small block as essentially a pin that goes to the internal edge then all of those go out to the outter edge? If so, each big block of that eDRAM is 2048 (edited from 1024) bits right?
 

Plinko

Wildcard berths that can't beat teams without a winning record should have homefield advantage
If anyone predicted this in the speculation threads, you would get neutered.

I don't think anybody reasonable would have ever considered Nintendo would have gone as underpowered as they did with this system.

It makes absolutely no sense unless the tablet is ultra-expensive.
 

Schnozberry

Member
Seems like we learned quite a bit today. But of course, with answers comes more questions. I'm still really interested at what all that extra metal is doing that has yet to be identified.
 
The speculated bandwidth in the OP is more than sufficient for Wii U, in terms of eDRAM.

Probably, but this new doubled speed speculation is much more to my liking :)

If true @140GB/s it then poses he obvious question - WTF is Microsoft doing? Maybe there's a secret speed doubling there too.
 
Well at least now we know why the GPU felt so better than everything in the case, making it an unbalanced architecture. It just wasn't that better!
 
This is false. It has always been considered a possibillity, albeit the low-end estimate. There was even speculation among Wsippel and BGassassin well over half a year ago, that stated these flop numbers in combination with fixed functions.
I've been following those threads. The idea of "Oh, well we all thought it would be ~350 GFLOPS. This is just what we expected." is ridiculous revisionist history. The number was floated occasionally more recently, yes, but never seriously expected by the bulk of posters. The expectations have progressively tempered, yes, but they were still higher even recently. It's what happens in echo chambers/groupthink havens - the same scenario occurred with Durango and it's secret sauce.

Well over half a year ago people were floating crazy, stupid numbers like 800 GFLOPS.
 

Kimawolf

Member
If anyone predicted this in the speculation threads, you would get neutered.

The issue with all those people is, while there was good speculation going on on the low and high end you had people coming into those threads literally losing their minds and being so antagonizing that you had to outright dismiss them as troll fools. So don't make it like people came in there with good intentions. they generally went far beyond ribbing into trolling.
 

Thraktor

Member
Yes, it's old, and yes, I think it's custom. It has to be, it doesn't seem to use any of the three macros listed on the site. As I wrote, there are no 4MB macros on the site to begin with.

For the bus, I was just counting pins. Figured that if it works for external memory, it should work for eDRAM as well.;)

Which pins? My counts seem to be coming up different.
 

guek

Banned
The idea of 350 GFLOPS was floated but largely dismissed comparative to the general groupthing settling on a higher number. This is in reference to recent speculations - not even considering much earlier and much higher consensus speculation. The only one who seemed adamant that it was ~350 GFLOPS was roundly derided as trolling.

You can be right and still be a troll, they're not mutually exclusive.

It's nice in hindsight to pretend that everyone was a raving fanboy (there were those, i may have been occasionally guilty) but the reality is there were a fair share of pessimistic people that were usually involved in the conversation. It's also pretty unfair to discount the pervading rumors and leaks at the time, especially in the very beginning around 2 years ago, which seemed to hint at substantially more. People were overly optimistic but there were reasons for that beyond fanboy fanaticism.

Just because someone was correct doesn't automatically make everyone else irrational in hindsight.
 
Status
Not open for further replies.
Top Bottom