• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.
Read my last post. The question i am asking, is because the performance we are seeing, in no shape or form can be justified by what we know from the memory setup. Even if it were worse. The fact that so many games suffer from it so excessively, makes me think there has to be another explanation.
Really simple to understand. Current engines were made with PS360 on mind, and this means, 22-40 GB/s bandwidth to the bigger pool of memory and on 360's case, a small amount of eDram that can't be read directly from the GPU.

Even when memory management is one of the easiest part to adapt, since kits weren't even finalized, games had been totally rushed and maximum bandwidth of the WiiU big pool of RAM is only 12,8GB/s, it's easy to think were these "artificial" bottleneck appears.

CPU is also totally different, with a much more modest SIMD design but a much better general purpose performance *.

So until engines are adapted to a more friendly design (and both PS4 and 720 will have that design, by the way) we won't be able to speak about the console's flaws basing our judgement on the games at sale.


*Hell, even the "1 thread of the 360 CPU was only 20% faster than Broadway in a general case scenario" is not a fair comparison because 360 had a shared L2 cache, which means that if only 1 thread was used, this thread had the whole 1MB of L2 cache for itself.

So the correct sentence would be "1 thread of the 360 CPU with the whole 1MB of L2 caché available to it was only 20% faster than Broadway", so with more threads being used and since we know that cache misses were EXPENSIVE on the 360 (500 cycles waiting for data XD) it's obvious that WiiU CPU, which that at LEAST is three Wii cores overclocked at 1,24Ghz (70% more clock), with two of them having the L2 cache DOUBLED and the other one having it multiplied by EIGHT will be on another league on real world scenarios. Ah, and this is not considering that WiiU has a proper DSP and some other processors to do what the 360 had to use it's CPU for, meaning even more accesses to main memory and so less real world performance...
 

Earendil

Member
I thought I was the only one who noticed that. I know B3D are supposed to be up their in the tech field but they are certainly not without bias and agendas.

No, you definitely aren't the only one who noticed. There's a reason I stopped posting over there.

I feel like I've mentioned Nintendoland a number of times and it hasn't so much as been addressed

I haven't played it yet, so I cannot comment. But based on videos, I would agree with you.
 

kinggroin

Banned
I feel like I've mentioned Nintendoland a number of times and it hasn't so much as been addressed

For what its worth, not only do I agree with you, but I feel it hits the highest visual plateau at launch so far. Alphas and radiosity everywhere, gorgeous self shadowing and when fully populated, a nice amount of geometry all running locked at 60fps. Having a nice aesthetic doesn't hurt either.
 
the only thing he could possibly 'regret' is agreeing with the hypothesis that we're seeing alpha issues on multiple Wii U titles because of memory bandwidth. the things he calls facts... all are. Epic Mickey 2 does allow you to create the issue on demand, and multiple titles show the same issue.

Exactly. I made sure to constantly use the word hypothesis for a reason, if it turns out to be wrong someone will make a different hypothesis and then it can be tested. I have no beef with being wrong here, only problem I have is with the ridiculousness of people here in dealing with ideas they don't agree with.
 

ozfunghi

Member
Really simple to understand. Current engines were made with PS360 on mind, and this means, 22-40 GB/s bandwidth to the bigger pool of memory and on 360's case, a small amount of eDram that can't be read directly from the GPU.

Even when memory management is one of the easiest part to adapt, since kits weren't even finalized, games had been totally rushed and maximum bandwidth of the WiiU big pool of RAM is only 12,8GB/s, it's easy to think were these "artificial" bottleneck appears.

CPU is also totally different, with a much more modest SIMD design but a much better general purpose performance *.

So until engines are adapted to a more friendly design (and both PS4 and 720 will have that design, by the way) we won't be able to speak about the console's flaws basing our judgement on the games at sale.


*Hell, even the "1 thread of the 360 CPU was only 20% faster than Broadway in a general case scenario" is not a fair comparison because 360 had a shared L2 cache, which means that if only 1 thread was used, this thread had the whole 1MB of L2 cache for itself.

So the correct sentence would be "1 thread of the 360 CPU with the whole 1MB of L2 caché available to it was only 20% faster than Broadway", so with more threads being used and since we know that cache misses were EXPENSIVE on the 360 (500 cycles waiting for data XD) it's obvious that WiiU CPU, which that at LEAST is three Wii cores overclocked at 1,24Ghz (70% more clock), with two of them having the L2 cache DOUBLED and the other one having it multiplied by EIGHT will be on another league on real world scenarios. Ah, and this is not considering that WiiU has a proper DSP and some other processors to do what the 360 had to use it's CPU for, meaning even more accesses to main memory and so less real world performance...

Yes.

But this is still nothing more than a logical hypothesis. I would like actual confirmation from someone knowdledgable (like Blu, AlStrong...) that the memory setup is more than adequate in order to pull off alpha like games so far have "tried" to. And thus, have tried to achieve this in a bad way.
 

ozfunghi

Member
Exactly. I made sure to constantly use the word hypothesis for a reason, if it turns out to be wrong someone will make a different hypothesis and then it can be tested. I have no beef with being wrong here, only problem I have is with the ridiculousness of people here in dealing with ideas they don't agree with.

I think the confusion comes from you stating that you are going on "facts", while in fact, you are only going on "facts" to support a hypothesis. Hence, the hypothesis still isn't a "fact".
 

guek

Banned
Exactly. I made sure to constantly use the word hypothesis for a reason, if it turns out to be wrong someone will make a different hypothesis and then it can be tested. I have no beef with being wrong here, only problem I have is with the ridiculousness of people here in dealing with ideas they don't agree with.

What ridiculousness? You're the one shouting in caps and sighing about fanboys. Who attacked you?

I might as well do what you did and say "Sigh, what a typical rant by a nintendo hater" but that wouldn't help anything now would it?
 
For what its worth, not only do I agree with you, but I feel it hits the highest visual plateau at launch so far. Alphas and radiosity everywhere, gorgeous self shadowing and when fully populated, a nice amount of geometry all running locked at 60fps. Having a nice aesthetic doesn't hurt either.

Couldn't agree more. The plaza with lots of gifts and fully populated looks really awesome! Even the zoomed out view.
 
That part is confusing. Do you mean the WiiU's eDRAM is faster than the 360's? Or their respective RAM? If it's the latter, it's not faster, no.

It is, WiiU can use it's full bandwidth to either read or write, while 360 had only half available for each.
 
It is, WiiU can use it's full bandwidth to either read or write, while 360 had only half available for each.

This image says that 360's CPU get's half but the GPU gets full 22GB/s. At least that's how I read it.

X360bandwidthdiagram.jpg
 

ozfunghi

Member
That part is confusing. Do you mean the WiiU's eDRAM is faster than the 360's? Or their respective RAM? If it's the latter, it's not faster, no.

Define "faster". Are you talking about talking about latency, then WiiU is faster. If going by theoretical numbers, 360 has higher bandwidth. But it has 22GB/s... devided by two, for read and write. While WiiU has 12GB/s, which could be used for either (but not at the same time). Supposedly, WiiU real-world performance would come closer to its theoretical peak as well. So maximum read BW for WiiU is actually higher than 360. Same goes for write, but never at once. Since the gDDR3 on WiiU is over twice the size, it will need to do less reading/writing at once than 360.

This is how i understand it.
 
Define "faster". Are you talking about talking about latency, then WiiU is faster. If going by theoretical numbers, 360 has higher bandwidth. But it has 22GB/s... devided by two, for read and write. While WiiU has 12GB/s, which could be used for either (but not at the same time). Supposedly, WiiU real-world performance would come closer to its theoretical peak as well. So maximum read BW for WiiU is actually higher than 360. Same goes for write, but never at once. Since the gDDR3 on WiiU is over twice the size, it will need to do less reading/writing at once than 360.

This is how i understand it.

This is how I understood it as well, I didn't mean theoretical bandwidths I meant real world performance
 

neo-berserk

Neo Member
I think its equally silly to dismiss off hand that unfamiliarity may have caused the developer in question to potentially not leverage the memory subsystem properly. If the documentation was non descript and the sdk/final kits were very late, it certainly seems possible to me.

exactly like nintendo helping shinen w/neo they said(shinen')that tey get twice the performance out of memory problems or something like that don't remember well...
 
According to the image on page 1 the WiiU edram and GPU are on the same chip. I think people were throwing around 140GB/s on that connection, though I don't know if that was revised down or up. We also see DDR interfaces on page 1, which we know the speed of (12.8GB/s). Not sure about the CPU->GPU speed though, maybe that was discovered earlier and I didn't see it. I guess that because the GPU and eDram are on the same chip it'll have one bus from that chip to CPU. I think Southbridge is on the GPU chip, but I don't think it's important in any case.

CPU <-???->SOC(eDram<-140GB/s estimate->GPU)<-12.8GB/s->DDR3 RAM
 

tipoo

Banned
Define "faster". Are you talking about talking about latency, then WiiU is faster. If going by theoretical numbers, 360 has higher bandwidth. But it has 22GB/s... devided by two, for read and write. While WiiU has 12GB/s, which could be used for either (but not at the same time). Supposedly, WiiU real-world performance would come closer to its theoretical peak as well. So maximum read BW for WiiU is actually higher than 360. Same goes for write, but never at once. Since the gDDR3 on WiiU is over twice the size, it will need to do less reading/writing at once than 360.

This is how i understand it.

This image says that 360's CPU get's half but the GPU gets full 22GB/s. At least that's how I read it.

X360bandwidthdiagram.jpg


Good point here; if it's the case that only the CPU on the 360 gets half that speed for both read and write, and the GPU gets the full 22GB/s for reads, that's a bigger deal. GPUs are more bandwidth intensive/bound. The 360 also has a similar eDRAM to take away some higher bandwidth operations, although the Wii U does have 3x more of it. Would the extra eDRAM make up for the others GPU having twice the speed to the larger pool of RAM? I have no idea, but instinct says no.
 

Raist

Banned
Interesting. Can a tech guru clear this up? Or show a comparable image for WiiU's lay-out?

It's rather simple.

The 360's RAM is faster than the WiiU's, there's no arguing about it (I mean it's GDDR3 vs DDR3, that's expected).

The 360's CPU access to the RAM might be slower than the WiiU's, but the only scenario where that would happen is when the CPU's basically utilizing almost all the RAM's bandwidth, which is unlikely.
 
It is, WiiU can use it's full bandwidth to either read or write, while 360 had only half available for each.
No, 360 can use it's full bandwidth to read, the one that is limited to 11.1GB/s (10.8GB/s according to the picture posted above) is the CPU as far as I understand it.
But I don't think the CPU would need more than that and since the GPU can read at the full 22.2GB/s of speed then it's safe to assume the whole 22.2GB/s as the available maximum theoretical speed. Latencies are something else, and I don't doubt even for a moment that the DDR3 on the WiiU have lower latencies than the GDDR3 used on the 360 and that the aggregate latencies are even lower since the WiiU is using an MCM design and the CPU is much closer to the memory than the one on the 360 was.

But if used in the same way, then WiiU's theoretical maximum bandwidth from it's biggest pool of memory is near half what the 360 have, being a possible source of "artificial" bottlenecks.
 

tipoo

Banned
Also the image appears to be right: The CPU bandwidth each way is cut in half but the GPU gets the full 22. So says Anandtech.

http://www.anandtech.com/show/1719/7

With the Wii U, the markings on the DRAM chips themselves tell us how much data they can move per second, so it's not like the 360 case where the GPU can have double the bandwidth of the CPU, that's just how much they move in total period regardless of how they are connected.

In the Wii U, the RAM connects to the GPU first, right? I wonder if the CPU bandwidth is similarly cut down from that.
 

neo-berserk

Neo Member

krizzx

Junior Member
It's rather simple.

The 360's RAM is faster than the WiiU's, there's no arguing about it (I mean it's GDDR3 vs DDR3, that's expected).

The 360's CPU access to the RAM might be slower than the WiiU's, but the only scenario where that would happen is when the CPU's basically utilizing almost all the RAM's bandwidth, which is unlikely.

The GC and the Wii both used GDDR3 RAM but the Wii U's RAM is faster, so I don't understand that logic. Also, there is far more to RAM speed than clock speed. The CAS latency is just as, if not more, important. There are other aspects that also affect the RAM speed that escape me at the moment.

Also, its gDDR3 RAM for the Wii U I thought. Have they found out what the lower case g means yet? I do not understand why people keep omitting it when discussing the Wii U's RAM.

There is one other thing I don't understand that seems to contradict most of the slow RAM bottleneck theory. All of the Wii U ports of 360 and PS3 games have much faster loading times on the Wii U. Is that related to the RAM?
 

prag16

Banned
In the Wii U, the RAM connects to the GPU first, right? I wonder if the CPU bandwidth is similarly cut down from that.

I would probably doubt that, but could be wrong. Seems reasonable that it may be more likely to affect the latency though.

There is one other thing I don't understand that seems to contradict the most of the slow RAM bottlekneck theory. All of the Wii U ports of 360 and PS3 games have faster loading times on the Wii U. Is that related to RAM?

That's news to me. Very interesting if true.
 
Also, its gDDR3 RAM for the Wii U I thought. Have they found out what the lower case g means yet? I do not understand why people keep omitting it when discussing the Wii U's RAM.

The Wii U has several RAM providers, all with chips rated at the same speed. IIRC while two vendors supply regular DDR3, one vendor supplies gDDR3 which is their lower voltage RAM module, rated as the same speed as the other two vendors.

There is one other thing I don't understand that seems to contradict most of the slow RAM bottleneck theory. All of the Wii U ports of 360 and PS3 games have much faster loading times on the Wii U. Is that related to the RAM?

No. That's moreso flushing data from the disk. Like copying a file from one location (BD) to another (RAM), with whatever decompression comes along the way. Solely on BD Disc speed.
 
The GC and the Wii both used GDDR3 RAM but the Wii U's RAM is faster, so I don't understand that logic. Also, there is far more to RAM speed than clock speed. The CAS latency is just as, if not more, important. There are other aspects that also affect the RAM speed that escape me at the moment.

Also, its gDDR3 RAM for the Wii U I thought. Have they found out what the lower case g means yet? I do not understand why people keep omitting it when discussing the Wii U's RAM.

There is one other thing I don't understand that seems to contradict most of the slow RAM bottleneck theory. All of the Wii U ports of 360 and PS3 games have faster loading times on the Wii U. Is that related to the RAM?

We know what gDDR3 RAM is. It's the same as DDR3 ram.

The faster loading times could be a faster disk drive, or the developer pre-loading some stuff due to the fact that WiiU has double the RAM than 360 (for devs).
 

neo-berserk

Neo Member
The GC and the Wii both used GDDR3 RAM but the Wii U's RAM is faster, so I don't understand that logic. Also, there is far more to RAM speed than clock speed. The CAS latency is just as, if not more, important. There are other aspects that also affect the RAM speed that escape me at the moment.

Also, its gDDR3 RAM for the Wii U I thought. Have they found out what the lower case g means yet? I do not understand why people keep omitting it when discussing the Wii U's RAM.

There is one other thing I don't understand that seems to contradict most of the slow RAM bottleneck theory. All of the Wii U ports of 360 and PS3 games have much faster loading times on the Wii U. Is that related to the RAM?

don't forget it's not normal ram it's very custom by IBM to meet the need of nintendo.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Ok, so for the less tech savy
me
, that argument is void? Unless the 32MB eDRAM is slower than that? But that is unlikely? And what about tipoo's argument, that not everything could be stored in MEM1?
I was referring to peak fillrate alone. From there on everything is open for debate.

12.8GB/s for static assets is not little by any measure. Of course 22.4GB/s is better. But then come the questions of what part of the dynamic assets (i.e. mostly render-target textures) Latte can keep onboard - for Xenos that amount is 0 (Xenos cannot texture from eDRAM - it has to first resolve a render target to main mem, and then fetch it back from there).

Can there be a hypothetical scenario where 360's 22.4GB/s could give it an advantage over WiiU, based on static assets fetching alone? Sure. How often would that happen in practice? We need to hear actual multi-platform developers' comments on the subject to be able to get an idea.

As re the alpha-blending issue that Epic Mickey is supposed to demonstrate - that's just a hypothesis. The bottleneck, if in hw, does not have to necessarily be in the ROPs - it could be in trisetup just as well. Or the bottleneck might not be in the hw at all. Claiming a game demonstrates it without actually having analyzed the situation (read: profiled it and/or ran synthetic tests isolating all but the crucial aspects) is just not serious.
 

Schnozberry

Member
Regarding the Toki Tori Dev comment, does anyone know what kind of magical texture compression might be taking place? Is there some kind of hardware compression and decompression of textures that wasn't available on the PS360 GPUs?
 

QaaQer

Member
The Wii U has several RAM providers, all with chips rated at the same speed. IIRC while two vendors supply regular DDR3, one vendor supplies gDDR3 which is their lower voltage RAM module, rated as the same speed as the other two vendors.

the lower case 'g' refers to package type. Somebody went through the Samsung catalog to figure that out, it is just DDR3 memory, 800 mHz iirc.
 

Raist

Banned
The GC and the Wii both used GDDR3 RAM but the Wii U's RAM is faster, so I don't understand that logic. Also, there is far more to RAM speed than clock speed. The CAS latency is just as, if not more, important. There are other aspects that also affect the RAM speed that escape me at the moment.

It's not faster >.> And we're talking bandwidth, not clock speed.
As for latency, I don't think there is a massive difference between DDR3 and GDDR3, but don't quote me on that.
 

krizzx

Junior Member
It's not faster >.> And we're talking bandwidth, not clock speed.
As for latency, I don't think there is a massive difference between DDR3 and GDDR3, but don't quote me on that.

Even amongst RAM of the same type and clock speed, latency makes a lot of difference as well as other things.

http://www.overclockersclub.com/reviews/corsair_vengeance_12gb/3.htm
http://www.techradar.com/us/reviews...tracer-1333mhz-717099/review/2#articleContent
http://pro-clockers.com/memory/2589-crucial-ballistix-sport-vlp-1600mhz-16gb-kit-review.html?start=5

Just as with a CPU, there is far more to RAM performance than clocks. I would imagine that while the clock in the Wii U's ram might be lower, the performance, hertz for hertz, may well be better than the 360's memory. Just a hypothesis of course.

What do you mean by "its not faster"? Are you inferring that the RAM in the Wii U is slower than the RAM in the Wii and Gamecube?
 

wsippel

Banned
Regarding the Toki Tori Dev comment, does anyone know what kind of magical texture compression might be taking place? Is there some kind of hardware compression and decompression of textures that wasn't available on the PS360 GPUs?
3Dc comes to mind. Though Xenos supports 3Dc - RSX doesn't.
 

NBtoaster

Member
Guess we know why Pikmin 3 doesn't feature AF. 16 TMUs at 550Hz means only 8.8GB/s texture fill rate. Slightly more than 360 but less than PS3.
 

ozfunghi

Member
I was referring to peak fillrate alone. From there on everything is open for debate.

12.8GB/s for static assets is not little by any measure. Of course 22.4GB/s is better. But then come the questions of what part of the dynamic assets (i.e. mostly render-target textures) Latte can keep onboard - for Xenos that amount is 0 (Xenos cannot texture from eDRAM - it has to first resolve a render target to main mem, and then fetch it back from there).

Can there be a hypothetical scenario where 360's 22.4GB/s could give it an advantage over WiiU, based on static assets fetching alone? Sure. How often would that happen in practice? We need to hear actual multi-platform developers' comments on the subject to be able to get an idea.

As re the alpha-blending issue that Epic Mickey is supposed to demonstrate - that's just a hypothesis. The bottleneck, if in hw, does not have to necessarily be in the ROPs - it could be in trisetup just as well. Or the bottleneck might not be in the hw at all. Claiming a game demonstrates it without actually having analyzed the situation (read: profiled it and/or ran synthetic tests isolating all but the crucial aspects) is just not serious.

Ok, thanks.
 

guek

Banned
Can there be a hypothetical scenario where 360's 22.4GB/s could give it an advantage over WiiU, based on static assets fetching alone? Sure. How often would that happen in practice? We need to hear actual multi-platform developers' comments on the subject to be able to get an idea.

Dammit blu, you're supposed to be able to give us definite, concrete answer!

I still <3 u
 

Raist

Banned
Even amongst RAM of the same type and clock speed, latency makes a lot of difference as well as other things.

http://www.overclockersclub.com/reviews/corsair_vengeance_12gb/3.htm
http://www.techradar.com/us/reviews...tracer-1333mhz-717099/review/2#articleContent
http://pro-clockers.com/memory/2589-crucial-ballistix-sport-vlp-1600mhz-16gb-kit-review.html?start=5

Just as with a CPU, there is far more to RAM performance than clocks. I would imagine that while the clock in the Wii U's ram might be lower, the performance, hertz for hertz, may well be better than the 360's memory. Just a hypothesis of course.

What do you mean by "its not faster"? Are you inferring that the RAM in the Wii U is slower than the RAM in the Wii and Gamecube?

You don't seem to understand the difference between clock speed and bandwidth. I've pointed that out before.

And no, neither the GC's (which never had GDDR3 btw) nor the Wii's RAM were faster.

360s memory is split into two channels, 10.4gb/s for reads and 10.4gb/s for writes. Thats for anything that uses that memory.

Did you somehow miss the graph summarizing the architecture like 4 posts above what you quoted? The split is only in regards to the CPU<->GPU bandwidth (and the CPU accesses the RAM via the GPU). The GPU<->RAM access is not affected by this.
 

Schnozberry

Member
Did you somehow miss the graph summarizing the architecture like 4 posts above what you quoted? The split is only in regards to the CPU<->GPU bandwidth (and the CPU accesses the RAM via the GPU). The GPU<->RAM access is not affected by this.

The split occurs with the GPU as well while it is reading and writing simultaneously. It can write or read full speed, but not at the same time. Also, based on developer commentary, real world bandwidth didn't reach theoretical peaks in either situation.
 
Big kudos to the chip works guys.

Glad to see that there is some good speculation mixed in here amongst the trolling.

Agreed, I haven't tossed my thanks in yet but I do very much appreciate the efforts of those who got the die shot and then are interpreting it.

Also, why do I feel like your avatar more every day?
 

prag16

Banned
The split occurs with the GPU as well while it is reading and writing simultaneously. It can write or read full speed, but not at the same time. Also, based on developer commentary, real world bandwidth didn't reach theoretical peaks in either situation.

Are you saying the Wii U CAN read/write simultaneously, with both operations occurring at potentially over 10 GB/s? Not sure if that was made clear yet (probably was, and I just missed it).
 

Schnozberry

Member
Are you saying the Wii U CAN read/write simultaneously, with both operations occurring at potentially over 10 GB/s? Not sure if that was made clear yet (probably was, and I just missed it).

No, it cannot. The 360 can read and write simultaneously, but it must split it's bandwidth when it does so. The Wii U reads and writes at full bandwidth, but not at the same time.
 

prag16

Banned
No, it cannot. The 360 can read and write simultaneously, but it must split it's bandwidth when it does so. The Wii U reads and writes at full bandwidth, but not at the same time.

Understood. 360 definitely has a theoretical advantage then (if not an enormous one; definitely less than almost 2x or whatever the speeds say). Would be interesting to see how that would play out in real world conditions both in a vacuum, and considering the rest of the respective memory setups.
 
Status
Not open for further replies.
Top Bottom