• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.
256bit? What pins did you count? The things I believe should be the interface pins are pretty tiny and hard to count precisely, but the closest ballpark seems to be 512 - per macro. So 2048bit for the 2MB pool.

We have to use a bit of logic as to why MEM0 is there to begin with, and split into two pools at that. Plainly, if the 32 MB eDRAM was fast enough to be used for Wii's eFB and eTC, it would be - there's enough of it on there. So using that logic, the 1 MB texture cache is faster than the 2 MB framebuffer, which is in turn faster than MEM1. Of course, when we say faster, we are talking relatively - bandwidth per kilobyte of SRAM. The real issue is the individually addressable banks. Just as we would expect, the SRAM texture cache is broken into 32 banks, just as on Hollywood. It follows that it should be on a 512 bit bus as previously, otherwise why not have it as eDRAM? If that 2 MB pool is carrying that much bandwidth, the SRAM pool would be senseless. It makes more sense for that 2 MB pool to be on the same 256-bit bus as its predecessor.

Now take a look at the arrangement of the rows and columns in the eDRAM pools. In the 2 MB pool the columns run linearly across the interface on the bottom of the macros. There are about 64 columns across, with a couple extra on the ends(for redundancy/copy protection I believe it was said). That visually seems to suggest that each macro is running on a 64 bit bus.

For the macros that make up the 32 MB pool, the columns are arranged in a different fashion, coming together like a sandwich perpendicular to the interface. The columns are 32 across and there are 4 of these intersections on each macro. That makes 1024-bit for the whole deal - a number which was previously speculated on and seems to make sense in performance analysis.
 

Donnie

Member
Wasn't the 2MB eDRAM in Flipper/Hollywood on a 384bit bus? AFAIR its bandwidth was 7.6GB/s at 162Mhz (GC) and 11.4GB/s at 243Mhz (Wii).
 
I wrote it down wrong. I said "half as powerful again" as the 360. I didn't mean is the WiiU only half as powerful as the 360, I meant as powerful as the 360 + half as powerful on top (that would be 1.5 times as powerful, then?) I have been told the answer, just clearing something up.

I knew what you meant :)

Considering how little power the WiiU uses we really must not be far away from mobiles that have more graphical power than the 360/PS4.

I think you hit the nail on the head here. I'd be very surprised if the next Nintendo console wasn't a handheld/console hybrid that streams video to the TV.
 
I knew what you meant :)



I think you hit the nail on the head here. I'd be very surprised if the next Nintendo console wasn't a handheld/console hybrid that streams video to the TV.

And I'm completely of the opposite opinion there is little to no way that in 4 to 5 years time (the minimum for agen leap) that technology will be a decent leap above wii u in a mobile form
 
And I'm completely of the opposite opinion there is little to no way that in 4 to 5 years time (the minimum for agen leap) that technology will be a decent leap above wii u in a mobile form
I don't know why Nintendo would put so much effort into making the Wii U compact and energy efficient while combining their handheld and console divisions if they weren't planning on releasing a single device that can act as both next gen. I doubt it'll be a decent leap above the Wii U, like you said, but does it have to be? It not being a gen above the Wii U graphically will save them money on developing games. And it will solve the problem they have right now where they're spreading themselves thin developing for two systems.

The only reason I can think that they won't do this is because they wouldn't want to put all their eggs in one basket.

Why do you think they won't do this?
 

sfried

Member
I think you hit the nail on the head here. I'd be very surprised if the next Nintendo console wasn't a handheld/console hybrid that streams video to the TV.

Everything about their architectural philosophy (about using less electric power) has been leading up to this. It was prominent with Wii, and now even more prominent with the Wii U. I don't know why else they would be looking for something that takes up so little watts.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
We have to use a bit of logic as to why MEM0 is there to begin with, and split into two pools at that. Plainly, if the 32 MB eDRAM was fast enough to be used for Wii's eFB and eTC, it would be - there's enough of it on there.
It could be fast enough to cover Flippers aggregate eFB + TC BW and yet not meet Flipper's needs for TC latencies.
 

japtor

Member
I don't know why Nintendo would put so much effort into making the Wii U compact and energy efficient while combining their handheld and console divisions if they weren't planning on releasing a single device that can act as both next gen. I doubt it'll be a decent leap above the Wii U, like you said, but does it have to be? It not being a gen above the Wii U graphically will save them money on developing games. And it will solve the problem they have right now where they're spreading themselves thin developing for two systems.

The only reason I can think that they won't do this is because they wouldn't want to put all their eggs in one basket.

Why do you think they won't do this?
Another reason would be additional complexity introducing additional costs on top of the likely high costs if they target that power spec for a portable to begin with. And otherwise while it might make development easier from a basic tech perspective, it's also harder.

Games would have to be designed to be played on the portable by itself as well as on the big screen. Stuff like HUD elements would have to be designed to take into account the different sizes, they'd have to optimize for possible different performance characteristics (unless they standardize on 720p for the portable screen?), something possible on the big screen might not be as playable on the small screen without tweaking (like the Mario Galaxy example), stuff designed for portable play might not work well for people expecting a console experience (or vice versa), etc. It'd basically be developing portable and console versions and getting a single payout for it.

...which could work if the sales are there, which gets back to the eggs in one basket thing, which could be an issue in the markets where portables are weak vs consoles (and phones), since ultimately it'd look like a portable that connects to a TV.
 

wsippel

Banned
We have to use a bit of logic as to why MEM0 is there to begin with, and split into two pools at that. Plainly, if the 32 MB eDRAM was fast enough to be used for Wii's eFB and eTC, it would be - there's enough of it on there. So using that logic, the 1 MB texture cache is faster than the 2 MB framebuffer, which is in turn faster than MEM1. Of course, when we say faster, we are talking relatively - bandwidth per kilobyte of SRAM. The real issue is the individually addressable banks. Just as we would expect, the SRAM texture cache is broken into 32 banks, just as on Hollywood. It follows that it should be on a 512 bit bus as previously, otherwise why not have it as eDRAM? If that 2 MB pool is carrying that much bandwidth, the SRAM pool would be senseless. It makes more sense for that 2 MB pool to be on the same 256-bit bus as its predecessor.

Now take a look at the arrangement of the rows and columns in the eDRAM pools. In the 2 MB pool the columns run linearly across the interface on the bottom of the macros. There are about 64 columns across, with a couple extra on the ends(for redundancy/copy protection I believe it was said). That visually seems to suggest that each macro is running on a 64 bit bus.

For the macros that make up the 32 MB pool, the columns are arranged in a different fashion, coming together like a sandwich perpendicular to the interface. The columns are 32 across and there are 4 of these intersections on each macro. That makes 1024-bit for the whole deal - a number which was previously speculated on and seems to make sense in performance analysis.
I don't think that's how it works. The two small pools need to be separate because that's how they were organized on Flipper and Hollywood. They need to be on two separate busses. And MEM1 needs its own bus as well. It's hardware BC after all, the memory needs to be split in three pools on three busses regardless of speed.

Also, I believe the small pads organized in tables at the bottom of each eDRAM macro are the actual interface, and there are ~500 of those per macro.
 

Clefargle

Member
I knew what you meant :)



I think you hit the nail on the head here. I'd be very surprised if the next Nintendo console wasn't a handheld/console hybrid that streams video to the TV.
Fucking this

Everything about their architectural philosophy (about using less electric power) has been leading up to this. It was prominent with Wii, and now even more prominent with the Wii U. I don't know why else they would be looking for something that takes up so little watts.

Another reason would be additional complexity introducing additional costs on top of the likely high costs if they target that power spec for a portable to begin with. And otherwise while it might make development easier from a basic tech perspective, it's also harder.

Games would have to be designed to be played on the portable by itself as well as on the big screen. Stuff like HUD elements would have to be designed to take into account the different sizes, they'd have to optimize for possible different performance characteristics (unless they standardize on 720p for the portable screen?), something possible on the big screen might not be as playable on the small screen without tweaking (like the Mario Galaxy example), stuff designed for portable play might not work well for people expecting a console experience (or vice versa), etc. It'd basically be developing portable and console versions and getting a single payout for it.

...which could work if the sales are there, which gets back to the eggs in one basket thing, which could be an issue in the markets where portables are weak vs consoles (and phones), since ultimately it'd look like a portable that connects to a TV.

Just make it a Dual Screen system and you have the ability to play it either way. Make the charge cradle something like the Wii's but hooked through a cable to the TV. Then you can use the system itself as a controller with one screen blank. Also make both screens capacitive HD touch screens to provide devs with options. It could be the ultimate fusion, offering full BC, even WiiU.
 
Well, regarding GC's eDram bus, it had the following bandwidth:
10.4 GB/s Texture cache bandwidth
7.6 GB/s Framebuffer bandwidth

10.4 GB/s at 162 Mhz means a 512 bit bus for the eTC
7.6 GB/s at 162 Mhz means a 384 bit bus for the framebuffers

Considering that we know for sure that hardware emulation is done via a downclock, those buses have to have at least the same width as they had on the Wii/GC.
 
Another reason would be additional complexity introducing additional costs on top of the likely high costs if they target that power spec for a portable to begin with. And otherwise while it might make development easier from a basic tech perspective, it's also harder.

Games would have to be designed to be played on the portable by itself as well as on the big screen. Stuff like HUD elements would have to be designed to take into account the different sizes, they'd have to optimize for possible different performance characteristics (unless they standardize on 720p for the portable screen?), something possible on the big screen might not be as playable on the small screen without tweaking (like the Mario Galaxy example), stuff designed for portable play might not work well for people expecting a console experience (or vice versa), etc. It'd basically be developing portable and console versions and getting a single payout for it.

...which could work if the sales are there, which gets back to the eggs in one basket thing, which could be an issue in the markets where portables are weak vs consoles (and phones), since ultimately it'd look like a portable that connects to a TV.

Yeah that could turn out too expensive. I don't know though, Nvidia keeps talking about how their tablet and smartphone GPUs will be better than this past gen's console's. And those are coming out when, next year?

I don't think it's too farfetched to have a 720p screen of a decent size on a portable. I'm looking at my Wii U GamePad and I'm sure I could play Galaxy on a screen that size. Comparing that screen to the lid of my 3DS XL and it's just ever slightly bigger. If they moved the camera and speakers from the top of the XL they could squeeze a decent sized screen in there. In 5 years, it could even be 720p without being too expensive. They could leave the bottom screen as is for DS backwards compatibility. Battery life though...

I could be totally off the mark, but I think the signs are there.
 
Fucking this





Just make it a Dual Screen system and you have the ability to play it either way. Make the charge cradle something like the Wii's but hooked through a cable to the TV. Then you can use the system itself as a controller with one screen blank. Also make both screens capacitive HD touch screens to provide devs with options. It could be the ultimate fusion, offering full BC, even WiiU.
You had me until capacitive.
 
You had me until capacitive.

I know we'd be arguing in circles, back to the pre-Wii U launch days where there were dozens of threads on the issue, but I think everyone came to the consensus that modern capacitive tech is just as precise and accurate as the resistive Nintendo used. The only downfall would be that you can't use ANY object or your fingernail to tap the screen, just capacitive pens or conductive metals.

But you can go ahead and post that 2008 jpg comparing phone screens.
 
I know we'd be arguing in circles, back to the pre-Wii U launch days where there were dozens of threads on the issue, but I think everyone came to the consensus that modern capacitive tech is just as precise and accurate as the resistive Nintendo used. The only downfall would be that you can't use ANY object or your fingernail to tap the screen, just capacitive pens or conductive metals.

But you can go ahead and post that 2008 jpg comparing phone screens.
Umm... Down Cujo.

I have never, nor did I intend to post any photos. I would, however, prefer a multi touch resistive screen for the reasons you just mentioned.

Also, if you want a capacitive screen as accurate as the gamepad screen (and at that size) it would be very cost prohibitive.
 
Umm... Down Cujo.

I have never, nor did I intend to post any photos. I would, however, prefer a multi touch resistive screen for the reasons you just mentioned.

Also, if you want a capacitive screen as accurate as the gamepad screen (and at that size) it would be very cost prohibitive.

The Note 7 and the Nexus 7 have very accurate screens more than decent for the gaming we have seen on the DS, 3DS and Wii U. The capacitive glass is not a large priceline item for either of those.

What we're talking about is the theoretical application of a precise touchscreen whose resolution is many times higher than the screen's itself, like the ones manufactured by Wacom, and we've seen several applications of those at decent price points in the last year or so.
 

Randdalf

Member
I knew what you meant :)

I think you hit the nail on the head here. I'd be very surprised if the next Nintendo console wasn't a handheld/console hybrid that streams video to the TV.

Nintendo will surely go with VR in their next console now that the technology is becoming cheaper and more consumer-friendly.
 
The Note 7 and the Nexus 7 have very accurate screens more than decent for the gaming we have seen on the DS, 3DS and Wii U. The capacitive glass is not a large priceline item for either of those.

What we're talking about is the theoretical application of a precise touchscreen whose resolution is many times higher than the screen's itself, like the ones manufactured by Wacom, and we've seen several applications of those at decent price points in the last year or so.

The decent priced screen you mention costs more than the entire gamepad.


You realize that the Nexus 7 screen has a bom of around $62 when a capacitive interface is added.

The Gamepad's bom for all of its components is estimated around $50.


This discussion is pointless and thread derailing, though.
 

joesiv

Member
Just make it a Dual Screen system and you have the ability to play it either way. Make the charge cradle something like the Wii's but hooked through a cable to the TV. Then you can use the system itself as a controller with one screen blank. Also make both screens capacitive HD touch screens to provide devs with options. It could be the ultimate fusion, offering full BC, even WiiU.
makes sense for the consumer, but I don't think it makes sense for Nintendo. I think they'd prefer to just have the console and the portable talk between each other better, and have software moving forward leverage that. The main reason would be they could target two price points (and power envelopes), typically the portable systems are cheap allowing the mass's to buy them (including young people). If they merged the functionality to be the same, that would essentially mean you'd sell fewer portables, as well as you'd miss the people consumers that would have bought both.
 

joesiv

Member
I have never, nor did I intend to post any photos. I would, however, prefer a multi touch resistive screen for the reasons you just mentioned.

Also, if you want a capacitive screen as accurate as the gamepad screen (and at that size) it would be very cost prohibitive.
Capacitive is actually pretty good for accuracy, at least my iphone is. The problem is they're really sensitive, so if you use a big nubby finger, it's only going to be as accurate as the tip if your finger. Use a capacitive stylus and you're fine for accuracy. HOWEVER, the reason why I don't like capacitive touch for things like stylus use (and thus DS/gamepad) is that you have to continuously hover the stylus off the surface unless you want it to be sensed. This makes drawing and other uses a bit more clumsy and annoying (whereas resistive you can rest the stylus on the surface and just press harder when you're in the right spot). Perhaps it's just a personal thing.

I would say that capacitive touch is far better at tracking changes, things like doing swipes, and pinch zoom wouldn't work as well with resistive (even if it had multi touch)
 
Wasn't the 2MB eDRAM in Flipper/Hollywood on a 384bit bus? AFAIR its bandwidth was 7.6GB/s at 162Mhz (GC) and 11.4GB/s at 243Mhz (Wii).

Right you are, Donnie. Man, I've been at this nonsense so long, I'm forgetting things I used to know! haha. Anyway, I was looking over the shot of Hollywood that Marcan did. The texture cache looks just like one would imagine from the descriptions. 32 macros, each of which contains 16 columns and a 16-bit bus for 512 total pools and a 512-bit aggregate bus.

On the other hand, the eFB is broken up into what appears to be 8 macros of 128 total pools (so far so good). But each macro is comprised of 16 columns (the top rows are actually 17, but that's for error protection supposedly). So each of those ends (the blue parts - need a name for them, anyone?) would need to carry 3 bits of data to get a composite 384-bit pool.

Now on Wii U, that 2 MB pool is somewhat different. Each module has 64 (actually 66) columns for a total of 256. If we're figuring the same 384-bit bus, that breaks down to 1.5 bits for each. But that doesn't make sense to me. From what I know of comp sci, I don't see how you can have half a bit. So maybe some smart person can chime in and help make sense of this.

My only other guess is that each of the modules is actually on a 128-bit bus for the reason that those are the closest Renesas could supply (at the price Nintendo was willing to pay).

It could be fast enough to cover Flippers aggregate eFB + TC BW and yet not meet Flipper's needs for TC latencies.

I hear yah. That's kind of what I meant - that whatever is running Wii BC would need access to a certain number of pools of a certain size in a certain amount of time. Actually, even if MEM1 is on a 2048-bit bus, it still would be slower per megabit than both Wii's eFB and eTC.

I don't think that's how it works. The two small pools need to be separate because that's how they were organized on Flipper and Hollywood. They need to be on two separate busses. And MEM1 needs its own bus as well. It's hardware BC after all, the memory needs to be split in three pools on three busses regardless of speed.

Also, I believe the small pads organized in tables at the bottom of each eDRAM macro are the actual interface, and there are ~500 of those per macro.

True, the pools probably need to be on separate buses. But see what I wrote above. Even if the 32 MB is on a 2048-bit bus, I reckon it still wouldn't be enough. So there is good reason to believe those two additional pools would be a necessary inclusion even if they perform the same as on Hollywood/Flipper.

Actually, it's good we came back to this, as it isn't very clear, and the door could still be open on that 32 MB pool running at 140.8 GB/s. I assumed (and it appears) that the ends which come together "sandwich" style would share a bus (or portion of the total bus for the whole macro I should probably say), but indeed, the jury is still out. It would certainly help if we had another Renesas eDRAM module of known specifications to compare against.
 

krizzx

Junior Member
Right you are, Donnie. Man, I've been at this nonsense so long, I'm forgetting things I used to know! haha. Anyway, I was looking over the shot of Hollywood that Marcan did. The texture cache looks just like one would imagine from the descriptions. 32 macros, each of which contains 16 columns and a 16-bit bus for 512 total pools and a 512-bit aggregate bus.

On the other hand, the eFB is broken up into what appears to be 8 macros of 128 total pools (so far so good). But each macro is comprised of 16 columns (the top rows are actually 17, but that's for error protection supposedly). So each of those ends (the blue parts - need a name for them, anyone?) would need to carry 3 bits of data to get a composite 384-bit pool.

Now on Wii U, that 2 MB pool is somewhat different. Each module has 64 (actually 66) columns for a total of 256. If we're figuring the same 384-bit bus, that breaks down to 1.5 bits for each. But that doesn't make sense to me. From what I know of comp sci, I don't see how you can have half a bit. So maybe some smart person can chime in and help make sense of this.

My only other guess is that each of the modules is actually on a 128-bit bus for the reason that those are the closest Renesas could supply (at the price Nintendo was willing to pay).



I hear yah. That's kind of what I meant - that whatever is running Wii BC would need access to a certain number of pools of a certain size in a certain amount of time. Actually, even if MEM1 is on a 2048-bit bus, it still would be slower per megabit than both Wii's eFB and eTC.

Slower? Wouldn't that cause a large number of performance issues when playing in Wii mode?
 
Slower? Wouldn't that cause a large number of performance issues when playing in Wii mode?

Nope, I'm talking about MEM1 - the 32 MB eDRAM pool. That is analogous to Wii's 24 MB 1t-SRAM, and is certainly on a much faster bus than Wii's Hollywood MCM had going on.
 
Quick question: how much more expensive would it REALLY have been just to include a Wii SoC onto the MCM and have, say, a PowerPC A2 @ 1.8GHz and a modified early DX11 Radeon GPU instead of just modifying Wii components to run at (or perhaps 50-75% above) current-gen speeds?
 
Quick question: how much more expensive would it REALLY have been just to include a Wii SoC onto the MCM and have, say, a PowerPC A2 @ 1.8GHz and a modified early DX11 Radeon GPU instead of just modifying Wii components to run at (or perhaps 50-75% above) current-gen speeds?

A2 is in-order and a lot more power hungry than espresso
 
A2 is in-order and a lot more power hungry than espresso

Never said that it couldn't be modified for additional features (and removing those deemed as "unnecessary") And A2 isn't THAT power-hungry; an 18-core running at 1.8GHz only has a TDP of 55 watts. Scaling it down to 3 cores at 1.8GHz only consumes about 11 Watts (probably a LOT more than Espresso's *IIRC* 5-6 watts, but it's STILL pretty damn low compaired to other IBM CPUs).
 
Never said that it couldn't be modified for additional features (and removing those deemed as "unnecessary") And A2 isn't THAT power-hungry; an 18-core running at 1.8GHz only has a TDP of 55 watts. Scaling it down to 3 cores at 1.8GHz only consumes about 11 Watts (probably a LOT more than Espresso's *IIRC* 5-6 watts, but it's STILL pretty damn low compaired to other IBM CPUs).

would a scaled down a2 actually be anymore powerful though?
 
would a scaled down a2 actually be anymore powerful though?
In floating point? sure.

Otherwise most likely not. A2 would be a proper follow-up to the PPE/Xenon designs, it's basically their design, twice refined (first by Power6 implementation and then by the PPC A2). being 4-way SMT capable probably makes it a good step up providing multi-threading gets used though.

It wouldn't make any sense for Nintendo. For that they might as well have went with a Power7 part. Could have made sense for Microsoft and Sony had they not went with x86 this time around.
 

AzaK

Member
Quick question: how much more expensive would it REALLY have been just to include a Wii SoC onto the MCM and have, say, a PowerPC A2 @ 1.8GHz and a modified early DX11 Radeon GPU instead of just modifying Wii components to run at (or perhaps 50-75% above) current-gen speeds?

That's what I wonder too. It seems they spend so much time and expense on getting BC in their when they could have just designed their new system optimally for just that system and added Wii in there for BC.

Personally I don't want to pay for BC when I'll never really use it. I can just use my Wii if I really want it.

A2 is in-order and a lot more power hungry than espresso

This is their big failing. Thinking that they needed to hit 30 odd watts. It's stupid, they could have still been more efficient than current generation at double that and imagine what sort of power we could have had in the box.
 

krizzx

Junior Member
That's what I wonder too. It seems they spend so much time and expense on getting BC in their when they could have just designed their new system optimally for just that system and added Wii in there for BC.

Personally I don't want to pay for BC when I'll never really use it. I can just use my Wii if I really want it.



This is their big failing. Thinking that they needed to hit 30 odd watts. It's stupid, they could have still been more efficient than current generation at double that and imagine what sort of power we could have had in the box.

What's stupid about it? I would think they have iterated how little big numbers mattered to them enough by now.

Their primary goal is cost effectiveness. Their goal is to make their console accessible, not a powerhouse.

I honestly don't see the reasoning behind people wanting it to have the same specs as the the 720/PS4. Why do we need three of the same console?
 

nikatapi

Member
That's what I wonder too. It seems they spend so much time and expense on getting BC in their when they could have just designed their new system optimally for just that system and added Wii in there for BC.

I somehow feel that they made those hardware choices just so they wouldn't need to build new engines from the ground up, i guess the architectural similarities are a way of ensuring a smoother transition to the new machine.
 

AzaK

Member
What's stupid about it? I would think they have iterated how little big numbers mattered to them enough by now.

Their primary goal is cost effectiveness. Their goal is to make their console accessible, not a powerhouse.

I honestly don't see the reasoning behind people wanting it to have the same specs as the the 720/PS4. Why do we need three of the same console?
Well the spent lots of money it seems to engineer low TDP, small and to keep BC. None of which I care about AT ALL, so to me it was stupid. Go off the shelf, more powerful (no need for 8GB or 8 core) and put the R&D into a bit more powerful tech. Lets them compete better and market their machine with obvious power jump over current gen. They can still keep the GamePad for a USP
 

Thraktor

Member
In floating point? sure.

Otherwise most likely not. A2 would be a proper follow-up to the PPE/Xenon designs, it's basically their design, twice refined (first by Power6 implementation and then by the PPC A2). being 4-way SMT capable probably makes it a good step up providing multi-threading gets used though.

It wouldn't make any sense for Nintendo. For that they might as well have went with a Power7 part. Could have made sense for Microsoft and Sony had they not went with x86 this time around.

I'd have to disagree on the A2 being a "follow-up" to the PPE/Xenon designs. In fact, it represents a rather significant departure from design principles of CELL and Xenon in a number of ways. For one, it marks a shift in IBM's focus from the heterogeneous many-core architecture of CELL to a homogenous one, which we can see the implications of when we compare BG/Q's HPC market penetration to that of PowerXCell at a similar point in its life-cycle.

Secondly, we see a rather dramatic difference when it comes to SIMD implementation. Xenon was based largely around big beefy VMX128 units, with a large instruction set and 14-stage pipeline (just for the SIMD unit!). The A2, by contrast, uses a much simpler Quad-FPU, which integrates FPU and SIMD functionality, uses a pared-down instruction set, and has a much shorter 6-stage pipeline. Also, although I don't have confirmed numbers, I understand the A2 has a generally shorter pipeline than Xenon (I've heard 15 stages, but as I say I don't have a source on that).

And, although an in-order design, the larger cache, higher degree of multi-threadedness and shorter pipeline should ensure it runs much more efficiently in general code than Xenon.
 
I'd have to disagree on the A2 being a "follow-up" to the PPE/Xenon designs. In fact, it represents a rather significant departure from design principles of CELL and Xenon in a number of ways. For one, it marks a shift in IBM's focus from the heterogeneous many-core architecture of CELL to a homogenous one, which we can see the implications of when we compare BG/Q's HPC market penetration to that of PowerXCell at a similar point in its life-cycle.

Secondly, we see a rather dramatic difference when it comes to SIMD implementation. Xenon was based largely around big beefy VMX128 units, with a large instruction set and 14-stage pipeline (just for the SIMD unit!). The A2, by contrast, uses a much simpler Quad-FPU, which integrates FPU and SIMD functionality, uses a pared-down instruction set, and has a much shorter 6-stage pipeline. Also, although I don't have confirmed numbers, I understand the A2 has a generally shorter pipeline than Xenon (I've heard 15 stages, but as I say I don't have a source on that).

And, although an in-order design, the larger cache, higher degree of multi-threadedness and shorter pipeline should ensure it runs much more efficiently in general code than Xenon.
That's one way to look at things, but I never meant to imply they were the same thing (reading it again I do realize how it might come across that way, as I do believe A2 could be considered a proper "follow-up", but it's just that and I'll explain). I do believe the in-order lineage in PowerPC follows a rather well defined line and thus Cell and Xenon contribute can't be ignored for follow-up products.

Let's go back, IBM tested, way back in 1997 the guTS (GigaHertz unit Test Site), this test cpu was a 64 bit cpu, it ran at 1 GHz and made that possible by focusing on higher frequency; meaning more pipeline stages, and yet simpler execution (read: in-order).

Sounds familiar?

The core of the whole CELL technology was never "many-core" and instead it was the result of balancing for an end (same for the SIMD pipeline stages). Yet you can consider it a relative maturing of a different beast altogether.


Power6 and A2 are different [from Cell and Xenon, and] they're not code-compatible between themselves (like variants within PPC75x and PPC74xx often are), that's not what's at stake; but the whole 2-way SMT thing on the in-order design first appearing on them was important to pave the way for them (and ultimately for A2's 4-way even existing). It's more in that sense. And I always realized pipeline length on the A2 would be shorter.

As for how do they fare per clock again short pipeline OoO cpu's like PPC75x, I dunno, but I'm guessing in-order still makes it less efficient one way or the other (either more efficient by clock, or more efficient per Watt).
 

HTupolev

Member
No but isn't speculation pointing to not very fast? By that I mean 70GB/s which compared to even the main RAM of PS4 is slow.
The main RAM of the PS4 is GDDR5. That stuff is a modern RAM architecture entirely oriented at satisfying the ridiculous bandwidth needs of modern PC GPUs. Having lower bandwidth than it hardly implies "being slow," even for eDRAM.

For comparison, the 360's eDRAM is usually listed as 256Gbit/s, which is 32GB/s. So if the WiiU eDRAM is 70GB/s, that's a healthy doubling over the 360's eDRAM. And as on-die eDRAM, it doubtless has respectable latency performance.

70GB/s isn't crazy high, but for a console that seems to be targeting a graphical performance ballpark not all that far above PS360, it's probably more than adequate. If 70GB/s is what the eDRAM actually is, that's probably something to be celebrated, not a horrifying bottleneck of doom.
 

krizzx

Junior Member
A bit quiet around here....

That's because no one has expressed any thing great about the Wii U GPU in a while. Just post some advantage or next gen feature it seems to have and watch people dash out from all over to dismiss it or try to promote one of the other consoles over it to the highest extreme. j/k

It does seem that people have stopped seeking the truth, though. I'm still interested in the RAM analysis from that other forum. From what I can tell, when you take bottleknecks into the equation, the Wii U has the overall better system RAM performance which completely contradicts the "bandwidth starved" analysis that other folk were slamming it with.

Then there was the claim that it had problems with transparencies which was clearly not present in Nintendo land at launch. In fact, I would say that every claim of inferiority of the hardware that has been made since the Wii U launch has been disproven by games released on it since then. Yet, not a single person who made such claims has corrected himself or even attempted to cross analyze the new data. I may never understand the minds of these analyzers.

What I am the most interested in is still the tessellation capabilities. I wish Shin'en would show us some shots of the game they have in development.
 
There is still one question I asked a while back that I never got a straight answer for.

Around how many more polygons can the Wii U GPU output than the PS3/360? Not that, I'm not about how much polygons matter or anything else. I'm just looking for probable numbers.

The shot from the Bayonetta 2 dev video still comes to mind.
http://www.neogaf.com/forum/showpost.php?p=47294548&postcount=371
I don't think the wiiu GPU has enough horsepower to push that number of polycounts in real time in map with lot of details ( i hope i could be wrong).

I think the model will be downscaled when it will be use in game.
 
I don't think the wiiu GPU has enough horsepower to push that number of polycounts in real time in map with lot of details ( i hope i could be wrong).

I think the model will be downscaled when it will be use in game.
Not necessarily. Some polygon models for more recent games are a bit higher than they were a few years ago. For example, some models for NG3: Razor Edge has exceeded 100,000 polygons.

Check out this beyond3d thread.

http://beyond3d.com/showpost.php?p=1728755&postcount=1389

It should be noted that a reason for the increase on alot of polygon models are usually due to additional weapons, fancy clothes, and hair.
 

krizzx

Junior Member
I don't think the wiiu GPU has enough horsepower to push that number of polycounts in real time in map with lot of details ( i hope i could be wrong).

I think the model will be downscaled when it will be use in game.

That is not what I asked. I was simply using that scene as an example.

I want to know mow much more the Latte should be able to output than the PS3/360 GPU's.
 
That is not what I asked. I was simply using that scene as an example.

I want to know mow much more the Latte should be able to output than the PS3/360 GPU's.
The max is likely 550 million polygon/sec (1 poly/cycle @ 550MHz) The 360 is 500 million, the PS3 is 250 million (cell had to help to keep up with the 360s), and the PS4/Durango clocks at 1.6 billion. The real world numbers are a bit more complicated than that, considering that no current-gen game was able to reach close to 500 million poly/ sec.
 
Then there was the claim that it had problems with transparencies which was clearly not present in Nintendo land at launch. In fact, I would say that every claim of inferiority of the hardware that has been made since the Wii U launch has been disproven by games released on it since then. Yet, not a single person who made such claims has corrected himself or even attempted to cross analyze the new data. I may never understand the minds of these analyzers.

Has Wii U not been struggling with transparencies as much in recent games? I seem to recall that some concluded that those hiccups were a result of either not having true ROPs (a theory I was never a huge supporter of - it's got at least 8 - same as current gen, but likely more advanced) or not having enough bandwidth to eDRAM. I still doubt that eDRAM bandwidth is unlimited. However, as stated on the last page, even the worst case (70.4 GB/s) is not bad at all. That's over twice as much bandwidth as Xenos has to its eDRAM.
 
Status
Not open for further replies.
Top Bottom