• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Wii U CPU |Espresso| Die Photo - Courtesy of Chipworks

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Choosing an IBM processor over x86 is going to hurt them long-term unless Nintendo plans to move over everything to IBM.
The "IBM processor" is licenseable. Nintendo can become an OpenPOWER licensee and go produce their "IBM processors" at a plant they fancy. Or stick with IBM production facilities - it's entirely nintendo's choice.
 
x86 processors are archaic and need to go away in my opinion. The efficiency of IBM CPU's is so much greater watt for watt, nm for nm.

The only benefit of x86 is that it makes it easier for people who were taught on the x86 architecture and move their code around to different systems.

My ideal of gaming consoles are systems that offer me something unique to my PC or something my PC can't do better.

I still stand behind my belief that we do not need 3 of the same console. Variety is a good thing and I don't like the idea of having all system getting hardware from the same company either. At that point, the only things that really make a difference in gaming would be raw horsepower and the name molded/taped to the side of the casing. There would be no real reason to buy 3 different consoles as the one with the highest specs would be the only one that mattered. That is an ill scenario.

The Power8 or a Power8/Espresso hybrid would be a nice choice for Nintendo moving forward.



Being different is what is hurting Nintendo.
 

Raist

Banned
x86 processors are archaic and need to go away in my opinion. The efficiency of IBM CPU's is so much greater watt for watt, nm for nm.

The only benefit of x86 is that it makes it easier for people who were taught on the x86 architecture and move their code around to different systems.

My ideal of gaming consoles are systems that offer me something unique to my PC or something my PC can't do better.

"Fancy" architectures have a history of not being healthy in the end, a notable exception being the PS2.
It doesn't matter if the hypothetical better efficiency is there (and if it is it's most likely some %) when overall the hardware is a lot underpowered compared to the more traditional and "less efficient" alternatives.
 

DonMigs85

Member
If IBM/PowerPC is so great why did Apple ditch them?
They're actually not terribly efficient from a power standpoint too. Intel can't really be touched in that regard.
 
If IBM/PowerPC is so great why did Apple ditch them?
They're actually not terribly efficient from a power standpoint too. Intel can't really be touched in that regard.

Exactly. Even they knew when to move the moment Intel started to have a good thing going with the Core microarchitecture in terms of PPW, moving away from Netburst.

Speculation does abound though that they might move to an ARM hybrid the moment PPW reaches parity with x86. They always strive for the lowest power drawing chipsets and greatest battery life in their mobile PC products.

Frankly speaking Intel's x86 CPUs have never been better in the PPW arms race (from Ivy Bridge onwards/22nm 3D transistors), and as the landscape shifts to x86 powered tablets (next gen Surface Pro/Atom based tablets etc.), it can only improve even more.
 

LordOfChaos

Member
x86 processors are archaic and need to go away in my opinion. The efficiency of IBM CPU's is so much greater watt for watt, nm for nm.

Lol, no. Instruction set is such a tiny fraction of a CPU nowadays it hardly matters (for overall performance per watt), the architecture around it matters so much more, and IBMs current designs aren't ahead of say Intels in efficiency. In fact, the place they are still strong is in high wattage high performance, whereas Intel targets a bit under that in terms of power draw, and have a higher performance per watt doing it. They do ok in embedded, but hardly the leader. I think you're making things up. I just spent the workday besides Power7 systems, by the way.

It's true, x86 isn't as neat as PPC and had to have extensions for a bunch of things over the years that PPC did natively, but with the hundreds of millions of transistors in CPUs these days the efficiency loss you're talking about is like worrying about the weight of a grain of sand while carrying a mountain. Everything around the ISA matters more, the ISA just matters for compatibility. You could build an architecture around ARMs ISA that was as big as a Power8, or one around PPC as small as a Cortex A7.
 
I've heard people say before that Power/PowerPC and other RISC architectures are more efficient than x86 and Intel's CISC CPUs. But I don't really know much about it.

I don't have the technical understanding, but it sounds like it isn't true. And it seems like to me that IBM's hardware is becoming more unusual and isn't really something that's competing with Intel much anymore. At least not for consumers like me. For rich people and business people I assume some people are still using them to make supercomputers or some kind of business computers.

Intel have done a lot to make sure that their chips are useful in the mobile market. And a lot of tablets are now using Intel processors. And I'm sure that there's a reason that the PlayStation 4 and XBOX One are both using AMD CISC processors. I don't hear of too many people using IBM's CPUs anymore. And it seems like to me if Nintendo were to using something other than CISC, ARM would be rather than PowerPC.

I kind of hope that if Nintendo makes another console, that it's x86 based rather than IBM. I'm don't know enough about CPUs to understand if x86 is a good architecture investment for decades into the future. But it has a long history of support and is really good right now.

If x86 is just as good as PowerPC, then there's no reason to go with Power or PowerPC because IBM's stuff isn't as well supported. x86 CISC is used by almost every PC and is being used by a lot of tablets.

I think that the best would probably be for Nintendo to either use AMD like Sony and Microsoft are, Intel, or some ARM or ARM based CPU. And at least move on from PowerPC to something more supported.
 
PowerPC was more efficient about a decade ago.

That, and IBM basically threw in the towel long ago in the consumer market. The PowerPC WiiU uses is just kind of a touched-up holdover.

Now granted, Nintendo really needs to fix their lines of communication and overall relationship with 3rd parties, but this won't be helping, either.
 

defferoo

Member
x86 processors are archaic and need to go away in my opinion. The efficiency of IBM CPU's is so much greater watt for watt, nm for nm.

The only benefit of x86 is that it makes it easier for people who were taught on the x86 architecture and move their code around to different systems.

My ideal of gaming consoles are systems that offer me something unique to my PC or something my PC can't do better.

I still stand behind my belief that we do not need 3 of the same console. Variety is a good thing and I don't like the idea of having all system getting hardware from the same company either. At that point, the only things that really make a difference in gaming would be raw horsepower and the name molded/taped to the side of the casing. There would be no real reason to buy 3 different consoles as the one with the highest specs would be the only one that mattered. That is an ill scenario.

The Power8 or a Power8/Espresso hybrid would be a nice choice for Nintendo moving forward.

please stop... Nintendo will NEVER use a server CPU like POWER8. It is WAY too expensive to use in a game console (it must be several hundred or thousand a pop?). If anything, they need to look at AMD's x86 APUs, Intel's BayTrail line of chips and it's successors, or high-end ARM chips.

Those are more in line with the price and power consumption that they would want to target. Given that they already have a good relationship with AMD for the GPU side of things, using them for the CPU makes a whole lot of sense.

and you can't just "fuse" two chips with completely different architectures like you're suggesting with POWER8 and Espresso, things don't work that way.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
If IBM/PowerPC is so great why did Apple ditch them?
They're actually not terribly efficient from a power standpoint too. Intel can't really be touched in that regard.
Actually PPC are very efficient from a power standpoint, which is why they've been forming the bulk of the embedded industry for the past 10 years, before ARM reached the requied performance levels (and PPC G2 & G3 are still _the_ processors in automotive and the networking data plane - cars, gigabit swtiches & routers, etc). As re why Apple ditched them - it's was mainly a political move. Apple did not feel comfortable with Motorola (the one PPC vendor trying hard to please Apple) often missing their roadmaps. Had Motorolla have Intel's fabbing track record, history would've been quite different. Your statement that Intel 'could not be touched from a power efficiency' standpoint couldn't be further from the truth - if it was anywhere near true ARM wouldn't be here today. But they are and they own the entire power-sensitive segment, despite Intel having the world's best fabbing tech - mobile, a large chunk of embedded (shared with PPC and MIPS), and now moving into segments of the server space with AArch64 - it's all ARM to you. So yeah, so much for Intel's power efficiency.
 
Intel doesn't have a product that is really competitive to ARM in the low power/low cost segment, but there is no competetive PPC product either. That's why the market is dominated by ARM.

On the more high end side, as far as I know, Xeons are absolutely in POWERs ballbark from a performance/watt standpoint. PPC had a significant market share in super computers and servers 10+ years ago, but they lost most of it. Intel is the leader now, and not without reason.
 

LordOfChaos

Member
Your statement that Intel 'could not be touched from a power efficiency' standpoint couldn't be further from the truth - if it was anywhere near true ARM wouldn't be here today.

He was talking about the time of the transition, before the ARM boom, and nothing ARM had would be up to the requisite performance levels. Intel did kill IBM in power efficiency during the switch. And if you can use ARM to dispel Intels power efficiency, you can now do so for IBM too. This topic is more about PPC vs x86, and the myth that x86 intrinsically requires more power is just that, a myth. The ISA is such a small fraction of things now.

A reminder of what 2006 was like:
http://www.anandtech.com/show/2064

12812.png


12807.png
 

v1oz

Member
If IBM/PowerPC is so great why did Apple ditch them?
They're actually not terribly efficient from a power standpoint too. Intel can't really be touched in that regard.

Yes Apple ditched them because they produced too much heat and consumed too much power to put into laptops. And IBM couldn't be too bothered to engineer low power consumption chips. It wasn't because PPC's are not powerful chips or that the underlying PPC architecture isn't efficient. Because PPC still finds use in embedded processors.
 

MrJoe

Banned
Actually PPC are very efficient from a power standpoint, which is why they've been forming the bulk of the embedded industry for the past 10 years, before ARM reached the requied performance levels (and PPC G2 & G3 are still _the_ processors in automotive and the networking data plane - cars, gigabit swtiches & routers, etc). As re why Apple ditched them - it's was mainly a political move. Apple did not feel comfortable with Motorola (the one PPC vendor trying hard to please Apple) often missing their roadmaps. Had Motorolla have Intel's fabbing track record, history would've been quite different. Your statement that Intel 'could not be touched from a power efficiency' standpoint couldn't be further from the truth - if it was anywhere near true ARM wouldn't be here today. But they are and they own the entire power-sensitive segment, despite Intel having the world's best fabbing tech - mobile, a large chunk of embedded (shared with PPC and MIPS), and now moving into segments of the server space with AArch64 - it's all ARM to you. So yeah, so much for Intel's power efficiency.

one of the problems with wii-u is that it uses an entirely different architecture vs. PS4/XB1. I for one hope that nintendo doesn't make that same mistake with their next system. porting between the systems should be as simple as possible. make things as easy as possible for third parties. otherwise the system might as well be called the wii-u2.
 

v1oz

Member
one of the problems with wii-u is that it uses an entirely different architecture vs. PS4/XB1. I for one hope that nintendo doesn't make that same mistake with their next system. porting between the systems should be as simple as possible. make things as easy as possible for third parties. otherwise the system might as well be called the wii-u2.

They did that for backwards compatibility with the Wii. At the time that they made their decisions backwards compatibility was a good idea.
 

BuggyMike

Member
Sorry for the random noob questions, but can someone explain how compute shaders help improve the graphics of a game? Are there any games out for Wii U now that utilized compute shaders? Can anyone spot if Mario Kart 8 uses compute shaders or tessellation? I know the tessellation part is not easy to see. If they haven't used compute shaders or tessellation in any games currently out, does anyone see the Wii U having a decent upgrade in visuals when devs start taking advantage of these features? What kinds of things would we start to see as a result of taking advantage of compute shaders on Wii U?
 

wsippel

Banned
one of the problems with wii-u is that it uses an entirely different architecture vs. PS4/XB1. I for one hope that nintendo doesn't make that same mistake with their next system. porting between the systems should be as simple as possible. make things as easy as possible for third parties. otherwise the system might as well be called the wii-u2.
Well, then we better hope Sony and MS switch to ARM next gen, because I'm pretty damn sure that's what Nintendo's going to use.
 

LordOfChaos

Member
Sorry for the random noob questions, but can someone explain how compute shaders help improve the graphics of a game? Are there any games out for Wii U now that utilized compute shaders?

Doing something like simulating cloth moving or water or any fluids is very taxing on a CPU. A CPU is good at singular long tasks, whereas those require many points of calculation. That's what a GPU is good at. Mass processing of a similar small task. Offloading such things to the GPU can therefore either free up the CPU for other tasks, or add in effects that were never possible on the CPU alone.


Look at PhysX

https://www.youtube.com/watch?v=-x9B_4qBAkk
 

LordOfChaos

Member
God, the last thing I want is a completely homogenised video games industry, afraid of anything different.

Using a common, easily ported between architecture doesn't mean everything else has to be the same. That's just silly. The Wii U could still be the Wii U with an AMD APU.
 

MrJoe

Banned
God, the last thing I want is a completely homogenised video games industry, afraid of anything different.

are you talking about games or the technology which runs those games? those are two completely different things.

as shown by wii-u, being different technologically isn't a guarantor of success. I'm more concerned with nintendo achieving success with their platforms. that way they won't burn through their cash reserves and (like sega) be forced to become a simple third party developer.
 

LordOfChaos

Member
The only way I see is that Wii U 2 uses 4 shrunken down Wii U CPUs to have BC with Wii U :/

The problem I see is that this architecture isn't expected to be able to scale much further in clock speed with its short pipeline, so they'd have to increase performance by adding more cores. That's all fine and well, the other two have 8 cores, but if they need more power than 8 Espresso cores? Ie your 4x example, making 12 cores? That would be a pain to efficiently use every core for one, and cache coherency could start becoming an issue (it is even starting with the PS4, jumping between the two seperate cache pools [of each cluster of 4 cores] has a large latency hit)
 
Why don't one of you guys create a thread about Nintendo's possible next console (even if it has to go in community) as it's fascinating hearing the discussion of possible hardware but I fear some of your are going to get this thread closed by going off topic just like the GPU thread.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
The problem I see is that this architecture isn't expected to be able to scale much further in clock speed with its short pipeline, so they'd have to increase performance by adding more cores. That's all fine and well, the other two have 8 cores, but if they need more power than 8 Espresso cores? Ie your 4x example, making 12 cores? That would be a pain to efficiently use every core for one, and cache coherency could start becoming an issue (it is even starting with the PS4, jumping between the two seperate cache pools [of each cluster of 4 cores] has a large latency hit)
The latency issues with Jaguar's L2 round-trip are Jauguar's alone. I'm not saying coherency protocols are cheap, but AMD simply pooched it with Jaguar's uncore.
 

LordOfChaos

Member
The latency issues with Jaguar's L2 round-trip are Jauguar's alone. I'm not saying coherency protocols are cheap, but AMD simply pooched it with Jaguar's uncore.

Right; just an example though. A 12-16 core hypothetical expansion off the Espresso would take some doing for full coherency, and I'm throwing that many cores into the hypothetical because the clock speed is unlikely to get much higher for a next gen successor based off the same design (and because I'm replying to someone who mentioned that many).

Even Intels quads were clusters of 2 very recently.
 

DonMigs85

Member
He was talking about the time of the transition, before the ARM boom, and nothing ARM had would be up to the requisite performance levels.

Indeed. Even today, a single Bobcat core is still a bit faster than a similarly-clocked Cortex A15.
Back then around 2005-2006, ARM11MP was the best you could expect from ARM. Things only really started ramping up once the Cortex-A8 came out in 2009.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
He was talking about the time of the transition, before the ARM boom, and nothing ARM had would be up to the requisite performance levels.
The timefame is irrelevant. If x86 was power efficient in any shape of form, its scaled-down implementations would be dominating embedded today. ARM scaled up with AArch64 to 'traditionally x86' territories. The opposite is not true - how many routers do you know which run x86? If anything, x86 was always open to attacks on the power efficiency front. The Core architecture stabilized them somewhat in a certain reltively-high power envelope, and Intel's advanced fabtech is about half of the story there. At the same time, Atom's been a flop.

Intel did kill IBM in power efficiency during the switch.
Yes, at a generation-and-a-half fab lead.

And if you can use ARM to dispel Intels power efficiency, you can now do so for IBM too. This topic is more about PPC vs x86, and the myth that x86 intrinsically requires more power is just that, a myth. The ISA is such a small fraction of things now.
There's nothing mythical about x86 requiring more power - those ginormous decoders for the oldest ISA on the market are not free in any shape of form, and the smaller the x86 CPU is, the larger its decoder portion gets. Have you seen Bobcat's floorplan?

A reminder of what 2006 was like:
http://www.anandtech.com/show/2064

12812.png


12807.png
You do realize you're looking at nothing less than a generation difference in every aspect in those charts, right?

* G5 - 90nm (2004 tech), Intel Core Xeon - 65nm (Q3 2006 tech).
* G5 - 2x 0.5MB L2, Xeon - 4MB L2
* G5 - 2x ~52M transistors, Core Xeon - ~290M transistors

How about some apples to apples?
 
It's hard to offer an apples to apples comparison when IBM had effectively ceased development of consumer grade CPUs at that time. Which was the problem in the first place.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
It's hard to offer an apples to apples comparison when IBM had effectively ceased development of consumer grade CPUs at that time. Which was the problem in the first place.
I'm not saying doing a proper comparison is trivial or easy. I'm just saying that picture does not show what its poster thinks it does.
 
The picture shows Apple switched to Intel because they offered faster, more power efficient CPUs at the time. That is true. The fact that IBM did not offer a competitive product is their own fault. It's nothing to do with the ISA in either case.
 

LordOfChaos

Member
I'm not saying doing a proper comparison is trivial or easy. I'm just saying that picture does not show what its poster thinks it does.

Except it does exactly that. To remind you of why I posted that, it was about the time of the PPC-Intel transition from Apple, and you quoting the other guy and denying Intels efficiency lead at the time. IBM/Motorola were not making low power designs fast enough, and with the shift to laptops the writing was on the wall. To quote my post that you are criticizing

He was talking about the time of the transition, before the ARM boom, and nothing ARM had would be up to the requisite performance levels. Intel did kill IBM in power efficiency during the switch.
during the switch



The picture shows Apple switched to Intel because they offered faster, more power efficient CPUs at the time. That is true. The fact that IBM did not offer a competitive product is their own fault. It's nothing to do with the ISA in either case.

Yep.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
The picture shows Apple switched to Intel because they offered faster, more power efficient CPUs at the time. That is true. The fact that IBM did not offer a competitive product is their own fault. It's nothing to do with the ISA in either case.
And yet LordOfChaos used those charts in the context of 'dispelling the x86 power inefficiency myth'. And you're saying it had nothing to do with the ISA? Right. Which brings us to..

Except it does exactly that. To remind you of why I posted that, it was about the time of the PPC-Intel transition from Apple, and you quoting the other guy and denying Intels efficiency lead at the time.
When your CPUs are produced at a generational litography advantage compared to the competition, you don't have an 'efficiency lead', you had a fabtech lead. You'd have an efficiency lead when your CPUs are at a comparable fabtech vis-a-vis the competition, then you can post charts such as those you posted. Otherwise you can take a 22nm Haswell and a 130nm G5 and "prove" a bunch of things. Let me remind you this entire argument is about architectures (which somehow nobody remembers now, I wonder why..)

IBM/Motorola were not making low power designs fast enough, and with the shift to laptops the writing was on the wall. To quote my post that you are criticizing
Why not quote my answer to your post? Here, let me help:

you said:
Intel did kill IBM in power efficiency during the switch.
me said:
Yes, at a generation-and-a-half fab lead.
You're welcome.
 

DonMigs85

Member
IBM had their own fabs so it was also their own fault they were behind Intel in that regard. That's another major reason Apple switched.
 

Vanillalite

Ask me about the GAF Notebook
What on the system processes audio in terms of voice chat via the mic built into the gamepad, and how much would this effect system resources in game?

I wonder about this because I play MH which uses the mic, and MK8 only uses the mic in the lobby. I wonder how much not having voice chat in game helps out the game in terms of framerate and latency.
 

prag16

Banned
IBM had their own fabs so it was also their own fault they were behind Intel in that regard. That's another major reason Apple switched.

Yes, but as blu said, that's not what this argument was originally about. IBM fucked up by not offering a competitive product at that point in time, but that's immaterial with regard to determining which architecture is more power efficient when making an apples to apples comparison.
 

LordOfChaos

Member
And yet LordOfChaos used those charts in the context of 'dispelling the x86 power inefficiency myth'. And you're saying it had nothing to do with the ISA? Right. Which brings us to..


When your CPUs are produced at a generational lytography advantage compared to the competition, you don't have an 'efficiency lead', you had a fabtech lead. You'd have an efficiency lead when your CPUs are at a comparable fabtech vis-a-vis the competition, then you can post charts such as those you posted. Otherwise you can take a 22nm Haswell and a 130nm G5 and "prove" a bunch of things. Let me remind you this entire argument is about architectures (which somehow nobody remembers now, I wonder why..)


Why not quote my answer to your post? Here, let me help:



You're welcome.

To this part alone:
When your CPUs are produced at a generational lytography advantage compared to the competition, you don't have an 'efficiency lead', you had a fabtech lead
Intel has a fabrication process advantage over AMD right now. It still means they are more efficient in the end. If you can find a modern example of IBM processors with the same lithography and power draw handily beating Intel ones in real world benchmarks, I'd love to see it.

But yes, lets get back to the ISA side of things:

You've missed my continually repeated point. x86 could be more efficient, yes, but the ISA is still a pittance compared to everything else in the chip in terms of power use. You mentioned looking at the floor plan of a tiny core, indeed, lets do that

AMD_Bobcat_Core_675.jpg


Oh yes, so so huge that x86 decoder is.

As I said
The ISA is such a small fraction of things now.
 

LordOfChaos

Member
For the record, Anand from Anandtech seems to be on my side of this debate, Blu mate. I can see you know your stuff, but I'm going with the industry titan.

The aptly titled, The x86 Power Myth Busted: (And let me be clear here - I acknowledge that the x86 isa is not perfect and others are more efficient. My whole point has been that it's mattering less and less as the ISA becomes such a tiny piece of the puzzle. As the article also explains)

http://www.anandtech.com/show/6529/busting-the-x86-power-myth-indepth-clover-trail-power-analysis
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
To this part alone:

Intel has a fabrication process advantage over AMD right now. It still means they are more efficient in the end.
Good. I think we've reached a point where we can drop the entire 'efficiency ad lithography' angle, as it does zilch for the subject of this thread. Agree?

If you can find a modern example of IBM processors with the same lithography and power draw handily beating Intel ones in real world benchmarks, I'd love to see it.
We should compare POWER7 and Nehalem EX, both at 45nm. Or Bobcats and Espessos (Bobcat is 40nm TSMC's, which is quite comparable to Intel's 45nm, and Espesso is 45nm). Apparently I cannot provide benchmarks for the high-grade server chips, but I could do so for the lightweights. We just need a way to measure the power draw. Or find such reliable figures (i.e. powerdraw at peak) from 3rd sources.

Re the heavyweights, here's Arstechnica's blurb, just to give some basic perspective:
http://arstechnica.com/gadgets/2009/09/ibms-8-core-power7-twice-the-muscle-half-the-transistors/

But yes, lets get back to the ISA side of things:

You've missed my continually repeated point. x86 could be more efficient, yes, but the ISA is still a pittance compared to everything else in the chip in terms of power use. You mentioned looking at the floor plan of a tiny core, indeed, lets do that

AMD_Bobcat_Core_675.jpg


Oh yes, so so huge that x86 decoder is.
Erm, I don't think you realize what you're looking at. Let me help:

1. Remove all parts of the chip dubbed 'cache' or 'tag/TLB'
2. Combine the 'x86 decoder' and 'ucode ROM' parts of the floorplan, since those are both needed to issue valid uops to the rest of the pipeline.
3. Compare that to some other chip part, something of a traditionally massive size, like the fpu/simd block.

Since this picture is rather badly rotated, we could also use the better-annotated, flat-on Jaguar floorplan (same category of chip):

OKs8Qs2.jpg


Notice how the 'x86 decoder' alone (i.e. without the ucode rom) is ~1/2 the fp unit on the Bobcat, and ~1/3 the fp unit on Jaguar (jaguar got proper 128bit fp alus)? Now add the 'ucode rom'.. Still thinking x86 decoding is cheap?

ps: people like Jim Keller and Mark Papermaster are industry titans. Anand is a journalist.
 

LordOfChaos

Member
1. Remove all parts of the chip dubbed 'cache' or 'tag/TLB'
2. Combine the 'x86 decoder' and 'ucode ROM' parts of the floorplan, since those are both needed to issue valid uops to the rest of the pipeline.
3. Compare that to some other chip part, something of a traditionally massive size, like the fpu/simd block.
...

Notice how the 'x86 decoder' alone (i.e. without the ucode rom) is ~1/2 the fp unit on the Bobcat, and ~1/3 the fp unit on Jaguar (jaguar got proper 128bit fp alus)? Now add the 'ucode rom'.. Still thinking x86 decoding is cheap?

Have you seen an ARM (and armv8, for the modernized 64 bit version) or more importantly PowerPC floorplan for comparison? All this means nothing without it, I haven't found one with instruction decode outlined in such a way in both of the above.

1. Remove all parts of the chip dubbed 'cache' or 'tag/TLB'

Too bad that those are intrinsic parts of processors that contribute to their die size and power draw, which is what we're talking about? "Remove half the stuff and the other stuff starts to look bigger, eh?"

Not sure what the Power8 article has to do with anything. It doesn't make any comparisons with Intel, nor is it overly technical compared to Anandtech, not sure why that was thrown in here. I'll be following its launch with interest, that said (And I wonder if we'll move our Power7 systems to them...). Also, if it was your meaning, I never said IBM wasn't strong in the mainframe market, in fact just the opposite a page back, I said where IBM has excelled is in high performance high power draw uses (usually with crazy cooling rigs and die sizes much larger than what Intel bothers with).
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Have you seen an ARM (and armv8, for the modernized 64 bit version) or more importantly PowerPC floorplan for comparison? All this means nothing without it, I haven't found one with instruction decode outlined in such a way in both of the above.
Here's Cortex A5 (single-issue, in-order) from the CPU's official page nevertheless:

Cortex-A5_layout.jpg


The unit that's responsible for feeding the pipeline with ops is dubbed PFU (prefetch unit). A5's PFU does the equivalent job to the ISA decoder + ucode rom + branch prediction units on the Bobcat/Jaguar floorplans. Also, A5 is a much smaller CPU than those AMD's CPUs in transistor counts, so the frontend part gets naturally a bigger portion of the floorplan.

Too bad that those are intrinsic parts of processors that contribute to their die size and power draw, which is what we're talking about? "Remove half the stuff and the other stuff starts to look bigger, eh?"
Erm, every single transistor on the CPU and support logic, heck, the entire motherboard, is 'contributing to the powerdraw' (some of them negatively) and performance. Also, AMD's x86 we've been discussing here are both parts of APUs - i.e. technically they're on the same die with the entire darn computer. Shall we count those too?

In reality, there are logical units in the CPU core design, and L1 caches (and tags), and sometimes L2 caches (and tags). Comparing logical units to caches is pointless - caches can be relatively easily (from the design perspective, no necessarily production perspective) reconfigured, changed the associativity of, etc. That's why CPU IP vendors (like ARM) offer the same CPU core design with various cache configurations. Logical units are much more rigid - notice how Jaguar's floorplan is Bobcat's with small modifications - 50% larger fp/simd unit, bus unit moved around (to make room for the fp unit expansion), and of course no L2 blocks (arbitrary shown on the Bobcats floorplan), and Jaguar still constitutes a separate design? Since AMD does not target so many markets with those CPUs, their design features a single cache configuration (multiple APU versions based on caches would be infeasible) But if we take a CPU with bigger target markets (ARM's, Intel's), you'll find various cache configurations. Those cache configurations, though affecting performance in the general case, do not change the inherent CPU core logic design - a CortexA8 with 128KB L2$ is as much a CortexA8 as its 256KB L2$ sibling.

Not sure what the Power8 article has to do with anything. It doesn't make any comparisons with Intel, nor is it overly technical compared to Anandtech, not sure why that was thrown in here. I'll be following its launch with interest, that said (And I wonder if we'll move our Power7 systems to them...). Also, if it was your meaning, I never said IBM wasn't strong in the mainframe market, in fact just the opposite a page back, I said where IBM has excelled is in high performance high power draw uses (usually with crazy cooling rigs and die sizes much larger than what Intel bothers with).
It's a POWER7 article. If you had actually read it, you'd have seen how the 8 core (32 threads with 4x SMT) POWER7 design uses half the transistors of 8-core (16 threads with 2x SMT) Nehalem EX. Of course that's thanks to IBM's L3 eDRAM tech, so there's no much point in comparing those numbers, as POWER7 would wipe the floor with Intel's chip (higher performance, half the transistors).

edit: A5 OOO and superscalarity misinformation (thanks, DonMigs85) and a few typos
 

Durante

Member
It's a POWER7 article. If you had actually read it, you'd have seen how the 8 core (32 threads with 4x SMT) POWER7 design uses half the transistors of 8-core (16 threads with 2x SMT) Nehalem EX. Of course that's thanks to IBM's L2 eDRAM tech, so there's no much point in comparing those numbers, as POWER7 would wipe the floor with Intel's chip (higher performance, half the transistors).
In my experience comparing Power7 with Nehalem EX (yes, those exact models), the floor wiping happens entirely in the other direction, in pretty much every benchmark code I ever ran on them.

Now, you can of course blame compiler maturity or something for that -- what I executed are actual real-world C(++) programs, not some highly tuned assembly DGEMM -- but in the end what it means to me is that any x86(-64) deficiencies that HW people like to go on about (and boy do they go on about them) seem pretty irrelevant in the grand scheme of things.
 
Top Bottom