Support NeoGAF

tipoo · Jan 22, 2013

blu said:
It's not slower than SRAM for certain amounts up.

Not sure what you mean, if you're saying that the capacity makes up for the lower speed that's essentially what I was getting at, I'm just not sure how it would scale for a gaming use rather than a server use. The reason I say it's slower is that eDRAM has to refresh its memory banks, eSRAM doesn't, so for timing reasons eSRAM ends up faster.

wsippel said:
I don't know either, but it was certainly more R&D effort to go with eDRAM and the chip would cost the same if they used 1MB SRAM, so 3MB eDRAM being superior is the most logical conclusion.

Not necessarily, as mentioned IBM already has chips that use eDRAM so they don't need to reinvent the wheel there. It's possible Nintendo just wanted 3MB cache period, and then for cost reasons chose eDRAM, rather than performance reasons.

Margalis · Jan 22, 2013

I can't believe people are seriously debating the "two WiiUs" comment. It was a joke.

Anyway if it was two Us duct taped together it would 64 MB of eDRAM, not 32.

This assuming devs will be able to compress the texture ect to fit in the small memory space.

You can compress any texture by making it smaller and if you're targeting a lower res output you won't see much relative hit in image quality. Ramping down texture sizes is like the single easiest thing possible. Scaling down stuff like collision detection or even poly counts is a million times harder. Program complexity is hard to ramp down, textures are extremely easy to ramp down.

wsippel · Jan 22, 2013

tipoo said:
Not necessarily, as mentioned IBM already has chips that use eDRAM so they don't need to reinvent the wheel there. It's possible Nintendo just wanted 3MB cache period, and then for cost reasons chose eDRAM, rather than performance reasons.

That makes no sense. Nintendo doesn't care about big numbers, they only care about efficiency. Also, A2 is the only core using eDRAM. All 750s so far used SRAM, so staying with that would have been simpler. Obviously.

tipoo · Jan 22, 2013

wsippel said:
That makes no sense. Nintendo doesn't care about big numbers, they only care about efficiency. Also, A2 is the only core using eDRAM. All 750s so far used SRAM, so staying with that would have been simpler. Obviously.

Not what I said...IBM already has the technology for eDRAM on processors, so it's not like they have to reinvent that whole thing to put it on a new chip. And as we've said probably dozens of times here already, PowerPC 750 based doesn't mean it's a 750 with some tweaks, any more than you can say that about a Core 2 Duo compared to a Pentium 3, it could be substantially new but if the front end is compatible you could say it's "Pentium 3 based".

What's this straw man about "big numbers"? Maybe they needed 3MB to hit their target performance, but 3MB SRAM would go over their size/cost budget so they went eDRAM.

Not necessarily, as mentioned IBM already has chips that use eDRAM so they don't need to reinvent the wheel there. It's possible Nintendo just wanted 3MB cache period, and then for cost reasons chose eDRAM, rather than performance reasons.

I should have worded this better, I meant maybe they needed 3MB cache for whatever performance reason, and the performance hit from eDRAM vs SRAM was worth it.

blu · Jan 22, 2013

tipoo said:
Not sure what you mean, if you're saying that the capacity makes up for the lower speed that's essentially what I was getting at, I'm just not sure how it would scale for a gaming use rather than a server use. The reason I say it's slower is that eDRAM has to refresh its memory banks, eSRAM doesn't, so for timing reasons eSRAM ends up faster.

No. I'm saying that for certain amounts and up the overall latency (ie. cell access + wire delay) of eDRAM catches up and actually surpasses SRAM. From what I recall, for IBM's 1Mb eDRAM macros that 'turning point' was at ~4MB, but I can't be bothered to seek out the paper right now.

tipoo · Jan 22, 2013

blu said:
No. I'm saying that for certain amounts and up the overall latency (ie. cell access + wire delay) of eDRAM catches up and actually surpasses SRAM. From what I recall, for IBM's 1MB eDRAM macros that 'turning point' was at ~4MB, but I can't be bothered to seek out the paper right now.

So for 3MB it would be over the latency of eSRAM?

wsippel · Jan 22, 2013

tipoo said:
Not what I said...IBM already has the technology for eDRAM on processors, so it's not like they have to reinvent that whole thing to put it on a new chip.

What's this straw man about "big numbers"? Maybe they needed 3MB to hit their target performance, but 3MB SRAM would go over their size/cost budget so they went eDRAM.

Even though they have the tech, just leaving the cache as is (SRAM in that case) would still be less work.

And what's with the 3MB SRAM straw man? 3MB eDRAM offered better performance than 1MB SRAM, that's all that matters. Because cost is always, always the limiting factor for pretty much everything. That's where the difference between efficiency and effectiveness comes into play.

blu · Jan 22, 2013

tipoo said:
So for 3MB it would be over the latency of eSRAM?

Yes, but not by much.

Thraktor · Jan 22, 2013

Perhaps the most interesting this to me about Espresso is that, even though it's a relatively modest CPU, as far as I can tell it holds two records as far as CPUs are concerned:

1. I'm pretty sure it's the highest clocked CPU with a 4-stage pipeline ever.

2. The "special" core, with 2MB of L2 cache, has the most per-thread on-die cache of any consumer CPU ever. In fact, depending on when POWER7+ systems actually started shipping, when the Wii U released that core might have had the most per-thread cache of any CPU core ever made.

Incorrect, apparently, but still a big chunk of cache for one core.

(I'm open to correction on both of these, by the way)

It makes me all the more interested to see what it was that convinced Nintendo to go with so much cache for that core (I assume it was based on performance of some first party/third party code, but I'd be fascinated to know if it's some particular engine or middleware that saw such a large performance boost to warrant it).

blu · Jan 22, 2013

Thraktor said:
Perhaps the most interesting this to me about Espresso is that, even though it's a relatively modest CPU, as far as I can tell it holds two records as far as CPUs are concerned:

1. I'm pretty sure it's the highest clocked CPU with a 4-stage pipeline ever.

2. The "special" core, with 2MB of L2 cache, has the most per-thread cache of any consumer CPU ever. In fact, depending on when POWER7+ systems actually started shipping, when the Wii U released that core might have had the most per-thread cache of any CPU core ever made.

(I'm open to correction on both of these, by the way)

It makes me all the more interested to see what it was that convinced Nintendo to go with so much cache for that core (I assume it was based on performance of some first party/third party code, but I'd be fascinated to know if it's some particular engine or middleware that saw such a large performance boost to warrant it).

I'm almost positive you're right about the first. Re the second, it really depends what you'd dub a 'consumer CPU', and whether L2 has to be on-die, on-chip, or off-chip. R10K supported 16MB of L2 back in 1997, but that was off-chip, speeds back then were in the low hundreds of MHz, and R10K chip was a 'bit' pricey even on its own, let alone with 16MB worth of SRAM chips : )

Thraktor · Jan 22, 2013

blu said:
I'm almost positive you're right about the first. Re the second, it really depends what you'd dub a 'consumer CPU', and whether L2 has to be on-die, on-chip, or off-chip. R10K supported 16MB of L2 back in 1997, but that was off-chip, speeds back then were in the low hundreds of MHz, and R10K chip was a 'bit' pricey even on its own, let alone with 16MB worth of SRAM chips : )

I'm limiting myself to on-die here, otherwise we have to include things like the zEC12, which has 384MB of off-die L4. That's enough to fit the entire usable RAM of the PS3 (at launch) in cache at once.

Edit: Actually, off-die cache is largely independent of the CPU in any case. I'm sure you could take a stock Pentium 1 and insert about 100MB of cache between itself and the RAM if you really wanted.

oversitting · Jan 22, 2013

Isn't there some serious limitations for a 4 stage pipeline besides the fact that it doesn't clock up.

There has got to be a reason IBM went with more than 20 pipelines stages for their newer cpus. Even Intel's modern cpus have just under 20 stages.

Trevelyan9999 · Jan 22, 2013

ultrazilla said:
I'm pretty sure Crytek has said they could port Crysis 3 to the WiiU if they
wanted with little to any visual fidelity loss.

And the very strong rumors that the WiiU can run the Unreal Engine 4 leads
me to believe it'll be a stronger system than most now believe it is.

The was already confirmed true by Mark Rein at Epic Games. The Wii U will be able to get ports of most games on the PS4 and Xbox 720, it's just up to the developers to do it. Regardless, they should have an easier time with porting to Wii U than the nightmare they had with having to make ground-up Wii ports.

The way they are developing these new consoles are all very similar to each other, which bodes well for the Wii U.

blu · Jan 22, 2013

Thraktor said:
I'm limiting myself to on-die here, otherwise we have to include things like the zEC12, which has 384MB of off-die L4. That's enough to fit the entire usable RAM of the PS3 (at launch) in cache at once.

Well, I was strictly limiting myself to L2 and single-threaded CPUs, as that was my understanding of your criteria ; )

Edit: Actually, off-die cache is largely independent of the CPU in any case. I'm sure you could take a stock Pentium 1 and insert about 100MB of cache between itself and the RAM if you really wanted.

Actually cache size and organisation are largely imprinted in the CPU. Whether it's on-die or elsewhere is just a matter of wiring.

Thraktor · Jan 22, 2013

oversitting said:
Isn't there some serious limitations for a 4 stage pipeline besides the fact that it doesn't clock up.

There has got to be a reason IBM went with more than 20 pipelines stages for their newer cpus. Even Intel's modern cpus have just under 20 stages.

It's really mostly about clocks. Each added stage allows you to increase the clock speed that bit further, but reduces the performance per clock a bit because of increased branch misprediction penalties. These days the clock benefits seem to outweigh the misprediction losses up to about the high teens of stages, so that tends to be the sweet spot in terms of pipelines. It does depend on things like branch prediction accuracy and various different techniques to limit the effect of misprediction, so different architectures are going to have different sweet spots.

Thraktor · Jan 22, 2013

blu said:
Well, I was strictly limiting myself to L2 and single-threaded CPUs, as that was my understanding of your criteria ; )

I've edited my original post to clarify a little.

blu said:
Actually cache size and organisation are largely imprinted in the CPU. Whether it's on-die or elsewhere is just a matter of wiring.

In general, yes, but if you have a CPU with an off-die memory controller (eg the Pentium where it's located on a discrete northbridge), then surely you can intercept memory requests between CPU and northbridge, and implement a cache in between the two without the CPU "knowing"?

Rolf NB · Jan 22, 2013

blu said:
It's not slower than SRAM for certain amounts up.

eDRAM is slower than SRAM at any size.
Typical L2 latency for Power 7 is 6 cycles for L2 (SRAM)*, and between 18 clocks and 128 clocks to access the eDRAM L3 cache (depending on "local" vs "nonlocal" L3 partitioning).

*cycles spent on checking lower-level caches factored out

The advantage of eDRAM is density and density only. You can have more of it in the same space. IBM, in their server processors does this, they kill it with sheer size (32MB eDRAM L3 cache in addition to 256kB SRAM as L2). Nintendo does not.

tipoo · Jan 22, 2013

oversitting said:
Isn't there some serious limitations for a 4 stage pipeline besides the fact that it doesn't clock up.

There has got to be a reason IBM went with more than 20 pipelines stages for their newer cpus. Even Intel's modern cpus have just under 20 stages.

I'll admit I don't know enough about this first off, but I don't think a short pipeline is all good apart from limiting clock speed. Or at least, shorter isn't always better, and the optimal depth seems to be 10-20 depending on what the processor is used for. Pipelining allows multiple instructions to be fed into the processor at once, so with a 4 stage pipeline after 4 instructions if the first one had not completed, the system would simply have to stop feeding the processor new instructions. With a deeper pipeline it could keep filling it up. The problems with this have already been mentioned, if it predicts a branch wrong it has to evacuate some of the pipeline stages.

With earlier pipelined processors, it would evacuate ALL the pipeline stages, but two things have made this less of an issue, now it only evacuates the missed branch pipeline stages, plus branch prediction rates themselves have gotten more and more accurate.

So there's the tradeoff, and just as longer wasn't always better, sorter isn't always better either.

All that said, I still have no damn clue if the 4 stage pipeline is worth the 1.2GHz clock, or if they went with that design for being cost and power effective. Anyone else care to guess?

Edit: Here are two studies on it
https://docs.google.com/viewer?a=v&...WagxhG&sig=AHIEtbRECrbEwXkZAiVsuLSC2DLnGmNkrg

https://docs.google.com/viewer?a=v&...kWrVYa&sig=AHIEtbQ3iRfc4wcUPlROHSCOOusYLrfGbA

So yeah, shorter isn't always better. There is a sweet spot between too-long and too-short pipelines.

blu · Jan 22, 2013

Rolf NB said:
eDRAM is slower than SRAM at any size.
Typical L2 latency for Power 7 is 6 cycles for L2 (SRAM)*, and between 18 clocks and 128 clocks to access the eDRAM L3 cache (depending on "local" vs "nonlocal" L3 partitioning).

Did you read what I wrote? POWER7 has 256KB of L2 SRAM per core, and 32MB of L3 eDRAM, with local/remote partitioning, etc. The latencies you're quoting are meaningless in the context of what I said - POWER7 has neither 256KB of eDRAM / 32MB of SRAM, nor any other SRAM/eDRAM pools of comparable size.

The advantage of eDRAM is density and density only. You can have more of it in the same space. IBM, in their server processors does this, they kill it with sheer size (32MB eDRAM L3 cache in addition to 256kB SRAM as L2). Nintendo does not.

Has it occurred to you that better density could translate to lower wiring latencies for the same amount of cells?

Here, help yourself: https://www.google.com/search?q=ibm+edram+pdf+Subramanian+Iyer (first hit, page 10).

nordique · Jan 22, 2013

Thraktor said:
It's really mostly about clocks. Each added stage allows you to increase the clock speed that bit further, but reduces the performance per clock a bit because of increased branch misprediction penalties. These days the clock benefits seem to outweigh the misprediction losses up to about the high teens of stages, so that tends to be the sweet spot in terms of pipelines. It does depend on things like branch prediction accuracy and various different techniques to limit the effect of misprediction, so different architectures are going to have different sweet spots.

Given Nintendo's (perceived) targets for a system, I suppose that the CPU they included is ideal for what they wanted.

Something to be efficient power draw wise, backwards compatible, yet powerful enough for modern HD gaming purposes, within a specific budget range

Makes sense they went with lower clocks and a shorter pipeline

The asymmetrical cache aspect has me curious; how is that ideal/good/beneficial? Other than a "master core" being active for perhaps OS routines why have that design over 1MB cache per core

Though 8 cores vs 3 cores is a fairly large difference, I wonder how Core-to-core does Espresso stack up to each next gen counterpart Jaguar core? (I suppose that would depend on whether a 512MB cache core is compared vs the 2048MB cache core to the Jaguar core used, which is only clocked ~22% faster at 1.6 vs 1.243125 GHz, aside from the fact IBM vs AMD design varies)

Its just interesting to me since I am a believer that the Wii U CPU is a much smarter design, all things considered, than the interwebs gives it credit. That said, I'm not an affluent tech person so I don't understand any deeper levels beyond.

Thraktor · Jan 22, 2013

They went with the design because of backwards compatibility with Wii. The only people designing CPUs with as few stages these days are ARM, and those are embedded CPUs like the M0, which is absolutely miniscule and designed to operate at only a couple hundred MHz. If Nintendo and IBM were to design a CPU completely from scratch for Wii U, I wouldn't see them go with anything less than about 10 stages, even considering a desire for energy efficiency.

blu said:
Here, help yourself: https://www.google.com/search?q=ibm+edram+pdf+Subramanian+Iyer (first hit, page 10).

That slide actually gives us the exact numbers we're looking for:

512KB cache - ~1.7ns latency with eDRAM vs ~1.1ns latency with SRAM
2MB cache - ~2ns latency with eDRAM vs ~1.7ns latency with SRAM

Fairly small total latency increases when you consider you're getting three times the cache in trade, to be honest.

blu · Jan 22, 2013

tipoo said:
I'll admit I don't know enough about this first off, but I don't think a short pipeline is all good apart from limiting clock speed. Or at least, shorter isn't always better, and the optimal depth seems to be 10-20 depending on what the processor is used for. Pipelining allows multiple instructions to be fed into the processor at once, so with a 4 stage pipeline after 4 instructions if the first one had not completed, the system would simply have to stop feeding the processor new instructions. With a deeper pipeline it could keep filling it up. The problems with this have already been mentioned, if it predicts a branch wrong it has to evacuate some of the pipeline stages.

You got the concept of pipelines a bit wrong. A pipeline stall is a pipeline stall, regardless of the length of the pipeline - you cannot move ops positioned above the stalled stage any further down, or feed any new ops to the pipeline. Now, how long each instruction spends at a given stage can vary, though most instructions in most pipelined CPUs spend a single cycle at each stage, before they leave the pipeline AKA retire (which, apropos, might be earlier than the final stage, for various special cases).

With earlier pipelined processors, it would evacuate ALL the pipeline stages, but two things have made this less of an issue, now it only evacuates the missed branch pipeline stages, plus branch prediction rates themselves have gotten more and more accurate.

That would be valid for shorter pipelines of the same capability.

So there's the tradeoff, and just as longer wasn't always better, sorter isn't always better either.

The tradeoff is in the amount of work you can do - a longer pipeline can do more work per op over the course of more stages, or do the same amount of work, but broken down into smaller pieces over the more stages, thus allowing the pipeline to clock higher. For a fixed-clock, fixed amount of work, though, the shorter the pipeline you can 'encode' that in, the better the design's overall performance.

Pociask · Jan 22, 2013

Thraktor said:
They went with the design because of backwards compatibility with Wii. The only people designing CPUs with as few stages these days are ARM, and those are embedded CPUs like the M0, which is absolutely miniscule and designed to operate at only a couple hundred MHz. If Nintendo and IBM were to design a CPU completely from scratch for Wii U, I wouldn't see them go with anything less than about 10 stages, even considering a desire for energy efficiency.

Which is really, really curious, given that Nintendo includes backwards compatibility as some kind of red-headed stepchild, and even dropped Gamecube backward compatibility midway through the Wii's lifespan.

And why did they think energy efficiency is so important? For something that completely drove their hardware design, have they made any public statements touting how energy efficient the system is?

Durante · Jan 22, 2013

Margalis said:
Program complexity is hard to ramp down, textures are extremely easy to ramp down.

Which brings us back to the point that the CPU could be a much larger issue for downports than the GPU.

Thraktor said:
2. The "special" core, with 2MB of L2 cache, has the most per-thread on-die cache of any consumer CPU ever. In fact, depending on when POWER7+ systems actually started shipping, when the Wii U released that core might have had the most per-thread cache of any CPU core ever made.

Maybe I'm missing a part of your criteria, but I don't think this is true at all. For example, a large number of Core 2 Duo processors released around 08 have 6 MB of L2 for 2 cores/threads.

The Boat · Jan 22, 2013

Pociask said:
Which is really, really curious, given that Nintendo includes backwards compatibility as some kind of red-headed stepchild, and even dropped Gamecube backward compatibility midway through the Wii's lifespan.

And why did they think energy efficiency is so important? For something that completely drove their hardware design, have they made any public statements touting how energy efficient the system is?

I don't see anyone doing BC as well as Nintendo.

Thraktor · Jan 22, 2013

Durante said:
Maybe I'm missing a part of your criteria, but I don't think this is true at all. For example, a large number of Core 2 Duo processors released around 08 have 6 MB of L2 for 2 cores/threads.

Nah, you're quite right. I skimmed through Intel's processors before making the claim, but obviously missed those.

Pociask · Jan 22, 2013

The Boat said:
I don't see anyone doing BC as well as Nintendo.

That's what's so crazy about it! They do perfect hardware BC, and even design their new systems for that sole purpose - but apparently, at the same time, are happy to ditch BC completely mid-way through a gen. Also, they don't seem to want to reap any benefits of built in BC - why is there a Wii mode, where Wii Ware and Wii Virtual Console are hidden away? Why isn't that integrated into the new eShop? Where are DS games on the 3DS eShop?

The Abominable Snowman · Jan 22, 2013

lwilliams3 said:
Ok. Thanks for clarifying.

I'm not sure if it was ever officially confirmed, though Marcan stated that there is an ARM processor in the system that at least does the tasks that Wii's ARM "Scarlet" did. I believe Wsippel's research led him to unoffically confirm that the ARM is mult-core.

As for the background OS tasks running off the ARM, I honestly think that just makes sense considering that Espresso only has 3 cores to work with and we haven't heard anything about any of those cores being locked up.

Wasn't that strictly in regards to security? The same setup was used in GameCube and Wii.

And yeah no offense but Wsippel's research often turns up extremely optimistic.

Thraktor · Jan 22, 2013

Pociask said:
That's what's so crazy about it! They do perfect hardware BC, and even design their new systems for that sole purpose - but apparently, at the same time, are happy to ditch BC completely mid-way through a gen. Also, they don't seem to want to reap any benefits of built in BC - why is there a Wii mode, where Wii Ware and Wii Virtual Console are hidden away? Why isn't that integrated into the new eShop? Where are DS games on the 3DS eShop?

The reason they ditched Gamecube BC mid-gen was simple: it required extra hardware (controller ports and memory card slots) and 99% of people who wanted the feature already owned a Wii by that point. This time around there's no extra hardware required (every Wii component is used in some way by Wii U) and there are an order of magnitude more people with Wii games they might want to play than there were people with GC games. I'd be extremely surprised if BC doesn't last the entire gen.*

*I'll qualify this by saying that Nintendo might at some point release a Wii U without an optical drive, and therefore it wouldn't be compatible with Wii discs, but if they do it'll be sold alongside the main version of the console, so I'll be correct in the sense of "You'll be able to buy a new Wii U which can play Wii games right up to the point the Wii U's successor is released".

The Boat · Jan 22, 2013

Pociask said:
That's what's so crazy about it! They do perfect hardware BC, and even design their new systems for that sole purpose - but apparently, at the same time, are happy to ditch BC completely mid-way through a gen. Also, they don't seem to want to reap any benefits of built in BC - why is there a Wii mode, where Wii Ware and Wii Virtual Console are hidden away? Why isn't that integrated into the new eShop? Where are DS games on the 3DS eShop?

They removed BC in 2011, hardly mid-gen, more near the end of it. It was a way of cutting costs because most people didn't care about it.
The Wii mode is there because it's the easiest, safest way to guarantee 100% Wii BC, with less work and more security that (in theory) doesn't allow Wii's security flaws spread to Wii U.

Talk of eShop and VC strays from actual BC purposes.

Pociask · Jan 22, 2013

The Boat said:
They removed BC in 2011, hardly mid-gen, more near the end of it. It was a way of cutting costs because most people didn't care about it.
The Wii mode is there because it's the easiest, safest way to guarantee 100% Wii BC, with less work and more security that (in theory) doesn't allow Wii's security flaws spread to Wii U.

Talk of eShop and VC strays from actual BC purposes.

You're right that it's later in the gen when they removed it. But still, you brought out the important point - "most people didn't care about it."

I guess you have to know why someone values BC - to keep playing old games during new system droughts, or to be able to play two generations of games on one device. If it's reason two, then that never loses value, and shouldn't be removed. If it's reason one, then BC loses all value at some point post-launch, and removing BC makes perfect sense.

So, let's say Nintendo is doing the thing that makes more sense, and they believe that it's just a feature that is only important to early adopters. How many early adopters is it really important too? How many sales would they lose if they didn't include that feature? How many sales could they pick up if they weren't constrained by trying to do perfect hardware BC, which in the long run of the console lifecycle is a feature that many people don't care about, at all?

I'm not attacking Nintendo here - I really have no idea what Nintendo is doing. The Wii U is the necessary result of choices Nintendo made early on - power consumption, backwards compatibility, Gamepad inclusion, etc. Why they made those early decisions is still pretty much an open question, I think.

And I don't think talk of eShop and VC strays from the purpose of BC - if you're going to make a system natively run another system's code, why aren't you making that other system's games available digitally? It goes to show whether BC is important or not to Nintendo, or whether Nintendo is working at cross purposes with itself.

Thraktor · Jan 22, 2013

Pociask said:
You're right that it's later in the gen when they removed it. But still, you brought out the important point - "most people didn't care about it."

I guess you have to know why someone values BC - to keep playing old games during new system droughts, or to be able to play two generations of games on one device. If it's reason two, then that never loses value, and shouldn't be removed. If it's reason one, then BC loses all value at some point post-launch, and removing BC makes perfect sense.

So, let's say Nintendo is doing the thing that makes more sense, and they believe that it's just a feature that is only important to early adopters. How many early adopters is it really important too? How many sales would they lose if they didn't include that feature? How many sales could they pick up if they weren't constrained by trying to do perfect hardware BC, which in the long run of the console lifecycle is a feature that many people don't care about, at all?

I'm not attacking Nintendo here - I really have no idea what Nintendo is doing. The Wii U is the necessary result of choices Nintendo made early on - power consumption, backwards compatibility, Gamepad inclusion, etc. Why they made those early decisions is still pretty much an open question, I think.

And I don't think talk of eShop and VC strays from the purpose of BC - if you're going to make a system natively run another system's code, why aren't you making that other system's games available digitally? It goes to show whether BC is important or not to Nintendo, or whether Nintendo is working at cross purposes with itself.

Regarding the last question, I fully expect Wii games to become available on the eShop at some point. It might be a couple of years before they do so, but it seems a no-brainer by the end of the gen. Being able to play Gamecube games almost natively is also a bonus when it comes to VC.

Back to the CPU, while I would expect Nintendo and IBM to have taken a different microarchitectural route had they started the CPU from scratch, I don't think the outcome would have been all that different, as they would still have been working within the same cost and thermal constraints. You probably would have got a CPU which clocked to around 1.6GHz, but with very similar per-clock performance as Espresso. Backwards compatibility probably cost them about 30% CPU performance, and saved them a lot of R&D cash, which I'm sure they considered to be worth it.

Lonely1 · Jan 22, 2013

Kinda amusing that, now that the NextBox is using a similar setup, the EDRAM stopped being "A Nintard wishful-thinking setup" and now is a serious solution to a (relative) low main memory bandwidth architecture.

oversitting · Jan 22, 2013

Lonely1 said:
Kinda amusing that, now that the NextBox is using a similar setup, the EDRAM stopped being "A Nintard wishful-thinking setup" and now is a serious solution to a (relative) low main memory bandwidth architecture.

EDRAM was in the 360. It helped a lot. Theres more things missing in the Wii U compared to the next gen.

Durante · Jan 22, 2013

Lonely1 said:
Kinda amusing that, now that the NextBox is using a similar setup, the EDRAM stopped being "A Nintard wishful-thinking setup" and now is a serious solution to a (relative) low main memory bandwidth architecture.

Please provide a link when you make such accusations. I certainly don't remember anyone calling eDRAM "A Nintard wishful-thinking setup". (Which would be strange even if your implicit supposition that everyone hates Nintendo were true -- it's not like the Wii U is the first system to use embedded memory)

magash · Jan 22, 2013

oversitting said:
EDRAM was in the 360. It helped a lot. Theres more things missing in the Wii U compared to the next gen.

Like?

Durante · Jan 22, 2013

magash said:
Like?

If the current rumours are accurate, a factor of 4-6 in GPU performance (and a newer architecture as well), up to 7 in CPU performance and around 6 in external memory bandwidth. (The latter is 15 instead of 6 for Orbis, but its memory setup is not comparable)

The Boat · Jan 22, 2013

Pociask said:
You're right that it's later in the gen when they removed it. But still, you brought out the important point - "most people didn't care about it."

I guess you have to know why someone values BC - to keep playing old games during new system droughts, or to be able to play two generations of games on one device. If it's reason two, then that never loses value, and shouldn't be removed. If it's reason one, then BC loses all value at some point post-launch, and removing BC makes perfect sense.

So, let's say Nintendo is doing the thing that makes more sense, and they believe that it's just a feature that is only important to early adopters. How many early adopters is it really important too? How many sales would they lose if they didn't include that feature? How many sales could they pick up if they weren't constrained by trying to do perfect hardware BC, which in the long run of the console lifecycle is a feature that many people don't care about, at all?

I'm not attacking Nintendo here - I really have no idea what Nintendo is doing. The Wii U is the necessary result of choices Nintendo made early on - power consumption, backwards compatibility, Gamepad inclusion, etc. Why they made those early decisions is still pretty much an open question, I think.

And I don't think talk of eShop and VC strays from the purpose of BC - if you're going to make a system natively run another system's code, why aren't you making that other system's games available digitally? It goes to show whether BC is important or not to Nintendo, or whether Nintendo is working at cross purposes with itself.

I don't disagree with you there, but there probably isn't one sole reason for it. Wii was a small bump from GC hardware wise because they wanted out of the hardware race and were making something very risky, starting from GC's architecture made sense financially and VC was an added bonus. I doubt many people bought a Wii just to play GC or a PS2 just to play PS1, but they're a nice bonus especially at launch. Later on, especially in GC's case since it wasn't a runaway success, most people don't care so it didn't make financial sense for Nintendo to include it. They didn't design the hardware around BC, so I don't see a problem here.

Wii U though, I know as much as you why they decided what they did in regards to being a continuation of sorts of Wii's CPU, but it's not the same situation as GC -> Wii. My guess is that it made more sense to do so than to make something from scratch considering their familiarity with the hardware. I also doubt they designed the hardware around BC, even if there are loads of people with Wii games.

I don't think that putting games up for download is BC in its purest sense. When I think backwards compatibility I think about being able to pop in whatever physical media I had for my old consoles in my new console and play it. Digitally downloading them is a different thing, even if hardware similarities make it easy to run the games. Wii isn't backwards compatible with NES, SNES, N64, Mega Drive/Genesis, etc and it still runs games from those consoles.
I agree with you that they need to embrace their old catalog more aggressively, like they did with Wii . Despite what people say about VC, there's no other console with something like this on this scale.

Lonely1 · Jan 22, 2013

Durante said:
Please provide a link when you make such accusations. I certainly don't remember anyone calling eDRAM "A Nintard wishful-thinking setup". (Which would be strange even if your implicit supposition that everyone hates Nintendo were true -- it's not like the Wii U is the first system to use embedded memory)

The thread where the RAM type Nintendo was using was disclosed is full of such posts. And I'm not pointing to you. The "A Nintard wishful-thinking setup" is paraphrasing DrinkyCow, which I know is a troll account, but many where marching at that tune.

magash · Jan 22, 2013

Durante said:
If the current rumours are accurate, a factor of 4-6 in GPU performance (and a newer architecture as well), up to 7 in CPU performance and around 6 in external memory bandwidth. (The latter is 15 instead of 6 for Orbis, but its memory setup is not comparable)

My understanding of oversittings sentence "...more things missing" relates to hardware features not raw power.

Ryoku · Jan 22, 2013

Durante said:
Please provide a link when you make such accusations. I certainly don't remember anyone calling eDRAM "A Nintard wishful-thinking setup". (Which would be strange even if your implicit supposition that everyone hates Nintendo were true -- it's not like the Wii U is the first system to use embedded memory)

It was widely referred to as "GPGPU/eDRAM magic"

oversitting · Jan 22, 2013

magash said:
My understanding of oversittings sentence "...more things missing" relates to hardware features not raw power.

Things like DX11 hardware and the previously quoted performance differences are all things that are missing.

Durante · Jan 22, 2013

Ryoku said:
It was widely referred to as "GPGPU/eDRAM magic"

Well, criticizing "GPGPU magic" has some validity, when it is proposed as a general cure for weak CPU performance. Just like criticizing eDRAM as a general solution for all memory bandwidth issues is valid.

For an example, you need not look farther than how harsh people reacted in the current Durango thread when it was proposed that you can simply see it as a system with 8 GB of 170 GB/s memory. Or even the reaction to the "magic move engines". Basically, unrealistic scenarios are pointed out as such regardless of where they come from, or which hardware they pertain to.

magash · Jan 22, 2013

oversitting said:
Things like DX11 hardware and the previously quoted performance differences are all things that are missing.

Ah OK. Carry on.

Thunder Monkey · Jan 22, 2013

Durante said:
Well, criticizing "GPGPU magic" has some validity, when it is proposed as a general cure for weak CPU performance. Just like criticizing eDRAM as a general solution for all memory bandwidth issues is valid.

For an example, you need not look farther than how harsh people reacted in the current Durango thread when it was proposed that you can simply see it as a system with 8 GB of 170 GB/s memory. Or even the reaction to the "magic move engines". Basically, unrealistic scenarios are pointed out as such regardless of where they come from, or which hardware they pertain to.

The general issue is people not understanding why it's "magic" depending on or deriding the setup.

Dumbasses being dumbasses basically. Acting like it's a magic bandaid, or insulting when not realizing that it's inclusion really does help the design... just not at the lengths some try to embellish.

Ryoku · Jan 22, 2013

Durante said:
Well, criticizing "GPGPU magic" has some validity, when it is proposed as a general cure for weak CPU performance. Just like criticizing eDRAM as a general solution for all memory bandwidth issues is valid.

For an example, you need not look farther than how harsh people reacted in the current Durango thread when it was proposed that you can simply see it as a system with 8 GB of 170 GB/s memory. Or even the reaction to the "magic move engines". Basically, unrealistic scenarios are pointed out as such regardless of where they come from, or which hardware they pertain to.

Oh, I'm not arguing the facts. I'm just pointing it out since you asked.

Thraktor · Jan 22, 2013

Thunder Monkey said:
The general issue is people not understanding why it's "magic" depending on or deriding the setup.

This is true, but every technical thread is going to contain people saying "X is great" and "Y is terrible" without really understanding what X and Y are. Confirmation bias is a fact of life when it comes to things like this.

Durante · Jan 22, 2013

Thunder Monkey said:
The general issue is people not understanding why it's "magic" depending on or deriding the setup.

I'd say the general issue is people trying to use things as an argument that they don't really understand, or don't have enough information about. Of course, that's true for all "sides".

Edit:

Thraktor said:
This is true, but every technical thread is going to contain people saying "X is great" and "Y is terrible" without really understanding what X and Y are. Confirmation bias is a fact of life when it comes to things like this.

Exactly.

One funny thing for me personally is that I find myself downplaying the impact of GPGPU recently on GAF, when I spent 3 years or so of my life (~2005-2008) convincing people how great it is

Raist · Jan 22, 2013

Lonely1 said:
The thread where the RAM type Nintendo was using was disclosed is full of such posts. And I'm not pointing to you. The "A Nintard wishful-thinking setup" is paraphrasing DrinkyCow, which I know is a troll account, but many where marching at that tune.

It wasn't about the EDRAM, it was about the slow as fuck main RAM they're using.

Thunder Monkey · Jan 22, 2013

Thraktor said:
This is true, but every technical thread is going to contain people saying "X is great" and "Y is terrible" without really understanding what X and Y are. Confirmation bias is a fact of life when it comes to things like this.

Durante said:
I'd say the general issue is people trying to use things as an argument that they don't really understand, or don't have enough information about. Of course, that's true for all "sides".

Edit:
Exactly.

One funny thing for me personally is that I find myself downplaying the impact of GPGPU recently on GAF, when I spent 3 years or so of my life (~2005-2008) convincing people how great it is

My edit is actually more concise. I'm not entirely sure what I was trying to say with that first one. Just kind of comes off like mangled English.

Support NeoGAF

WiiU technical discussion (serious discussions welcome)

Banned

Banned

Banned

Banned

Wants the largest console games publisher to avoid Nintendo's platforms.

Banned

Banned

Wants the largest console games publisher to avoid Nintendo's platforms.

Member

Wants the largest console games publisher to avoid Nintendo's platforms.

Member

Banned

Banned

Wants the largest console games publisher to avoid Nintendo's platforms.

Member

Member

Member

Banned

Wants the largest console games publisher to avoid Nintendo's platforms.

Member

Member

Wants the largest console games publisher to avoid Nintendo's platforms.

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Unconfirmed Member

Banned

Member

Member

Member

Member

Unconfirmed Member

Member

Member

Banned

Member

Member

Banned

Member

Member

Member

Banned

Banned

Similar threads