• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

The iPhone 7 will the be the most powerful gaming system any Nintendo game has run on

LordOfChaos

Member
What games did Nintendo ever release on PC? Like some old Japanese PC? iPhone 7 is probably faster than most old PCs unless they're just a few years old

Mario is Missing, and yeah an iPhone 7 is way more powerful than PCs from when that came out, but Mario run also runs on much older iPhones, so applying the same rule both ways, PCs would be the most powerful gaming platform a Nintendo game ever ran on as a modern PC could still run it.

It makes the title utterly meaningless, but it already was as it wasn't built from the ground up for the 7. I still do get the OP though, it's an interesting tidbit that if Nintendo focused on top end iPhone hardware, on a hardware level it would be more powerful than any hardware they made.
 
Mario is Missing, and yeah an iPhone 7 is way more powerful than PCs from when that came out, but Mario run also runs on much older iPhones, so applying the same rule both ways.

The oldest iPhone that can run Super Mario Run will almost certainly still be faster than any consumer-friendly PC that came out in 1992, the year the game came out..
 
Mario is Missing

Oh come on that doesn't count. The MS-Dos version was published by Mindscape, not Nintendo, and I seriously doubt it would run natively today without Dosbox, meaning that it's emulated. And if we're going to just bundle emulation into the equation, then...
 

LordOfChaos

Member
The oldest iPhone that can run Super Mario Run will almost certainly still be faster than any consumer-friendly PC that came out in 1992, the year the game came out..

Oh come on that doesn't count. The MS-Dos version was published by Mindscape, not Nintendo, and I seriously doubt it would run natively today without Dosbox, meaning that it's emulated. And if we're going to just bundle emulation into the equation, then...

Well, If Mario Run runs on a 4S or 5, it would invalidate "most powerful gaming system a Nintendo game has run on" by way of it not being exclusively for the 7 hardware, it's generally for the iOS platform which includes newer more powerful hardware if we apply the same rule of forward compatibility doesn't count.

32 bit Windows can run 16 bit applications, certainly you could find a 32 bit computer out there with more GPU grunt than an iPhone.

This is getting into uninteresting semantics games anyways, as I said I do get the OPs point, it is an interesting factoid that if they took full advantage, modern iOS hardware would be more powerful than any dedicated Nintendo hardware.
 

Zil33184

Member
Interesting.
Pixel (not vertex) shader processing was not something that Cell excelled at.

I agree with everything else you said though.

...

Software rasterization is inefficient for pixel shaders, so they definitely needed dedicated hardware for that, unless people were OK with PS2-era graphics (no programmable pixel shaders at all, minus a few exceptions).

Quite a few games used deferred shading on the SPUs, especially later on in the console generation as the SPUs were better than xenos at deferred lighting passes.
 

wilstreak

Member
TC claim will unlikely woon't hold up.

The game will release on holiday season and that means iPad Pro 2 will probably release before the game.
so iPad Pro 2 will be the most powerful gaming system any Nintendo game has run on.
 
I understand the difference between RISC and CISC. For pure computation the Cell remains faster, which even given the architectural differences is sad for a machine that released 8 years later.
CISC vs RISC? LOL, where did that come from?!

I said MIPS and flops... you know, integer vs floating point operations?

I guess you still don't understand why Jaguar is 2.5-3x faster than the Xenon CPU in general purpose code, right?

You even ignored this site: http://7-cpu.com/

I wonder why...

Regarding flops, do you realize that Jaguar at 1.6 GHz (half the frequency of Cell) can perform 102.4 Gflops? Cell has 200 Gflops at 3.2 GHz. This says to me that Jaguar has a really efficient design (it reminds me of this tweet: https://twitter.com/marcan42/status/274179630599131136).

You are contradicting yourself. If you think only highly parallelizeable code matters in gaming, then the Jaguar is a downgrade. In fact, if only parallelizeable code mattered then they could have put an ARM CPU and solely focused on GPGPU instead like Nvidia is doing with Tegra.
I'm not contradicting anything. For number crunching, we have GPUs these days. Cell is obsolete, but its legacy still lives in every AMD APU.

ND is even using the GPU for AI pathfinding in Uncharted 4. If the PS4 had a hypothetical Cell 2 with 32 SPUs, they would use those instead. Proficient developers will always use the most appropriate tool in their disposal, not the other way around.

ps: AMD doesn't have good ARM CPUs. There's a reason they put x86-64 cores in their APUs.

Quite a few games used deferred shading on the SPUs, especially later on in the console generation as the SPUs were better than xenos at deferred lighting passes.
AFAIK, the bulk of pixel shader operations were performed by the RSX.

It was vertex shaders that the RSX was really weak at and it needed some Cell "assistance".
 

diehard

Fleer
AMD does have a higher end ARM based SoC, for servers.

The reason they put AMD64 based processors in their APUs is so they can run Windows, and because they are one of 2 current companies with the rights to manufacture them.
 
AMD does have a higher end ARM based SoC, for servers.

The reason they put AMD64 based processors in their APUs is so they can run Windows, and because they are one of 2 current companies with the rights to manufacture them.
http://www.fudzilla.com/news/processors/39179-jim-keller-was-not-a-big-fan-of-k12

x86-64 = better performance
ARM = more suitable for mobile devices

Also, I don't think ARM64 was ready back in 2012 when they finalized the APU designs... they wanted 8GB of RAM (Microsoft mostly and then Sony), so ARM32 wouldn't cut it.
 

KeigoNiwa

Member
I was expecting to be stunned by the fact that iPhone 7 specs were in fact better than that of the Wii U. In stead I got a comparison chart between iPhones.

Incredibly misleading thread :(
 

Zil33184

Member
AFAIK, the bulk of pixel shader operations were performed by the RSX.

It was vertex shaders that the RSX was really weak at and it needed some Cell "assistance".

RSX mainly rendered the g-buffers, but in games that used deferred shading pretty much all of the pixel shading operations were done on the SPUs, including post processing and anti-aliasing. That's not to say RSX was horrible at it, but SPUs were much better.
 
I think you might have a fundamental misunderstanding of processor ISA's if that is what you believe it boils down to.
I think you don't know who Jim Keller is and why he dismissed K12 (ARM-based) in favor of Zen (x86-based). There's a reason ARM is restricted to mobile devices mostly.

Not even Apple uses their (spectacular) ARM cores outside of mobile devices.

ps: Servers run Linux for the most part. Windows is a more popular OS in desktop computers.

Because comparing instructions per second between RISC and CISC is misleading.
Not when you have a cross-platform benchmark: http://7-cpu.com/

Are you still going to pretend that Cell PPE was a great CPU back in 2005? Because even a shitty, inefficient Pentium 4 from 2004 could beat it in plenty of benchmarks.

Do you understand the microarchitectural differences between in-order vs out-of-order? Microarchitectures have nothing to do with architectures.

Not to mention that the CISC vs RISC distinction is pretty much obsolete these days, since all modern x86 CPUs have RISC cores that process micro-ops. x86 is merely a frontend that translates/decodes x86 opcodes to micro-ops.

Yes you are. You claimed that there's no gaming code that can't run on GPGPU while at the same time using general purpose CPU benchmarks to prove your point.
Now you're putting words in my mouth and that's not very cool of you...

Alright, so is there any gaming-related SIMD algorithm (like physics) that can run faster on Jaguar than on the GPU?
I said gaming related SIMD code, like physics, destruction and stuff like that that can be GPGPU accelerated.

Do you even understand what SIMD is? Have you ever studied linear algebra and more precisely matrix multiplications? The 7-cpu benchmark has nothing to do with flops, if you bothered to read what it says: "The test code doesn't use FPU and SSE. Most of the code is 32-bit integer code."

I think you're confused by the "general purpose" term. GPGPU doesn't mean that GPUs can run everything, otherwise we wouldn't need a CPU. You'll always need a CPU to orchestrate the GPU and execute code with lots of branches.

A CPU is like the conductor in an orchestra, while GPU Compute Units are the equivalent of guys playing the violins. They're really good at it, but they can't really do anything by themselves.
 

PFD

Member
Let me get this right... People are telling that the iPhone 7 will be stronger than Videogames like Wii U, Ps3 and Xbox 360?

That machine will be able to handle games like GTA 5 without a problem?

Is really that I reading?

PS3_110.bmp.jpg

Yeah, I've seen iPhone games that look better than the PS3 screenshot you posted
yes I replaced the PC screenshot you actually posted
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Regarding flops, do you realize that Jaguar at 1.6 GHz (half the frequency of Cell) can perform 102.4 Gflops? Cell has 200 Gflops at 3.2 GHz. This says to me that Jaguar has a really efficient design (it reminds me of this tweet: https://twitter.com/marcan42/status/274179630599131136).
So 8 Jaguars @ 1.6 do 100 GFLOPS, and 8 SPUs @ 3.2 do 200 GFLOPS - where exactly is the FLOPS efficiency advantage of the Jaguars?

ps: AMD doesn't have good ARM CPUs. There's a reason they put x86-64 cores in their APUs.
I think you don't know who Jim Keller is and why he dismissed K12 (ARM-based) in favor of Zen (x86-based). There's a reason ARM is restricted to mobile devices mostly.

AMD had to prioritize their product lines and they naturally went after the better short-to-mid-term ROI option. Keller did not 'dismiss' K12 - he designed it. AMD had to decide what to launch first on the market, and they made the right choice.

And since you brought up apple:

http://daringfireball.net/linked/2016/09/14/geekbench-android-a10

UPDATE: The iPhone 7 scores better on both single- and multi-core than most MacBook Airs ever made, and performs comparably to a 2013 MacBook Pro.

UPDATE 2: Here’s another eye-opener. Matt Mariska tweets:

@gruber Grain of salt and all, but Geekbench has the iPhone 7 beating the $6,500 12-core Mac Pro in single-thread.
 
So 8 Jaguars @ 1.6 do 100 GFLOPS, and 8 SPUs @ 3.2 do 200 GFLOPS - where exactly is the FLOPS efficiency advantage of the Jaguars?
Maybe the fact that Jaguar has the same amount of flops per Hz? If it could run @ 3.2 GHz, it would be on par with Cell.

Not to mention power consumption (Jaguar @ 1.6 consumes 30 watts)...

That still doesn't explain why Apple has to buy CPUs from Intel, instead of putting their own ARM-based ones... they made the switch from PPC to x86, so it shouldn't be that hard to go full ARM.
 

Moreche

Member
If the A10 is so powerful, why don't Apple put it in the Apple TV with a nice heatsink and overclock and blow us away with games?
As much as Apple say it isn't, it's obvious that Apple TV is still a by product to them.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Maybe the fact that Jaguar has the same amount of flops per Hz? If it could run @ 3.2 GHz, it would be on par with Cell.
Leaving aside the fact Jaguar cannot run at 3.2GHz by design (it'd have a very different pipeline if it could), is CELL some measure for FLOP efficiency these days? I can think of a dozen CPUs that do the same (or higher) FLOPS/clock. CELL surely had lots of FLOPS for its time, but my 4-core desktop has those FLOPS and then some. So your impressive efficiency comment is puzzling. Heck, a 16-core Epiphany does 32GFLOPS at 2W and 65nm - now that's impressive for a non-GPU. Jaguar is a run-of-the-mill mobile-class design. Sure it has the best IPC in the x86 entry-level mobile word but that alone is not much of an achievement.

Not to mention power consumption (Jaguar @ 1.6 consumes 30 watts)...
That's what lithography advancements can do to your designs. For reference, my intel IGP does 320GFLOPS @ ~8W @ 22nm fabnode, whereas my 2009 Tesla does 34GFLOPS @ ~20W @ 40nm. Woe is nvidia, I guess. Drawing efficiency conclusions out of thin air, across multiple lithography node generations, taking TDP ratings for linear (how many Watts would an 8-core Jaguar consume at 3.2GHz again?) does not really help make a point. Just saying.

That still doesn't explain why Apple has to buy CPUs from Intel, instead of putting their own ARM-based ones... they made the switch from PPC to x86, so it shouldn't be that hard to go full ARM.
Profit margins would be my uneducated guess.
 
Leaving aside the fact Jaguar cannot run at 3.2GHz by design (it'd have a very different pipeline if it could), is CELL some measure for FLOP efficiency these days? I can think of a dozen CPUs that do the same (or higher) FLOPS/clock. CELL surely had lots of FLOPS for its time, but my 4-core desktop has those FLOPS and then some. So your impressive efficiency comment is puzzling. Heck, a 16-core Epiphany does 32GFLOPS at 2W and 65nm - now that's impressive for a non-GPU. Jaguar is a run-of-the-mill mobile-class design. Sure it has the best IPC in the x86 entry-level mobile word but that alone is not much of an achievement.
I never said that it would be able to run @ 3.2 GHz. I know that it has a short pipeline.

It's like comparing P3 vs P4. They were not designed with the same goals in mind.

That's what lithography advancements can do to your designs. For reference, my intel IGP does 320GFLOPS @ ~8W @ 22nm fabnode, whereas my 2009 Tesla does 34GFLOPS @ ~20W @ 40nm. Woe is nvidia, I guess. Drawing efficiency conclusions out of thin air, across multiple lithography node generations, taking TDP ratings for linear (how many Watts would an 8-core Jaguar consume at 3.2GHz again?) does not really help make a point. Just saying.
I never said that consumption scales linearly. Jaguar @ 2 GHz consumes 50 watts (@28nm).

Lithography advancements do not always help that much (e.g. Polaris vs Maxwell/Pascal).

Profit margins would be my uneducated guess.
Apple has a higher profit margin by designing their own CPUs.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
I never said that it would be able to run @ 3.2 GHz. I know that it has a short pipeline.

It's like comparing P3 vs P4. They were not designed with the same goals in mind.
That still leaves me puzzled about your 'impressive efficiency' comment. What exactly is impressive in Jaguar's 4-way SIMD * 2 ops /clock in this day and age?

I never said that consumption scales linearly. Jaguar @ 2 GHz consumes 50 watts (@28nm).
But you quoted a power rating @ a given clock, in the context of Jaguar having the same FLOPS/clock as CELL. And where are you taking those TDPs from? I hope you're not extrapolating from APU numbers, as those never give the CPU ratings.

Lithography advancements do not always help that much (e.g. Polaris vs Maxwell/Pascal).
Lithography advancements always help. Clearly NV have had the architecture TDP edge with Maxwell/Pascal, but AMD cannot blame lithography for that.

Apple has a higher profit margin by designing their own CPUs.
Apple are already designing their own CPUs. But they'd also have to produce them. Whereas Intel are masters of production but they do not quite design stellar tablet CPUs. It'd be curious to know why you think TSMC/Samsung can provide better prices per transistor than Intel.
 
So yeah, the Apple A10 fusion is an absolute beast. It only ever uses two cores at once, but the fact that the two low power cores (which are clocked at 1.1Ghz) are faster than my iPhone 5s, which is already competitive with flagship Android phones in single core performance, blows my mind.

I don't think anything Nvidia can provide Nintendo with, whether it be the Tegra X1 or X2, will be able to match the A10.

If the A10 is so powerful, why don't Apple put it in the Apple TV with a nice heatsink and overclock and blow us away with games?
As much as Apple say it isn't, it's obvious that Apple TV is still a by product to them.

Supply and demand. Every A10 fabricated could do towards a new iPhone 7 or 7 Plus, instead of the lower-margin Apple TV.

It's a shame, I was hoping the new Apple TV would have, at the time, had an A9X on board. But A9Xes had much more worth in the iPad Pros.
 
@blu

It all boils down to the fact that people are so eager to shit on Jaguar, just because "Cell has moar flopz!!!!!!1111ONEONE"
(even though most of them don't really know what flops are, but I guess it's a cool buzzword)

Yeah, it has double the amount of flops, but it also has double the frequency of Jaguar. That's all I was saying.

Honestly, what's so impressive about Cell minus the SPU part? Would Cell be a memorable processor if it had no SPUs at all? I appreciate the legacy it left behind, but that's it.

PPE was mediocre back in 2005 (let alone in 2016) and it should not be praised at all.

Regarding Intel, if what you're saying was true, then all smartphones/tablets would have x86 SoCs. Intel could outsell ARM if they wanted to long time ago, but they never cared about selling cheaply, even though they always had the lithography advantage.

Apple does not produce chips, they merely design them. nVidia does the same.
 
The iPhone 7 benchmarks are really impressive, but the problem with smartphone games is that they are usually made targeting a min spec that is far less powerful than the current flagship, so we don't often get to see what the latest hardware is capable of outside of tech demos in press conferences.
 

SURGEdude

Member
If the A10 is so powerful, why don't Apple put it in the Apple TV with a nice heatsink and overclock and blow us away with games?
As much as Apple say it isn't, it's obvious that Apple TV is still a by product to them.

Maybe they just don't want to get into the hardcore gaming side of the market and go against consoles. They seem largely cautious about going in big on markets they don't see immediate returns on. Just speculation on my part though.

Also I don't think they want to put the AppleTV on a yearly upgrade cycle yet. I suspect it will be more like Applewatch with a year and a half or two between updates.
 

MacTag

Banned
The iPhone 7 benchmarks are really impressive, but the problem with smartphone games is that they are usually made targeting a min spec that is far less powerful than the current flagship, so we don't often get to see what the latest hardware is capable of outside of tech demos in press conferences.
Min spec for Super Mario Run seems to be the 5S. It won't run on anything older than an A7 it seems.
 
Min spec for Super Mario Run seems to be the 5S. It won't run on anything older than an A7 it seems.

Yes but not because the game is too demanding but because of iOS constraints.

An iPhone is not a dedicated gaming system so you can't use the resources like you would do it on a console.

Do people really think that a runner game needs the power of an iPhone 7 (or even 5S) to run properly? You can't be serious...
 

MacTag

Banned
Chû Totoro;217354797 said:
Yes but not because the game is too demanding but because of iOS constraints.

An iPhone is not a dedicated gaming system so you can't use the resources like you would do it on a console.

Do people really think that a runner game needs the power of an iPhone 7 (or even 5S) to run properly? You can't be serious...
Well, I mean SMR looks like it could run on a PSP even but that doesn't mean anything. The 5 and 5C can run the current iOS update but not SMR so I'm not sure that's really the limiting factor here either.
 

LordOfChaos

Member
I think you don't know who Jim Keller is and why he dismissed K12 (ARM-based) in favor of Zen (x86-based). There's a reason ARM is restricted to mobile devices mostly.

Not even Apple uses their (spectacular) ARM cores outside of mobile devices.

ps: Servers run Linux for the most part. Windows is a more popular OS in desktop computers.


Not when you have a cross-platform benchmark: http://7-cpu.com/

Are you still going to pretend that Cell PPE was a great CPU back in 2005? Because even a shitty, inefficient Pentium 4 from 2004 could beat it in plenty of benchmarks.

How was that written for Cell? As I was getting at in earlier pages, code complexity blew up dramatically with it, but it could also do some good pushing once all was said and done.

This routine for edge detection


Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
 
/* ... */
 
/* the graph */
vertex_t * G;
 
/* number of vertices in the graph */
unsigned card_V;
 
/* root vertex (where the visit starts) */
unsigned root;
 
void parse_input( int argc, char** argv );
 
int main(int argc, char ** argv)
{
  unsigned *Q, *Q_next, *marked;
  unsigned  Q_size=0, Q_next_size=0;
  unsigned  level = 0;
 
  parse_input(argc, argv);
  graph_load();
 
  Q      = 
          (unsigned *) calloc(card_V, sizeof(unsigned));
  Q_next = 
          (unsigned *) calloc(card_V, sizeof(unsigned));
  marked = 
          (unsigned *) calloc(card_V, sizeof(unsigned));
 
  Q[0] = root;
  Q_size  = 1;
  while (Q_size != 0)
    {
      /* scanning all vertices in queue Q */
      unsigned Q_index;
      for ( Q_index=0; Q_index<Q_size; Q_index++ )
      {
        const unsigned vertex = Q[Q_index];
        const unsigned length = G[vertex].length;
        /* scanning each neighbor of each vertex */
        unsigned i;
      for ( i=0; i<length; i++)
          {
            const unsigned neighbor =
              G[vertex].neighbors[i];
      if( !marked[neighbor] ) {
            /* mark the neighbor */
            marked[neighbor]      = TRUE;
            /* enqueue it to Q_next */
            Q_next[Q_next_size++] = neighbor;
          }
        }
      }
      level++;
      unsigned * swap_tmp;
      swap_tmp    = Q;
      Q           = Q_next;
      Q_next      = swap_tmp;
      Q_size      = Q_next_size;
      Q_next_size = 0;
    }
  return 0;
}


While a mere 60 lines on a general processor (P4, Jaguar, whatever) and an exacerbating 1200 on the Cell, you also went from the Pentium 4 HT running at 3.4 GHz checking edges at 24-million edges per second. On the Cell, 538-million edges per second. A 22x speedup with a lot of elbow grease.

Remember when devs figured out to move antialiasing to the Cell? That would be why. Modern GPUs are now into the billions of edges per second, but it was impressive for a 2006 CPU.

Would a P4 have produced better performing games at the start of the gen? Surely. Would it have carried through as much to the end of the gen? Probably not.



You're absolutely right in that a general benchmark written for generic processor X will have even a P4 run circles around the Cell, but it's ignoring everything Cell was built in mind with, it's a deliberate move from hardware complexity (prefetching, branch prediction, caches) to the programmer having a lot more complexity, but in turn it dedicates more transistors to stuff that makes it fast.

I don't agree with preferring it over the Jaguar, but I quite disagree with looking at a generic 7Zip benchmark as a way of seeing its potential. All it says is that it's not great at unmodified general purpose code.


Nor, by the way, will I agree that ARM is bad for large processors. Developing large cores is an ordeal that can cost into the billions, the market has to show a strong demand for ARM cores. The ISA isn't the limit to scaling up, in fact its sane front end is a saving grace.

Honestly, what's so impressive about Cell minus the SPU part? Would Cell be a memorable processor if it had no SPUs at all? I appreciate the legacy it left behind, but that's it.

Would Cell be notable without its primary feature of note? Er, no, I guess?
 

LordOfChaos

Member
If the A10 is so powerful, why don't Apple put it in the Apple TV with a nice heatsink and overclock and blow us away with games?
As much as Apple say it isn't, it's obvious that Apple TV is still a by product to them.

How I wish, especially as they dropped the requirement for games to support the touchpad/wand controller. I guess they want to separate the iPad Pro and iPhone 7 as the powerful ones though, Apple has been going for chip appeal a lot of late I find.

If they did start putting A9X in there and more storage, and updated it to the newest A*X every year, the consoles may not need to be afraid but certainly be looking behind their backs...A yearly A*X update cadence could rapidly bridge gaps, though then again the 3 year console update cadence may be in response to devices like it and keep them ahead.
 
Well, I mean SMR looks like it could run on a PSP even but that doesn't mean anything. The 5 and 5C can run the current iOS update but not SMR so I'm not sure that's really the limiting factor here either.

I don't know too for real but from what I saw the game doesn't seem that demanding... maybe high res assets are something to take into consideration but that's about it.

But maybe 5S is the minimum because of this (iOS Metal feature sets)
Which GPU Family is it?

I'm not a mobile game dev nor a console game dev though.
 
Chû Totoro;217358109 said:
I don't know too for real but from what I saw the game doesn't seem that demanding... maybe high res assets are something to take into consideration but that's about it.

But maybe 5S is the minimum because of this (iOS Metal feature sets)
Which GPU Family is it?

I'm not a mobile game dev nor a console game dev though.

iPhone 5s was the first consumer device to ship with a Power VR Series 6 GPU if I remember correctly. The iPhones 4s-5 had Series 5 GPUs, with similar architecture to the PS Vita's GPU.

As for Super Mario Run, 5s is probably the minimum to ensure 60fps gameplay. Given it's multiplatform I'm not expecting any low level programming using Metal etc., and the visuals are pretty basic too. Miitomo was built using Cocos3D, I expect something similar with Super Mario Run.
 
How was that written for Cell? As I was getting at in earlier pages, code complexity blew up dramatically with it, but it could also do some good pushing once all was said and done.

This routine for edge detection

Code:
*code snip*

While a mere 60 pages on a general processor (P4, Jaguar, whatever) and an exacerbating 1200 on the Cell, you also went from the Pentium 4 HT running at 3.4 GHz, this algorithm is able to check 24-million edges per second. On the Cell, 538-million edges per second.

You're absolutely right in that a general benchmark written for generic processor X will have even a P4 run circles around the Cell, but it's ignoring everything Cell was built in mind with, it's a deliberate move from hardware complexity (prefetching, branch prediction, caches) to the programmer having a lot more complexity, but in turn it dedicates more transistors to stuff that makes it fast.

I don't agree with preferring it over the Jaguar, but I quite disagree with looking at a generic 7Zip benchmark as a way of seeing its potential. All it says is that it's not great at unmodified general purpose code.

Nor, by the way, will I agree that ARM is bad for large processors. Developing large cores is an ordeal that can cost into the billions, the market has to show a strong demand for ARM cores. The ISA isn't the limit to scaling up, in fact its sane front end is a saving grace.
The majority of game code is written in high-level languages (C/C++), just like the 7z benchmark. It's mostly vectorized code that is written in hand-tuned assembly.

And yes, I know that the Cell could beat P4 flops-wise. It was made to be an SIMD beast at the expense of PPE's IPC.

~

People have to understand that this trend of moving CPU tasks to the graphics processor is not something new... it has been there for almost 20 years. So, unless you're really young, you should not be surprised by this trend.

Before 3D accelerators (3DFX Voodoo etc.) became the norm, it was the CPU (more precisely the FPU) that had to render 3D graphics. There was even an era (386SX) when CPUs didn't even have a built-in FPU (it used to be a separate chip).

When GeForce 256 came out (the first "GPU"), it moved T&L from the CPU to the GPU with the advent of the DX7 API. People were sceptical of that, since 3DFX reigned supreme back then and it didn't support T&L. People are now sceptical because only AMD offers proper Async Compute support. Déjà vu, right?

All I'm trying to say is I don't understand why some people think that the CPU must do everything. This isn't consistent with the historical trend I'm describing here.

So, why are people so fussed about physics code being processed by the GPU in this generation of consoles instead of the CPU in the previous generation?

~

Regarding ARM, what's the benefit of developing a large core? ARM became popular because of mobile devices that need small, less power-hungry cores. Contrary to the popular belief, even the Jaguar is not a good fit for tablets (5-10W devices), let alone smartphones (2-3W).

ARM is adopting features (like OoO) that x86 had long time ago and this increases the transistor count. Increasing the complexity always comes with a price.

The CISC vs RISC battle is a moot point these days. Why would the market show a strong preference for ARM cores in the desktop/laptop/server segment? x86-64 cores are already a perfect fit for the job.
 

LordOfChaos

Member

You posted AMD putting ARM on the backburner as proof ARM isn't good for high end cores, I said high end cores cost a lot to make and the market has to show a strong demand or else few will risk it, that's all. You're going somewhere else with what I said that is rather complicated to get into, but I'd point to the simplified front end of ARM as a boon for developing high end cores while x86 takes increasingly Intel-ian R&D to shove forward. Sure, ARM would have to get more complex lengthy pipelined cores that eat some of its advantage, but the ISA is no barrier to scaling up, is my main point.

Similarly I'm not sure what I said that launched you into the GPU talk - I'm full aware of it, I've programmed for it, and my post itself supported the fact that GPUs are better at doing what Cell was good at for a CPU in the year it was launched. Which was a saving grace for the lackluster RSX, in the end.

It was made to be an SIMD beast at the expense of PPE's IPC.

Yes, it was SPE heavy. You seem keen to ignore its defining feature. The PPE was nothing special, sure. The idea was to move anything that an SPE would take more time on than a transfer to the SPE. The SPEs were in fact also capable of running C/C++ code, but a copy pasta job obviously wouldn't do all they could.

From the Metro dev,
Oles Shishkovstov: It's difficult to compare such different architectures. SPUs are crazy fast running even ordinary C++ code, but they stall heavily on DMAs if you don't try hard to hide that latency.
Oles Shishkovstov: No, it was not that difficult. The PS3 is simple in one aspect. If some code slows down the frame, move it to SPUs and forget about it. We've built a simple and beautiful system based on virtual threads (fibres) running on two real hardware threads. The beauty comes from the fact that we can synchronously (from the looking of code) offload any task to SPU and synchronously wait for results to continue.

The actual execution, when you look at the full machine, is fully asynchronous. The direct and indirect overhead of that offloading is less than 15 microseconds (as seen from PPU), so every piece of code that takes more than that to execute can be offloaded. All we were doing was profiling some real scene, finding the reason for the slowdown, then moving it to the SPUs. In the shipping product there are almost a hundred different SPU tasks, executing about 1000 times per frame, resulting at up to 60 per cent total load on the SPUs.

As for the RSX, the only thing we offload from it to the SPUs was post-process anti-aliasing. We were out of main memory to offload something else.
 

JoduanER2

Member
Considerably more capable, but as is always the problem, the issue comes back to budgets and price points. iPhone 7 could have a game that looks better than Breath of the Wild, but you won't see it because no one would spend the kind of money to make that game, nor could they get away with charging an appropriate price for it.

Mobile is a very, very different market than console, and Super Mario Run makes a hell of a lot more sense than making a 3D platformer for it.

Can you imagine to have a pocket "console looking" nintendo zelda or mario galaxy?! The big problem is the controller (touch controls <<<<<< buttons and sticks)
 

LordOfChaos

Member
Can you imagine to have a pocket "console looking" nintendo zelda or mario galaxy?! The big problem is the controller (touch controls <<<<<< buttons and sticks)

This feels great, and was reminiscent of what the concept of the NX appears to be, though the issue is still the assumption of how many people have a physical controller like it so most games are hampered by thinking about virtual controls for control depth.



Also app install sizes, I think still a limit? And prices. No one is making AAA games for 6.99 the same as they would for moving the decimal place over.
 

Moreche

Member
Maybe they just don't want to get into the hardcore gaming side of the market and go against consoles. They seem largely cautious about going in big on markets they don't see immediate returns on. Just speculation on my part though.

Also I don't think they want to put the AppleTV on a yearly upgrade cycle yet. I suspect it will be more like Applewatch with a year and a half or two between updates.
Personally if Apple TV could run all of the indie games that are on my PS4, I would question keeping my PS4 as it is mostly indie games I'm playing. iCloud saving when playing between phone, iPad and Apple TV would be fantastic.
 

Servbot24

Banned
If the A10 is so powerful, why don't Apple put it in the Apple TV with a nice heatsink and overclock and blow us away with games?
As much as Apple say it isn't, it's obvious that Apple TV is still a by product to them.

Hi fidelity games aren't very important. Around 450 million PlayStations sold over 22 years. Over 1 billion iPhones sold over 9 years. And the iPhones are sold at a way higher profit margin.
 

J-Skee

Member
Maybe they just don't want to get into the hardcore gaming side of the market and go against consoles. They seem largely cautious about going in big on markets they don't see immediate returns on. Just speculation on my part though.

Also I don't think they want to put the AppleTV on a yearly upgrade cycle yet. I suspect it will be more like Applewatch with a year and a half or two between updates.

The new Apple TV was damn near marketed as a direct competitor to dedicated consoles. It fell flat though.
 
Top Bottom