LordOfChaos
Member
So is the OP correct?
Will the iPhone 7 be the most powerful hardware system Nintendo has officially released a game on?
PC, if we're going with the same allowance of "not exclusively for the top end hardware"
So is the OP correct?
Will the iPhone 7 be the most powerful hardware system Nintendo has officially released a game on?
What games did Nintendo ever release on PC? Like some old Japanese PC? iPhone 7 is probably faster than most old PCs unless they're just a few years oldPC, if we're going with the same allowance of "not exclusively for the top end hardware"
What games did Nintendo ever release on PC? Like some old Japanese PC? iPhone 7 is probably faster than most old PCs unless they're just a few years old
Mario is Missing, and yeah an iPhone 7 is way more powerful than PCs from when that came out, but Mario run also runs on much older iPhones, so applying the same rule both ways.
Mario is Missing
The oldest iPhone that can run Super Mario Run will almost certainly still be faster than any consumer-friendly PC that came out in 1992, the year the game came out..
Oh come on that doesn't count. The MS-Dos version was published by Mindscape, not Nintendo, and I seriously doubt it would run natively today without Dosbox, meaning that it's emulated. And if we're going to just bundle emulation into the equation, then...
Interesting.
Pixel (not vertex) shader processing was not something that Cell excelled at.
I agree with everything else you said though.
...
Software rasterization is inefficient for pixel shaders, so they definitely needed dedicated hardware for that, unless people were OK with PS2-era graphics (no programmable pixel shaders at all, minus a few exceptions).
CISC vs RISC? LOL, where did that come from?!I understand the difference between RISC and CISC. For pure computation the Cell remains faster, which even given the architectural differences is sad for a machine that released 8 years later.
I'm not contradicting anything. For number crunching, we have GPUs these days. Cell is obsolete, but its legacy still lives in every AMD APU.You are contradicting yourself. If you think only highly parallelizeable code matters in gaming, then the Jaguar is a downgrade. In fact, if only parallelizeable code mattered then they could have put an ARM CPU and solely focused on GPGPU instead like Nvidia is doing with Tegra.
AFAIK, the bulk of pixel shader operations were performed by the RSX.Quite a few games used deferred shading on the SPUs, especially later on in the console generation as the SPUs were better than xenos at deferred lighting passes.
http://www.fudzilla.com/news/processors/39179-jim-keller-was-not-a-big-fan-of-k12AMD does have a higher end ARM based SoC, for servers.
The reason they put AMD64 based processors in their APUs is so they can run Windows, and because they are one of 2 current companies with the rights to manufacture them.
http://www.fudzilla.com/news/processors/39179-jim-keller-was-not-a-big-fan-of-k12
x86-64 = better performance
ARM = more suitable for mobile devices
AFAIK, the bulk of pixel shader operations were performed by the RSX.
It was vertex shaders that the RSX was really weak at and it needed some Cell "assistance".
I think you don't know who Jim Keller is and why he dismissed K12 (ARM-based) in favor of Zen (x86-based). There's a reason ARM is restricted to mobile devices mostly.I think you might have a fundamental misunderstanding of processor ISA's if that is what you believe it boils down to.
Not when you have a cross-platform benchmark: http://7-cpu.com/Because comparing instructions per second between RISC and CISC is misleading.
Now you're putting words in my mouth and that's not very cool of you...Yes you are. You claimed that there's no gaming code that can't run on GPGPU while at the same time using general purpose CPU benchmarks to prove your point.
I said gaming related SIMD code, like physics, destruction and stuff like that that can be GPGPU accelerated.Alright, so is there any gaming-related SIMD algorithm (like physics) that can run faster on Jaguar than on the GPU?
Let me get this right... People are telling that the iPhone 7 will be stronger than Videogames like Wii U, Ps3 and Xbox 360?
That machine will be able to handle games like GTA 5 without a problem?
Is really that I reading?
So 8 Jaguars @ 1.6 do 100 GFLOPS, and 8 SPUs @ 3.2 do 200 GFLOPS - where exactly is the FLOPS efficiency advantage of the Jaguars?Regarding flops, do you realize that Jaguar at 1.6 GHz (half the frequency of Cell) can perform 102.4 Gflops? Cell has 200 Gflops at 3.2 GHz. This says to me that Jaguar has a really efficient design (it reminds me of this tweet: https://twitter.com/marcan42/status/274179630599131136).
ps: AMD doesn't have good ARM CPUs. There's a reason they put x86-64 cores in their APUs.
I think you don't know who Jim Keller is and why he dismissed K12 (ARM-based) in favor of Zen (x86-based). There's a reason ARM is restricted to mobile devices mostly.
UPDATE: The iPhone 7 scores better on both single- and multi-core than most MacBook Airs ever made, and performs comparably to a 2013 MacBook Pro.
UPDATE 2: Here’s another eye-opener. Matt Mariska tweets:
@gruber Grain of salt and all, but Geekbench has the iPhone 7 beating the $6,500 12-core Mac Pro in single-thread.
Maybe the fact that Jaguar has the same amount of flops per Hz? If it could run @ 3.2 GHz, it would be on par with Cell.So 8 Jaguars @ 1.6 do 100 GFLOPS, and 8 SPUs @ 3.2 do 200 GFLOPS - where exactly is the FLOPS efficiency advantage of the Jaguars?
That still doesn't explain why Apple has to buy CPUs from Intel, instead of putting their own ARM-based ones... they made the switch from PPC to x86, so it shouldn't be that hard to go full ARM.
Leaving aside the fact Jaguar cannot run at 3.2GHz by design (it'd have a very different pipeline if it could), is CELL some measure for FLOP efficiency these days? I can think of a dozen CPUs that do the same (or higher) FLOPS/clock. CELL surely had lots of FLOPS for its time, but my 4-core desktop has those FLOPS and then some. So your impressive efficiency comment is puzzling. Heck, a 16-core Epiphany does 32GFLOPS at 2W and 65nm - now that's impressive for a non-GPU. Jaguar is a run-of-the-mill mobile-class design. Sure it has the best IPC in the x86 entry-level mobile word but that alone is not much of an achievement.Maybe the fact that Jaguar has the same amount of flops per Hz? If it could run @ 3.2 GHz, it would be on par with Cell.
That's what lithography advancements can do to your designs. For reference, my intel IGP does 320GFLOPS @ ~8W @ 22nm fabnode, whereas my 2009 Tesla does 34GFLOPS @ ~20W @ 40nm. Woe is nvidia, I guess. Drawing efficiency conclusions out of thin air, across multiple lithography node generations, taking TDP ratings for linear (how many Watts would an 8-core Jaguar consume at 3.2GHz again?) does not really help make a point. Just saying.Not to mention power consumption (Jaguar @ 1.6 consumes 30 watts)...
Profit margins would be my uneducated guess.That still doesn't explain why Apple has to buy CPUs from Intel, instead of putting their own ARM-based ones... they made the switch from PPC to x86, so it shouldn't be that hard to go full ARM.
I never said that it would be able to run @ 3.2 GHz. I know that it has a short pipeline.Leaving aside the fact Jaguar cannot run at 3.2GHz by design (it'd have a very different pipeline if it could), is CELL some measure for FLOP efficiency these days? I can think of a dozen CPUs that do the same (or higher) FLOPS/clock. CELL surely had lots of FLOPS for its time, but my 4-core desktop has those FLOPS and then some. So your impressive efficiency comment is puzzling. Heck, a 16-core Epiphany does 32GFLOPS at 2W and 65nm - now that's impressive for a non-GPU. Jaguar is a run-of-the-mill mobile-class design. Sure it has the best IPC in the x86 entry-level mobile word but that alone is not much of an achievement.
I never said that consumption scales linearly. Jaguar @ 2 GHz consumes 50 watts (@28nm).That's what lithography advancements can do to your designs. For reference, my intel IGP does 320GFLOPS @ ~8W @ 22nm fabnode, whereas my 2009 Tesla does 34GFLOPS @ ~20W @ 40nm. Woe is nvidia, I guess. Drawing efficiency conclusions out of thin air, across multiple lithography node generations, taking TDP ratings for linear (how many Watts would an 8-core Jaguar consume at 3.2GHz again?) does not really help make a point. Just saying.
Apple has a higher profit margin by designing their own CPUs.Profit margins would be my uneducated guess.
That still leaves me puzzled about your 'impressive efficiency' comment. What exactly is impressive in Jaguar's 4-way SIMD * 2 ops /clock in this day and age?I never said that it would be able to run @ 3.2 GHz. I know that it has a short pipeline.
It's like comparing P3 vs P4. They were not designed with the same goals in mind.
But you quoted a power rating @ a given clock, in the context of Jaguar having the same FLOPS/clock as CELL. And where are you taking those TDPs from? I hope you're not extrapolating from APU numbers, as those never give the CPU ratings.I never said that consumption scales linearly. Jaguar @ 2 GHz consumes 50 watts (@28nm).
Lithography advancements always help. Clearly NV have had the architecture TDP edge with Maxwell/Pascal, but AMD cannot blame lithography for that.Lithography advancements do not always help that much (e.g. Polaris vs Maxwell/Pascal).
Apple are already designing their own CPUs. But they'd also have to produce them. Whereas Intel are masters of production but they do not quite design stellar tablet CPUs. It'd be curious to know why you think TSMC/Samsung can provide better prices per transistor than Intel.Apple has a higher profit margin by designing their own CPUs.
emulators on pc exist
If the A10 is so powerful, why don't Apple put it in the Apple TV with a nice heatsink and overclock and blow us away with games?
As much as Apple say it isn't, it's obvious that Apple TV is still a by product to them.
If the A10 is so powerful, why don't Apple put it in the Apple TV with a nice heatsink and overclock and blow us away with games?
As much as Apple say it isn't, it's obvious that Apple TV is still a by product to them.
I listen to more music than anywhere inside of my car.Game companies make more money on Iphone than on Wii U. Is the Wii U even a gaming system?
A10 is a really killer in single task... that is a non brainer.
Min spec for Super Mario Run seems to be the 5S. It won't run on anything older than an A7 it seems.The iPhone 7 benchmarks are really impressive, but the problem with smartphone games is that they are usually made targeting a min spec that is far less powerful than the current flagship, so we don't often get to see what the latest hardware is capable of outside of tech demos in press conferences.
Min spec for Super Mario Run seems to be the 5S. It won't run on anything older than an A7 it seems.
Well, I mean SMR looks like it could run on a PSP even but that doesn't mean anything. The 5 and 5C can run the current iOS update but not SMR so I'm not sure that's really the limiting factor here either.Chû Totoro;217354797 said:Yes but not because the game is too demanding but because of iOS constraints.
An iPhone is not a dedicated gaming system so you can't use the resources like you would do it on a console.
Do people really think that a runner game needs the power of an iPhone 7 (or even 5S) to run properly? You can't be serious...
I think you don't know who Jim Keller is and why he dismissed K12 (ARM-based) in favor of Zen (x86-based). There's a reason ARM is restricted to mobile devices mostly.
Not even Apple uses their (spectacular) ARM cores outside of mobile devices.
ps: Servers run Linux for the most part. Windows is a more popular OS in desktop computers.
Not when you have a cross-platform benchmark: http://7-cpu.com/
Are you still going to pretend that Cell PPE was a great CPU back in 2005? Because even a shitty, inefficient Pentium 4 from 2004 could beat it in plenty of benchmarks.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
/* ... */
/* the graph */
vertex_t * G;
/* number of vertices in the graph */
unsigned card_V;
/* root vertex (where the visit starts) */
unsigned root;
void parse_input( int argc, char** argv );
int main(int argc, char ** argv)
{
unsigned *Q, *Q_next, *marked;
unsigned Q_size=0, Q_next_size=0;
unsigned level = 0;
parse_input(argc, argv);
graph_load();
Q =
(unsigned *) calloc(card_V, sizeof(unsigned));
Q_next =
(unsigned *) calloc(card_V, sizeof(unsigned));
marked =
(unsigned *) calloc(card_V, sizeof(unsigned));
Q[0] = root;
Q_size = 1;
while (Q_size != 0)
{
/* scanning all vertices in queue Q */
unsigned Q_index;
for ( Q_index=0; Q_index<Q_size; Q_index++ )
{
const unsigned vertex = Q[Q_index];
const unsigned length = G[vertex].length;
/* scanning each neighbor of each vertex */
unsigned i;
for ( i=0; i<length; i++)
{
const unsigned neighbor =
G[vertex].neighbors[i];
if( !marked[neighbor] ) {
/* mark the neighbor */
marked[neighbor] = TRUE;
/* enqueue it to Q_next */
Q_next[Q_next_size++] = neighbor;
}
}
}
level++;
unsigned * swap_tmp;
swap_tmp = Q;
Q = Q_next;
Q_next = swap_tmp;
Q_size = Q_next_size;
Q_next_size = 0;
}
return 0;
}
Honestly, what's so impressive about Cell minus the SPU part? Would Cell be a memorable processor if it had no SPUs at all? I appreciate the legacy it left behind, but that's it.
If the A10 is so powerful, why don't Apple put it in the Apple TV with a nice heatsink and overclock and blow us away with games?
As much as Apple say it isn't, it's obvious that Apple TV is still a by product to them.
Well, I mean SMR looks like it could run on a PSP even but that doesn't mean anything. The 5 and 5C can run the current iOS update but not SMR so I'm not sure that's really the limiting factor here either.
Chû Totoro;217358109 said:I don't know too for real but from what I saw the game doesn't seem that demanding... maybe high res assets are something to take into consideration but that's about it.
But maybe 5S is the minimum because of this (iOS Metal feature sets)
Which GPU Family is it?
I'm not a mobile game dev nor a console game dev though.
The majority of game code is written in high-level languages (C/C++), just like the 7z benchmark. It's mostly vectorized code that is written in hand-tuned assembly.How was that written for Cell? As I was getting at in earlier pages, code complexity blew up dramatically with it, but it could also do some good pushing once all was said and done.
This routine for edge detection
Code:*code snip*
While a mere 60 pages on a general processor (P4, Jaguar, whatever) and an exacerbating 1200 on the Cell, you also went from the Pentium 4 HT running at 3.4 GHz, this algorithm is able to check 24-million edges per second. On the Cell, 538-million edges per second.
You're absolutely right in that a general benchmark written for generic processor X will have even a P4 run circles around the Cell, but it's ignoring everything Cell was built in mind with, it's a deliberate move from hardware complexity (prefetching, branch prediction, caches) to the programmer having a lot more complexity, but in turn it dedicates more transistors to stuff that makes it fast.
I don't agree with preferring it over the Jaguar, but I quite disagree with looking at a generic 7Zip benchmark as a way of seeing its potential. All it says is that it's not great at unmodified general purpose code.
Nor, by the way, will I agree that ARM is bad for large processors. Developing large cores is an ordeal that can cost into the billions, the market has to show a strong demand for ARM cores. The ISA isn't the limit to scaling up, in fact its sane front end is a saving grace.
It was made to be an SIMD beast at the expense of PPE's IPC.
Oles Shishkovstov: It's difficult to compare such different architectures. SPUs are crazy fast running even ordinary C++ code, but they stall heavily on DMAs if you don't try hard to hide that latency.
Oles Shishkovstov: No, it was not that difficult. The PS3 is simple in one aspect. If some code slows down the frame, move it to SPUs and forget about it. We've built a simple and beautiful system based on virtual threads (fibres) running on two real hardware threads. The beauty comes from the fact that we can synchronously (from the looking of code) offload any task to SPU and synchronously wait for results to continue.
The actual execution, when you look at the full machine, is fully asynchronous. The direct and indirect overhead of that offloading is less than 15 microseconds (as seen from PPU), so every piece of code that takes more than that to execute can be offloaded. All we were doing was profiling some real scene, finding the reason for the slowdown, then moving it to the SPUs. In the shipping product there are almost a hundred different SPU tasks, executing about 1000 times per frame, resulting at up to 60 per cent total load on the SPUs.
As for the RSX, the only thing we offload from it to the SPUs was post-process anti-aliasing. We were out of main memory to offload something else.
Considerably more capable, but as is always the problem, the issue comes back to budgets and price points. iPhone 7 could have a game that looks better than Breath of the Wild, but you won't see it because no one would spend the kind of money to make that game, nor could they get away with charging an appropriate price for it.
Mobile is a very, very different market than console, and Super Mario Run makes a hell of a lot more sense than making a 3D platformer for it.
Can you imagine to have a pocket "console looking" nintendo zelda or mario galaxy?! The big problem is the controller (touch controls <<<<<< buttons and sticks)
Personally if Apple TV could run all of the indie games that are on my PS4, I would question keeping my PS4 as it is mostly indie games I'm playing. iCloud saving when playing between phone, iPad and Apple TV would be fantastic.Maybe they just don't want to get into the hardcore gaming side of the market and go against consoles. They seem largely cautious about going in big on markets they don't see immediate returns on. Just speculation on my part though.
Also I don't think they want to put the AppleTV on a yearly upgrade cycle yet. I suspect it will be more like Applewatch with a year and a half or two between updates.
If the A10 is so powerful, why don't Apple put it in the Apple TV with a nice heatsink and overclock and blow us away with games?
As much as Apple say it isn't, it's obvious that Apple TV is still a by product to them.
Maybe they just don't want to get into the hardcore gaming side of the market and go against consoles. They seem largely cautious about going in big on markets they don't see immediate returns on. Just speculation on my part though.
Also I don't think they want to put the AppleTV on a yearly upgrade cycle yet. I suspect it will be more like Applewatch with a year and a half or two between updates.