• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

The iPhone 7 will the be the most powerful gaming system any Nintendo game has run on

Peltz

Member
The larger numbers = better game is annoying, you can see it with the complaints that the NX's screen will "only" be 720p, which is apparently terrible. Despite having a dot pitch that'll make the pixels indistinguishable to the eye at a reasonable viewing distance. Its ppi is slightly higher than the Vita, and its screen is larger and higher resolution, so that extra resolution will either be used to make things bigger, or show even more than developers could on the Vita.

It's the same story with phones. 1440p sounds nice in theory, but 1) The phone ends up rendering more pixels that are indistinguishable to the eye, and 2) OLED screens utilise a PenTile subpixel structure, which gives an effective resolution of around 2/3-3/4 of the actual pixel count because every third subpixel is shared between neighbouring pixels (this leads to a "brick"-like arrangement when using devices like the HTC Vive and Rift which is absent on PS VR, despite PS VR having a lower resolution panel.

But yeah, the metric seems to be how big the number is instead of how it works in practise. Sure, the iPhone 7 might "only" be 750p, but 1) the pixel count is fine for the majority of viewing instances 2) the device renders a sensible resolution for performance and battery life and 3) It's the first display shipping on a phone to accurately support both the sRGB and wider DCI-P3 colour gamuts, and it automatically switches between them depending on the content being shown to ensure sRGB content isn't oversaturated. But of course that's not something worth applauding or buying the device for because it's not a bigger number like 1440p.
Fucking thank you.

Too many people are jacked up about the numbers, especially regarding resolution. Resolution is only one piece of the whole puzzle when evaluating hardware.... Particularly on handhelds more than anything else.
 

BDGAME

Member
Let me get this right... People are telling that the iPhone 7 will be stronger than Videogames like Wii U, Ps3 and Xbox 360?

That machine will be able to handle games like GTA 5 without a problem?

Is really that I reading?

17258149236_8b656d78a9_o.png
 

CronoShot

Member
Let me get this right... People are telling that the iPhone 7 will be stronger than Videogames like Wii U, Ps3 and Xbox 360?

That machine will be able to handle games like GTA 5 without a problem?

Is really that I reading?

17258149236_8b656d78a9_o.png
In theory, yes.

In practice, you have to think of things like heat, battery life, and the fact that no one is going to build a game that only works on the newest iPhone.

Also, why did this turn into an iOS vs Android thread.
 

LordOfChaos

Member
Let me get this right... People are telling that the iPhone 7 will be stronger than Videogames like Wii U, Ps3 and Xbox 360?

That machine will be able to handle games like GTA 5 without a problem?

Is really that I reading?

17258149236_8b656d78a9_o.png


I think all the OP was going for was a very literal grammatical statement, in theory, yes, the iPhone CPU and GPU are far above anything Nintendo has shipped.

In practice, phone games don't receive all the development cost of games you can charge 60 dollars for. The best ones are often older games from other platforms.
 
I think all the OP was going for was a very literal grammatical statement, in theory, yes, the iPhone CPU and GPU are far above anything Nintendo has shipped.

In practice, phone games don't receive all the development cost of games you can charge 60 dollars for. The best ones are often older games from other platforms.

The OP states that the "iPhone 7 will be the most powerful gaming system any Nintendo game has run on" which, even if we take it to mean "any Nintendo game officially targeting said system", is factually incorrect because Mario is Missing is a PC game, and there are PCs more powerful than the iPhone 7.

Honestly this is the same type of situation as releasing a PC game, since this will run on older model iPhones (and Android phones) too with a wide range of power levels, so I don't see how we can't count Mario is Missing as a Nintendo game which is targeted for hardware much more powerful than an iPhone 7.
 
I wonder if the topic creator was reading Gruber before making this thread...

John Gruber said:
Also, consider this: the iPhone 7 (and probably 6S too) will be the most powerful computing hardware any Nintendo game has ever run on.

---

In theory, yes.

In practice, you have to think of things like heat, battery life, and the fact that no one is going to build a game that only works on the newest iPhone.

Also, why did this turn into an iOS vs Android thread.

Heat and battery life aren't really the biggest problems, in my eyes. An iPad Pro, for example drains 19% for every hour of play in an intensive 3D game, leading to 5 hours of sustained load battery life, but most iOS games, which don't max out the hardware under load, drain the battery at the same rate as other tasks like browsing using 4G.

But we've had that discussion before. The bigger issue, to me, is one of storage. While Apple has lifted the old 3GB app size limit, developers aren't going to be releasing 20GB games when the majority of owners have 16 or 64GB devices.

Which is actually the same problem Vita faces right now. 4GB game cards are pricey to buy from Sony, or they just aren't big enough for ports of PS4 games. World of Final Fantasy requires a separate download onto the user's memory card for voiceovers since they didn't fit on the game card, to name one example.

Luckily Nintendo has realised this and seems to have partnered with Macronix to deliver 32GB game cards as the recommended standard for the format.
 

LordOfChaos

Member
The OP states that the "iPhone 7 will be the most powerful gaming system any Nintendo game has run on" which, even if we take it to mean "any Nintendo game officially targeting said system", is factually incorrect because Mario is Missing is a PC game, and there are PCs more powerful than the iPhone 7.

Honestly this is the same type of situation as releasing a PC game, since this will run on older model iPhones (and Android phones) too with a wide range of power levels, so I don't see how we can't count Mario is Missing as a Nintendo game which is targeted for hardware much more powerful than an iPhone 7.


Ah, well the PC game does invalidate it then.
 

Genio88

Member
Really? My Xiaomi Mi5 with Snapdragon 820 is almost as fast as iPhone 7, can't wait to play Zelda Breath of the Wild and Mario Maker on it!
 

BigEmil

Junior Member
Let me get this right... People are telling that the iPhone 7 will be stronger than Videogames like Wii U, Ps3 and Xbox 360?

That machine will be able to handle games like GTA 5 without a problem?

Is really that I reading?

17258149236_8b656d78a9_o.png
Yes the last gen PS3/360 version of GTA V. Not the maxed out settings PC version in your screenshot
 
not exactly

mario is missing was released in 92 for DOS

you'd have to emulate it through DOSBOX to play on a modern pc, same as you'd have to emulate a SNES or N64 game

As others have said you can install DOS on modern machines, even if it doesn't run all that well. It's semantics, but the OP is still factually incorrect no matter how you spin it.

Either way this whole thing is an argument of semantics, as you cannot play Wii U level games on an iPhone for any appreciable amount of time, so this all depends on how you define powerful.
 

ethomaz

Banned
Let me get this right... People are telling that the iPhone 7 will be stronger than Videogames like Wii U, Ps3 and Xbox 360?

That machine will be able to handle games like GTA 5 without a problem?

Is really that I reading?

17258149236_8b656d78a9_o.png
GTAV didn't looks nowhere near your pic neither on PS4/XB1.

And yes... iPhone 7 can run GTAV like it was on PS360 easily.

GTAPS32_zpsb42e8a9d1_zpsc6f1df0d.jpg
 

Killyoh

Member
GTAV didn't looks nowhere near your pic neither on PS4/XB1.

And yes... iPhone 7 can run GTAV like it was on PS360 easily.

GTAPS32_zpsb42e8a9d1_zpsc6f1df0d.jpg
I'd not be surprised if Rockstar release GTA IV on iOS and Android, after all even San Andreas is out.

But running GTA V? I'd be amazed.
 

Interfectum

Member
Let me get this right... People are telling that the iPhone 7 will be stronger than Videogames like Wii U, Ps3 and Xbox 360?

That machine will be able to handle games like GTA 5 without a problem?

Is really that I reading?

Did you just post a picture of the PC version of GTA5?

Is that really what I'm seeing?
 
I mean technically isn't this the case whenever a new system that runs Nintendo games comes out? The Wii was the most powerful system any Nintendo game has run on when it came out. Just seems like a case of semantics. And regardless it's not like any game is really going to take advantage of all the extra power on a phone.

The GBA was not more powerful than the N64 or SNES. The DS was not more powerful than a Gamecube. This is a handheld device that's more powerful than their latest console, which makes it more than a case of semantics.
 
The GBA was not more powerful than the N64 or SNES. The DS was not more powerful than a Gamecube. This is a handheld device that's more powerful than their latest console, which makes it more than a case of semantics.

The GBA is more powerful than the SNES. The former's cpu is a powerhouse compared to the puny rioch in the older system.

The GBA could play Yoshi's Island without additional enhancement chips and it could do 3d software rendering.

It was only lacking in the sound department which consisted of the original GameBoy PSG channels and 2 8-bit (noisy) software channels which used the cpu to mix the sound.
 
Remember when the iPhone was announced and it was said to be as powerful as a dreamcast for gaming? amazing how fast tech evolves in the mobile space.
 

HMD

Member
I'd not be surprised if Rockstar release GTA IV on iOS and Android, after all even San Andreas is out.

I honestly never thought that PS3/360 quality games can ever run on a phone. It's crazy how fast mobile technology moves. In 5 years can phones run Witcher 3?
 
The GBA is more powerful than the SNES. The former's cpu is a powerhouse compared to the puny rioch in the older system.

The GBA could play Yoshi's Island without additional enhancement chips and it could do 3d software rendering.

It was only lacking in the sound department which consisted of the original GameBoy PSG channels and 2 8-bit (noisy) software channels which used the cpu to mix the sound.


Wasn't aware of that. I based that statement off of the lackluster ports the system received of SNES and Genesis games.

The primary comparison stands however, Nintendo's handheld tech has, in the past, lagged behind its most recent console release. GBA arrived after the N64 and was significantly weaker.
 

BDGAME

Member
Did you just post a picture of the PC version of GTA5?

Is that really what I'm seeing?

Yes. My mistake.
But, in the end, see a smartphone with the amount of power to run a last generation AAA is very crazy.

About NX, for the rumors, it will be something like 300 to 600 GFlops, what is above the 230 GFlops of the Iphone 7.
 
"gaming system"

Also apple has been going with that "as powerful as consoles or more so" bullshit for YEARS. Fuck off and show me the results. Every game on mobile is a trash time waster. No physical controls = shit. I don't know how or why people waste their time :)D) with that garbage. Nintendo being on mobile just means more money for them to make real games, and I'm fine with that.
 
Not modern PC processors for sure. Not even old ass PC processors. But last gen console CPUs, which guy 1 was comparing it to? Huge pipeline, huge pipeline flush penalty, nearly no branch prediction, no prefetchers, last gen console CPUs?

The Cell could go places if you put a whole lot of manual work in, and on paper, sure, more Gflops than 6.5 Jaguar cores. But people hugely understate the scale up in software complexity it brought to make up for all of the above. This code for edge detection

Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
 
/* ... */
 
/* the graph */
vertex_t * G;
 
/* number of vertices in the graph */
unsigned card_V;
 
/* root vertex (where the visit starts) */
unsigned root;
 
void parse_input( int argc, char** argv );
 
int main(int argc, char ** argv)
{
  unsigned *Q, *Q_next, *marked;
  unsigned  Q_size=0, Q_next_size=0;
  unsigned  level = 0;
 
  parse_input(argc, argv);
  graph_load();
 
  Q      = 
          (unsigned *) calloc(card_V, sizeof(unsigned));
  Q_next = 
          (unsigned *) calloc(card_V, sizeof(unsigned));
  marked = 
          (unsigned *) calloc(card_V, sizeof(unsigned));
 
  Q[0] = root;
  Q_size  = 1;
  while (Q_size != 0)
    {
      /* scanning all vertices in queue Q */
      unsigned Q_index;
      for ( Q_index=0; Q_index<Q_size; Q_index++ )
      {
        const unsigned vertex = Q[Q_index];
        const unsigned length = G[vertex].length;
        /* scanning each neighbor of each vertex */
        unsigned i;
      for ( i=0; i<length; i++)
          {
            const unsigned neighbor =
              G[vertex].neighbors[i];
      if( !marked[neighbor] ) {
            /* mark the neighbor */
            marked[neighbor]      = TRUE;
            /* enqueue it to Q_next */
            Q_next[Q_next_size++] = neighbor;
          }
        }
      }
      level++;
      unsigned * swap_tmp;
      swap_tmp    = Q;
      Q           = Q_next;
      Q_next      = swap_tmp;
      Q_size      = Q_next_size;
      Q_next_size = 0;
    }
  return 0;
}

60 lines of source code for any general processor.

1200 lines of code for Cell. Twelve fricking hundred.



I agree with guy 1 here, Jaguars are far preferable to last gen, and the Cell would only edge it out in edge cases for a lot more work, and cases where GPGPU is far better now anyways.
Interesting.

So, I guess executable code is a lot smaller in Jaguar's case? I knew RISC was more memory-hungry than CISC, but I didn't expect it to be that bad.

And to think that the PS3 only had 256MB RAM & 256MB VRAM...

--

Btw, is there anything in mobile SoC technology that matches the sheer bandwidth of Wii U's eDRAM? People always tend to forget that flops mean nothing without an equally fast memory to feed them.

Apple - and most mobile SOCs - have serious Memory speed and bandwidth issues vs conventional consoles, and apps have pretty strict size limits. So while you can do pretty amazing things, the market doesn't lean in that direction because there is way more money in games that are small enough to download over LTE and cheap enough that they don't cost $25 up front.

Mobile stands to see the biggest gains in GPU performance related to games when they can start taking advantage of HBM. ( or when Apple allows for 10GB + games...)
Since we're talking about stacked DRAM, would HBM increase the thickness of a smartphone? Apple is a bit obsessed with that. :p

You're assuming that Cell's only market was a CPU/GPU type combo. Cell BE was intended to be used in smart TVs where its SIMD implementation would have allowed for practically infinite decoding power (Toshiba showed off 48 1080p MPEG-2 streams at one point) and smarts for things like web browsing, voice recognition and what not. But instead ARM came with fixed function hardware at half the price and ate their breakfast on that one.

IBM was also running it in the HPC market for a little while where you basically throw any sort of single precision math at it and have it processed ridiculously quickly. The batch job nature of Cell SPEs made it basically IBM's mainframe wheelhouse as far as software went.

They probably picked Cell because it was infinitely flexible for a console. You need extra geometry? Extra shader power? Extra lighting? The SPEs have your back!
Pixel (not vertex) shader processing was not something that Cell excelled at.

I agree with everything else you said though.

they reportedly invested less then $400 in the R&D of CELL
On other hand, MS paid 3 billion $ for a semi-custom APU:

https://www.vg247.com/2013/05/27/amd-microsoft-deal-for-xbox-one-cost-over-3-billion/

I wonder how much a modern Cell equivalent would cost...

Still, Sony was thinking about a GPU far earlier than you would think, the pure software rasterisation dream probably did not make it out of the initial labs theorisations and patented to cover future ground. CPU makers have yet to give that dream up... see Intel a few years later with Larrabee for example. It may have not been with nVidia.
Software rasterization is inefficient for pixel shaders, so they definitely needed dedicated hardware for that, unless people were OK with PS2-era graphics (no programmable pixel shaders at all, minus a few exceptions).

I remember using a software renderer to emulate GeForce 3 pixel shaders 15 years ago on an AMD Duron & GeForce 2 MX rig... it was insanely slow compared to native hardware support.

I'd not be surprised if Rockstar release GTA IV on iOS and Android, after all even San Andreas is out.

But running GTA V? I'd be amazed.
A 20GB mobile game? Very unlikely to happen.
 
Since we're talking about stacked DRAM, would HBM increase the thickness of a smartphone? Apple is a bit obsessed with that.

It increases the size of the chip stack but the actual physical object you see on a chipboard is virtually unchanged because most of what you're seeing is actually a heat spreader. (It's why you don't see the actual lithography.) For Apple's purposes it doesn't affect anything because there are too many other bigger objects within the phone already, like the battery, frame, etc.

tapantaola said:
A 20GB mobile game? Very unlikely to happen.

Apple increased the maximum app size last year from 2GB to 4GB, which basically puts iOS on par with the 3DS. Developers can only exceed this size by downloading data from within the app after it is installed, but in some cases iOS is allowed to delete this data to make space for other things so it's not a great solution.

In addition, to download over LTE the maximum app size is 100MB, so anyone who tries to buy such a game anywhere but home would be prevented from doing so.
 

LordOfChaos

Member
Shared cache among the big and LITTLE cores surely would differentiate it from big.LITTLE, where big cores' cluster has its own shared L2, and likewise with and LITTLE cluster, and coherency is provided by the CCI. But I'm not sure that'd be sufficient to claim Fusion is not big.LITTLE. As long as there's coherency, and straight-forward migratability of a task between the different power-domain cores, I don't see what'd make Fusion non-big.LITTLE.

Chipworks die shot is out fwiw, they're unsure where the little cores are right now, but if their two guesses to the left are right the caches don't seem like they could be shared. However if the right is right and the four cores are homogenized in one area then yes

oPxqfzA.jpg
 

LordOfChaos

Member
Interesting.

So, I guess executable code is a lot smaller in Jaguar's case? I knew RISC was more memory-hungry than CISC, but I didn't expect it to be that bad.


Btw, is there anything in mobile SoC technology that matches the sheer bandwidth of Wii U's eDRAM? People always tend to forget that flops mean nothing without an equally fast memory to feed them.

Not about RISC vs CISC, but specifically the Cell. Since you had to do loop unrolling and manually feeding the local memory (not cache) etc, code got a lot bigger to do simple things.

As for the sheer bandwidth of the Wii U eDRAM, maybe not in total combined bandwidth just yet, but the A9X was worth 50GB/s, if the Wii U's eDRAM is around there as I recall it was it's a bit faster at full bore with the extra 12GB/s DDR3 thrown in, but that's for a small memory set, vs the mobile chip doing it for all system memory. Sort of reminiscent of the PS4 GDDR5 vs the XBO DDR3+eSRAM.

http://www.anandtech.com/show/9766/the-apple-ipad-pro-review/2
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Chipworks die shot is out fwiw, they're unsure where the little cores are right now, but if their two guesses to the left are right the caches don't seem like they could be shared. However if the right is right and the four cores are homogenized in one area then yes

oPxqfzA.jpg
It appears they got something fundamentally wrong - the bigLITTLE (let's leave the semantics arguments for now and keep calling it that for the sake of clarity) scheme is 2x2, not 4x4. So the LITTLE cores marked on the picture cannot be dual cores, and thus are most likely not the LITTLE cores at all. That before even considering cores using the SRAM blocks would not be that far from the latter.
 
The Cell was over-marketed and over-BSed as a 1Tflops CPU while in fact it was a 100Gflops one that's for sure, but PS4's Jaguar is by no means an upgrade.
Wrong. You need to get your facts straight.

First, it was the RSX that was marketed as a 1.8TF "beast" (BS nVflops).

PS3s2Tflops_zpse0c83ba7.jpg


Sony never claimed that the Cell could do more than 200 Gflops (it's 25.6 GFlops per SPU, do the math).

Second, did you check this site? -> http://7-cpu.com/

IPC-wise the in-order Cell PPE was worse than even the ancient (out-of-order) Pentium 3 (the basis of Intel Core CPUs). OG XBOX had a P3 CPU.

Cell PPE was almost as bad as Intel Atom (an in-order x86 CPU)... you know, the laptop/tablet CPU that everyone should be mocking? Oh wait, no one did that!

A developer in Beyond3D has said that the octa-core Jaguar is 2.5-3x faster than the Xenon (triple-core PPC XBOX 360 CPU). The PS3 only had one PPC core, so the Jaguar is like a 9x upgrade over the shitty, in-order Cell PPE. Whether we like it or not, Jaguar is the fastest CPU that a PlayStation/XBOX console ever had.

You guys need to learn the difference between MIPS and flops. Flops are only good for certain algorithms.

IBM/Sony/Toshiba had a limited transistor budget at that time (2005, 90nm). A in-order CPU is slower, but it also requires less transistors than an out-of-order one. Sony wanted to devote a lot of transistors to dedicated SIMD-heavy cores. This is no different than the semi-custom console AMD APU design. More transistors are allocated to GPU Compute Units than CPU cores. It has been like that for over 15 years (even the GameCube had more transistors to the GPU than the CPU).
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
A developer in Beyond3D has said that the octa-core Jaguar is 2.5-3x faster than the Xenon (triple-core PPC XBOX 360 CPU). The PS3 only had one PPC core, so the Jaguar is like a 9x upgrade over the shitty, in-order Cell PPE. Whether we like it or not, Jaguar is the fastest CPU that a PlayStation/XBOX console ever had.

You guys need to learn the difference between MIPS and flops. Flops are only good for certain algorithms.
There's one problem with that statement - those certain algorithms were quite popular in games, and were exclusively done on the CPU, before the GPGPU advent. So yes, Xenon, let alone CELL were FLOPS monsters, and if you decided to port their FLOPS-centric code to the Jaguars in the current crop of consoles you'd have certain issues (of the sort ps3/360/wiiU multiplorts had, just not as severe).

So Jaguars are the fastest general-purpose CPUs a console has had yet, but that does not cover all classes of problems a console CPU has/had to solve.
 
There's one problem with that statement - those certain algorithms were quite popular in games, and were exclusively done on the CPU, before the GPGPU advent. So yes, Xenon, let alone CELL were FLOPS monsters, and if you decided to port their FLOPS-centric code to the Jaguars in the current crop of consoles you'd have certain issues (of the sort ps3/360/wiiU multiplorts had, just not as severe).

So Jaguars are the fastest general-purpose CPUs a console has had yet, but that does not cover all classes of problems a console CPU has/had to solve.
And why would you do that?

jq7lIDu.png


There's a reason PS4/XB1 have fully programmable GPUs, let alone Sony (Mark Cerny) adding more CUs and ACEs to the PS4 GPU for this very reason (this is no different than making Cell more SIMD-heavy than Xenon).
 

LordOfChaos

Member
It appears they got something fundamentally wrong - the bigLITTLE (let's leave the semantics arguments for now and keep calling it that for the sake of clarity) scheme is 2x2, not 4x4. So the LITTLE cores marked on the picture cannot be dual cores, and thus are most likely not the LITTLE cores at all. That before even considering cores using the SRAM blocks would not be that far from the latter.

I thought that at first, but I think they're just pointing to three options. Big and little cores are in one cluster of four, or there's two big on the right and two somewhere else on the right in either spot, not both.

Which is to say, they're saying either there's two in the big block and two in either of the two highlighted second blocks, or four on the right. Not all three together. They still think it's 2+2, just not sure where the 2 little are.


Power7 40threads (8 cores) 37000 56000
Power8 80 (10 cores) 57000 74000


attachment.php


Also reconfirms that Xenon and Cell had lower IPC than even IBMs contemporary PowerPC G5 (970) at the time. Which should have never been a surprise with the narrow architectures with a bunch of the stuff that makes processors manage themselves stripped out, but I remember lots of people fighting over that in 2005 on.
http://www.anandtech.com/show/1702/2
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
And why would you do that?
Because GPGPUs are not CPUs and they show different performance characteristics. Not every CPU SIMD code is portable to GPGPU with not performance implications.

I thought that at first, but I think they're just pointing to three options. Big and little cores are in one cluster of four, or there's two big on the right and two somewhere else on the right in either spot, not both.

Which is to say, they're saying either there's two in the big block and two in either of the two highlighted second blocks, or four on the right. Not all three together. They still think it's 2+2, just not sure where the 2 little are.
Ok, but that still leaves the question why the LITTLE cores would be so far from the cache pools.
 
Because GPGPUs are not CPUs and they share different performance characteristics. Not every CPU SIMD code is portable to GPGPU with not performance implications.
I don't know of any SIMD algorithm that cannot run on the GPU much faster... from physics processing to cryptocurrency mining.

Of course that doesn't mean that it's a copy/paste job. It needs to be written in a appropriate language that the processor can understand. Companies like ND, DICE and even Bluepoint have done it.

Even hand-tuned VMX128 assembly cannot be copy-pasted and expect it to run on Jaguar's AVX units.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
I don't know of any SIMD algorithm that cannot run on the GPU much faster... from physics processing to cryptocurrency mining.
Ignoring data roundtrips, anything with low-enough coherency will run less efficiently on GPUs than on CPUs. Small enough batches of such tasks (where aggregate throughput cannot tip the balance in favor of the GPU pipeline) can run faster on SMP CPUs than on GPUs. There's a reason why at some workloads multi-TFLOP GPUs only edge out on several-hundred GFLOP CPUs.

Of course that doesn't mean that it's a copy/paste job. It needs to be written in a appropriate language that the processor can understand. Companies like ND, DICE and even Bluepoint have done it.

Even hand-tuned VMX128 assembly cannot be copy-pasted and expect it to run on Jaguar's AVX units.
I don't think anybody in this thread has implied anything about copy-pasting yet ; )
 
Ignoring data roundtrips, anything with low-enough coherency will run less efficiently on GPUs than on CPUs. Small enough batches of such tasks (where aggregate throughput cannot tip the balance in favor of the GPU pipeline) can run faster on SMP CPUs than on GPUs. There's a reason why at some workloads multi-TFLOP GPUs only edge out on several-hundred GFLOP CPUs.


I don't think anybody in this thread has implied anything about copy-pasting yet ; )
Alright, so is there any gaming-related SIMD algorithm (like physics) that can run faster on Jaguar than on the GPU?
 

Cynar

Member
Let me get this right... People are telling that the iPhone 7 will be stronger than Videogames like Wii U, Ps3 and Xbox 360?

That machine will be able to handle games like GTA 5 without a problem?

Is really that I reading?

17258149236_8b656d78a9_o.png
Yes but in reality that won't be the case.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Alright, so is there any gaming-related SIMD algorithm (like physics) that can run faster on Jaguar than on the GPU?
You missed what I said. "SIMD algorithm" is a very broad generalisation of a class of algorithms that can work in parallel on data in lock-step. Unfortunately, the algorithms that can run in lock-step 100% of the time are not that many - that means full coherency, or IOW no control flow divergences to any measurable degree. So normally by SIMD algorithms we refer to any algorithm that can have a non-negligible SIMD portion. Unfortunately, most meaningful tasks also have parts like:
Code:
if (runtime_condition)
    foo();
else
    bar();
Where the GPU has to execute both foo() and bar(), and depending on the complexity of those two, that can cause large decoherentisation in a SIMD block (workgroup/warp/wavefront/you-name-it bunch). Now, imagine an algorithm that has the following structure:
Code:
lockstep_blockA;
if (runtime_condition)
    foo();
else
    bar();
lockstep_blockB;
Depending on how much more efficient a CPU is at that middle bottleneck (for the GPU) section, a CPU with sufficient SIMD instrumentarium can beat a GPU 1:N-hundred. That is, a single CPU can perform equivalently-or-better to N-hundred GPU threads. Now, the next step is, imagine your original batch of work is less than N-hundred data units. In this case it doesn't matter if your GPU has less than N-hundred, N-hundred, or a bazillion threads - it will never outperform said CPU unit.
 

LordOfChaos

Member
Because GPGPUs are not CPUs and they share different performance characteristics. Not every CPU SIMD code is portable to GPGPU with not performance implications.

I agree, I think they just took a few stabs in the dark. The four cores being contained in the large rightmost block would make the most sense to me, especially as Fusion was described as sharing caches rather than big.LITTLE's per-cluster sharing, as you said with coherency over CCI.

It will be interesting if someone can test how much faster at switching it is than big.little
 
Top Bottom