• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Recent PS4 SDK update unlocked 7th CPU core for gaming

You do know that GCN CU's could actually run a OS right?
Then explain why Sony needs at least 1 Jaguar CPU core to run the OS? You've just contradicted yourself.

The GPU is more powerful, so according to your logic it would make sense to free all 8 Jaguar cores and reserve maybe 1 CU to run the OS. Too bad it's not that simple.

SIMD cores are specialized and they are very bad at running branchy code (i.e. AI). Fact.
 

truth411

Member
I don't think you know what you're talking about.

GPU CUs are akin to Cell SPUs. That's what GPGPU is for. Plenty of devs had said this, including Mark Cerny and ICE Team programmers. Why are you surprised? Cell was an early example of an "APU" (PPE -> Jaguar, SPUs -> GPU Compute Units). It heavily influenced the industry.

It's ridiculous to claim that a fully-fledged OS can run solely on the SPU. I've never accused anyone of lying, so do me a favour and stop putting words in my mouth. I just said that Sony has never elaborated on the PPE resource allocation. Yes, they've explained the SPU allocation, but that's it. You still need the PPE for more generic CPU-oriented tasks. Fact.

Calm down, read what I've said carefully (don't twist my words again) and please check the facts.
That's what Sony said, and your saying that's false. Draw your own conclusions on what that means, In any case if you want to disagree with facts go ahead, well just agree to disagree.

Edit: Also They did say the PPE and 6 spe are for game development. OS is dedicated to the 7th spe, they did say that.
 
That's what Sony said, and your saying that's false. Draw your own conclusions on what that means, In any case if you want to disagree with facts go ahead, well just agree to disagree.
Where has Sony said that? You got a source/link?

AFAIK, they utilized the 7th SPE to accelerate specific tasks like encryption/decryption (anti-piracy measures). I bet you don't even know what branch prediction is and you're here to claim that an SPU can run an entire OS.

Show me an OS than can run solely on an SPU/GPU and explain why they still need 1 Jaguar core for the OS. I'm waiting.
 
PS3 PPU ran the OS and game in separate soft threads. 1 SPU was locked down for security. OS also used part of another SPU for audio.
 
PS3 PPU ran the OS and game in separate soft threads. 1 SPU was locked down for security. OS also used part of another SPU for audio.
Thank you. I think that's a sensible response. That's what I've been saying all along and some people misunderstood me.

ps: Are you a dev, if I may ask?
 

slapnuts

Junior Member
Laptop CPU, completely different from mobile CPU

Just to paint the picture more completely, one Jaguar module [4 cores] spends 15W at default 1.6GHz cock. PS4 and Xbone have two of those modules each.

Yes, we know Jaguar is older and on an older fab process, but we're also comparing a console to a phone, people.

I can tell them that until you're blue in the face and yet they will still want to believe their phone or tablet is more powerful, really, some people believe this.
 

androvsky

Member
Where has Sony said that? You got a source/link?

AFAIK, they utilized the 7th SPE to accelerate specific tasks like encryption/decryption (anti-piracy measures). I bet you don't even know what branch prediction is and you're here to claim that an SPU can run an entire OS.

Show me an OS than can run solely on an SPU/GPU and explain why they still need 1 Jaguar core for the OS. I'm waiting.
Branch prediction isn't the killer there, the biggest problem is Cell SPUs can't access memory outside their local storage. All they can do is scheduled a DMA transfer, they were made that way to force devs to use efficient algorithms. A Cell running an OS would be like running a business with nothing but a firehose for communication. No page tables, no interrupt handler.
 
Branch prediction isn't the killer there, the biggest problem is Cell SPUs can't access memory outside their local storage. All they can do is scheduled a DMA transfer, they were made that way to force devs to use efficient algorithms. A Cell running an OS would be like running a business with nothing but a firehose for communication. No page tables, no interrupt handler.
No disagreement here. The SPUs are optimized for SIMD tasks, just like the GPU CUs, therefore they're unsuitable for running an entire OS.
 

Elandyll

Banned
Wtf are people talking about? Afaik the SPEs are simply not capable to run a full OS for the simple reason that the "Orchestra conductor" that assigns tasks to them is the PPE. SPEs are incapable of assigning themselves tasks, or assigning tasks to each other directly.
It's akin to say that the whole orchestra could run just fine with the cello having the baton would lead the entire orchestra behind him just fine while playing him/herself...

There's abig difference between saying that a co processor has the "capability" of running an OS (not even the case in reality here) and whether it would make any sense...
 

onQ123

Member
Then explain why Sony needs at least 1 Jaguar CPU core to run the OS? You've just contradicted yourself.

The GPU is more powerful, so according to your logic it would make sense to free all 8 Jaguar cores and reserve maybe 1 CU to run the OS. Too bad it's not that simple.

SIMD cores are specialized and they are very bad at running branchy code (i.e. AI). Fact.

Because you can doesn't mean you should, why would they waste a CU on the OS?

GCN CU are not just SIMD it can also be MIMD & SMT.

amd_new_era_small.jpg
 
Wtf are people talking about? Afaik the SPEs are simply not capable to run a full OS for the simple reason that the "Orchestra conductor" that assigns tasks to them is the PPE. SPEs are incapable of assigning themselves tasks, or assigning tasks to each other directly.
It's akin to say that the whole orchestra could run just fine with the cello having the baton would lead the entire orchestra behind him just fine while playing him/herself...

There's abig difference between saying that a co processor has the "capability" of running an OS (not even the case in reality here) and whether it would make any sense...
Exactly. That's why it doesn't make any sense to claim that the 7th SPU can run the entire PS3 OS by itself.

In layman's terms, SPUs are like factory workers. They're numerous, they do one specific job and they're pretty good at it. They don't run the business though and that's why they need the "factory manager" (PPU) to "orchestrate" them.

Because you can doesn't mean you should, why would they waste a CU on the OS?

GCN CU are not just SIMD it can also be MIMD & SMT.

amd_new_era_small.jpg
I'm still waiting for you to show me that GCN CU-powered OS (assuming it doesn't rely on traditional CPUs at all):

http://www.neogaf.com/forum/showpost.php?p=187364673&postcount=648

Don't you think it's dumb to buy both a CPU and a GPU? If a CPU is not really needed, then what's the point of spending extra money? PC gaming would be cheaper as well if that was the case.

This discussion reminds me of the "GPUs will replace CPUs" nonsense. You guys got a dev response, so I don't know why you keep insisting.

If you're wrong, just admit it and move on. I was wrong about the Vita expanded RAM allocation (77MB vs 109MB), because reddit misinformed me. I didn't keep insisting on it.
 

onQ123

Member
Exactly. That's why it doesn't make any sense to claim that the 7th SPU can run the entire PS3 OS by itself.

In layman's terms, SPUs are like factory workers. They're numerous, they do one specific job and they're pretty good at it. They don't run the business though and that's why they need the "factory manager" (PPU) to "orchestrate" them.


I'm still waiting for you to show me that GCN CU-powered OS (assuming that it doesn't rely on traditional CPUs at all):

http://www.neogaf.com/forum/showpost.php?p=187364673&postcount=648

Don't you think it's dumb to buy both a CPU and a GPU? If a CPU is not really needed, then what's the point of spending extra money? PC gaming would be cheaper as well if that was the case.

This discussion reminds me of the "GPUs will replace CPUs" nonsense. You guys got a dev response, so I don't know why you keep insisting.

If you're wrong, just admit it and move on. I was wrong about the Vita expanded RAM allocation (77MB vs 109MB), because reddit misinformed me. I didn't keep insisting on it.

No dev responded to me about GCN CU's being able to run an OS.


AMD's Heterogeneous Queuing (hQ) technology promises to unlock the true potential of APUs, allowing a GPU to queue its own work without the CPU or operating system getting in the way.


The GPU in the PS4 could run a OS if it had to but why would they run the OS on the GPU?
 
Hey, glad to see I'm not totally off base for once. :)
*bows*


Eurogamer did not guess and they have source who confirmed that its shared between OS and Game.
That's not what was said. Firelight — makers of FMOD — told Eurogamer that the core had been unlocked for developers, they had no idea what the OS reservation was, if any, and Razor was able to report both user and OS activity on Core6. They never said it was doing so; just that it could. Then Firelight independently told GamingBolt that the OS reservation was not dynamic, and GamingBolt then assumed the reservation was fixed at some non-zero value. Zoetis then confirmed that the OS reservation is indeed fixed, at 0%. Also, only first-party currently have access.

So that would seem to explain all of the confusion that EG and GB are getting. Basically, Firelight don't know squat. They're not first-party; they're a third-party developer of middleware used on the PS4. As such, they don't actually have direct access to the super-secret SDK and Core6. Basically, Sony just told them to enable access to Core6 from within FMOD, so first-party could continue using the middleware in their app. Firelight did so, and that was pretty much the end of their involvement with the unlocking. Firelight didn't weren't actually involved with the unlocking; they just tweaked their app to leverage it, but they can't even see how it works yet, because they can't access it themselves yet, being third-party.
 

androvsky

Member
No dev responded to me about GCN CU's being able to run an OS.


AMD's Heterogeneous Queuing (hQ) technology promises to unlock the true potential of APUs, allowing a GPU to queue its own work without the CPU or operating system getting in the way.


The GPU in the PS4 could run a OS if it had to but why would they run the OS on the GPU?
Not needing an OS to hand off function pointers is different from actually running an OS. Can a GCN CU take system level interrupts? Can it talk to peripherals without going through the CPU? Can it take over system init from the UEFI?
 
Exactly. That's why it doesn't make any sense to claim that the 7th SPU can run the entire PS3 OS by itself.
You could totally run an entire OS on an SPU though. Use DMA to page memory. It's possible the IO is part of the PPU, but you could handle that on a teeny tiny interrupt.

Be some serious old school voodoo shit though.
 

onQ123

Member
Not needing an OS to hand off function pointers is different from actually running an OS. Can a GCN CU take system level interrupts? Can it talk to peripherals without going through the CPU? Can it take over system init from the UEFI?


If someone wrote that code for the GPGPU yes.
 

DonMigs85

Member
Lol at some of these armchair hardware and programming experts here. GPUs are good with highly parallelizable code, but not branchy data.
Info below taken from http://superuser.com/questions/308771/why-are-we-still-using-cpus-instead-of-gpus

GPGPU is still a relatively new concept. GPUs were initially used for rendering graphics only; as technology advanced, the large number of cores in GPUs relative to CPUs was exploited by developing computational capabilities for GPUs so that they can process many parallel streams of data simultaneously, no matter what that data may be. While GPUs can have hundreds or even thousands of stream processors, they each run slower than a CPU core and have fewer features (even if they are Turing complete and can be programmed to run any program a CPU can run). Features missing from GPUs include interrupts and virtual memory, which are required to implement a modern operating system.

In other words, CPUs and GPUs have significantly different architectures that make them better suited to different tasks. A GPU can handle large amounts of data in many streams, performing relatively simple operations on them, but is ill-suited to heavy or complex processing on a single or few streams of data. A CPU is much faster on a per-core basis (in terms of instructions per second) and can perform complex operations on a single or few streams of data more easily, but cannot efficiently handle many streams simultaneously.

As a result, GPUs are not suited to handle tasks that do not significantly benefit from or cannot be parallelized, including many common consumer applications such as word processors. Furthermore, GPUs use a fundamentally different architecture; one would have to program an application specifically for a GPU for it to work, and significantly different techniques are required to program GPUs. These different techniques include new programming languages, modifications to existing languages, and new programming paradigms that are better suited to expressing a computation as a parallel operation to be performed by many stream processors. For more information on the techniques needed to program GPUs, see the Wikipedia articles on stream processing and parallel computing.

Modern GPUs are capable of performing vector operations and floating-point arithmetic, with the latest cards capable of manipulating double-precision floating-point numbers. Frameworks such as CUDA and OpenCL enable programs to be written for GPUs, and the nature of GPUs make them most suited to highly parallelizable operations, such as in scientific computing, where a series of specialized GPU compute cards can be a viable replacement for a small compute cluster as in NVIDIA Tesla Personal Supercomputers. Consumers with modern GPUs who are experienced with Folding@home can use them to contribute with GPU clients, which can perform protein folding simulations at very high speeds and contribute more work to the project (be sure to read the FAQs first, especially those related to GPUs). GPUs can also enable better physics simulation in video games using PhysX, accelerate video encoding and decoding, and perform other compute-intensive tasks. It is these types of tasks that GPUs are most suited to performing.

AMD is pioneering a processor design called the Accelerated Processing Unit (APU) which combines conventional x86 CPU cores with GPUs. This approach enables graphical performance vastly superior to motherboard-integrated graphics solutions (though no match for more expensive discrete GPUs), and allows for a compact, low-cost system with good multimedia performance without the need for a separate GPU. The latest Intel processors also offer on-chip integrated graphics, although competitive integrated GPU performance is currently limited to the few chips with Intel Iris Pro Graphics. As technology continues to advance, we will see an increasing degree of convergence of these once-separate parts. AMD envisions a future where the CPU and GPU are one, capable of seamlessly working together on the same task.

Nonetheless, many tasks performed by PC operating systems and applications are still better suited to CPUs, and much work is needed to accelerate a program using a GPU. Since so much existing software use the x86 architecture, and because GPUs require different programming techniques and are missing several important features needed for operating systems, a general transition from CPU to GPU for everyday computing is very difficult.
 

The_Dama

Member
When Sony increased the CPU on the PSP to 333MHz or something like that for GoW, did it help other games too? Games that originally were programmed to 232MHz
 
That doesn't mean it can run an OS, nor does it mean it doesn't need an OS at all.

I dunno if you're familiar, but I've talked in the past about how hUMA and GPGPU can be used to boost performance on stuff like AI. Most AI code is branchy prediction stuff that runs really well on the CPU, but you also need to determine what the actor can see around them (visibility), and also how they plan to move from where they are to where they want to be (pathfinding). So you look around, and based on what you see, you decide where you'd like to be instead of where you are, and then you figure out how you're going to get there. (Ladder? Stairs?)

The problem is, the CPU is really really shitty at visibility and pathfinding. In fact, even though it makes up a fairly small portion of the total AI code, a CPU will spend about 90% of it's cycles just on visibility and pathfinding. The other 10% of your CPU time is what makes the actual decisions.

Enter the GPU, which happens to be super awesome at both visibility and pathfinding. So we ask the GPU what the actor can see, and then the CPU can make a decision on what the GPU spots. Not only do we figure what we see much faster overall, we've freed up 90% of the AI core. Now we can make decisions for ten times as many actors, for example. Or we can teach them to make far more complicated decisions.

Now let's add heterogeneous queuing to the mix. Without it, the CPU basically needs to say, "What can actor1 see? Okay, then how does he get to where he's headed now? Okay, how about actor2?" That wastes a lot of the CPU's precious time, and is unnecessary on GCN. Basically, the GPU already knows it needs to rapidly update visibility and pathfinding for every entry in the actors table. Thanks to hUMA, the very same table is shared between the GPU and CPU, so they don't to pass that back and forth either; either can directly access any data they need. That means the GPU just takes the first actor, casts rays out of its face, determines which object(s) of interest those rays intersect, sets the resulting array in the actor's canSee property, and moves on to the next actor. After that, the GPU can lookup each actor's currentLocation and desiredLocation, and write out a nice path for them to start following in the actors table. Hell, it can probably do the visibility and pathfinding simultaneously, actually, since they're not directly interdependent.

All of this occurs with no input from the CPU whatsoever. "But wait! All of the input is coming from the CPU!" Sure, but the GPU doesn't care — nor even know — if it's looking up data the CPU wrote moments ago, or reading from a script written a year ago. It's just reading two points in space and tracing a line between them through the environment, which it also has a record of. All the GPU knows is input comes from A; output goes in B and it just does that, again, and again, and again… because that's what it does.

Meanwhile, from the CPU's perspective, everything is equally simple, but reversed. input comes from B; output goes in A

So, yeah, that's what hQ is all about. It doesn't change what the GPU is capable of; it just means the GPU already knows what it's supposed to be doing with the data, basically. Not only does that free up time on the CPU, it reduces interdependency, which helps you make your engine not just asynchronous but also largely autonomous. Ideally, you want each task to be completely independent from the others. That lets you do all kinds of neat tricks. For example, obviously, you want rendering done at 120 Hz in VR, but does your AI really need to make 120 decisions a second? I don't know that humans change their mind that quickly. Running your AI at 30 Hz or even 15 Hz would save you a lot of cycles, and seems like it would provide sufficient realism, especially if you're running more sophisticated scripts now, with all the time you saved offloading visibility and pathfinding.
 

onQ123

Member
Lol at some of these armchair hardware and programming experts here. GPUs are good with highly parallelizable code, but not branchy data.
Info below taken from http://superuser.com/questions/308771/why-are-we-still-using-cpus-instead-of-gpus

GPGPU is still a relatively new concept. GPUs were initially used for rendering graphics only; as technology advanced, the large number of cores in GPUs relative to CPUs was exploited by developing computational capabilities for GPUs so that they can process many parallel streams of data simultaneously, no matter what that data may be. While GPUs can have hundreds or even thousands of stream processors, they each run slower than a CPU core and have fewer features (even if they are Turing complete and can be programmed to run any program a CPU can run). Features missing from GPUs include interrupts and virtual memory, which are required to implement a modern operating system.

In other words, CPUs and GPUs have significantly different architectures that make them better suited to different tasks. A GPU can handle large amounts of data in many streams, performing relatively simple operations on them, but is ill-suited to heavy or complex processing on a single or few streams of data. A CPU is much faster on a per-core basis (in terms of instructions per second) and can perform complex operations on a single or few streams of data more easily, but cannot efficiently handle many streams simultaneously.

As a result, GPUs are not suited to handle tasks that do not significantly benefit from or cannot be parallelized, including many common consumer applications such as word processors. Furthermore, GPUs use a fundamentally different architecture; one would have to program an application specifically for a GPU for it to work, and significantly different techniques are required to program GPUs. These different techniques include new programming languages, modifications to existing languages, and new programming paradigms that are better suited to expressing a computation as a parallel operation to be performed by many stream processors. For more information on the techniques needed to program GPUs, see the Wikipedia articles on stream processing and parallel computing.

Modern GPUs are capable of performing vector operations and floating-point arithmetic, with the latest cards capable of manipulating double-precision floating-point numbers. Frameworks such as CUDA and OpenCL enable programs to be written for GPUs, and the nature of GPUs make them most suited to highly parallelizable operations, such as in scientific computing, where a series of specialized GPU compute cards can be a viable replacement for a small compute cluster as in NVIDIA Tesla Personal Supercomputers. Consumers with modern GPUs who are experienced with Folding@home can use them to contribute with GPU clients, which can perform protein folding simulations at very high speeds and contribute more work to the project (be sure to read the FAQs first, especially those related to GPUs). GPUs can also enable better physics simulation in video games using PhysX, accelerate video encoding and decoding, and perform other compute-intensive tasks. It is these types of tasks that GPUs are most suited to performing.

AMD is pioneering a processor design called the Accelerated Processing Unit (APU) which combines conventional x86 CPU cores with GPUs. This approach enables graphical performance vastly superior to motherboard-integrated graphics solutions (though no match for more expensive discrete GPUs), and allows for a compact, low-cost system with good multimedia performance without the need for a separate GPU. The latest Intel processors also offer on-chip integrated graphics, although competitive integrated GPU performance is currently limited to the few chips with Intel Iris Pro Graphics. As technology continues to advance, we will see an increasing degree of convergence of these once-separate parts. AMD envisions a future where the CPU and GPU are one, capable of seamlessly working together on the same task.

Nonetheless, many tasks performed by PC operating systems and applications are still better suited to CPUs, and much work is needed to accelerate a program using a GPU. Since so much existing software use the x86 architecture, and because GPUs require different programming techniques and are missing several important features needed for operating systems, a general transition from CPU to GPU for everyday computing is very difficult.

6BRUglx.png



AikcvIp.png
 

onQ123

Member
The GPU can't do everything by itself. The x86 cores are still essential. That's the point you don't seem to get. There aren't even any purely CUDA or OpenCL operating systems at all.

I know that CPUs are better at most OS tasks but my point was that it could be done & not that it should be done. at the end of the day both are GP processors.


That doesn't mean it can run an OS, nor does it mean it doesn't need an OS at all.

I dunno if you're familiar, but I've talked in the past about how hUMA and GPGPU can be used to boost performance on stuff like AI. Most AI code is branchy prediction stuff that runs really well on the CPU, but you also need to determine what the actor can see around them (visibility), and also how they plan to move from where they are to where they want to be (pathfinding). So you look around, and based on what you see, you decide where you'd like to be instead of where you are, and then you figure out how you're going to get there. (Ladder? Stairs?)

The problem is, the CPU is really really shitty at visibility and pathfinding. In fact, even though it makes up a fairly small portion of the total AI code, a CPU will spend about 90% of it's cycles just on visibility and pathfinding. The other 10% of your CPU time is what makes the actual decisions.

Enter the GPU, which happens to be super awesome at both visibility and pathfinding. So we ask the GPU what the actor can see, and then the CPU can make a decision on what the GPU spots. Not only do we figure what we see much faster overall, we've freed up 90% of the AI core. Now we can make decisions for ten times as many actors, for example. Or we can teach them to make far more complicated decisions.

Now let's add heterogeneous queuing to the mix. Without it, the CPU basically needs to say, "What can actor1 see? Okay, then how does he get to where he's headed now? Okay, how about actor2?" That wastes a lot of the CPU's precious time, and is unnecessary on GCN. Basically, the GPU already knows it needs to rapidly update visibility and pathfinding for every entry in the actors table. Thanks to hUMA, the very same table is shared between the GPU and CPU, so they don't to pass that back and forth either; either can directly access any data they need. That means the GPU just takes the first actor, casts rays out of its face, determines which object(s) of interest those rays intersect, sets the resulting array in the actor's canSee property, and moves on to the next actor. After that, the GPU can lookup each actor's currentLocation and desiredLocation, and write out a nice path for them to start following in the actors table. Hell, it can probably do the visibility and pathfinding simultaneously, actually, since they're not directly interdependent.

All of this occurs with no input from the CPU whatsoever. "But wait! All of the input is coming from the CPU!" Sure, but the GPU doesn't care — nor even know — if it's looking up data the CPU wrote moments ago, or reading from a script written a year ago. It's just reading two points in space and tracing a line between them through the environment, which it also has a record of. All the GPU knows is input comes from A; output goes in B and it just does that, again, and again, and again… because that's what it does.

Meanwhile, from the CPU's perspective, everything is equally simple, but reversed. input comes from B; output goes in A

So, yeah, that's what hQ is all about. It doesn't change what the GPU is capable of; it just means the GPU already knows what it's supposed to be doing with the data, basically. Not only does that free up time on the CPU, it reduces interdependency, which helps you make your engine not just asynchronous but also largely autonomous. Ideally, you want each task to be completely independent from the others. That lets you do all kinds of neat tricks. For example, obviously, you want rendering done at 120 Hz in VR, but does your AI really need to make 120 decisions a second? I don't know that humans change their mind that quickly. Running your AI at 30 Hz or even 15 Hz would save you a lot of cycles, and seems like it would provide sufficient realism, especially if you're running more sophisticated scripts now, with all the time you saved offloading visibility and pathfinding.

What would keep the GPGPU from running the OS code if the instructions was written for it?
 

c0de

Member
You do know that GCN CU's could actually run a OS right?

What? How? The GPU would still have to be instructed by the CPU. That's how it currently works. The GPU is there to help CPUs, not replacing them. They can execute a lot of things faster but don't live "on their own".
 

DieH@rd

Banned
When Sony increased the CPU on the PSP to 333MHz or something like that for GoW, did it help other games too? Games that originally were programmed to 232MHz

It helped games that were run on CFW PSPs.There users had the ability to force CPU clocks no matter what game was active.
 

Jenotron

Banned
I'd hate to see the PS4's OS slow down because of this redistribution. I bought a Xbox One on Friday and even though my expectations were super low I'm still disappointed in how sluggish it operates. Its like running Android on a really crappy phone trying to do anything.
 

DieH@rd

Banned
I'd hate to see the PS4's OS slow down because of this redistribution. I bought a Xbox One on Friday and even though my expectations were super low I'm still disappointed in how sluggish it operates. Its like running Android on a really crappy phone trying to do anything.

And did you ever tried operating PS4 OS?
 

driver116

Member
I'm guessing the OS switches to a blocking single core mode when in game and back to dual core when the user goes back to the OS. Makes sense to free the extra core up when out of the GUI.
 

d9b

Banned
Could it be that some of the first party games are already utilising this advantage (Uncharted collection /last patch/ ?)

?
 
No dev responded to me about GCN CU's being able to run an OS.

AMD's Heterogeneous Queuing (hQ) technology promises to unlock the true potential of APUs, allowing a GPU to queue its own work without the CPU or operating system getting in the way.

The GPU in the PS4 could run a OS if it had to but why would they run the OS on the GPU?
I'm afraid you're confusing Asynchronous Compute with running a full-blown OS.

A modern x86-64 processor has tons of features that a GPU doesn't have and it supports over 1000 different instructions. A GPU/SPU is specialized at certain tasks (mainly linear algebra/matrix multiplication). Its feature/instruction set is limited compared to a CPU and that's why they're able to process many Teraflops. It's a different design philosophy, depending on what you want to do with a limited transistor budget. A CPU is a jack of all trades and master at none, while the GPU/SPU is a specialized, streaming processor.
 

Kayant

Member
So other vetted insider earlier (zoiost?) was wrong when he said full access to the core?

Maybe there are two thing here I think

1. FMOD, Eurogamer's source(Which might be FMOD) and lherre(thanks for info as always) all point to it not being completely unlocked. Although they are all third party devs.

2. Zoetis says it's completely unlocked for first party devs atm and says this has been the case for about 2 months which lines about with what lherre said (2-3 months).

So hard to say atm.
 

nortonff

Hi, I'm nortonff. I spend my life going into threads to say that I don't care about the topic of the thread. It's a really good use of my time.
I bet the OS could run alot faster if we could hide/delete/organize everything we wanted.
Just folders for what we want right away and library for everything else.
 
What would keep the GPGPU from running the OS code if the instructions was written for it?
Well, apart from the fact that it's just not very good at it, potentially, missing circuitry. I don't know enough to say for sure, but I wouldn't be at all surprised if the GPU simply lacked the transistors required to perform vital operations, because why would you waste silicon on functionality nobody will ever call? So maybe it can run the entire OS and maybe it can't, but what difference does it really make if it's a terrible idea to begin with?

That's what I was trying to get across in my hQ explanation, actually. It's dumb to force your decision-maker to waste all of its time crunching numbers, but it's equally stupid to make your number-cruncher try to make decisions. That's the advantage of GCN. Everything is designed around the idea of letting the appropriate chip handle the appropriate function, and letting the to chips work as independently from one another as possible. Running the entire OS on the GPU is just as dumb as running the entire OS on the CPU.

Having information like visibility magically update for the CPU is a huge fucking win, so we should be talking about stuff like that, or what devs are gonna run on this extra CPU core, instead of bickering about moot points like running the OS entirely on the GPU. Yes? :)


full core for gaming
2 modes: 10% or 50% depending on the mode used
Fight!!
 
Top Bottom