What is an APU AMD Jaguar featured in the PS4 explained
The age of the APU is firmly here and for low power devices, it brings a plethora of various benefits.
An APU (or, to give it its full name Accelerated Processing Unit) integrates the CPU and GPU on one die. The idea behind this really started to hit home after AMD bought out / merged with ATI. AMD then started a new project by the name of fusion. The concept behind this merger is to reduce the number of components, along with heat and power, while bringing performance up along with it.
AMD Fusion (as it was formerly known) was rebranded recently to AMD Accelerated Processing Unit. Its a type of SoC (System on Chip). For home PC systems, there are obvious issues with this including limiting the upgrade path, however, for cheaper systems, laptops and other lower power devices, an APU starts to make increasing sense.
Dataflow for AMDs Bobcat APU similar to the Jaguar
Its becoming increasingly more common tasks are being given to the GPU that arent just graphics. The nature of a GPU is such that its extremely good with parrel calculations. Certain applications on your PC take advantage of these heavily. Video editing applications, such as Adobe Premiere, are a fantastic example of the GPU being used for more than just game graphics. Hardware Physx (using Nvidias Cuda technology that you swould see on say their Geforce cards) are another.
Sonys Playstation 4 APU has been constructed to allow the GPU to help the CPU out with certain calculations when required, via use of the various GCN cores. The PS4 sports 18 of these GCN cores, which combined put out around 1.84Teraflops of compute power. The CPU part of the APU package is the weaker side of things. The AMD Jaguar uses 8 cores, clocked at 1.6GHZ. The Jaguar is itself, based on a redesign and improvement of AMDs Bobcat.
The APU drastically improves data flow, eliminating the need for the data to flow out and around the motherboard. It allows huge amounts more data to be accessed by the GPU/CPU. The thoughts behind it are simple to combine a general purpose CPU cores with programmable vector processing engines. Or, to put it into another way the PS4′s APU will be based on technology that ties CPU Scalar processing with high performance Parallel Processing.
architecture of the South Islands GPU Which is inside the APU of the PS4
In the AMD APU solution (so that would be for example, the Playstation 4′s Jaguar), a fully programmable X86 (although the PS4 version also combines X64 instructions too) CPU with the vector processing power of a GPU is tied together on a single piece of silicon by a super high performance bus. They are both are able to access the RAM (in the Playstation 4′s case, the 176GB/S of the GDDR5). The Silicon of the APU generally contains other important system elements, for example the I/O controller and memory controllers. A lot of the functionality normally placed on the northbridge, thus reducing the bandwidth needed to shuttle data around the board.
Vector processors have been built into GPUs for awhile now, and are made up of thousands upon thousands of compute cores. These compute cores (known as shaders) each can operate simultaneously and can be fully programmed.
The Playstation 4′s Jaguar APU is the most complicated that AMD have built so far and AMD are planning to sell cut down versions of the core to computer vendors (although currently, theyll only be 4 cores. Also, obviously lacking the Sony specific changes).
Some numerically intensive problems lend themselves to parallel algorithms, and others dont. When a machine optimized for parallel computation encounters a problem that cannot be computed in a parallel manner, the machine operates as an inefficient scalar processor, and most of its parallel computing resources sit idle. Conversely, a processor optimized for scalar calculations cannot exploit the parallelism in many algorithms, and thus is limited by its scalar processing speed.
As said early, the playstation 4 uses the Jaguar APU. The Jaguar offers numerous improvements over the Bobcat core used in the companys existing Brazos platform. Jaguar not only doubles the amount of cores, from two to four, but also the size of Level 2 cache to 2MB, and it makes the cache shareable among all the cores.
AMD Bobcat APU specs vs the PS4′s Jaguar APU specs
What are Scalar CPUs?
Scalar Processors (so, the CPU Jaguar part of the APU inside the Playstation 4) operate on arrays of data, element by element. lets say an application needs to say add a 1000 elements in vector A to a separate list of 1000 in Vector B, and then store the combination of these results in Vector C. It will generally set pointers to each of the vectors, and load the values pointed to by both A and B into the CPU registers. It will then add those together and then store them in the location of C. However, to do all of this it would require the CPU to complete this work 1000 times!
So what about Vector Processors like in the PS4′s GCN Units
Advanced GPUs have hundreds of units available to calculate instructions, and can all operate simultaneously. Lets assume the same problem as before, and the application in question wants to add two one-thousand element vectors using just say 10 of the processing units on the GPU. The software will then send the work to each of the units at the same time, so theyll all be working to calculate the results in around a tenth of the time that the CPU would
but like most things in IT, its never quite that easy.
The architecture of Bobcat -the Jaguar Predecessor
Youve also got to consider several other variables such as the time it takes to set these operations up to begin with (after all, its not just magic that things work), time taken to ensure that the calculations have been done, and correctly too, and the time taken to actually move the data around from the Scalar CPUs RAM and the RAM of the GPU. In other words, latency introduced from the simple act of the two devices telling each other what the other one is doing.
Because APUs combine the two together on the same die, the APU doesnt have these issues. Sony have worked with AMD in constructing an absolute monster of a die. The Playstation 4′s APU will be able to move data around extremely quickly, with the GCN cores tightly integrated with the CPU, both with direct access to the same memory controller. In the case of the Playstation 4 (which obviously will require plenty of bandwidth to keep the CPU and GPU satisfied) the systems designers went with GDDR5 RAM. The CPU and GPU will both be able to be used to compliment the weaknesses of the other.