vg247-PS4: new kits shipping now, AMD A10 used as base, final version next summer

Both machines seem quite capable, although there is no doubt in my mind that Microsoft will win the graphics race (past empahsis on GPU centric machines, the heavy customization involved this time... :). Not by a huge margin though!
You are approaching this in the complete wrong way. Both consoles are designed with different goals in mind. One will have faster frame rates, better texture streaming, and the ability to push more polygons. The other will offer more texture variety in static scenes (Heavy Rain), higher resolution textures, and better detail.

If the rumored specs turn out to be 100% on the money, the machines will have advantages for different genre's and exclusives will look more exceptional than they already do this generation. Third parties might have porting headaches, but all things considered, it will be a wash.

Why the sudden change of heart? A few weeks ago you even thought that 2.2+ TFlops are possible and now you "know" that it won't be more than 1.84 TFlops? The same goes for Proelite - on one hand posting the "weak" specs and in a B3D thread he posts that 8 times the performance of a PS3 is a conservative estimate. Makes the
discussion and credibility rather difficult.
In the next month or two, more leaks will come. Right now this is what I know, and there is a reason quite a few are emphasizing the above said specs.

So you claim to know the (january) devkit hardware or at least know someone who does? Instead of talking about rather useless Flop numbers we could talk about if the APU will feature GCN 2.0, if it is a mobile version of a card, if it is Steamroller/Jaguar, if there is BC and so on. Especially if there are no custom parts, no delay for Steamroller/HD8xxx series why is there such a long hold up? Why are there worries for a 2014 release and why does Aegies only hears that Sony is quiet and doesn't know the "direction"...
Aegies, I honestly believe, knows absolutely nothing. Sony has not been as quiet as you would believe. We've had more leaks, and one leak with almost the exact specs of the Orbis. Sony being behind could allude to Steamroller being delayed, the GDDR5 headaches, and design decision on how much RAM to include. Microsoft does not dick around with schedules, especially in the Balmer era. You can damn well bet they will launch on time and in fashion.
 
I'd really like to see Sony employees' faces when reading this thread. Are they happy that they got it right or - even better - that will surprise us, or on the contrary: terrified, because it's already too late to change anything... That would be precious :)

BTW, does anyone know what Matt Swoboda, Principal Engineer at SCEE is up to recently? His last presentation from GDC 2012 was really interesting and was hinting at some quite exciting developments in raytracing, ssao, procedural rendering, liquids simulations, etc.

Or this guy?
 
If the rumored specs turn out to be 100% on the money, the machines will have advantages for different genre's and exclusives will look more exceptional than they already do this generation. Third parties might have porting headaches, but all things considered, it will be a wash.

.
the potential downside is multiplatform development will go lowest common denominator. But hopefully good teams will find ways to leverage the strengths of each platform.
 
I'd really like to see Sony employees' faces when reading this thread. Are they happy that they got it right or - even better - that will surprise us, or on the contrary: terrified, because it's already too late to change anything... That would be precious :)

BTW, does anyone know what Matt Swoboda, Principal Engineer at SCEE is up to recently? His last presentation from GDC 2012 was really interesting and was hinting at some quite exciting developments in raytracing, ssao, procedural rendering, liquids simulations, etc.

Or this guy?
Holy shit! Amazing find on the twitter. "FPGA" guy- :) good work man.
 
the potential downside is multiplatform development will go lowest common denominator. But hopefully good teams will find ways to leverage the strengths of each platform.
I don't mind if the graphics get tuned down but I hope we won't get another Skyrim disaster again or if the PS3 will seriously suffer when it comes to third party applications. That brings me to another question:

How would both consoles compare when it comes to power for things like physics, proper destruction, loading times, ... basicly everything that is often far more important than a realtime realistic shadow that takes longer to process than everything else ;-)

From my post in the 720 thread:


[...]I could live with a minor bump in graphics if the overall credibility of the "world" gets better. More people wandering the streets, realistic physics, proper destruction, etc. add much more to the atmosphere than graphics alone. Of course depending on the game (eg. Splinter Cell) - a proper shadow and correct lighting are important aswell. I couldn't care less for all those glossy combat suits if I can't shoot a tin can from a table[...]
 
The only customizations from MS to make a 1.2 tflops gpu better is adding HSA or a new tesselation unit really fast, with more than 2 poly setup engines that has GCN, lets talk like in GTX680 of lots of them. But not sure it could compensate a 1,82 tflops one...
 
So the PS4 may actually have 4GB of GDDR5? Funny how some were so adamant that it wouldn't be the cast due to motherboard complexity/cost.
I doubt it ( more for heat reassons that for cost ). It must be 2.5 stacked DDR3 but with a interposer allowing 1024 bits of bandwith or something like that. I think this ( huge bandwith via interposer ) is the major R+D investment in PS4.
 
The only customizations from MS to make a 1.2 tflops gpu better is adding HSA or a new tesselation unit really fast, with more than 2 poly setup engines that has GCN, lets talk like in GTX680 of lots of them. But not sure it could compensate a 1,82 tflops one...
Microsoft is not foolish. Durango's customized parts are an unknown quanity, but an emphasized one. Don't just look at the TF number and try to derive its potential.
 
Microsoft is not foolish. Durango's customized parts are an unknown quanity, but an emphasized one. Don't just look at the TF number and try to derive its potential.
But what is phisically impossible is still impossible. No in the known person ( insider, engineer or whatever ) here or in B3D has a clue of what magic could make a 1.2 tflop gpu behave like a 680GTX. Seriously. In a graphics engine with only closed rooms TBDR could make that ( and that is a Power VR patent ) , but... what else?.

Another thing possible but out of the rumors that could help reach this: 16 CPU threads in a HSA enviroment. That would suppose a new gaming engines paradigm and make things in another way feasible ( Tim Sweeney talked about multi CPU cores engine centric in some of his interviews ).
 
But what is phisically impossible is still impossible. No in the known person ( insider, engineer or whatever ) here or in B3D has a clue of what magic could make a 1.2 tflop gpu behave like a 680GTX. Seriously. In a graphics engine with only closed rooms TBDR could make that ( and that is a Power VR patent ) , but... what else?.

Another thing possible but out of the rumors that could help reach this: 16 CPU threads in a HSA enviroment. That would suppose a new gaming engines paradigm and make things in another way feasible ( Tim Sweeney talked about multi CPU cores engine centric in some of his interviews ).
you can look at expected engine processing flows (eg for a deferred renderer) and put optimisations in that drastically speed certain sections up, giving you overall speed equivalent to a faster GPU that has to brute force it.

Eg writing stuff to memory is the big one I guess. Being able to keep things on die in Edram for as long as possible is potentially huge (although we said a lot of that with 360)

also having the CPU on the same die as the GPU gives you huge latency savings.
 
Could someone post a small recap for me, how do we know that PS4 suddenly has 4 GB GDDR5? And that it's 30% over Durango in terms of TFLOPS? Something new leaked (January devkits)?
 
thuway said:
Holy shit! Amazing find on the twitter. "FPGA" guy- :) good work man.
antic604 said:
Matt Swoboda, Principal Engineer at SCEE is up to recently? His last presentation from GDC 2012 was really interesting

I thought it sounded important, though honestly I don't know what it means (even after skimming through the wikipedia entry :)) Still, he says part-time which might suggest it's only a hobby...
If you look through the link, over and over again the ms it takes in ray tracing or intersection calculation are always the stumbling blocks so he finds ways around it by blurring a smaller sample. FPGA can be 100X faster at intersection calculations and only a small FPGA needed. My guess anyway from doing reading on this months ago when the rumors of FPGA from Sony CTO (programmable arrays) filtered through SemiAccurate (FPGA). First generation games will not use the FPGA if it's in there as only the beta versions of developer hardware will have it.

If you read the cites I posted, the best way to produce a cutting-edge, powerful, and inexpensive next generation game console is using a transposer and breaking up the design in to smaller higher yielding parts and then assembling them after testing onto a Silicon interposer. Using the same transposer you can also have 3D stacked ultra wide IO memory.

You come to this design choice by reading the AMD reasoning in a 2011 article (below) and understanding that yields have been decreasing as we have moved to smaller node sizes. If it's APU plus GPU plus custom hardware IP on SI, it's still the same idea...all will be pretested and either 3D or 2.5D attached on a SI that also acts like a MCM.

http://eda360insider.wordpress.com/2011/12/14/3d-week-driven-by-economics-its-now-one-minute-to-3d/ said:
According to the data gleaned from presentations by Samsung, Toshiba, AMD, and others, 3D IC assembly gives you the equivalent performance boost of 2 IC generations (assuming Dennard scaling wasn’t dead). Garrou then quoted AMD’s CTO Byran Black, who spoke at the Global Interposer Technology 2011 Workshop last month. AMD has been working on 3D IC assembly for more than five years but has intentionally not been talking about it. AMD’s 22nm Southbridge chips will probably be the last ones to be “impacted by scaling” said Black. AMD’s future belongs to partitioning of functions among chips that are process-optimized for the function (CPU, Cache, DRAM, GPU, analog, SSD) and then assembled as 3D or 2.5D stacks.
Same AMD philosophy echoed here in a 2012 article. Kabini/Temesh are the first SoCs, Richland is not a SoC so it again supports Jaguar CPUs.

I'm at a loss to understand the one large chip Xbox 3 rumor as such a chip at 28nm is going to have very low yields at first and Yield will be an issue for the entire life, even FPGAs are broken up into 2 to 4 chips and 2.5D attached to a SI. This has been done since 2010. Also a single large custom chip can not fully take advantage of AMD R&D going forward.

Near future off the shelf AMD designs would be the best way to take advantage of AMD R&D, 20nm 2014 designs (Pennar) ported to 28nm and those include SI and 3D stacked memory as seen in the AMKOR PDF. Those designs are waiting on 3D stacked memory which Console volumes are going to drive and make available.

This rumor of a one chip Xbox3 design is so stupid I have a hard time believing it. Only if this design was chosen for a 2012 launch (no Jaguar likely no 8000 series GPU and 32nm SOI not 28nm) would it be believable. That both Kryptos and Thebe were outed at the same time by HWinfo.com makes me believe the designs are similar and the large one chip design is confusing large Silicon Interposer @ 32nm with large monolithic silicon chip.

If we can believe Sweetvar26, whose info has been correct so far, Kryptos has a jaguar CPU and it was forged after Thebe which also has a jaguar CPU. HWinfo.com seems to support this (Oct 8, 2012 info release for both).

Even in 2008 AMD to increase yield connected a pretested CPU and GPU together to create a fusion chip:

http://www.reghardware.com/2008/08/18/amd_swift_apu/ said:
AMD's first 'Fusion' processor isn't due until mid-2009, but specs are leaking out. Given how AMD has touted its 'native' multi-core designs, AMD's next-gen part isn't as integrated as you'd expect

The chip, codenamed 'Swift', brings together CPU and GPU. However, according to Taiwanese mobo-maker moles, cited by Chinese-language site HKEPC, Swift's 'Kong' graphics core is separate die built into the CPU package.

Does this matter? From a technology standpoint, not much, and from a business perspective it's a smart move. AMD can be sure both GPU and CPU work properly before they're sealed in their black ceramic shell. That means better yields and better profitability.

As AMD puts it, Swift will be an "optimised design using more existing IP for less risk and faster time to market".

However, having spent so long needling Intel over the fact that the chip giant's quad-core chips were simply two dual-core dies in the same package whereas its quad-core CPUs were single slabs of silicon, it's pleasing to see that the boot is now firmly on the other foot.

AMD's use of an established GPU core for Kong - it's based on the 'RV710' - will mean Swift gets DirectX 10.1 support and will feature the company's UVD video decoding core. Kong, like Swift's CPU die, will be fabbed at 45nm.

Curiously, the moles claim Swift will not use HyperTransport but a new bus codenamed 'Onion'. The GPU's link to the on-board memory controller is called 'Garlic', apparently. However, the description is vague, so it's probably best not to read too much into this at the moment.

Kong is said to clock at 600-800MHz and link to DDR 3 graphics memory over a 128-bit bus.

The CPU will be a multi-core 'Stars' design.

Swift's northbridge components come from those designed for AMD's 'Griffin' mobile CPUs - aka the Turion X2 Ultra.

And, lastly, what do you call a CPU+GPU combo? AMD's acronym is APU - Accelerated Processing Unit.
Likely the APU + GPU will be connected together by a similar custom "Vegetable buss".
 
Could someone post a small recap for me, how do we know that PS4 suddenly has 4 GB GDDR5? And that it's 30% over Durango in terms of TFLOPS? Something new leaked (January devkits)?
A source has recently come out and said the latest devkit with PS4 has 4 GB GDDR5. He also mentions there will be no APU + GPU, but it will be APU only. The rest of the specs match up almost exactly with the VGleaks. I could say something to get people fired, but the VGleaks page has the majority of things correct. Things can change from now til launch.
 
A source has recently come out and said the latest devkit with PS4 has 4 GB GDDR5. He also mentions there will be no APU + GPU, but it will be APU only. The rest of the specs match up almost exactly with the VGleaks. I could say something to get people fired, but the VGleaks page has the majority of things correct. Things can change from now til launch.
Thx thuway, helpful as always!
 
A source has recently come out and said the latest devkit with PS4 has 4 GB GDDR5. He also mentions there will be no APU + GPU, but it will be APU only. The rest of the specs match up almost exactly with the VGleaks. I could say something to get people fired, but the VGleaks page has the majority of things correct. Things can change from now til launch.
One more question, is it UMA (GDDR5)?
 
A source has recently come out and said the latest devkit with PS4 has 4 GB GDDR5. He also mentions there will be no APU + GPU, but it will be APU only. The rest of the specs match up almost exactly with the VGleaks. I could say something to get people fired, but the VGleaks page has the majority of things correct. Things can change from now til launch.
http://www.vgleaks.com/world-exclusive-ps4-in-deep-first-specs/ Really old spec before Jaguar was announced. APU only and GDDR5 both have issues. AMD recommends APU + GPU until 2014 GPUs and GDDR5 is too expensive and hot.

CPU

4 core (2 core pairs) 3.2 GHz AMD x86 (Steamroller) moved on to 2 Jaguar packages (Sweetvar26 and Kabini (Jaguar) is the first SoC that can add third party IP)
aggregate, 10x PS3 PPU performance
512 KByte L2 per core pair with Jaguar L2 is 2 meg/package
64 bit pointers

GPU

AMD R10x series GPU @ 800 MHz (Tahiti) moved on from 7000 series GPU to 8000 series GPU
aggregate 10x RSX performance, 1.84 TFlops aggregate is APU GPU + second GPU
DX”11.5″ Current Microsoft DX spec is 11.1 this must include some future features. At the time this was posted OpenGL was catching up with DX
cheap branching
18 compute units, focus on fine grained compute & tessellation 2 CUs + 16 CUs (Temesh APU has a 8280 2 CUs)

MEMORY:

2 GByte UMA, pushing for 4 GByte 4GB UMA likely wide IO
192 GByte/ sec non-coherent access for e.g. GPU Yole speculation 102 Gbyte/sec apx equal to wide IO memory (512 bit) Could be 1024 bit and apx 200 GB/sec
12 GByte/ sec coherent access for CPU both explained by this link to a AMD developer slide show.
>50% longer memory latency for CPU/ GPU compared to PC!!!
DXT is lowest bpp texture compression format

MEDIA:

50 GByte BD drive likely upgraded to 4 layer which will allow 4K blu-ray using HEVC
PCAV, between 3.3 and 8x BD, most likely clamped to 6x
automatic background caching to HDD
HDD SKU with at least 380 GByte
downloadable games

EXTRA HARDWARE:

video encoder/ decoder (Decoder in all AMD APUs) Encoder may be also for Chat
audio processing unit, ~200 concurrent MP3 streams
HW zlib decompressor already in all AMD APUs part of the W3C specs
Missing here is the DSP for the depth camera
FPGA speculation based on CTO programmable arrays and SemiAccurate FPGA. Small FPGA for Ray intersection-database calculations 100X faster than CPU(May be the DX 11.5 feature).
16 gig Flash memory for fast loads of OS and memory swaps

Yole attended the Sony VP Technology Platform lecture and speculated 512 bit wide IO for PS4. In the lecture Yutaka was comparing the PS3 1 S3D to what would be needed by the PS4 to support 5 S3D views. 5 X PS3 bandwidth is about 102 Gbytes/sec and equal to or less than 512 bit wide IO bandwidth. The cost for TSVs and Transposer are fixed, 3D stacked DDR memory should be much cheaper and cooler than GDDR5. AMD GPUs would still be memory bandwidth starved at 100 GB/sec so some internal cache for GPU will be needed. Since AMD has plans to use Wide IO memory in 2014 there should be some near future AMD GPU design including a eDRAM or eSRAM cache that will be used.

The specs above were taken from a developer platform and are close to final target.

HWinfo.com outed the Thebe and Kryptos AMD designs in their Oct 8 2012 update.

AMD Southern Islands: Radeon HD 7950 (TAHITI PRO), Radeon HD 7970 (TAHITI XT), Radeon HD 7990 (NEW ZEALAND), IBIZA, COZUMEL, KAUAI.
AMD London series: WIMBLEDON, PITCAIRN, HEATHROW, CHELSEA.
AMD THAMES, LOMBOK, GREAT WALL, SUMMER PALACE, CAPE VERDE.
AMD Sea Islands: Oland, Bonaire, Hainan, Curacao, Aruba.
AMD Solar Systems: Mars, Sun, Neptune, Venus.
AMD Fusion: SUMO, WRESTLER, TRINITY DEVASTATOR/SCRAPPER, RICHLAND, THEBE, KRYPTOS.
AMD Kabini, Samara, Kaveri, Pennar.
Pennar/Samara (2014 Jaguar APU) ported from 20nm TSMC database to Global Foundries. It has 2 128 bit LPDDR3 memory interfaces and a New GNB. It fits in the same product lineup as Kabini. My guess is that Pennar is the near future AMD APU design being used by the PS4.

links indicating PS4 is using a Interposer and stacked (likely DDR3) memory.

Professional papers predicting the PS4 will use an interposer and stacked RAM

Low power XTV mode needed and stacked wide IO memory needed

PS4 includes encoder to serve to handhelds

Amkor now assembling multiple interposer and stacked RAM projects
 
http://www.vgleaks.com/world-exclusive-ps4-in-deep-first-specs/ Really old spec before Jaguar was announced.

CPU

4 core (2 core pairs) 3.2 GHz AMD x86 (Steamroller) moved on to 2 Jaguar packages (Sweetvar26 and Kabini (Jaguar) is the first SoC that can add third party IC)
aggregate, 10x PS3 PPU performance
512 KByte L2 per core pair with Jaguar L2 is 2 meg/package
64 bit pointers
GPU

AMD R10x series GPU @ 800 MHz (Tahiti)
aggregate 10x RSX performance, 1.84 TFlops aggregate is APU GPU + second GPU
DX”11.5″
cheap branching
18 compute units, focus on fine grained compute & tessellation 2 CUs + 16 CUs (Temesh APU has a 8280 2 CUs)
MEMORY:

2 GByte UMA, pushing for 4 GByte 4GB UMA likely Ultrawide IO
192 GByte/ sec non-coherent access for e.g. GPU GDDR5 is 193 Gbyte/sec apx equal to wide IO2 memory (512 bit) not ultrawide IO (1024 bit and generally considered eDRAM attached to the logic)
12 GByte/ sec coherent access for CPU
>50% longer memory latency for CPU/ GPU compared to PC!!!
DXT is lowest bpp texture compression format
MEDIA:

50 GByte BD drive
PCAV, between 3.3 and 8x BD, most likely clamped to 6x
automatic background caching to HDD
HDD SKU with at least 380 GByte
downloadable games
EXTRA HARDWARE:

video encoder/ decoder (Decoder in all AMD APUs) Encoder may be also for Chat
audio processing unit, ~200 concurrent MP3 streams
HW zlib decompressor already in all AMD APUs part of the W3C specs
Rigby I never said this was 100% exactly the same thing, but its 90% there.
 
If single APU has decent enough performance (10x PS3), why would you still want discrete GPU? Pro forma?

I'm thinking that with a single APU, system should be much smaller than with a discrete GPU.
Are that any APUs that have something around 2,5 Teraflops which would be the minimum for a decent next gen jump?
 
Waiting for some real news is so painful. We're after CES and for now got nothing. So maybe we have to wait for GDC?
This is about as real as it gets. Expect more leaks to come very soon. Including code names :). The information I have is all relatively new. Once the heat from CES dies down, more people will chirp in.

Are that any APUs that have something around 2,5 Teraflops which would be the minimum for a decent next gen jump?
No and don't expect Sony or MS to waste silicon on developing a machine that can push 2.5 TF. The GPU in both systems are most likely Franksteins with the most important parts included and any excess waste cut off. TF is not a good metric for judging system potential.
 
What from VGLeaks has ever been accurate to warrant this trust?
You will just have to trust your fellow GAF members on this one :). In a few weeks someone will leak out new information. New kits should ship soon in any case and the information we have will once more become irrelevant. The more important thing is we can ballpark what Sony is attempting.
 
Are that any APUs that have something around 2,5 Teraflops which would be the minimum for a decent next gen jump?
No, at least not commercially. Probably, with a proper customization, it can be done, but as it was mentioned before, don't rely on FLOPS too much. They will tell you very little about system's true power.
 
You will just have to trust your fellow GAF members on this one :). In a few weeks someone will leak out new information. New kits should ship soon in any case.
Ok cool thanks! One question - Any idea if these new kits will be the final version or could there be further tweaks?. If there could be further tweaks then doesn't a 2013 launch date seem highly unlikely at this point?
 
Rigby I never said this was 100% exactly the same thing, but its 90% there.
I agree. The specs align with the Sony VP Technology platform. Just revising the hardware choices to get there. Biggest difference is building using TSV/AMD packages and SI which was outlined in the Sony CTO interview and Yoles take on the Sony VP Technology lecture. Those allow higher yields and lower costs from day one. 10 year design require Wide IO memory not GDDR5.
 
I agree. The specs align with the Sony VP Technology platform. Just revising the hardware choices to get there. Biggest difference is building using TSV/AMD packages and SI which was outlined in the Sony CTO interview and Yoles take on the Sony VP Technology lecture. Those allow higher yields and lower costs from day one. 10 year design require Wide IO memory not GDDR5.
I agree. I think the devkits right now are using GDDR5, but if they can move to Wide IO it'll be for the best.

Ok cool thanks! One question - Any idea if these new kits will be the final version or could there be further tweaks?. If there could be further tweaks then doesn't a 2013 launch date seem highly unlikely at this point?
None of this is final. Final devkits will ship in the Summer. Do not look at this and expect everything to look 100% the same. Things can change either in a positive or a negative direction. Do not expect any major revisions (switching from a 8850 to a 8970), but expect things such as clock speeds to be tweaked until release. This is a giant balancing act.

Was there some new development [looks at last few pages]? New VGleak? Confirmation of some old vgleak?
Some good information has come down the grapevine that the VGleaks article was almost 100% correct and that Sony has upped RAM to 4 GB. It is 192 GB/S unified :). Also the Durango specs have been leaked and apparently some mystery sauce is at work.

Through PM I was just told I am almost right. This is your Orbis folks :).
 
Some good information has come down the grapevine that the VGleaks article was almost 100% correct and that Sony has upped RAM to 4 GB. It is 192 GB/S unified :). Also the Durango specs have been leaked and apparently some mystery sauce is at work.

Through PM I was just told I am almost right. This is your Orbis folks :).
The only question I have is how Sony managed to get that bandwith with 4GB GDDR5 - either they went crazy (aka expensive) on the bus, logic, etc. or there are bigger modules available...
 
The only question I have is how Sony managed to get that bandwith with 4GB GDDR5 - either they went crazy (aka expensive) on the bus, logic, etc. or there are bigger modules available...
1. Densities for GDDR5 are improving in 2013.
2. Clamshell mode allows fewer chips to be used.

So if densities improve by 100% and clamshell mode is implemented, it will allow GDDR5 to be put in place.

3. The third option requires Wide IO configuration.

The real takeaway is 192 GB/S to GPU. GDDR5 in devkits, I don't know what the final system will have.