Two days later in
this thread on BY3D and my take is confirmed.
Still, no-one on BY3D is doing research to find out if
Memory on logic or 3D stacked memory will be available for inside the SOC nor are they looking for faster than GDDR5 memory. All seem confirmed for 2012-2013 with 3D stacked at the outside 2014. More than 2 3D stacked memory wafers (on top of each other) is said to wait for 2015 (want to keep the heights of all chips 2.5D attached to the SOC substrate nearly the same height). The 2 gig PS4 rumors might support 2 1 gig 3D memory wafers inside the SOC. (all including memory on logic can be 2.5D attached to Substrate)
If the Developer rumors of 2 Gig GDDR5 are true then SOC to second GPU will not be a PCIe buss but will be a unified memory (one memory pool) GDDR5 buss. Also since we will have 2 GPUs and minimum 2 X86 CPUs as well as accelerators accessing this same memory, it will need to be very fast. Sony patent calls for memory transfer speeds out of the 4 module 1PPU4SPU of 35 GB/sec (Sony patent Dec 2010 was for 2PPUs and 16SPUs, that may/probably has changed to AMD X86 APU (2 Jaguar X86 CPUs + 300 GPU math elements) which is essentially the same performance), with several other CPU-GPUs using the same buss as well as display I/O like Infinity view I expect the requirements for memory speed inside the SOC to be even higher. Memory on logic in the SOC (80-100 meg) can reduce the load on the memory outside the SOC (The MMUs and logic in the SOC take care of this transparently). If the second GPU is going to be outside the SOC due to heat or size issues then 3D stacked memory inside the SOC would still require a memory buss to the second GPU.
If 3D stacked memory in quantity is expensive or not available then 100Meg or so 256-512 bit wide ram inside the SOC with 2 Gigs of GDDR5 256 bit buss external connected to the second GPU seems likely. A later refresh might include 3D stacked memory inside the SOC (replacing the differential GDDR5 or 265 bit GDDR5) as well as moving the second GPU inside the SOC when smaller die sizes reduce size and heat.
Sony and Microsoft should wait for 3D stacked memory and have to wait for a new full HSA GPU which won't be ready till late this year.
All could take advantage of SOC efficiencies to release early @ 32nm but for 28nm, Full HSA GPU, 3D stacked memory and the economies in the Consortium building block 3D stacking they must release nearly a year after WiiU.
Didn't think of this until now but a Full HSA GPU and 3D stacked memory are key to next generation support and are the limiting factor in a release date. A SOC design with some HSA efficiencies and GDDR5 speed memory (multiple DDR3 controllers and 4 banks or more of interleaved DDR3) has been possible and with such a design, redesign to support lower die size as well as economies using the consortium building blocks can be implemented in the near future but
3D stacked memory and HSA GPU could not be incorporated later as they drastically change how the next generation game consoles will be designed and massively impact it's performance.
Talk in BY3D about the number of Compute units and relative CPU and GPU performance do not take into account the impact of 3D stacked memory (2X) and prefetch by CPU for GPU (2X) (requires the 2014 design GPU). There is overlap between the two so the total increase in performance is going to be less, I'd guess below 3X total, this is something professionals should be talking about but probably can't because of NDAs. Obvious also is the clock speed of GPUs can/should be higher and fewer GPU elements would be needed which should increase yield and reduce price.
So talk about the performance difference between WiiU which can have none of the efficiencies above and Xbox 3 or PS4 which can, only using comparisons using current GPUs and CPUs is missing the point. Also assuming performance is going to be disappointing using the same logic is also missing the point.