• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.
Are people really expecting something on the level of KZ4 for WiiU ? :O.

There is no way imo, that game probably has a budget close to $100 million and it's made by one of the most renowned first party developers (in terms of visuals) in the industry.

Then you have the hardware which is about 6x as powerful as WiiU and more up to date (DX10.1 effects vs DX11.1).

Even at 720p I don't think WiiU is pulling off anything close to KZ4 but that is not to say it won't have some amazing looking first party games.

People need to lower their expectations though.
 
I'm not implying that ;]



Shadow Fall, not KZ2 or 3 lol.

Also, CPU doesn't dictate graphics (though the Wii U GPU is more powerful than PS360 as well).

Don't you think that Wii U could run a game like Shadow fall, but with Killzone 3 level polygons, slightly better textures, better particle effects, much better draw distance, and a higher native resolution than PS3 would be capable of?

I think that's what Nintendo is going for with Wii U. A console that is capable of putting out anything more powerful consoles can do, scaled down - but not to such an extent where the average consumer would care about the difference. The same, solid, experience, but with less gravy and subsequent cost.
 
Are people really expecting something on the level of KZ4 for WiiU ? :O.

There is no way imo, that game probably has a budget close to $100 million and it's made by one of the most renowned first party developers (in terms of visuals) in the industry.

Then you have the hardware which is about 6x as powerful as WiiU and more up to date (DX10.1 effects vs DX11.1).

Even at 720p I don't think WiiU is pulling off anything close to KZ4 but that is not to say it won't have some amazing looking first party games.

People need to lower their expectations though.

I've been under the impression that Wii U hardware is DX11 equivalent. Is that incorrect? Link?
 

Donnie

Member
Anything more than a 5% clock increase would've made for more than 'slightly' adjusted clockspeeds, IMO. I wish we could ask those guys directly for more info. It'd kind of be a disaster IMO if a devkit didn't have identical hardware 6 months after the console launch. This can't be the latest devkit.

In your opinion, but you didn't write the article. A 10% increase could be considered slight, even a 15% or 20% increase could, it depends on the person. As you say it would be interesting to hear more on the subject, but I suspect if they knew more they'd say. I mean if they're talking about the clock increase we already heard about a while ago (1Ghz/400Mhz to 1.25Ghz/550Mhz) that's an increase of 25% CPU and 37% GPU. Far from slight IMO, but their opinion?, who knows.

Then again the devkit may have had identical performance in comparison to the retail unit. But perhaps they decided to adjust the clock speeds of the retail unit and this is a preemptive act for the next planned update.

By the way, WiiU hasn't been out for 6 months. its just over 3, though yes IF they're just getting devkits adjusted to the performance of retail units 3 months after launch then that's quite poor, though not a disaster.
 
Can you please be more specific about what I don't know enlighten me. Maybe you know something better than me I don't recall that I said I know everything.
Woaw there partner, I was talking about AOC83. NOT you.

What things the Ps3 or Xbox 360 do better? Have you developed a game on any of this systems? Did you do any comparison in their architecture against a custom CPU built by IBM like the Cell of the Ps3 and MCM Power PC CPU on the Wii U and the Xenon? My questions are sincere.
Lol seriously man calm down... 1 example is SIMD processing, I think Xenon at least matches it while Cell completely crushes it. Second example is SMT. Xenon has SMT's while IIRC Espresso, does not.
As the Jaguar it is not upgraded x2, its custom alright but it's built on a single die APU with 4MB memory cache for 8 cores processors which in fact the normal jaguar has 2MB memory cache. If by any magic trick or any sorcery done by AMD engineers or Sony engineers that I do not know, the specs of the CPU reveal that the CPU is JUST TWO Jaguars DUCTED TAPED together clocked at 1,6 GHz.
*cough*.... that's where the whole "2x" thing came from... Jaguar is just 4 cores, PS4 and Durango use... 8. 4x2=8.

The Jaguar is a cheap solution for small portable devices and "mews" enough power for it's size just 3.1mm2 per die, without L2 cache, low consumption on power without problems for overheating. It is a very good choice for costs but not on performance.

http://www.xbitlabs.com/news/mainbo...t_Core_AMD_Jaguar_Microprocessors_Report.html

http://www.xbitlabs.com/news/cpu/di...ext_Generation_Jaguar_Micro_Architecture.html
It still outperforms the measly 1.2GHZ tri-core "Gekko". It's also on a much more efficient 28nm process... has more overall L1 and L2 cache, higher clock rates AND higher flop rate (~190 GFLOPS per 4 cores...).


Also where are exactly elements you present as facts about the Ps4 GPU?
It's an AMD GPU based on the GCN architecture and it's more powerful than the 16CU 1.7TFLOP 7850. What else more do you need to know. It's basically confirmed. They apparently have some other "extra" tweaks we don't know about, but it's powerful and we know it is.

http://www.polygon.com/2013/2/20/4009940/playstation-4-tech-specs-hardware-details

The official specs specified as an next generation BASED graphic engine

I would love see that to be true but on a single APU from AMD that the most powerfull GPU they have planned for the market RADEON HD 8850 consumes 130Watts, ONLY the GPU and has the performance of 2,99Tflops and IT IS NEXT GEN.

http://www.cpu-world.com/news_2012/2012092201__Sea_Islands_Radeon_HD_8800-Series_Specs_Revealed.html

The RADEON HD 7850 is previous generation and its specs are:860MHz Engine Clock,2GB GDDR5 Memory,1200MHz Memory,Clock (4.8 Gbps GDDR5),153.6GB/s memory bandwidth (maximum),1.76 TFLOPS,Single Precision compute power,GCN Architecture and runs at peak power at 96Watts

http://www.techpowerup.com/reviews/AMD/HD_7850_HD_7870/24.html

I don't know why you're talking about 8850 or whatever...
If by any means AMD manage to fit ALL that in a single APU and runs at the imaginary(your reference) 130watts of power consumption I would BE VERY impressed. Not forget the amazing 1,8Tflops of performance of the Ps3 specs.

GPU: RSX @550MHz

1.8 TFLOPS floating point performance
Full HD (up to 1080p) x 2 channels
Multi-way programmable parallel floating point shader pipelines

http://playstation.about.com/od/ps3/a/PS3SpecsDetails_3.htm
:lol

So you are guessing without even knowing about the facts of what Wii U can do or not? Volumetric particle effects with 1080p and AA? You have a very sharp eye to notice AA on a direct feed video of a game.
How can you NOT notice it. It's using MLAA that's why there is the sub-pixel shimmering artifact. Seriously man you're going crazy at this point. We have direct feed footage there is no reason not to be able to tell what's going on there.


Sure you do.
:lol :lol
 

Donnie

Member
Even the V3 kits are really old by now. Final silicon supposedly didn't become available until close to launch, so V4.2 or V5 or something.

We know that initial kits were clocked at 1GH/ 400MHz, and we know the clocks were bumped pretty early on, so V3 might have been the first revision running at 1.25/550.

What record do we have as far as WiiU devkits go?
 
Don't you think that Wii U could run a game like Shadow fall, but with Killzone 3 level polygons, slightly better textures, better particle effects, much better draw distance, and a higher native resolution than PS3 would be capable of?

I think that's what Nintendo is going for with Wii U. A console that is capable of putting out anything more powerful consoles can do, scaled down - but not to such an extent where the average consumer would care about the difference. The same, solid, experience, but with less gravy and subsequent cost.

I already said Wii U would surpass PS3. You guys are getting too trigger happy for no reason at all.
 

USC-fan

Banned
Nintendo (and Microsoft) have sacrificed bandwidth for latency. Think of that pipe/water analogy above but have a bigger tap to fill a kettle for the PS4. That's great. You can turn your tap on and get more water flowing through the tap at once. The only downside is that with the Wii U and 720 taps the water flows as soon as youtturn your tap on, but your PS4 tap waits a few seconds before water comes out. RAM latency was a huge problem for developers last gen.

As for the CPU, as I've already mentioned, it's better at general processing than Xenon. Xenon is miles better at floating point work but that's because the PS3 and 360 were poorly designed. The CPUs in the PS4 and 720 will also be 'considerably worse' than Xenon at floating point work. That's what GPUs are for.
You say latency was a huge problem with last gen console. Have you really looked into this because i have been researching this and yet to find anything on this at all. Love to hear this info.

I have to say i enjoy reading your posts.

Are people really expecting something on the level of KZ4 for WiiU ? :O.

There is no way imo, that game probably has a budget close to $100 million and it's made by one of the most renowned first party developers (in terms of visuals) in the industry.

Then you have the hardware which is about 6x as powerful as WiiU and more up to date (DX10.1 effects vs DX11.1).

Even at 720p I don't think WiiU is pulling off anything close to KZ4 but that is not to say it won't have some amazing looking first party games.

People need to lower their expectations though.
shhhhh It part of the fun of wiiu threads. People build up massive expectations that just so far from reality that there can be only disappointment. Just look back at wiiu thread before E3 last year. Melt downs are going to be epic.
 
shhhhh It part of the fun of wiiu threads. People build up massive expectations that just so far from reality that there can be only disappointment. Just look back at wiiu thread before E3 last year. Melt downs are going to be epic.

They'll just set themselves up for disappointment again.
 

Donnie

Member
It still outperforms the measly 1.2GHZ tri-core "Gekko". It's also on a much more efficient 28nm process... has more overall L1 and L2 cache, higher clock rates AND higher flop rate (~190 GFLOPS per 4 cores...).

First of all I think you're mis-calculating the theoretical floating point performance of Jaguar cores (8 cores won't even reach 190gflops theoretically never mind 4 cores). Also can you tell me what theoretical floating point performance Bobcat is supposed to have? It might put things into perspective in comparison to the "measly" Espresso cores. Which by the way are around the same size as Jaguar cores (relative to the different process technology) and two generations evolved from Gekko..
 
First of all I think you're mis-calculating the theoretical floating point performance of Jaguar cores (8 cores won't even reach 190gflops never mind 4 cores). Also can you tell me what theoretical floating point performance Bobcat is supposed to have? It might put things into perspective in comparison to the "measly" Espresso cores. Which by the way are around the same size as Jaguar cores (relative to the different process technology) and two generations evolved from Gekko..

EDIT: I can't count, and I apparently don't know how many digital are in a GFLOP. Punch my face.
7ohplSM.jpg


Straight from AMD.

And yes, Measly. Espresso is old technology and they held back a great CPU by keeping Wii BC.
 

Donnie

Member
shhhhh It part of the fun of wiiu threads. People build up massive expectations that just so far from reality that there can be only disappointment. Just look back at wiiu thread before E3 last year. Melt downs are going to be epic.

You're a very.. very sad person.
 
It's an AMD GPU based on the GCN architecture and it's more powerful than the 16CU 1.7TFLOP 7850.

not too be pedantic but i just googled and a stock 7850 is 1.76 teraflops. i routinely refer to the ps4's gpu as essentially a 7850 (just as durango's rumored gpu is basically a 7770)

both have more cu's, but are clocked lower than their pc counterparts thus the teraflops are close to equal, thus they are basically equal.
 

Donnie

Member
7ohplSM.jpg


Straight from AMD.

And yes, Measly. Espresso is old technology and they held back a great CPU by keeping Wii BC.

That states 19Gflops per core, and 16Gflops per core for Bobcat (a 1.33Ghz CPU). Blu's benchmark shows that Broadway (prior to Espresso's enhancements) pushes significantly more flops per clock than Bobcat. You really fall for the whole "Brand" name PR don't you? So many CPU's are based on older models, but if they don't mention the progression some people may just think they're something new and revolutionary.
 
not too be pedantic but i just googled and a stock 7850 is 1.76 teraflops. i routinely refer to the ps4's gpu as essentially a 7850 (just as durango's rumored gpu is basically a 7770)

both have more cu's, but are clocked lower than their pc counterparts thus the teraflops are close to equal, thus they are basically equal.

You're right.

I honestly can't believe people are bringing this back.

Old as in it could be better because we have had advancements in CPU architectures.

Also, 45nm. Old.

Is that core flop count supposed to mean per core, per 4 cores, is it in FLOPS, MFLOPS? Clarification needed.

You're right, my bad.
 
I already said Wii U would surpass PS3. You guys are getting too trigger happy for no reason at all.

I'm just getting at the point of the Wii U having the specific newer abilities to go along with enough power so that whatever on screen advantages that more powerful hardware may have isn't something that the average consumer would notice readily, or care about. Especially when they'd have to spend more money.
 

Donnie

Member
You're right.



Old as in it could be better because we have had advancements in CPU architectures.

Also, 45nm. Old.

So put Espresso on a 28nm process and suddenly it would be new technology in your opinion? Otherwise what advancements do you think are required for the this CPU to be considered new technology? Or does it even matter if performance is comparable?
 

USC-fan

Banned
That states 19Gflops per core, and 16Gflops per core for Bobcat (a 1.33Ghz CPU). Blu's benchmark shows that Broadway (prior to Espresso's enhancements) pushes significantly more flops per clock than Bobcat. You really fall for the whole "Brand" name PR don't you? So many CPU's are based on older models, but if they don't mention the progression some people may just think they're something new and revolutionary.
Broadway 2.9 glfop @ 729 MHz
Broadway is 4 x .729 = 2.9 glfops per core

Jaguar 8 x 1.6 = 12.8 glfops per core or 8 cores 102.4 gflops
Wiiu using 3x Broadway at 1.24GHz would be 14.88 glfops

maybe i'm missing something because at 1ghz it would be

broadway 4 glfops per core
Jagaur 8 glfops per core

must be talking about some fuzzy "real world" numbers.
 

Donnie

Member
Broadway 2.9 glfop @ 729 MHz
Broadway is 4 x .729 = 2.9 glfops per core

Jaguar 8 x 1.6 = 12.8 glfops per core or 8 cores 102.4 gflops

maybe i'm missing something because at 1ghz it would be

broadway 4 glfops per core
Jagaur 8 glfops per core

must be talking about some fuzzy "real world" numbers.

I'm talking about the numbers phosphor112 posted direct from AMD. As well as Blu's floating point benchmark that compares Bobcat to Broadway.

Seriously, I don't see why this needs to be explained, was my post unclear in some way? Also fuzzy real world numbers?, surely real world numbers are worth more than theoretical mumbo jumbo.
 

AkiraGr

Banned
Woaw there partner, I was talking about AOC83. NOT you.

Sorry but you quote me on your message, so I thought you were talking to me.

Lol seriously man calm down... 1 example is SIMD processing, I think Xenon at least matches it while Cell completely crushes it. Second example is SMT. Xenon has SMT's while IIRC Espresso, does not.

This examples are moot the power pc architecture of multithread cores that are synchronized on same speed do not reach the performance of a MCM single thread multicore power pc CPU even at lower speeds.

*cough*.... that's where the whole "2x" thing came from... Jaguar is just 4 cores, PS4 and Durango use... 8. 4x2=8.

Then you should have not put a multiplier symbol but a plus 4+4=8 on cores and 2+2=4 for cache memory. Just adding two same cpu together you don't x2 the performance.

It still outperforms the measly 1.2GHZ tri-core "Gekko". It's also on a much more efficient 28nm process... has more overall L1 and L2 cache, higher clock rates AND higher flop rate (~190 GFLOPS per 4 cores...).

Gekko?Realy now, are you serious? 3 Gamecube's CPU ducted taped together this is what you implying here? I am at loss of words. Do you have any elements for what are you saying?Also for your numbers on Jaguar are correct BUT how on earth you know what numbers the Wii U Expresso process? Any source or links would be welcome for all the above.

It's an AMD GPU based on the GCN architecture and it's more powerful than the 16CU 1.7TFLOP 7850. What else more do you need to know. It's basically confirmed. They apparently have some other "extra" tweaks we don't know about, but it's powerful and we know it is.

I was not debating the fact that the 7850 is a good PC GPU, but the fact that on a single APU die AMD will fit all that without serious sacrifices.

I don't know why you're talking about 8850 or whatever...

Because even on the next generation of their pc gpu which is more efficient on power consumption with better performance they did not manage to lower more than 7% the power consumption. How on earth with the present technology with out of the self cheap products AMD will manage 1,8Tflops with a single APU with low power consumption. From my point of view they are lying on the performance of the GPU same as the Ps3 or they have discovered some powerful alien technology that powers the APU of the PS4.

How can you NOT notice it. It's using MLAA that's why there is the sub-pixel shimmering artifact. Seriously man you're going crazy at this point. We have direct feed footage there is no reason not to be able to tell what's going on there.

Sorry man but no, I could not see what are you saying. I saw very good HDR lighting, standar geometry when the action started, particle effects that are possible even on Ps3. The DoF was outstanding when the scripted scenes on the hover plane on the begging and the end of the demo that current gen cannot produce at 1080p, but the scale,animation and geometry had a sharp fall on the action part, especially the npc characters that were running were like robots. Of course the game is not ready and I believe the game will fix all this and will be a very polished game like GG know how to make.
 

USC-fan

Banned
Please read my post, its pretty clear. I'm talking about the numbers phosphor112 posted direct from AMD. As well as Blu's floating point benchmark that compares Bobcat to Broadway.

Seriously, I don't see why this needs to be explained, was my post unclear in some way? Also fuzzy real world numbers?, that's amazingly rich.. seriously :D

The reason for the performance hit was the 64bit fpu datapath in bodcat. If you ran that test on jaguar you would get very difference results.
 

Schnozberry

Member
It still outperforms the measly 1.2GHZ tri-core "Gekko". It's also on a much more efficient 28nm process... has more overall L1 and L2 cache, higher clock rates AND higher flop rate (~190 GFLOPS per 4 cores...).

It's an AMD GPU based on the GCN architecture and it's more powerful than the 16CU 1.7TFLOP 7850. What else more do you need to know. It's basically confirmed. They apparently have some other "extra" tweaks we don't know about, but it's powerful and we know it is.

The PPC750 was never in an SMP configuration, never reached clock rates as high as Espresso, and never had this much on die cache. It's actually a pretty nice piece of hardware, despite your fatuous assertion of gekkos being duct taped together. Theoretical floating point numbers are of dubious value in the real world (See the Cell CPU), and floating point calculations aren't particularly useful for game AI, which predominantly uses fixed point math. I think the Wii U was probably designed for the GPU to do most of the floating point math, as is the PS4.

Also, the Radeon 7850 uses GCN. If the PS4 is a newer architecture than current PC tech, it will use GCN 2. If the PS4 GPU is 1.8TFLOPS, it's pretty close to the power of a 7850. It's definitely not a 7870 or similar, because that part reached 2.5 TFLOPS at single precision. The extra tweaks are probably related to whatever efficiency advantages were made through the API, which should allow the hardware to reach closer to it's maximum performance compared to PC AMD harware running on DirectX.
 

Donnie

Member
The reason for the performance hit was the 64bit fpu datapath in bodcat. If you ran that test on jaguar you would get very difference results.

You don't know what performance difference that makes, and neither do I until we're shown benchmarks. But my point is its pretty ridiculous to call Espresso cores measly when what people say is the weakest part of their design (floating point performance) is significantly better per clock than a Bobcat core. AMD claim Jaguar is 20% more powerful in floating point performance than Bobcat according to the numbers phosphor112 posted. Now it may be that in reality its more than that (though PR rarely if ever exceeds reality), but even if it was, calling Espresso cores measly is still ludicrous given what we've seen even from older Broadway cores vs Bobcat. Its not even based on any knowledge of each CPU's performance, simply the idea that one is "based off older technology" while the other is apparently brand spanking new. Forgetting the fact that Jaguar is going to be based off some older core or another, who honestly cares what each CPU is based on? Performance is all that matter. 8 Jaguar cores at 1.6Ghz is going to be quite a bit better than 3 Espresso cores at 1.24Ghz, that much is obvious. But I take issue with exaggerations based purely on the perception of "newness" making one piece of hardware massively better than another.

I mean Bobcat would be considered newer technology than Espresso right? At 1.33Ghz its speced to produce around 16Gflops per core. While Espresso's younger brother (Broadway) should output 5.3Gflops at the same 1.33Ghz. But Broadway has been shown to perform better per clock than Bobcat (so better floating point performance from a 1.33Ghz Broadway than a 1.33Ggz Bobcat), hurray for old measly technology..
 
This examples are moot the power pc architecture of multithread cores that are synchronized on same speed do not reach the performance of a MCM single thread multicore power pc CPU even at lower speeds.
The benefits of the MCM is part of the package, not to the CPU itself.


Then you should have not put a multiplier symbol but a plus 4+4=8 on cores and 2+2=4 for cache memory. Just adding two same cpu together you don't x2 the performance.
My point is that it was the CPU...twice.



Gecko?Realy now, are you serious? 3 Gamecube's CPU ducted taped together this is what you implying here? I am at loss at words. Do you have any elements for what are you saying?Also for your numbers on Jaguar are correct BUT how on earth you know what numbers the Wii U Expresso process? Any source or links would be welcome for all the above.
It's the same architecture base, it's not 3 Gamecube CPU's. Just like Wii CPU wasn't the same as a GC CPU. Also, take a look at that Wii U CPU thread.

http://www.neogaf.com/forum/showthread.php?t=513471

It's got the breakdown of all the parts that we can tell from the die shot.

I was not debating the fact that the 7850 is a good PC GPU, but the fact that on a single APU die AMD will fit all that without serious sacrifices.

Because even on the next generation of their pc gpu which is more efficient on power consumption with better performance they did not manage to lower more than 7% the power consumption. How on earth with the present technology with out of the self cheap products AMD will manage 1,8Tflops with a single APU with low power consumption. From my point of view they are lying on the performance of the GPU same as the Ps3 or they have discovered some powerful alien technology that powers the APU of the PS4.
You mention the 7850 power draw... 96w at max. That includes the 2GB GDDR5, at 2gbit per chip, it would be .. 8 chips? It also has a giant fan.... power to the display ports... etc. The APU will be fine in terms of power draw.

Sorry man but no, I could not see what are you saying. I saw very good HDR lighting, standar geometry when the action started, particle effects that are possible even on Ps3. The DoF was outstanding when the scripted scenes on the hover plane on the begging and the end of the demo that current gen cannot produce at 1080p, but the scale,animation and geometry had a sharp fall on the action part, especially the npc characters that were running were like robots. Of course the game is not ready and I believe the game will fix all this and will be a very polished game like GG know how to make.

Lololol. They demonstrated tessellation. That's more than "standard geometry.", DOF was outstanding (and it still exists when you scope in on someone, not just the scripted parts. You're also mistaken to think that we won't see full environments like that. And the "robot" thing probably has to do with (as you said) it being an early stage... like alpha stage, so much in an alpha state that all the marines were the same model.
 

Donnie

Member
It's the same architecture base, it's not 3 Gamecube CPU's. Just like Wii CPU wasn't the same as a GC CPU. Also, take a look at that Wii U CPU thread.

But you actually called it a "1.2GHZ tri-core Gekko". Which is why he's questioning your assertion. Maybe you should stop making sensationalist remarks?

Also do you have any response to my posts?, specifically the ones in response to yourself.
 

USC-fan

Banned
You don't know what performance difference that makes, and neither do I until we're shown benchmarks. But my point is its pretty ridiculous to call Espresso cores measly when what people say is the weakest part of their design (floating point performance) is significantly better per clock than a Bobcat core. AMD claim Jaguar is 20% more powerful in floating point performance than Bobcat according to the numbers phosphor112 posted. Now it may be that in reality its more than that (though PR rarely if ever exceeds reality), but even if it was, calling Espresso cores measly is still ludicrous given what we've seen even from older Broadway cores vs Bobcat. Its not even based on any knowledge of each CPU's performance, simply the idea that one is "based off older technology" while the other is apparently brand spanking new. Forgetting the fact that Jaguar is going to be based off some older core or another, who honestly cares what each CPU is based on? Performance is all that matter. 8 Jaguar cores at 1.6Ghz is going to be quite a bit better than 3 Espresso cores at 1.24Ghz, that much is obvious. But I take issue with exaggerations based purely on the perception of "newness" making one piece of hardware massively better than another.

I mean Bobcat would be considered newer technology than Espresso right? At 1.33Ghz its speced to produce around 16Gflops per core. While Espresso's younger brother (Broadway) should output 5.3Gflops at the same 1.33Ghz. But Broadway has been shown to perform better per clock than Bobcat (so better floating point performance from a 1.33Ghz Broadway than a 1.33Ggz Bobcat), hurray for old measly technology..
Ok take that one data set and run with it. Flaw logic but its par for the course here.

But you actually called it a "1.2GHZ tri-core Gekko". Which is why he's questioning your assertion. Maybe you should stop making sensationalist remarks?

Also do you have any response to my posts?, specifically the ones in response to yourself.
Moving it to 28nm wouldnt change anything but if they added modern SIMD support I would consider it modern.
 

Donnie

Member
Ok take that one data set and run with it. Flaw logic but its par for the course here.

Moving it to 28nm wouldnt change anything but if they added modern SIMD support I would consider it modern.

Why don't you explain why its flawed logic to discuss the only real benchmark we have for Broadway vs Bobcat and its relevance to Espresso's possible performance? I gave you quite a thorough argument, I think I deserve more than a one liner with no substance what so ever.

As far as SIMD support we can once again refer to Bobcat. 16Gflops per core of oh so "modern" SIMD magic, vs 5.3Gflops paired singles, and they come out about equal. That's not to say that more modern SIMD units are bad, but there does need to be a more balanced reasoning than "SIMD unit good" and "No SIMD unit bad".

By the way it may only be one benchmark and I accept its not definitive, but its one benchmark more than you or anyone else has to back up your "Espresso is measly old technology" claims.
 
But you actually called it a "1.2GHZ tri-core Gekko". Which is why he's questioning your assertion. Maybe you should stop making sensationalist remarks?

Also do you have any response to my posts?, specifically the ones in response to yourself.

About what? Would making it 28nm consider it modern? Certainly, more so than the 45nm processes that are now defunct. ARM, AMD, Intel (I think i5/i7 are off of 32nm now) are all on 28nm now. There are huge advantages of that. 45nm processes is 2008 tech, they could have gone for better, but they haven't. Hell even a 32 would have been a full die shrink down, they could have put in a better GPU and CPU.
 
Lol seriously man calm down... 1 example is SIMD processing, I think Xenon at least matches it while Cell completely crushes it. Second example is SMT. Xenon has SMT's while IIRC Espresso, does not.
Not to rain into anyone's parade and I'm pretty sure you know this, but...

Xenon and CELL's PPE both support 2-way SMT and it's there first and foremost because it is a In Order Execution solution. That means the pipeline will stall and during those stalls if single threaded it can't do anything in between jobs, waiting "in line" for something means ignoring smaller jobs/calls making it very ineffective, that means SMT instead of filling the unused overhead/bandwidth like hyperthreading on P4/i7's (who gives at most a 30% boost, usually less) on that configuration actually has full access to the CPU pipeline as if it was concurrently the main thread, that eases out the problem of having the CPU doing nothing for too many cycles. In that regard it's either use the second thread or get zero performance for those cycles.

That "problem" doesn't apply to Wii U CPU as a Out of Order Execution pipeline with SMT works very differently than a In Order one with SMT; as already stated. Of course it being SMT-able would probably help with ports of current generation titles whose code is heavily counting on it, but it's still a very different beast and certainly wouldn't work as a magic switch either.

Current gen having SMT is no "plus" over the Wii U; other next gen platforms having it though, might.
 

Donnie

Member
I don't think what phosphor112 posted clearly says what he thought it did or what you think it does. It doesn't specify what it means or provide units.

It certainly doesn't say what phosphor112 thought it meant. To me it says 15.9Gflops per core for Bobcat and 19.4Gflops per core for Jaguar. Of course I realise no clock speed is given and fair enough that's a big sticking point.
 

Donnie

Member
Lol seriously man calm down... 1 example is SIMD processing, I think Xenon at least matches it while Cell completely crushes it. Second example is SMT. Xenon has SMT's while IIRC Espresso, does not.

I missed this gem earlier. You realise Jaguar doesn't have SMT?. old technology then yes?
 
It certainly doesn't say what phosphor112 thought it meant. To me it says 15.9Gflops per core for Bobcat and 19.4Gflops per core for Jaguar, can there be any other interpretation there? Of course I realise no clock speed is given and fair enough that's a big sticking point.
7ohplSM.jpg


All it says is:
Core flop count = 194490

No unit, no clarification on meaning.

The only way it becomes 19.4 GFLOPS is if their unit in question is 10^5 FLOPS i.e. 100 kFLOPS or 0.1 MFLOPS; which would be a very odd unit.

They could also be referring collectively to the 4 cores of one Jaguar.

Meanwhile, someone in the original thread for that image said that it wasn't actually referring to floating point operations per second at all, but rather an actual component called a flop.
 
Not to rain into anyone's parade and I'm pretty sure you know this, but...

Xenon and CELL's PPE both support 2-way SMT and it's there first and foremost because it is a In Order Execution solution. That means the pipeline will stall and during those stalls if single threaded it can't do anything in between jobs, waiting "in line" for something means ignoring smaller jobs/calls making it very ineffective, that means SMT instead of filling the unused overhead/bandwidth like hyperthreading on P4/i7's (who gives at most a 30% boost, usually less) actually has access to the CPU in those moments as if it was the concurrently the main thread easing out the problem of having the CPU doing nothing for too many cycles. In that regard it's either use the second thread or get zero performance for those cycles.

That "problem" doesn't apply to Wii U CPU as a Out of Order Execution pipeline with SMT works very differently than a In Order one with SMT; as already stated. Of course it being SMT-able would probably help with ports of current generation titles whose code is heavily counting on it, but it's still a very different beast and certainly wouldn't work as a magic switch either.

Current gen having SMT is no "plus" over the Wii U; other next gen platforms having it though, might.

I missed this gem earlier. You realise Jaguar doesn't have SMT?. old technology then yes?

I was comparing PS360 to Wii U. IIRC AMD hasn't done multithreading since they started their APU's.

And I'm sure I noted Out of Order processing as an advantage for Espresso. If I didn't that was a given from the start, that's why people saying Xenos/Cell > Wii U CPU are stupid.
 
The PPC750 was never in an SMP configuration, never reached clock rates as high as Espresso, and never had this much on die cache. It's actually a pretty nice piece of hardware, despite your fatuous assertion of gekkos being duct taped together. Theoretical floating point numbers are of dubious value in the real world (See the Cell CPU), and floating point calculations aren't particularly useful for game AI, which predominantly uses fixed point math. I think the Wii U was probably designed for the GPU to do most of the floating point math, as is the PS4.

Also, the Radeon 7850 uses GCN. If the PS4 is a newer architecture than current PC tech, it will use GCN 2. If the PS4 GPU is 1.8TFLOPS, it's pretty close to the power of a 7850. It's definitely not a 7870 or similar, because that part reached 2.5 TFLOPS at single precision. The extra tweaks are probably related to whatever efficiency advantages were made through the API, which should allow the hardware to reach closer to it's maximum performance compared to PC AMD harware running on DirectX.
Very true.

The concept of old architecture has a lot to be said, fact is G3/PPC750 is still relevant or viable enough, it's pretty tight for what it is, a 6 staged pipeline with out of order; and performance has stayed strong per clock while consumption is still low by today's standards.

Small pipelines have advantages, latency is very low, cache miss is non-existent, it's efficient. Problem was scaling that to higher frequencies against Pentium 3 who had more stages, and as G3 started to struggle in that race G4 (PPC7400) and G5 (PPC970) appeared. Both performing worse at the same frequency (I still remember G3 kicking G4's butt at the same frequency) but being clocked way higher and introducing stuff like Altivec and the like.

Basically it was left be because it couldn't scale up all that well, just like intel left Pentium 3 behind for a time to focus on netburst/pentium 4. turns out Pentium 4 was an error, so Pentium M, Core Duo and further CPU's are based on Pentium 3 line of succession again. This is not unheard of either, Intel Atom was based on Pentium 1, as is/was Larabee and Intel MIC. Taking old stuff from the drawer and modifying on top of them is common practice, because despite manufacture process and some technology chances are they might still be relevant.

Of course, IBM has now a lot of "lines of succession" and modifying a G3 to scale up would just result in a G4, G5 or further evolution of those architectures, hence it stays forever in the dead end where it is.

But it's not a bad CPU still, it's not the best Nintendo could have gone for, but it's certainly not the worse either.


Of course I'd like for it to have Altivec and SMT; but then again the advantages of having SMT on small staged CPU's is very tight if not non-existent. It's probably non-existent here, so we're left with what if wishful thinking.
I was comparing PS360 to Wii U. IIRC AMD hasn't done multithreading since they started their APU's.

And I'm sure I noted Out of Order processing as an advantage for Espresso. If I didn't that was a given from the start, that's why people saying Xenos/Cell > Wii U CPU are stupid.
Hmmm, don't look at it as an attack of sorts.

I don't know if you did or not but I wasn't focusing on that, just that SMT on an in-order cpu is a means to reduce the impact of it's own nature. It's not a "plus" over the Wii U, instead it's kinda apples to oranges.
 

Donnie

Member
7ohplSM.jpg


All it says is:
Core flop count = 194490

No unit, no clarification on meaning.

The only way it becomes 19.4 GFLOPS is if their unit in question is 10^5 FLOPS i.e. 100 kFLOPS or 0.1 MFLOPS; which would be a very odd unit.

They could also be referring collectively to the 4 cores of one Jaguar.

Meanwhile, someone in the original thread for that image said that it wasn't actually referring to floating point operations per second at all.

Fair enough, it does say "core flop count", which seemed relatively safe to assume meant per core floating point performance and 19.4Gflops would be the only corresponding number to make any sense. But it may not be correct, and if so I'd be interested to see what the true numbers are.
 
Fair enough, it does say "core flop count", which seemed relatively safe to assume meant per core floating point performance. If that's not correct then no problem.

I'd be interested so see a source that gives us unequivocal theoretical floating point performance numbers for Bobcat and Jaguar.
I'm not sure there is one for the latter.

Anyway, here's the source of that image: http://www.hardware.fr/marc/ISSCC2013-Final-v5.pdf
 
The PPC750 was never in an SMP configuration, never reached clock rates as high as Espresso, and never had this much on die cache. It's actually a pretty nice piece of hardware, despite your fatuous assertion of gekkos being duct taped together. Theoretical floating point numbers are of dubious value in the real world (See the Cell CPU), and floating point calculations aren't particularly useful for game AI, which predominantly uses fixed point math. I think the Wii U was probably designed for the GPU to do most of the floating point math, as is the PS4.

Also, the Radeon 7850 uses GCN. If the PS4 is a newer architecture than current PC tech, it will use GCN 2. If the PS4 GPU is 1.8TFLOPS, it's pretty close to the power of a 7850. It's definitely not a 7870 or similar, because that part reached 2.5 TFLOPS at single precision. The extra tweaks are probably related to whatever efficiency advantages were made through the API, which should allow the hardware to reach closer to it's maximum performance compared to PC AMD harware running on DirectX.

I don't even know if I replied to this (on muh phone) but this is a great post. The stock 750 was locked at a lower clock rate wasn't it? And yes, I agree that the Wii U GPU is supposed to do most of the work. It's capable of compute functions and those fixed functions can make up for any general tasks on the GPU taking up processing power.
 

USC-fan

Banned
Fair enough, it does say "core flop count", which seemed relatively safe to assume meant per core floating point performance and 19.4Gflops would be the only corresponding number to make any sense. But it may not be correct, and if so I'd be interested to see what the true numbers are.
I have already posted the real numbers.
Why don't you explain why its flawed logic to discuss the only real benchmark we have for Broadway vs Bobcat and its relevance to Espresso's possible performance? I gave you quite a thorough argument, I think I deserve more than a one liner with no substance what so ever.

As far as SIMD support we can once again refer to Bobcat. 16Gflops per core of oh so "modern" SIMD magic, vs 5.3Gflops paired singles, and they come out about equal. That's not to say that more modern SIMD units are bad, but there does need to be a more balanced reasoning than "SIMD unit good" and "No SIMD unit bad".

By the way it may only be one benchmark and I accept its not definitive, but its one benchmark more than you or anyone else has to back up your "Espresso is measly old technology" claims.

Look if you want to learn about this stuff go read the old thread.
 

Donnie

Member
Wrong, sorry. AMD Jaguar has the equivalent of SMT, called CMT.

I'm not, Jaguar is 1 thread per core, 8 cores equals 8 threads. CMT is cluster based multithreading, like AMD use in buldozer cores when they in fact have two cores sharing the resources of 1 core in order to simulate something similar to SMT. If Jaguar uses the same kind of technology then you could call the Durango/Orbis CPU a 4 core dual threaded CPU, but not 8 core dual threaded.
 
Status
Not open for further replies.
Top Bottom