• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

PS4 Rumors , APU code named 'Liverpool' Radeon HD 7970 GPU Steamroller CPU 16GB Flash

Status
Not open for further replies.

coldfoot

Banned
As for the Cell 1.5, no matter how well it could work next gen, it will still take longer to produce results on that chip than on a x86 core where developers have far more experience.
The 360 is easy to program for and it does not have x86. I just don't get this love for x86, a chip where a lot of silicon is wasted for the sake of compatibility of PC business software from the 1980s. For me, that kind of chip has no business being in a console. Imagine using smaller (and slower) PPC general purpose cores (like the WiiU/360) and using the saved silicon budget to have some more CU's in the GPU, or if you're feeling adventurous, some SPU units on die for BC/physics/vector processing.
 
The 360 is easy to program for and it does not have x86. I just don't get this love for x86, a chip where a lot of silicon is wasted for the sake of compatibility of PC business software from the 1980s. For me, that kind of chip has no business being in a console. Imagine using smaller (and slower) PPC general purpose cores (like the WiiU/360) and using the saved silicon budget to have some more CU's in the GPU.

This. There is no need for that. Will it make PC porting to PS4 easy? Yes, but they still have to work through all of its intricacies of the hardware anyway, why not put something in that's... worthwhile? I also feel like every hacker and their mother knowing how to program for x86 would cause loads of fun for Sony and consumers...
 
Agreed. I think an improved 28cm cell with either 3PPU/8SPUs or 2PPU/12SPUs (add some more mem if the die size remains <200nm'2) would have done the job easily. As it seems, AMD was able to pull all three prties into their camp and Sony even went for the APU thing (this could bite them in the a"" in the end, given the turds AMD currently markets).
Think more about PPU needed to manage SPUs and X86 needed to manage GPGPU. The branching/general purpose ability of both PPU and X86 needed. But AMD has already created a library and infrastructure that uses X86 not PPU. 1PPU4SPU can be added to AMD HSA SoCs to serve special purpose best use case for SPUs; this is the idea behind HSA.
 

KageMaru

Member
The 360 is easy to program for and it does not have x86. I just don't get this love for x86, a chip where a lot of silicon is wasted for the sake of compatibility of PC business software from the 1980s. For me, that kind of chip has no business being in a console. Imagine using smaller (and slower) PPC general purpose cores (like the WiiU/360) and using the saved silicon budget to have some more CU's in the GPU, or if you're feeling adventurous, some SPU units on die for BC/physics/vector processing.

I would assume that Sony and MS would remove any unnecessary features that do not benefit the goal they have their respective systems, at least that's what they did (to an extent) with the PS360 IIRC. This would make the chip smaller, easier to cool, and cheaper overall.

I also think it, again, comes down to experience. While I'm sure a PPC core would also work, I'm not sure if they could scale the Power7 or Power8 core down for consoles as well as AMD can scale down their chips. Single threaded performance may also be a concern since it was rather poor in both consoles this gen and I'm not really sure where IBM's and AMD's chips compare in this category.

There's no telling what, if any, modifications MS and Sony are doing with the CPU or GPU, so things may turn out very well in the end.

Edit:

This. There is no need for that. Will it make PC porting to PS4 easy? Yes, but they still have to work through all of its intricacies of the hardware anyway, why not put something in that's... worthwhile? I also feel like every hacker and their mother knowing how to program for x86 would cause loads of fun for Sony and consumers...

There isn't a gaming computer today that couldn't beat the snot out of the PS3 or 360, so IMO that automatically makes their components worth while. =p
 

ekim

Member
German Gameswelt has "learned from realiable sources" that PS4 will be released in Q1 2014.

Translated article here.

Why Google translates "Gameswelt" to "Gamespot" is beyond me, I don't think they are affiliated.

There are a lot of people out there that "learn from reliable sources". If you shoot 1000 times you probably hit something. This is just the usual way for german sites to generate clicks.
 
German Gameswelt has "learned from realiable sources" that PS4 will be released in Q1 2014.

Translated article here.

Why Google translates "Gameswelt" to "Gamespot" is beyond me, I don't think they are affiliated.
Could be true but so could be any other "reliable" source. Also their PS"x" launched in march argument is pretty weak. Nice flag btw. ;-)


On a different story do we really want a AMD based CPU? I even doubt that 2 Steamroller cores beat 1 Sandy Bridge core. I would rather look away from x86 when AMD is my only option...
 
Yeah what happened to AMD CPUs? They use to be right up there with Intel, but since Bulldozer I haven't heard a single good thing about them.
 
Yeah what happened to AMD CPUs? They use to be right up there with Intel, but since Bulldozer I haven't heard a single good thing about them.
Even before Bulldozer AMD CPUs have been way to power hungry for their performance. After the Athlon XP I never even looked at AMD. Especially in a small console heat us a big issue. I don't want Sony to sacrifice GPU TDP because of an inefficient Steamroller or similar setup which needs too much power. AMD GPUs caught up with Nvidia - CPUs not so much... (in general)
 
German Gameswelt has "learned from realiable sources" that PS4 will be released in Q1 2014.

Translated article here.

Why Google translates "Gameswelt" to "Gamespot" is beyond me, I don't think they are affiliated.

http://www.eetimes.com/electronics-news/4371752/GlobalFoundries-installs-gear-for-20nm-TSVs said:
SAN JOSE – GlobalFoundries is installing equipment to make through-silicon vias in its Fab 8 in New York. If all goes well, the company hopes to take production orders in the second half of 2013 for 3-D chip stacks using 20 and 28 nm process technology.

The systems should be in place and qualified by the end of July, with about half of them installed today, McCann said. The company aims to run its first 20 nm test wafers with TSVs in October and have data on packaged chips from its partners by the end of the year.

GlobalFoundries’ schedule calls for having reliability data in hand early next year. The data will be used to update the company’s process design kits so its customers can start their qualification tests in the first half of the year.

If all goes well, first commercial product runs of 20 and 28 nm wafers with TSVs can start in the second half of 2013 and ramp into full production in 2014, McCann said.
Which might explain the Sony statements that they will wait till processes are in place; a 3-6 month delay till March of 2014 if Sony or Microsoft want's to take advantage of the new processes. 28nm and 2.5D with interposer is available for 2013 but 3D with some critical parts 20nm (like AMD Southbridge quotes at 22nm) will wait for 2014.
 

KageMaru

Member
German Gameswelt has "learned from realiable sources" that PS4 will be released in Q1 2014.

Translated article here.

Why Google translates "Gameswelt" to "Gamespot" is beyond me, I don't think they are affiliated.

They may launch in some region in Q1 2014, such as Europe, but the system will first launch in Q4 2013.

Not sure if serious.

We are comparing 2005 products to... 2012.

That's kind of my point though. Basically no matter what is packed into these consoles, it should be worthwhile for the fact that the gap in time between current and next gen will provide that worthwhile performance increase.

I guess I should ask, if you don't consider x86 architecture worthwhile, then what would you be happy with?
 
That's kind of my point though. Basically no matter what is packed into these consoles, it should be worthwhile for the fact that the gap in time between current and next gen will provide that worthwhile performance increase.

I guess I should ask, if you don't consider x86 architecture worthwhile, then what would you be happy with?

Something more attuned to consoles needs? I mean.. sure you want SOME general processing on there, but you won't likely need high Double performance for games...
 
Even before Bulldozer AMD CPUs have been way to power hungry for their performance. After the Athlon XP I never even looked at AMD. Especially in a small console heat us a big issue. I don't want Sony to sacrifice GPU TDP because of an inefficient Steamroller or similar setup which needs too much power. AMD GPUs caught up with Nvidia - CPUs not so much... (in general)

Which is why it's likely they switched to Jaguar cores so they can maintain what their GPU target is.
 
Which is why it's likely they switched to Jaguar cores so they can maintain what their GPU target is.
Kabini APU with 9-25W TDP leaving Sony with 175W TDP left for the standalone GPU. 75% moren than for the RSX. Even Kaveri would allow a 165W TDP Southern Island GPU. Can't be right or? Desktop Radeon 7870 or 7950 would "easily" fit that number with some changes and stay in the same TDP region as Cell/RSX.
 
I'm very shocked that Sony hasn't elected to go with the PowerXCell 8i for the PS4.

Is that the cpu in one of IBMs super computers? I don't know if it'll be feasible. I would love if Cell development went on, the libraries are there, devs wouldn't have the problems like they did at the start of the gen.
 

KageMaru

Member
Something more attuned to consoles needs? I mean.. sure you want SOME general processing on there, but you won't likely need high Double performance for games...

I'm sorry but what is high double performance?

Also what would you think consoles needs are? There's no reason an AMD CPU can't be as good, if not better, than other possible options out there. Who knows, maybe they can add some vector registers to the CPU if floating point performance is a big concern, it would be useful for physics calculations and such. Though the vast majority of floating point performance is going to come from your GPU anyways.

Is that the cpu in one of IBMs super computers? I don't know if it'll be feasible. I would love if Cell development went on, the libraries are there, devs wouldn't have the problems like they did at the start of the gen.

There are plenty of posts, from developers, over at the Beyond 3D forums why continuing with the Cell would likely not be the best option. Check them out, they are always interesting reads.
 
I'm sorry but what is high double performance?

Also what would you think consoles needs are? There's no reason an AMD CPU can't be as good, if not better, than other possible options out there. Who knows, maybe they can add some vector registers to the CPU if floating point performance is a big concern, it would be useful for physics calculations and such. Though the vast majority of floating point performance is going to come from your GPU anyways.

"Double" the data type... like double floating point? There is single and double point. General CPU's are high in double point precision while GPUs are higher in single point. Cell is like a stream processor... a GPU, very high single point precision. They can use the Jaguar cores for doubles, and have the GPU do its graphics processing, but I really like the idea of having those 1PPU 4SPE modules that Jeff talk about. Because it allows for low power media processing, backwards compatibility, and dedicated to physics calculations for PS4 games.
 

KageMaru

Member
8-core ARM A15 on a performance optimized process (>3 Ghz) with SPU's as FP units instead of NEON.

It would be interesting to see how an ARM set up would perform in a console.

"Double" the data type... like double floating point? There is single and double point. General CPU's are high in double point precision while GPUs are higher in single point. Cell is like a stream processor... a GPU, very high single point precision. They can use the Jaguar cores for doubles, and have the GPU do its graphics processing, but I really like the idea of having those 1PPU 4SPE modules that Jeff talk about. Because it allows for low power media processing, backwards compatibility, and dedicated to physics calculations for PS4 games.

Jeff is wrong, a 1PPU 4SPU module would not allow BC. Also by including such a module, they would likely have to sacrifice something else in the CPU or GPU area.
 

drkohler

Banned
Which might explain the Sony statements that they will wait till processes are in place; a 3-6 month delay till March of 2014 if Sony or Microsoft want's to take advantage of the new processes. 28nm and 2.5D with interposer is available for 2013 but 3D with some critical parts 20nm (like AMD Southbridge quotes at 22nm) will wait for 2014.
Will you give up your 3D stacking wet dream? Consoles are not the playfield for implementing untested, expensive new technolgies. You will NOT see 3D memory stacking in any of the new consoles. If Sony wants to bring their new console early 2014, they will have ordered all the materials by early 2013. This is how mass manufacturing works - otherwise your costs would skyrocket.
 

drkohler

Banned
Jeff is wrong, a 1PPU 4SPU module would not allow BC. Also by including such a module, they would likely have to sacrifice something else in the CPU or GPU area.
Also think of the address/databus nightmares. Supposedly we have a cpu and gpu in the APU and an external GPU fighting for the adress/data bus and now you want to add another multiprocessor-unit (is there redundancy in this unit?) to this mess? Good luck with that- you probably just designed a $600 console.
 
Jeff is wrong, a 1PPU 4SPU module would not allow BC.
With some small "emulation" you can work two of them, as it would be working with the same hardware. Right? You'd just have an extra PPU. Also if they can have it in HSA with the rest of the hardware, latency wouldn't be much of an issue......?(I'm taking a stab at things...)

Also by including such a module, they would likely have to sacrifice something else in the CPU or GPU area.
CPU is already a jaguar. Rumored 4 cores.. take it down to 2? Down the GPU just a tad since it wont have to worry about doing physics calculations.
 

Nachtmaer

Member
They can use the Jaguar cores for doubles, and have the GPU do its graphics processing, but I really like the idea of having those 1PPU 4SPE modules that Jeff talk about. Because it allows for low power media processing, backwards compatibility, and dedicated to physics calculations for PS4 games.

I'm not an expert on all this, but the last few pages of this thread made me think about this as well and how interesting it might be.

Would it actually be doable and viable to have an SoC with all these things that is not too big and within TDP limits?

Jeff is wrong, a 1PPU 4SPU module would not allow BC. Also by including such a module, they would likely have to sacrifice something else in the CPU or GPU area.

Wouldn't using that make it easier to have BC, maybe even with some form of emulation?
 
I hope you realize you repeated sections of your post. It was like scrolling through a whole page of posts just to get to the bottom of yours.

Sorry, Gateway error 504. Once posted I went away to do fanboy things.

I dont understand why you continue this argument about Cell. Multiple people have told you Cell>>>Xenon. Both 360 and PS3 have there strengths and weaknesses. 360 is its GPU while PS3 is the CPU. This has been known for years now. I dont even understand how this is still debatable. This argument is from 2006.

Cell > Xenon it's not an argument. Cell >>>Xenon either.

It's not as easy at that. Cell is weak at integer level, what is the main task of a CPU. GTAIV lagging and dropping frames is a CPU bound issue, for example. Cell have strenghs outside of the CPU requeriments, and weakness at CPU requeriments. Thats my point. Go to Beyond3D and have a read. Not even the bigger Cell fanboy there would claim such thing as Cell>>>Xenon 'because everyone know'. To me, someone claiming that it's no different to someone thinking PS2 was stronger than NGC.

As far the GDD5 vs GDD3. Obviously 4gb of gddr5 would be accompanied by a 256bit bus as well. The question is whether GDD3 with 6-8gb of RAM and 256bit bus would be better than 4gb of RAM and a 256bit bus. Essentially a lot more RAM vs a lot more bandwidth. Thats the question.

Infinite budget? Over 500$ again? You will not have 4GB GDDR5 on a 256bus. That's for sure.

http://www.newegg.com/Product/Produ...0006662&isNodeId=1&Description=gtx680&x=0&y=0

Regarding the memory choices - as a consumer I don't really care about patents and proprietary technology as long as it benefits me either through price, speed, quality. As a engineer my viewpoint is very different but I won't go into that now. Sony worked with Rambus and XDR before IF (and that is the problem) they now have an agreement with XDR2 which is "faster" (probably not cheaper) than GDDR5 and will be available in bulk I don't see why Sony shouldn't go that route.

With 4GB GDDR5 Sony hopefully won't cut cost with a smaller BUS because their engineers would simply waste the GDDR5 advantage. A lot of synthetic benchmarks really profit frome more (graphics) memory bandwith but not so much by a size increase.

I am not expert enough (or maybe a bit lazy too *g*) to calculate the maximum bandwith for DDR3, GDDR5, XDR2 but I guess there is a reason why AMD uses GDDR5 for their top-tier GPUS and as an example my Geforce FX880m only has DDR3...

Wishfull thinking

4GB XDR2 in the PS4 which is 50% faster than GDDR5 with the same power consumption. To minimize heat Sony could adjust clock-speed/voltage of the XDR2 to reach GDDR5 speed with less power.

You are not bandwith bound with GDDR5 and a 256bit bus in the recent top performer GTX680. Why go crazy with ram in a console wich will have lower specs?

In the past Nvidia was in the big bus/slow GDDR3 memory wagon because of the overhead of a narrow BUS in a GPU enviroment. ATI was in the narrow BUS/high speed RAM because it was cheaper and provided the same raw bandwith numbers. Righ now there is no bandwith bottleneck. If you go crazy up with resolutions, your muscle will fail earlier than your memory.

He has SOME good points. I lost him when he compared the GTX 280 to the Cell. Even more so after he tried to discredit every advantage the Cell has over other architectures.

I compared Cell vs Q6600 in a real world FP usage. Some other guy claimed Cell was more efficient than a GTX280 because it was in the table I posted. I replied to that too.

I do not discredit Cell advantages. I discredit it as a general purpose CPU. As I pointed earlier, and jeff_rigby was able to undestand and expand, so you should be able to, Cell it's efficient as a custom DSP processor for media devices. People here actual believes Cell is a good CPU, and actually believes its not a good streamer because his favourite codec it's not optimized for Cell. My head just implodes.

IIRC (I'm no tech expert) but the cache each SPE had was adequate. It was the memory interface between the XDR and the cell that was the bottleneck. If a extra wide memory I/O were used today, that would resolve the problem (correct me if I'm wrong Jeff).

Cache is not apropiate at SPUs because they are local and not accesible by PPU or any other SPU. That's the main bottleneck and fiasco at current Cell. Some good looking raw numbers at synthetic benchmarks, but unreachable in real world enviroments.


Amazing how console warriors can turn a thread into shit, isn't it? =p

I'm a master race guy :p

Not sure the lead free solder was the core issue since the PS3 had to use it as well, no?

The isue is the cheap lead free solder. You need a more expensive lead free solder based on silver/etc to match the performance of and ugly cheap lead solder.



Looking at performance on a PC isn't really an accurate way to determine how things would turn out in a console. Different games are programmed to take advantage of different types of CPUs. Some scale well with the number of cores while others do not because of the amount of configurations out there and developers working around a lowest common denominator.

PC is accurate at show how game engines works and what they demand.


Agreed. I think an improved 28cm cell with either 3PPU/8SPUs or 2PPU/12SPUs (add some more mem if the die size remains <200nm'2) would have done the job easily. As it seems, AMD was able to pull all three prties into their camp and Sony even went for the APU thing (this could bite them in the a"" in the end, given the turds AMD currently markets).

Cell lacks IPC at PPE. Current atom like level. Then you have to rearrange SPU's subsystem.

Once you did that, you have to match the current Intel/AMD CPU performance, and AMD/Nvidia GPU performance.

And after that, you have to teach developers to program to a whole different paradigma and convince them to invest in that new platform.

And after that you have to take care of manufacturate some exotics chips that only you use, so there will not be anyone more cheapening costs.

I just can't see that as a wise business model.

The 360 is easy to program for and it does not have x86. I just don't get this love for x86, a chip where a lot of silicon is wasted for the sake of compatibility of PC business software from the 1980s. For me, that kind of chip has no business being in a console. Imagine using smaller (and slower) PPC general purpose cores (like the WiiU/360) and using the saved silicon budget to have some more CU's in the GPU, or if you're feeling adventurous, some SPU units on die for BC/physics/vector processing.

As I said earlier, it's not about instruction set. It's about core architecture. x86 wasting silicon and not being Risc or whatsever is an old cliché. Current CPU just use decoders to do that. CPU's don't work with x86 or PowerPC instructions, but with micro-OPS. Complex instructions like x86 were not a problem since pentium days.

Change Instructions set would have little to no none performance advantage, but would cause a high cost forcing to change every program out there.

Yeah what happened to AMD CPUs? They use to be right up there with Intel, but since Bulldozer I haven't heard a single good thing about them.

Bad management. They wasted too much money buying ATI. They had to cut CPU research and sell some branches. Then they failed at planning level with the future of x86 and Bulldozer.

Which might explain the Sony statements that they will wait till processes are in place; a 3-6 month delay till March of 2014 if Sony or Microsoft want's to take advantage of the new processes. 28nm and 2.5D with interposer is available for 2013 but 3D with some critical parts 20nm (like AMD Southbridge quotes at 22nm) will wait for 2014.

Southbridges are not critical. Usually chipsets are one or two nodes behind CPU/GPUs. So they can make some more profit from older fabs.

Kabini APU with 9-25W TDP leaving Sony with 175W TDP left for the standalone GPU. 75% moren than for the RSX. Even Kaveri would allow a 165W TDP Southern Island GPU. Can't be right or? Desktop Radeon 7870 or 7950 would "easily" fit that number with some changes and stay in the same TDP region as Cell/RSX.

You need a bare minimum at CPU for feed the GPU. That APUs are just too weak. Some of them are even too weak for it's own integrated GPU. Steamroller it's not ideal, but would not be such a bottleneck for a beefier GPU.

8-core ARM A15 on a performance optimized process (>3 Ghz) with SPU's as FP units instead of NEON.

You can't be serious talking about a mobile CPU against desktop parts. ARM is far from current x86 CPU's at performance level.

It's just not as easy as to overclock an ARM to catch x86 CPU's. You would need a longer pipeline, rework the cache subsystem, long etc.

ARM is a lowTDP performer design, not a horse power beast.
 

Just Lazy

Banned
Supposedly they were getting new dev kits around E3. I figure theres a few different possibilities why nothings leaked about it yet.

A. New dev kits didnt end up releasing around E3 and were delayed
B. there were no major changes in this revision
C. they were just able to keep a better lid on things this time around/no ones cared to leak anything
D. combination of both B and C.
E. none of the above- then explain why

what do you guys think?

Didn't the new E3 dev kit rumour come from the pastebin post in April?

If so it's fake.
 

KageMaru

Member
Will you give up your 3D stacking wet dream? Consoles are not the playfield for implementing untested, expensive new technolgies. You will NOT see 3D memory stacking in any of the new consoles. If Sony wants to bring their new console early 2014, they will have ordered all the materials by early 2013. This is how mass manufacturing works - otherwise your costs would skyrocket.

I've already been down this road with Jeff. IMO part of the problem is that many people here see his wall of posts, illustrations, bolding, etc. and look to believe his theories. If he didn't have this following on GAF, I don't think he would post as often or in the same manner. We'll just have to wait until these next gen consoles are launched before he stops repeating the same assumptions.

With some small "emulation" you can work two of them, as it would be working with the same hardware. Right? You'd just have an extra PPU. Also if they can have it in HSA with the rest of the hardware, latency wouldn't be much of an issue......?(I'm taking a stab at things...)

CPU is already a jaguar. Rumored 4 cores.. take it down to 2? Down the GPU just a tad since it wont have to worry about doing physics calculations.

Look at drkohler's post above, moving all this data among all these chips would be a nightmare on many levels. It would also impact cost reduction down the line, which is one of the most important pillars with console design.

You mean like Cell and Blu-ray?

And how well did that work for Sony this gen? Nice to see them swimming in all that money from such decisions, oh wait...

And emotion engine?

Different time and different era in terms of console development. Back then there was merit to designing your own chips in order to keep up with the standards. It allowed Sony to produce chips that had great performance for their time and kept them in complete control for future revisions, cost reductions, BC, etc.

It's no longer possible for Sony to produce their own chips that can keep up with the expected standards now.
 
Infinite budget? Over 500$ again? You will not have 4GB GDDR5 on a 256bus. That's for sure.

http://www.newegg.com/Product/Produ...0006662&isNodeId=1&Description=gtx680&x=0&y=0
Good fucking lord.

Cache is not apropiate at SPUs because they are local and not accesible by PPU or any other SPU. That's the main bottleneck and fiasco at current Cell. Some good locking raw numbers at synthetic benchmarks, but unreachable in real world enviroments.
That cache isn't meant to be shared. That's why it's local to the SPE's. The main bottleneck was the memory controller into the XDR. They specifically used XDR because the Cell requires extremely fast access to memory... that is the specific reason for that local SPE cache. So the problem isn't with the Cell, it wasn't with the XDR, it was the controller between the two.


PC is accurate at show how game engines works and what they demand.
You used GTA4 in one of your posts as to why the PS3 chugged a lot... that game ran on shit even with a really good PC. So no, it's not a good way to judge an engine.
 
Good fucking lord.

You have to be kidding to expect GDDR5+256 bit bus in a console.

GDDR5 is meant to be with a 128 bit bus. That way you would have smaller chips and more simple motherboard logic, saving money and silicon budget. And later on it would be easier to shrinks into slim models and fewer ram chips.

That cache isn't meant to be shared. That's why it's local to the SPE's. The main bottleneck was the memory controller into the XDR. They specifically used XDR because the Cell requires extremely fast access to memory... that is the specific reason for that local SPE cache. So the problem isn't with the Cell, it wasn't with the XDR, it was the controller between the two.

I'm not sure you understand how an IMC works.



You used GTA4 in one of your posts as to why the PS3 chugged a lot... that game ran on shit even with a really good PC. So no, it's not a good way to judge an engine.

I used GTA as an example of a CPU bound engine. Can move to Souce if you want to. There are other few examples.
 
You have to be kidding to expect GDDR5+256 bit bus in a console.

GDDR5 is meant to be with a 128 bit bus. That way you would have smaller chips and more simple motherboard logic, saving money and silicon budget. And later on it would be easier to shrinks into slim models and fewer ram chips.
I'm referring to you posting Newegg prices. Yes, GDDR5 is expensive. Yes, 2 gigs of it is expensive. Yes, 256 bus is expensive. There are many reasons why it's expensive, but don't post a desktop GPU to explain your position for a console one.

I'm not sure you understand how an IMC works.
Yes... I do know. It's nothing complex. It's the "high way" or better yet, the "traffic lights" that control the data flow to and from the CPU and Memory.

There was the onchip EIB between the Cell components, but what I'm saying (from what I understand, I may be wrong) the FlexIO with the Cell and XDR wasn't fast enough to cater to the speeds that XDR had. Either that or the bus wasn't wide enough on the XDR... I'm not sure which one it was...


I used GTA as an example of a CPU bound engine. Can move to Souce if you want to. There are other few examples.
That engine still ran like crap. And just because an engine is CPU bound doesn't make it inherently good. Which is what sounded like what you were saying (which I may be mistaken).
 

Triple U

Banned
dr. apocalipsis said:
Cell > Xenon it's not an argument. Cell >>>Xenon either.

It's not as easy at that. Cell is weak at integer level, what is the main task of a CPU. GTAIV lagging and dropping frames is a CPU bound issue, for example. Cell have strenghs outside of the CPU requeriments, and weakness at CPU requeriments. Thats my point. Go to Beyond3D and have a read. Not even the bigger Cell fanboy there would claim such thing as Cell>>>Xenon 'because everyone know'. To me, someone claiming that it's no different to someone thinking PS2 was stronger than NGC.

What are you talking about? The main task of cpu is integer level? What does this even mean? The main task of a what type of CPU? You do realize that there are alot more fields outside of the Wintel spectrum of computing.

I really don't think you understand what you are ranting about. Integer performance only really matter when there are 2 coordinates, X & Y(2D). You decrease X once you move left one pixel, very discrete. As soon as you add Z your numbers start to float. This is basic stuff, the fact that you seem ignorant to this tells me you are not really qualified to discuss in such detail.

Not even going to get into what you have wrong about GTAIV or CELL V Xenon. Go read some more books :)




dr. apocalipsis said:
I compared Cell vs Q6600 in a real world FP usage. Some other guy claimed Cell was more efficient than a GTX280 because it was in the table I posted. I replied to that too.


What exactly are you using for "Real World" usage of Cell, if you don't mind me asking?
dr. apocalipsis said:
Cache is not apropiate at SPUs because they are local and not accesible by PPU or any other SPU. That's the main bottleneck and fiasco at current Cell. Some good looking raw numbers at synthetic benchmarks, but unreachable in real world enviroments.

Cache are not appropriate for SPU's? Oh really? So are they supposed to just pull local data from the cloud or?








dr. apocalipsis said:
PC is accurate at show how game engines works and what they demand.

This is true but only on a fundamental level.

dr. apocalipsis said:
Cell lacks IPC at PPE. Current atom like level. Then you have to rearrange SPU's subsystem.

Once you did that, you have to match the current Intel/AMD CPU performance, and AMD/Nvidia GPU performance.

And after that, you have to teach developers to program to a whole different paradigma and convince them to invest in that new platform.

And after that you have to take care of manufacturate some exotics chips that only you use, so there will not be anyone more cheapening costs.

I just can't see that as a wise business model.

Cells IPC isn't something to write home about but until Intel dropped Core, it was par for the course. What exactly do you mean by arrange the SPU subsystem? And what about matching CPU/GPU performance? Performance at what?

And this whole different paradigm is pretty much what is gonna be standard in the very near future. What is parallelization bad too now or something?


dr. apocalipsis said:
As I said earlier, it's not about instruction set. It's about core architecture. x86 wasting silicon and not being Risc or whatsever is an old cliché. Current CPU just use decoders to do that. CPU's don't work with x86 or PowerPC instructions, but with micro-OPS. Complex instructions like x86 were not a problem since pentium days.

Change Instructions set would have little to no none performance advantage, but would cause a high cost forcing to change every program out there.
Its not about instruction set? :Facepalm

What do you think helps determine core architecture?
dr. apocalipsis said:
You need a bare minimum at CPU for feed the GPU. That APUs are just too weak. Some of them are even too weak for it's own integrated GPU. Steamroller it's not ideal, but would not be such a bottleneck for a beefier GPU.

HUH?
 

DieH@rd

Banned
I'm referring to you posting Newegg prices. Yes, GDDR5 is expensive. Yes, 2 gigs of it is expensive. Yes, 256 bus is expensive. There are many reasons why it's expensive, but don't post a desktop GPU to explain your position for a console one.

Aren't that parts used in consoles? What are consoles made of? Nintendo magic?

I can't provide you wholesaler prices from hynix or Elpida, I,m sorry.

Yes... I do know. It's nothing complex. It's the "high way" or better yet, the "traffic lights" that control the data flow to and from the CPU and Memory.

There was the onchip EIB between the Cell components, but what I'm saying (from what I understand, I may be wrong) the FlexIO with the Cell and XDR wasn't fast enough to cater to the speeds that XDR had. Either that or the bus wasn't wide enough on the XDR... I'm not sure which one it was...

I can't see EIB as a problem. With the disclosed information available, it looks like an efficient silicon wise bus. Very similar to the later ring bus interconnect used by Sandy Bridge, with obvious performance differences.

Main problem I see with XDR is the pathetic width. 2x32bit channels means an insane overhead at lower layers. More with the data chunks SPU will use. FlexIO is fast, just too narrow. Even low ends Atoms have 64bits bus width.

If IMC wasn't fast enough, XDR could not be able to run at rated speed. That it's not a problem.


That engine still ran like crap. And just because an engine is CPU bound doesn't make it inherently good. Which is what sounded like what you were saying (which I may be mistaken).

I never said GTA engine was good or not. I was just talking about a CPU bound engine or workload. To be GPU, CPU or Wii Remote bound will not make an engine better.
 
What are you talking about? The main task of cpu is integer level? What does this even mean? The main task of a what type of CPU? You do realize that there are alot more fields outside of the Wintel spectrum of computing.

I really don't think you understand what you are ranting about. Integer performance only really matter when there are 2 coordinates, X & Y(2D). You decrease X once you move left one pixel, very discrete. As soon as you add Z your numbers start to float. This is basic stuff, the fact that you seem ignorant to this tells me you are not really qualified to discuss in such detail.

Not even going to get into what you have wrong about GTAIV or CELL V Xenon. Go read some more books :)

What on hell?!

Try to solve the CPU Queen problem on a GPU.

OMFG, just try to run some highly serialized lossless data compression such like .rar on a GPU.

Just... Just... OMFG.

Infinite facepalm.



What exactly are you using for "Real World" usage of Cell, if you don't mind me asking?

Any program that will use processing power to obtain some valuable data after an input. Not just a test algorithm used to benchmark.


Cache are not appropriate for SPU's? Oh really? So are they supposed to just pull local data from the cloud or?

I was talking about local private caches vs shared pool cache needed for multithreading.


Cells IPC isn't something to write home about but until Intel dropped Core, it was par for the course. What exactly do you mean by arrange the SPU subsystem? And what about matching CPU/GPU performance? Performance at what?

And this whole different paradigm is pretty much what is gonna be standard in the very near future. What is parallelization bad too now or something?



Its not about instruction set? :Facepalm

What do you think helps determine core architecture?

Cell/Xenon IPC is low not by today stardards, but by 2006 ones. Lack of OoO execution, poor prefetch, branch prediction, frontend, etc... Nothing to do with instruction set.

And loled at Cell paradigm as the new standard.



Yes. What the fuck happened there?
 

dogmaan

Girl got arse pubes.
What on hell?!

Try to run the CPU Queen benchmark on a GPU.

OMFG, just try to run some highly serialized lossless data compression such like .rar on a GPU.

Just... Just... OMFG.

Infinite facepalm.





Any program that will use processing power to obtain some valuable data after an input. Not just a test algorithm used to benchmark.




I was talking about local private caches vs shared pool cache needed for multithreading.




Cell/Xenos IPC is low not by today stardards, but by 2006 ones. Lack of OoO execution, poor prefetch, branch prediction, frontend, etc... Nothing to do with instruction set.

And loled at Cell paradigm as the new standard.




Yes. What the fuck happened there?

Regardless of how correct or incorrect your opinions are, to me, you appear to be overly pretentious in how you are presenting them.
 
Regardless of how correct or incorrect your opinions are, to me, you appear to be overly pretentious in how you are presenting them.

I could blame my poor english, but not going to happen.

I just found funny some things, do not want to be offensive though.

I'm sorry if you, or anyone else, feels disturbed.
 

Triple U

Banned
dr. apocalipsis said:
What on hell?!

Try to solve the CPU Queen problem on a GPU.

OMFG, just try to run some highly serialized lossless data compression such like .rar on a GPU.

Just... Just... OMFG.

Infinite facepalm.

How old are you again?

Anyways,

What does Rar's algorithm performance on GPUs have to do with anything I just said? Or the CPU Queen on GPU's? How about you respond to what I actually posted instead of going on random ass tangents about nothing?


dr. apocalipsis said:
Any program that will use processing power to obtain some valuable data after an input. Not just a test algorithm used to benchmark.

You do know that the number of these types of programs are infinitely high right? For you to make a definite answer either way is dumbfounding.

dr. apocalipsis said:
I was talking about local private caches vs shared pool cache needed for multithreading.

Each SPE can run only one HW thread. Im not sure what you are going on about? This is what you posted...
you said:
Cache is not apropiate at SPUs because they are local and not accesible by PPU or any other SPU. That's the main bottleneck and fiasco at current Cell. Some good looking raw numbers at synthetic benchmarks, but unreachable in real world enviroments.
Which is patently false. Each SPE has a DMA to move data between the SPEs(amongst each other) and to the PPE.

dr. apocalipsis said:
Cell/Xenon IPC is low not by today stardards, but by 2006 ones. Lack of OoO execution, poor prefetch, branch prediction, frontend, etc... Nothing to do with instruction set.

And loled at Cell paradigm as the new standard.

By 2006 ones do you mean like Pentium 4s and AMD FX's? LOL.

Low IPC high clock pentium 4s were common as hell in that era. It wasn't until Core that Intel flipped the script...

You really should hit the books again, if you have at all.
 
How old are you again?

Anyways,

What does Rar's algorithm performance on GPUs have to do with anything I just said? Or the CPU Queen on GPU's? How about you respond to what I actually posted instead of going on random ass tangents about nothing?

It's pretty difficult to me tro try to argue with someone that just spitted against the usefulness of integer performance at CPU. Pretty embarrasing to try to explain to someone who calls me ignorant the absolute need of serial processing at CPU and how it's the paradigm of current computational system.




You do know that the number of these types of programs are infinitely high right? For you to make a definite answer either way is dumbfounding.

As many programs as there can be out there, none will reach, or be even close, the numbers of any synthetic benchmark. Some architectures are just more real world proof than others. Don't you agree?



Each SPE can run only one HW thread. Im not sure what you are going on about? This is what you posted...

Each core of a Q6600 can run only one HW thread too. Whats your point?

Which is patently false. Each SPE has a DMA to move data between the SPEs(amongst each other) and to the PPE.

It's not false since:
1st It's not simultaneous.
2nd PPE can't see or access any SPE cache.
3rd SPE's can't access any other SPE cache.

SPEs works as a black box to PPE. You send data to SPE. SPE process data. SPE delivers results. Cell uses a list system to organize itself, a ring bus to manage data transfers.

That it's NOT simultaneous multi threading such as any modern multi core CPU. Call it Phenom or Xenon.


By 2006 ones do you mean like Pentium 4s and AMD FX's? LOL.

Low IPC high clock pentium 4s were common as hell in that era. It wasn't until Core that Intel flipped the script...

Conroe was launched at 2006. AMD K9 it's from 2005.

You really should hit the books again, if you have at all.

How can you be so disrepectful having proven anything?
 

Triple U

Banned
dr. apocalipsis said:
It's pretty difficult to me tro try to argue with someone that just spitted against the usefulness of integer performance at CPU. Pretty embarrasing to try to explain to someone who calls me ignorant the absolute need of serial processing at CPU and how it's the paradigm of current computational system.



It must also be embarrassing to try and argue with someone who actually knows the topic at hand and can easily sift thru this drivel you are trying to push eh? You've even resorted to putting words in my mouth to save a pathetic ass face. Tsk tsk.

Where did I say anything about Integer performance besides that its relatively worthless in 3d graphics compared to Floats? Or where did I mention anything at all about serial processing?



dr. apocalipsis said:
As many programs as there can be out there, none will reach, or be even close, the numbers of any synthetic benchmark. Some architectures are just more real world proof than others. Don't you agree?
Yeah I do. But um what does this have to do you claiming that a Q6600 would best a Cell at these programs?



dr. apocalipsis said:
Each core of a Q6600 can run only one HW thread too. Whats your point?



It's not false since:
1st It's not simultaneous.
2nd PPE can't see or access any SPE cache.
3rd SPE's can't access any other SPE cache.

SPEs works as a black box to PPE. You send data to SPE. SPE process data. SPE delivers results. Cell uses a list system to organize itself, a ring bus to manage data transfers.

That it's NOT simultaneous multi threading such as any modern multi core CPU. Call it Phenom or Xenon.

Welp, I guess thats that. I don't know why I trusted IBM to know how there chip works.

IBM said:
If the SPE needs to access main memory or other memory regions within the system, then it needs to initiate DMA data transfer operations. DMAs can be used to effectively transfer data between the main memory and SPE's local store, or between two different SPE local stores. Each SPE has its own DMA engine that can move data from its own local store to any other part of the system including local stores belonging to other SPEs and IO devices.

Wow. You totally have taught me a few things today. You have any more lessons Doc?


dr. apocalipsis said:
Conroe was launched at 2006. AMD K9 it's from 2005.

So did your mind just turn off when I said "before Intel launched Core " or?

dr. apocalipsis said:
How can you be so disrepectful having proven anything?

How can you be so smug and condescending when you know next to nothing about what you are in here parading about?

Look, we can play this game for however long you like but to the people actually informed you're doing not much more than making an ass of yourself. I give you kudos for knowing of a few things but alot of what you say is just plain silly(and I'm not talking about the English)
 
I don't need more proofs about your lack of knowledge.

You can't distinguish between memory access and data transfer. I guess you don't know what a DMA interrupt is and how it hurts other processors performance. You certainly don't know how a qeue logic and secuences of data transfers have no place in a multithreaded system looking for high performance.

You linked a doc explaining in detail what I already pointed out, and you actually thinks It can, somehow, prove me wrong. That's only because you can't understand it. This is the very same mechanism used in SEGA Genesis to transfer data between eeprom and VRAM.

I claim a Q6600 is a way better overall CPU than a Cell. Yes, I do. You can't claim otherwise without looking as a fool.

Core architecture and K9 were launched before Cell.

I'm tired of this.
 

Triple U

Banned
I don't need more proofs about your lack of knowledge.

You can't distinguish between memory access and data transfer. I guess you don't know what a DMA interrupt is and how it hurts other processors performance. You certainly don't know how a qeue logic and secuences of data transfers have no place in a multithreaded system looking for high performance.

You linked a doc explaining in detail what I already pointed out, and you actually thinks It can, somehow, prove me wrong. That's only because you can't understand it. This is the very same mechanism used in SEGA Genesis to transfer data between eeprom and VRAM.

I claim a Q6600 is a way better overall CPU than a Cell. Yes, I do. You can't claim otherwise without looking as a fool.

Core architecture and K9 were launched before Cell.

I'm tired of this.

Data transfer and memory access aren't mutally exclusive. If SPE 0 needs to "access" a variable in SPE 1's stores you simply fetch thru the DMA. Data is obviously gonna transfer. This is basic basic stuff, but maybe even thats too much for you.

I love how you can in one breathe tell me about what I don't know and in the same breathe tell me that DMA interrupts do nothing but "hurt other processors performance". And I also glance at the gem about the q6600 being "way better" than the CELL while not giving any scenarios(oh wait I forgot processing valuable data after input!!) despite the fact that they are designed for two totally different applications. Oh and the one about conroe(7/27/06) launching before Cell(05).

Im glad you are tired because my IQ nosedives everytime I read one of your rants.
 
I'm tired of this.

FINALLY! Thought this would never end.


Now back to the topic at hand. I've been doing some research on the piledriver APU which the steamroller APU is based on. The performance doesn't seem so hot. ~30fps for Crysis 2 for ex. A lot of the articles I've read said it isn't well suited for gaming. I understand as far as power consumption its making some pretty good strides, but I dont see how this would be a good suite for PS4. I dont know if I was reading it right but the gpu's in the APU were around the 560-700 gflop range, so a 1.84 tflop gpu would be quite a lot more, but than wouldn't PS4 pretty CPU bound? These performance analysis were for the mobile version of the piledriver APU.
 
Status
Not open for further replies.
Top Bottom