• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Albert Penello puts dGPU Xbox One rumor to rest

Status
Not open for further replies.

Bitmap Frogs

Mr. Community
At Microsoft, we have a position called a "Technical Fellow" These are engineers across disciplines at Microsoft that are basically at the highest stage of technical knowledge. There are very few across the company, so it's a rare and respected position.

We are lucky to have a small handful working on Xbox.

I've spent several hours over the last few weeks with the Technical Fellow working on our graphics engines. He was also one of the guys that worked most closely with the silicon team developing the actual architecture of our machine, and knows how and why it works better than anyone.

So while I appreciate the technical acumen of folks on this board - you should know that every single thing I posted, I reviewed with him for accuracy. I wanted to make sure I was stating things factually, and accurately.

So if you're saying you can't add bandwidth - you can. If you want to dispute that ESRAM has simultaneous read/write cycles - it does.

I know this forum demands accuracy, which is why I fact checked my points with a guy who helped design the machine.

This is the same guy, by the way, that jumps on a plane when developers want more detail and hands-on review of code and how to extract the maximum performance from our box. He has heard first-hand from developers exactly how our boxes compare, which has only proven our belief that they are nearly the same in real-world situations. If he wasn't coming back smiling, I certainly wouldn't be so bullish dismissing these claims.

I'm going to take his word (we just spoke this AM, so his data is about as fresh as possible) versus statements by developers speaking anonymously, and also potentially from several months ago before we had stable drivers and development environments.

I like how you dropped half your claims after your previous post was debunked.

Also, the argument from authority didn't work with your "we invented directx" so why do you attempt it again? I don't know who is advising you on your social media presence but it's not working, at least here.
 

The Flash

Banned
At Microsoft, we have a position called a "Technical Fellow" These are engineers across disciplines at Microsoft that are basically at the highest stage of technical knowledge. There are very few across the company, so it's a rare and respected position.

We are lucky to have a small handful working on Xbox.

I've spent several hours over the last few weeks with the Technical Fellow working on our graphics engines. He was also one of the guys that worked most closely with the silicon team developing the actual architecture of our machine, and knows how and why it works better than anyone.

So while I appreciate the technical acumen of folks on this board - you should know that every single thing I posted, I reviewed with him for accuracy. I wanted to make sure I was stating things factually, and accurately.

So if you're saying you can't add bandwidth - you can. If you want to dispute that ESRAM has simultaneous read/write cycles - it does.

I know this forum demands accuracy, which is why I fact checked my points with a guy who helped design the machine.

This is the same guy, by the way, that jumps on a plane when developers want more detail and hands-on review of code and how to extract the maximum performance from our box. He has heard first-hand from developers exactly how our boxes compare, which has only proven our belief that they are nearly the same in real-world situations. If he wasn't coming back smiling, I certainly wouldn't be so bullish dismissing these claims.

I'm going to take his word (we just spoke this AM, so his data is about as fresh as possible) versus statements by developers speaking anonymously, and also potentially from several months ago before we had stable drivers and development environments.

And on that bombshell, I am out. My "speed sense" tells me that this thread will explode in the next 10 seconds. Keep posting Albert, you seem like a nice guy and I do enjoy reading your posts.

10

9

8

7

6

5...
 

StevieP

Banned
Where did you get Sony's final CPU clock speed from and where did you get the information that the PS4 has a maximum of 10gb/s coherent bandwidth?

Durante's post earlier today:
http://www.neogaf.com/forum/showpost.php?p=80713669&postcount=677
Durante said:
Exactly. Having a unified memory space with (potentially) cache-coherent access is great, but it's not likely to make up for a significant overall CPU/GPU performance discrepancy in a majority of real-world gaming scenarios.

Also, generally when you want to do GPGPU, it's because you have a highly parallel problem with high bandwidth requirements. In such cases you wouldn't want to go through the cache-coherent bus in PS4 anyway (as that only provides 10 GB/s compared to the 176 GB/s of the traditional "GPU-like" bus).

Oh, and PCIe 3.0 tops out at 15.75 GB/s bidirectional, which is actually more than the 10 GB/s bidirectional on the cache-snooping "Onion" bus in PS4. Of course, there should be a pronounced latency difference between the two.
 

IN&OUT

Banned
Are you kidding? I mean really - are you kidding?

This is part and parcel of the territory here. You have to answer for your statements, especially if you're here in an official capacity. People get banned for being out of line, but poking holes in the arguments of other posters is well within the rules.

I've had my work here both praised and eviscerated, called out by numerous forum folks both publicly and via PM when I got stuff wrong, and I'm a goddamn admin. Guess what - I wouldn't have it any other way. That is what makes NeoGAF what it is.

There are many, many people who are more than capable of assessing, vetting and debunking technical claims and they have every right to do so. That's the price of doing business here. If we had official Nintendo or Sony reps on board, they would be subject to the same process.

If you're scared, buy a dog.

To show my appreciation, I read this post standing up, SALUTE.
 

Spongebob

Banned
I think its a matter of giving a poster the benefit of the doubt unless you know otherwise.

In this case Albert has stated two things as facts, one regarding adding bandwidth and other regarding a read/write cycle. Seems very unlike standard PR spin and FUD to hold fast to such statements. And if he's wrong, surely somebody can point that out with similar certainty?

While this is great an all, what I (and I presume many others) want are numbers that correspond with the previous numbers given.

109GB/s is easily attained with the following formula:

(1024 bits/cycle) * (1 Byte/8 bits) * (853MHz) = 109GB/s.

However, we have no such formula for the oft-reported 204GB/s number. Assuming simultaneous read writes as a theoretical max for each cycle, we get the following numbers:

(2048 bits/cycle) * (1 Byte/8 bits) * (853MHz) = 218GB/s

This is the main point of contention when talking about ESRAM bandwidth, if you could get it cleared up it would calm the forum wars quite a bit.
The math doesn't add up.
 

JaggedSac

Member
At Microsoft, we have a position called a "Technical Fellow" These are engineers across disciplines at Microsoft that are basically at the highest stage of technical knowledge. There are very few across the company, so it's a rare and respected position.

We are lucky to have a small handful working on Xbox.

I've spent several hours over the last few weeks with the Technical Fellow working on our graphics engines. He was also one of the guys that worked most closely with the silicon team developing the actual architecture of our machine, and knows how and why it works better than anyone.

So while I appreciate the technical acumen of folks on this board - you should know that every single thing I posted, I reviewed with him for accuracy. I wanted to make sure I was stating things factually, and accurately.

So if you're saying you can't add bandwidth - you can. If you want to dispute that ESRAM has simultaneous read/write cycles - it does.

I know this forum demands accuracy, which is why I fact checked my points with a guy who helped design the machine.

This is the same guy, by the way, that jumps on a plane when developers want more detail and hands-on review of code and how to extract the maximum performance from our box. He has heard first-hand from developers exactly how our boxes compare, which has only proven our belief that they are nearly the same in real-world situations. If he wasn't coming back smiling, I certainly wouldn't be so bullish dismissing these claims.

I'm going to take his word (we just spoke this AM, so his data is about as fresh as possible) versus statements by developers speaking anonymously, and also potentially from several months ago before we had stable drivers and development environments.

Boom
 

onanie

Member
I do want to be super clear: I'm not disparaging Sony. I'm not trying to diminish them, or their launch or what they have said. But I do need to draw comparisons since I am trying to explain that the way people are calculating the differences between the two machines isn't completely accurate. I think I've been upfront I have nothing but respect for those guys, but I'm not a fan of the mis-information about our performance.

Some of your points are misleading, or otherwise need clarifying.

• 18 CU's vs. 12 CU's =/= 50% more performance. Multi-core processors have inherent inefficiency with more CU's, so it's simply incorrect to say 50% more GPU.
Graphics processing is inherently parallel, so 18 vs 12 is indeed 50% more, given the same clock rate.

• Adding to that, each of our CU's is running 6% faster. It's not simply a 6% clock speed increase overall.
Each CU being 6% faster still means only 6% speed increase overall compared to YOUR baseline, not Sony's. You can't have it both ways, Albert. Having 50% more CU is not quite 50% more GPU, but having a 6% clock speed increase is more significant than the number implies?

1.8TF > 1.3TF, to the degree that is universally understood about laws of conservation.

• We have more memory bandwidth. 176gb/sec is peak on paper for GDDR5. Our peak on paper is 272gb/sec. (68gb/sec DDR3 + 204gb/sec on ESRAM). ESRAM can do read/write cycles simultaneously so I see this number mis-quoted.
As some others may have pointed out, you don't just add the numbers together. At no point in time can the GPU see more than the maximum ESRAM bandwidth. Cold, hard fact.

This leads to my question. Can ESRAM sustain simultaneous read/write cycles ALL the time? If not, then how much of the time?

And please allow me to help out your PR department a little. 204gb/sec, according to your understanding of the number, actually implies the old clock rate of 800mhz. The new number should be 853 * 128 * 2 (simultaneous read/write per cycle) = 218gb/sec. They can thank me later.

• We have at least 10% more CPU. Not only a faster processor, but a better audio chip also offloading CPU cycles.
Please do tell us more about Sony's audio chip, or do you actually not know like the rest of us?

• Speaking of GPGPU - we have 3X the coherent bandwidth for GPGPU at 30gb/sec which significantly improves our ability for the CPU to efficiently read data generated by the GPU.
You are combining read and write bandwidth again for your side, while using the one-way bandwidth for your competition. Not quite being honest there, are you?
 
D

Deleted member 80556

Unconfirmed Member
I remember when major nelson pulled the same "adding memory bandwidth" back during the 360 days to tout that system's superiority.

Can't believe it still goes on to this day.

I have no doubt the technical fellow knows his stuff, but that doesn't mean people with experience don't stretch the truth or obfuscate facts in order to make their designs look better. Appealing to authority isn't something I'm going to accept on blind loyalty.

Good thing that this time people can call them out if he's wrong (which he most likely is).

I will be waiting for an extensive write up to prove what he's saying, and for someone like Durante to comment on it. Shame that most of those tech guys don't really like to enter this kind of thread because of some people.
 

RoboPlato

I'd be in the dick
Reading Albert's clarification post makes it sound like this "Technical Fellow" told him what to say in hopes that we wouldn't understand things like bandwidth and GPU CU scaling. It's misleading at best but I wouldn't take it out on Albert himself, he's just passing on the info he was given.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.

That is based on an apparently outdated slide. I can quote Mark Cerny himself.

Mark Cerny said:
"First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. And by small, I just mean small in next-gen terms. We can pass almost 20 gigabytes a second down that bus. That's not very small in today’s terms -- it’s larger than the PCIe on most PCs!

http://www.gamasutra.com/view/feature/191007/inside_the_playstation_4_with_mark_.php?page=2
 
At Microsoft, we have a position called a "Technical Fellow" These are engineers across disciplines at Microsoft that are basically at the highest stage of technical knowledge. There are very few across the company, so it's a rare and respected position.

We are lucky to have a small handful working on Xbox.

I've spent several hours over the last few weeks with the Technical Fellow working on our graphics engines. He was also one of the guys that worked most closely with the silicon team developing the actual architecture of our machine, and knows how and why it works better than anyone.

So while I appreciate the technical acumen of folks on this board - you should know that every single thing I posted, I reviewed with him for accuracy. I wanted to make sure I was stating things factually, and accurately.

So if you're saying you can't add bandwidth - you can. If you want to dispute that ESRAM has simultaneous read/write cycles - it does.

I know this forum demands accuracy, which is why I fact checked my points with a guy who helped design the machine.

This is the same guy, by the way, that jumps on a plane when developers want more detail and hands-on review of code and how to extract the maximum performance from our box. He has heard first-hand from developers exactly how our boxes compare, which has only proven our belief that they are nearly the same in real-world situations. If he wasn't coming back smiling, I certainly wouldn't be so bullish dismissing these claims.

I'm going to take his word (we just spoke this AM, so his data is about as fresh as possible) versus statements by developers speaking anonymously, and also potentially from several months ago before we had stable drivers and development environments.

I can't wait for launch day! It's going to be a long drunken epic day.
DNO3a.gif
 

astraycat

Member
The math doesn't work and it never will because it's a lie.

There's no reason to take such a hard line about it, IMO. It could be this number comes out of some very specific operation like when we use FMA and claim "2 FLOPs" out of it for comparison. It could just be that there's some magic ROP operation specifically to ESRAM that can do a read-modify-write at a very strange rate.

Of course, they could just be mistaken. It happens. Personally I think it's come from some silly-but-non-obvious reason for an unexplained observation. Perhaps they've observed 204GB/s, but forgotten to account for the bandwidth granted by good L1/L2 coherency and instead, since it's so close to the 2x number, just incorrectly concluded that it's due to being able to read and write at the same time.
 
Me too. You would think being the designated tech guy and get consistently get called out for your bullshit would be embarrassing enough. The "We CREATED DirectX" was hilarious, but that post took the cake. He really believed he could just add the peak bandwidth and everyone would fall for it.

That is definitely his last post on gaf, guarantee it.

At Microsoft, we have a position called a "Technical Fellow" These are engineers across disciplines at Microsoft that are basically at the highest stage of technical knowledge. There are very few across the company, so it's a rare and respected position.

We are lucky to have a small handful working on Xbox.

I've spent several hours over the last few weeks with the Technical Fellow working on our graphics engines. He was also one of the guys that worked most closely with the silicon team developing the actual architecture of our machine, and knows how and why it works better than anyone.

So while I appreciate the technical acumen of folks on this board - you should know that every single thing I posted, I reviewed with him for accuracy. I wanted to make sure I was stating things factually, and accurately.

So if you're saying you can't add bandwidth - you can. If you want to dispute that ESRAM has simultaneous read/write cycles - it does.

I know this forum demands accuracy, which is why I fact checked my points with a guy who helped design the machine.

This is the same guy, by the way, that jumps on a plane when developers want more detail and hands-on review of code and how to extract the maximum performance from our box. He has heard first-hand from developers exactly how our boxes compare, which has only proven our belief that they are nearly the same in real-world situations. If he wasn't coming back smiling, I certainly wouldn't be so bullish dismissing these claims.

I'm going to take his word (we just spoke this AM, so his data is about as fresh as possible) versus statements by developers speaking anonymously, and also potentially from several months ago before we had stable drivers and development environments.

Not getting involved in this discussion except to say:

Glorified G is fucking terrible at 'guarantees'.

That is all.
 
Some of your points are misleading, or otherwise need clarifying.


Graphics processing is inherently parallel, so 18 vs 12 is indeed 50% more, given the same clock rate.


Each CU being 6% faster still means only 6% speed increase overall compared to YOUR baseline, not Sony's. You can't have it both ways, Albert. Having 50% more CU is not quite 50% more GPU, but having a 6% clock speed increase is more significant than the number implies?

1.8TF > 1.3TF, to the degree that is universally understood about laws of conservation.


As some others may have pointed out, you don't just add the numbers together. At no point in time can the GPU see more than the maximum ESRAM bandwidth. Cold, hard fact.

This leads to my question. Can ESRAM sustain simultaneous read/write cycles ALL the time? If not, then how much of the time?

And please allow me to help out your PR department a little. 204gb/sec, according to your understanding of the number, actually implies the old clock rate of 800mhz. The new number should be 853 * 128 * 2 (simultaneous read/write per cycle) = 218gb/sec. They can thank me later.


Please do tell us more about Sony's audio chip, or do you actually not know like the rest of us?


You are combining read and write bandwidth again for your side, while using the one-way bandwidth for your competition. Not quite being honest there, are you?

Personally, I'd like to see these points addressed. Good job.
 

KidBeta

Junior Member
At Microsoft, we have a position called a "Technical Fellow" These are engineers across disciplines at Microsoft that are basically at the highest stage of technical knowledge. There are very few across the company, so it's a rare and respected position.

We are lucky to have a small handful working on Xbox.

I've spent several hours over the last few weeks with the Technical Fellow working on our graphics engines. He was also one of the guys that worked most closely with the silicon team developing the actual architecture of our machine, and knows how and why it works better than anyone.

So while I appreciate the technical acumen of folks on this board - you should know that every single thing I posted, I reviewed with him for accuracy. I wanted to make sure I was stating things factually, and accurately.

So if you're saying you can't add bandwidth - you can. If you want to dispute that ESRAM has simultaneous read/write cycles - it does.

I know this forum demands accuracy, which is why I fact checked my points with a guy who helped design the machine.

This is the same guy, by the way, that jumps on a plane when developers want more detail and hands-on review of code and how to extract the maximum performance from our box. He has heard first-hand from developers exactly how our boxes compare, which has only proven our belief that they are nearly the same in real-world situations. If he wasn't coming back smiling, I certainly wouldn't be so bullish dismissing these claims.

I'm going to take his word (we just spoke this AM, so his data is about as fresh as possible) versus statements by developers speaking anonymously, and also potentially from several months ago before we had stable drivers and development environments.

Can we speak directly to this 'technical fellow' guy, you're either misinterpreting what he is saying, or he is plain and simple wrong about some things.
 
I compared it to the ease of accessibility of the 68gb/s for the DDR3 in the XB1, the maximum theoretical performance of both of those as stated are reasonable representations, I suppose I should have said.

The ESRAM read+write simultaneously is not always applicable.


Here's the deal. eSram on it's own should be able to list it's maximum just as the theoretical maximum of GDDR5 is being listed, it's only fair... however...

When counting total system bandwidth, the PS4's 176GB total is much easier to approach than DDR3 + eSram. Only the best programming is going to get you the average of say 133GB/s whereas with the PS4 reaching that will be a breeze.

Honestly, based on the CPU and GPU that both consoles are running, I don't think bandwidth will be a problem for either system. Enough bandwidth = Enough bandwidth, there's no benefit to having more bandwidth than needed and I think each pipe will supply enough bandwidth for the work that the CPU/GPU will pump out. PS4 having 32 ROPs means that some serious bandwidth is required, it's no wonder 176GB/s was set up for it. 16ROPs will require a lot less bandwidth, and that means less performance as well, but that's the GPUs fault and has nothing to do with the bus. Overall however both systems will be competitive much like GameCube vs Xbox.
 
I wish we could get DF comparisons now so I can stop pretending to care about clock speeds, CUs, hUMA, Esram, adding seperate bandwidths, olive, onion, or whatever GB/s these boxes are running at.
 

hawk2025

Member
Each CU being 6% faster still means only 6% speed increase overall compared to YOUR baseline, not Sony's. You can't have it both ways, Albert. Having 50% more CU is not quite 50% more GPU, but having a 6% clock speed increase is more significant than the number implies?


Thank you!

This is what I was pointing out with my long-winded example back in page 6. This is quite literally basic algebra, and the two points he makes are logically inconsistent. There's absolutely zero technical knowledge required to see this.
 

Curufinwe

Member
That is based on an apparently outdated slide. I can quote Mark Cerny himself.



http://www.gamasutra.com/view/feature/191007/inside_the_playstation_4_with_mark_.php?page=2

Which is backed up here.

http://www.eurogamer.net/articles/digitalfoundry-how-the-crew-was-ported-to-playstation-4

Jenner wouldn't go into details on the levels of bandwidth available for each bus owing to confidentiality agreements, but based on our information the GPU has full access to the 176GB/s bandwidth of the PS4's GDDR5 via Garlic, while the Onion gets by with a significantly lower amount, somewhere in the 20GB/s region (this ExtremeTech analysis of the PS4 APU is a good read). Whatever the precise figure is for the more constrained CPU area, Jenner would only confirm that it's "enough". Optimising the PS4 version of The Crew once the team did manage to get the code compiling required some serious work in deciding what data would be the best fit for each area of memory.
 

jayu26

Member
As some others may have pointed out, you don't just add the numbers together. At no point in time can the GPU see more than the maximum ESRAM bandwidth. Cold, hard fact.

This leads to my question. Can ESRAM sustain simultaneous read/write cycles ALL the time? If not, then how much of the time?

And please allow me to help out your PR department a little. 204gb/sec, according to your understanding of the number, actually implies the old clock rate of 800mhz. The new number should be 853 * 128 * 2 (simultaneous read/write per cycle) = 218gb/sec. They can thank me later.

This is masterfully done. So subtle, it is awesome...
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
There's no reason to take such a hard line about it, IMO. It could be this number comes out of some very specific operation like when we use FMA and claim "2 FLOPs" out of it for comparison. It could just be that there's some magic ROP operation specifically to ESRAM that can do a read-modify-write at a very strange rate.

That would be my guess as well, but the statement that this additional bandwidth was "found" rather surprisingly is still curious. I would suspect that such an operation would have been specified long before, but I have little knowledge about the design process of silicon.
 

tfur

Member
Alberts' PR misinformation is embarrassing at this point. I glad there are people with the energy to call out the BS.

It seems like just seed PR that is meant to picked up by a few news (lol game journalism) sites.
 

Plinko

Wildcard berths that can't beat teams without a winning record should have homefield advantage
For somebody who knows nearly nothing about technical stuff like this, this thread is fascinating.

From my perspective, though, Penello's posts come off somewhat like Baghdad Bob. Logical questions are being asked and I'm fascinated to see his responses as others are certain he's either lying or wrong.
 
I'll preface this post by saying that, having never owned a Sony console and thus not become invested in any of their franchises, I'm an Xbox fan. I've preordered an Xbox One and am very happy with that decision. That said, I'm fully willing to admit that the PS4 is a more technically capable machine on paper. I'm just not sure that the difference will manifest itself at launch.

When Penello says that the difference is not as great as you might think, he's referring to right now. Developers are still learning the intricacies of both systems. It makes sense that they'll be pretty comparable during the early stages. Heck, I'm pretty sure Sony itself has said it will likely be a couple years before the true potential of the PS4 is capitalized upon. As for right now, I'm content waiting for the actual launches before making assumptions about wide gaps in performance.
 

npm0925

Member
I'm going to take his word (we just spoke this AM, so his data is about as fresh as possible) versus statements by developers speaking anonymously, and also potentially from several months ago before we had stable drivers and development environments.
You are yourself citing an anonymous source in an effort to debunk claims made anonymously. What is the name of this fellow?
 

Toki767

Member
It amuses me so much that a software company like Microsoft can't think of a more technical title for their cream of the crop engineers than "Technical Fellow"
 

USC-fan

Banned
While this is great an all, what I (and I presume many others) want are numbers that correspond with the previous numbers given.

109GB/s is easily attained with the following formula:

(1024 bits/cycle) * (1 Byte/8 bits) * (853MHz) = 109GB/s.

However, we have no such formula for the oft-reported 204GB/s number. Assuming simultaneous read writes as a theoretical max for each cycle, we get the following numbers:

(2048 bits/cycle) * (1 Byte/8 bits) * (853MHz) = 218GB/s

This is the main point of contention when talking about ESRAM bandwidth, if you could get it cleared up it would calm the forum wars quite a bit.

Good luck! People have been trying to get these numbers to work for months. It takes some very fuzzy math to get them to fit.

The esram account for 0.39% of the total ram in the system.
 

nib95

Banned
I see my statements the other day caused more of a stir than I had intended. I saw threads locking down as fast as they pop up, so I apologize for the delayed response.

I was hoping my comments would lead the discussion to be more about the games (and the fact that games on both systems look great) as a sign of my point about performance, but unfortunately I saw more discussion of my credibility.

So I thought I would add more detail to what I said the other day, that perhaps people can debate those individual merits instead of making personal attacks. This should hopefully dismiss the notion I'm simply creating FUD or spin.

I do want to be super clear: I'm not disparaging Sony. I'm not trying to diminish them, or their launch or what they have said. But I do need to draw comparisons since I am trying to explain that the way people are calculating the differences between the two machines isn't completely accurate. I think I've been upfront I have nothing but respect for those guys, but I'm not a fan of the mis-information about our performance.

So, here are couple of points about some of the individual parts for people to consider:

• 18 CU's vs. 12 CU's =/= 50% more performance. Multi-core processors have inherent inefficiency with more CU's, so it's simply incorrect to say 50% more GPU.
• Adding to that, each of our CU's is running 6% faster. It's not simply a 6% clock speed increase overall.
• We have more memory bandwidth. 176gb/sec is peak on paper for GDDR5. Our peak on paper is 272gb/sec. (68gb/sec DDR3 + 204gb/sec on ESRAM). ESRAM can do read/write cycles simultaneously so I see this number mis-quoted.
• We have at least 10% more CPU. Not only a faster processor, but a better audio chip also offloading CPU cycles.
• We understand GPGPU and its importance very well. Microsoft invented Direct Compute, and have been using GPGPU in a shipping product since 2010 - it's called Kinect.
• Speaking of GPGPU - we have 3X the coherent bandwidth for GPGPU at 30gb/sec which significantly improves our ability for the CPU to efficiently read data generated by the GPU.

Hopefully with some of those more specific points people will understand where we have reduced bottlenecks in the system. I'm sure this will get debated endlessly but at least you can see I'm backing up my points.

I still I believe that we get little credit for the fact that, as a SW company, the people designing our system are some of the smartest graphics engineers around – they understand how to architect and balance a system for graphics performance. Each company has their strengths, and I feel that our strength is overlooked when evaluating both boxes.

Given this continued belief of a significant gap, we're working with our most senior graphics and silicon engineers to get into more depth on this topic. They will be more credible then I am, and can talk in detail about some of the benchmarking we've done and how we balanced our system.

Thanks again for letting my participate. Hope this gives people more background on my claims.

Ok just a quick run through of points...


  • What is this inherent inefficiency you speak of? Can you elaborate? It is not something I've ever heard mentioned.
  • Your second point contradicts your first. If 50% more CU performance is viable to inefficiencies, why would 6% extra performance not also be privy to the same thing?
  • How did you arrive at the 204gb/s figure for the Esram, can you elaborate? Also you realise this is a very disinginuous claim. YES the bandwidth can be added together in that the DDR3 and Esram can function simultaneously, but this tells only a small part of the full story. The Esram still only accounts for a meagre 32mb of space. The DDR3 ram, which is the bulk of the memory (8GB) is still limited to only 68gb/s, whilst the PS4's GDDR5 ram has an entire 8GB with 176gb/s bandwidth. This is a misleading way to present the argument of bandwidth differences.
  • How do you know you have 10% more cpu speed? You said you are unaware of the PS4's final specs, and rumours of a similar upclock have been floating around. It could also be argued that the XO has the more capable audio chip because the systems audio Kinect features are more demanding, something the PS4 does not have to cater to. Add to that, the PS4 does also have a (less capable) audio chip, along with a secondary custom chip (supposedly used for background processing). There's that to consider too.
  • That's good that Microsoft understands GPGPU, but that does not take away from the inherent GPGPU customisations afforded to the PS4. The PS4 also has 6 additional compute units, which is a pretty hefty advantage in this field.
  • This is factually wrong. With Onion plus Onion+ the PS4 also has 30gb/s bandwidth.
 
And please allow me to help out your PR department a little. 204gb/sec, according to your understanding of the number, actually implies the old clock rate of 800mhz. The new number should be 853 * 128 * 2 (simultaneous read/write per cycle) = 218gb/sec. They can thank me later.

tumblr_lxx6smv10F1qm4heyo1_500.gif
 
Oh yes, the party has arrived.

Sometime it would be better for MS if they just didn't say anything like Sony does.

What party? Seriously, it's funny how competitive people like you get when it comes to these consoles.

Sony, through Cerny, has done a pretty lengthy interview, or set of interviews, and even a presentation, on a lot of the architectural details about the PS4. I don't see what's wrong with wanting similarly low level details on the Xbox One. Sony has been anything but quiet on what's been done with the PS4, so why should Microsoft on the Xbox One? That literally makes no sense. They believe their system is more capable than people are thinking, and if that's the case, releasing more lower level details on the system would help in better showcasing that.
 

Gaz_RB

Member
Man I appreciate big names like Penello and Major posting here, but it really seems to be doing more harm then good. Trying to do PR damage control in the trenches of the internet is a job I'd never like to have...
 
Can someone please clarify something for me.
Is the 176Gb/s bandwidth of GDDR5 only whilst either reading or writing, or simultaneous like with the apparent 204Gb/s of the XBOne's ESRAM? Or is it that if we used Microsoft math's it would in fact be 352Gb/s bandwidth for GDDR5?
 

Klocker

Member
It amuses me so much that a software company like Microsoft can't think of a more technical title for their cream of the crop engineers than "Technical Fellow"


Fellow


From Wikipedia, the free encyclopedia

In academia, a fellow is a member of a group of learned people who work together as peers in the pursuit of mutual knowledge or practice. The fellows may include visiting professors, postdoctoral researchers and doctoral researchers.

Sounds pretty respectable to me.
 
Status
Not open for further replies.
Top Bottom