Support NeoGAF

krizzx · May 28, 2013

phosphor112 said:
Everyday.

That salt must taste delicious.

tipoo said:
I was pondering that as well, surely they don't mean the whole custom OpenGL library is just dumped on there? Where does the library sit on other consoles?

I took that as being a reserved spot for the libraries to be loaded directly into RAM every time without need for other management. I didn't think it was odd. That would be one less burden on the programmer. Automated memory management.

MEM1 is reserved for graphics libraries (although over the time the size of MEM1 will be decreased for graphics libraries). Therefore, applications cannot use MEM1 directly.

OryoN · May 28, 2013

Could it be that inaccessible Mem1 was only the case in older dev kits? Perhaps why Criterion talked so much about "hardware was always capable, but the tools to exploit it was not initially available"[/paraphrase]?

Fourth Storm · May 28, 2013

tipoo said:
I was pondering that as well, surely they don't mean the whole custom OpenGL library is just dumped on there? Where does the library sit on other consoles?

In my limited programming experience, the libraries are exactly what the name implies - a reference for all the various objects and whatnot to call on so that you don't have to reinvent the wheel every time you issue a command. Basically prewritten routines for the most common functions. (I'm probably mixing up terms and know someone can explain that much better

). I'm wondering how large they actually are and why they need to be in MEM1. Perhaps there is actually a large advantage in having these libraries in a low latency/high bandwidth position, since they are being referenced so often. Still seems like a hell of a way to blow through that precious 32MB, though.

Fourth Storm · May 28, 2013

krizzx said:
I didn't say "bandwith". I say performance. That means all things considered.

Other aspects people are not factoring into the bandwidth are the possibility of it being accessed in a dual channel fashion (it is 2X512 after all) and the latency.

I believe blu replied to one of your posts and explained quite succinctly why the bandwidth figures we have for MEM2 are accurate. I'm beginning to wonder if you are, in fact, trolling us.

tipoo · May 28, 2013

Fourth Storm said:
In my limited programming experience, the libraries are exactly what the name implies - a reference for all the various objects and whatnot to call on so that you don't have to reinvent the wheel every time you issue a command. Basically prewritten routines for the most common functions. (I'm probably mixing up terms and know someone can explain that much better ). I'm wondering how large they actually are and why they need to be in MEM1. Perhaps there is actually a large advantage in having these libraries in a low latency/high bandwidth position, since they are being referenced so often. Still seems like a hell of a way to blow through that precious 32MB, though.

I think, from this, that all the OpenGL function calls exist in the main memory of PCs, thus it would seem like it would have to be fairly bandwidth and latency insensitive for GPU interaction not to be slowed down by it.

This question is almost impossible to answer because OpenGL by itself is just a front end API, and as long as an implementations adheres to the specification and the outcome conforms to this it can be done any way you like.

The question may have been: How does an OpenGL driver work on the lowest level. Now this is again impossible to answer in general, as a driver is closely tied to some piece of hardware, which may again do things however the developer designed it.

So the question should have been: "How does it look on average behind the scenes of OpenGL and the graphics system?". Let's look at this from the bottom up:

At the lowest level there's some graphics device. Nowadays these are GPUs which provide a set of registers controlling their operation (which registers exactly is device dependent) have some program memory for shaders, bulk memory for input data (vertices, textures, etc.) and an I/O channel to the rest of the system over which it recieves/sends data and command streams.

The graphics driver keeps track of the GPUs state and all the resources application programs that make use of the GPU. Also it is responsible for conversion or any other processing the data sent by applications (convert textures into the pixelformat supported by the GPU, compile shaders in the machine code of the GPU). Furthermore it provides some abstract, driver dependent interface to application programs.

Then there's the driver dependent OpenGL client library/driver. On Windows this gets loaded by proxy through opengl32.dll, on Unix systems this resides in two places:
X11 GLX module and driver dependent GLX driver
and /usr/lib/libGL.so may contain some driver dependent stuff for direct rendering

On MacOS X this happens to be the "OpenGL Framework".

It is this part that translates OpenGL calls how you do it into calls to the driver specific functions in the part of the driver described in (2).

Finally the actual OpenGL API library, opengl32.dll in Windows, and on Unix /usr/lib/libGL.so; this mostly just passes down the commands to the OpenGL implementation proper.

How the actual communication happens can not be generalized:

In Unix the 3<->4 connection may happen either over Sockets (yes, it may, and does go over network if you want to) or through Shared Memory. In Windows the interface library and the driver client are both loaded into the process address space, so that's no so much communication but simple function calls and variable/pointer passing. In MacOS X this is similar to Windows, only that there's no separation between OpenGL interface and driver client (that's the reason why MacOS X is so slow to keep up with new OpenGL versions, it always requires a full operating system upgrade to deliver the new framework).

Communication betwen 3<->2 may go through ioctl, read/write, or through mapping some memory into process address space and configuring the MMU to trigger some driver code whenever changes to that memory are done. This is quite similar on any operating system since you always have to cross the kernel/userland boundary: Ultimately you go through some syscall.

Communication between system and GPU happen through the periphial bus and the access methods it defines, so PCI, AGP, PCI-E, etc, which work through Port-I/O, Memory Mapped I/O, DMA, IRQs.

http://stackoverflow.com/questions/6399676/how-does-opengl-work-at-the-lowest-level

krizzx · May 28, 2013

Fourth Storm said:
I believe blu replied to one of your posts and explained quite succinctly why the bandwidth figures we have for MEM2 are accurate. I'm beginning to wonder if you are, in fact, trolling us.

I never disagreed with blu or made any claims that the bandwidth figures for MEM2 were inacurrate. I was just saying that the possibility dual channel access should not be entirely ruled out this early on.

We have 2 sets of RAM. 2x512MB DDR3 for games and 2x512MB of gDDR3 for the OS. I was suggesting that 2 identical chips for the unreserved ram could be accessed simultaneously, "effectively" doubling the bandwidth, not actually changing the bandwidth rating.

tippo cleared that up before you even made this post though. Its mute at this pont.

tipoo · May 28, 2013

krizzx said:
I never disagreed with blu or made any claims that the bandwidth figures for MEM2 were inacurrate. I was just saying that the possibility dual channel should not be entierly ruled out this early on.

Yes, it should. I said why above.
Edit: Nevermind, I see your edit.

ozfunghi · May 28, 2013

blu said:
Yes, I've accounted for that - 1GB is set aside for the app. From there on, the system areas mapped into the app address space total to ~700MB. Which means that the system-reserved portions not mapped into app space are ~300MB. Basically naively put, OS + system buffers = 300MB.

Forgive my ignorance... how does this relate to the 2GB of total memory? OS only uses 300MB?

phosphor112 · May 28, 2013

krizzx said:
That salt must taste delicious.

I simply responded to a question.

Fourth Storm said:
In my limited programming experience, the libraries are exactly what the name implies - a reference for all the various objects and whatnot to call on so that you don't have to reinvent the wheel every time you issue a command. Basically prewritten routines for the most common functions. (I'm probably mixing up terms and know someone can explain that much better ). I'm wondering how large they actually are and why they need to be in MEM1. Perhaps there is actually a large advantage in having these libraries in a low latency/high bandwidth position, since they are being referenced so often. Still seems like a hell of a way to blow through that precious 32MB, though.

I feel like that isn't right. As you said, it is blowing through very valuable eSRAM. Do we know what the other embedded ram pools do? Certainly they aren't just leaving 1-3mb for rendering... seems silly.

krizzx · May 28, 2013

tipoo said:
Yes, it should. I said why above.

I know, I responded to it. He is requoting the exact statement you replied to. Its redundant. Also, i've never at any point in this thread tried to discount the bandiwth as 12 GB/s. I was suggesting ways in which "performance" may be higher than the bandwidth suggests.

It was just a logical suggestion. Nothing more. I don't know why Fourth Storm is trying to blow it up into some trolling attempt. Recently, he's been appearing hell bent on making issues with what I say in a very passive aggressive manner. He seems to have some kind of chip his on shoulder regarding me.

I have no interest in fighting with forum members. I'm here for the tech.

Barack Lesnar · May 28, 2013

ArchangelWest said:
Wii U = Dreamcast

Xbox One = Xbox

PS4 = Wii

The PS4 is a generational leap over the Wii U, the X1 is like 75% of a leap over the Wii U. Your comparison isn't accurate.

phosphor112 · May 28, 2013

Heavy said:
The PS4 is a generational leap over the Wii U, the X1 is like 75% of a leap over the Wii U. Your comparison isn't accurate.

I think he means sales.

Barack Lesnar · May 28, 2013

phosphor112 said:
I think he means sales.

I'm not sure. I thought the post he quoted was about power. Wait, are you being sarcastic. @_@

Fourth Storm · May 28, 2013

phosphor112 said:
I feel like that isn't right. As you said, it is blowing through very valuable eSRAM. Do we know what the other embedded ram pools do? Certainly they aren't just leaving 1-3mb for rendering... seems silly.

It is certainly bewildering. Who knows how old this documentation actually is. Perhaps some of MEM1 has already opened up. But it also says that Wii U does not allow uncached access, so my wild guess is that there is some automatic caching of data into MEM0 for all calls to the DDR3. This would definitely help speed things up. A very complex memory subsystem indeed, but seemingly rigidly controlled...

Edit: Also interesting that MEM0 is the same size as the L2 on Espresso. Since the GPU functions as the NB, perhaps MEM0 is there to help that data along its way, so to speak.

phosphor112 · May 28, 2013

Fourth Storm said:
It is certainly bewildering. Who knows how old this documentation actually is. Perhaps some of MEM1 has already opened up. But it also says that Wii U does not allow uncached access, so my wild guess is that there is some automatic caching of data into MEM0 for all calls to the DDR3. This would definitely help speed things up. A very complex memory subsystem indeed, but seemingly rigidly controlled...

Edit: Also interesting that MEM0 is the same size as the L2 on Espresso. Since the GPU functions as the NB, perhaps MEM0 is there to help that data along its way, so to speak.

It just seems so crazy that they put that much embedded ram on the chip, only to be used for libraries.

Fourth Storm · May 28, 2013

phosphor112 said:
It just seems so crazy that they put that much embedded ram on the chip, only to be used for libraries.

Seems like a gigantic waste, I agree. Kind of like the 1 GB of RAM used for system applications that run like molasses.

Unless Nintendo are intentionally gimping the system so that there is room for improvement in each wave of titles. Insane, but perhaps true. And as we can see, even without access to MEM1, Wii U is holding its own against current gen, so you could argue that the 32MB really isn't necessary this early in the console's life. Iwata did claim that only 50% of Wii U's power was being utilized - a vague general statement, but this could in part explain his reasoning for saying that.

phosphor112 · May 28, 2013

Fourth Storm said:
Seems like a gigantic waste, I agree. Kind of like the 1 GB of RAM used for system applications that run like molasses.

Unless Nintendo are intentionally gimping the system so that there is room for improvement in each wave of titles. Insane, but perhaps true. And as we can see, even without access to MEM1, Wii U is holding its own against current gen, so you could argue that the 32MB really isn't necessary this early in the console's life. Iwata did claim that only 50% of Wii U's power was being utilized - a vague general statement, but this could in part explain his reasoning for saying that.

They should have gone x86 like the other two. Going forward with PPC and highly custom expensive hardware is only going to give them trouble.

Heyt · May 28, 2013

Can't wait to read this thread's analysis on Sonic Lost World technical achievements.

krizzx · May 28, 2013

A trailer has been released for Sonic Lost World. Those graphcis look nice. The game play reminds me of the old unreleased Sonic X footage.
http://www.ign.com/videos/2013/05/28/sonic-lost-worlds-debut-trailer

Let the analysis begin.

Are those intermediate sequence CG or ingame?

phosphor112 · May 28, 2013

Heyt said:
Can't wait to read this thread's analysis on Sonic Lost World technical achievements.

For the past few years, the sonic games have looked amazing. Someone posted the Sonic game where he turns into a "werehog" or whatever, and the lighting was nuts.

OminoMichelin · May 28, 2013

krizzx said:
A trailer has been released for Sonic Lost World. Those graphcis look nice. The gameplay reminds me of the old unreleased Sonic X footage.

http://www.ign.com/videos/2013/05/28/sonic-lost-worlds-debut-trailer

Let the analysis begin.

My two cents: it doesn't look better than mario galaxy, except maybe for the higher res ground textures. It looks unfinished.

prag16 · May 28, 2013

OminoMichelin said:
My two cents: it doesn't look better than mario galaxy, except maybe for the higher res ground textures. It looks unfinished.

Eh, it's tough. With all the fast-moving action, the compression artifacts are fucking awful in both the IGN and youtube feeds. Hard to judge off of that.

Parts of it did kind of look like CG; if the entire video was 100% real time and representative of what gameplay will look like, it's fairly impressive. During what was obviously gameplay, for the most part it moved so fast all the detail was lost due to compression. Jury's completely out right now, imo.

phosphor112 · May 28, 2013

prag16 said:
Eh, it's tough. With all the fast-moving action, the compression artifacts are fucking awful in both the IGN and youtube feeds. Hard to judge off of that.

Parts of it did kind of look like CG; if the entire video was 100% real time and representative of what gameplay will look like, it's fairly impressive. During what was obviously gameplay, for the most part it moved so fast all the detail was lost due to compression. Jury's completely out right now, imo.

The parts where it showed those.. monster things, that was clearly pre-rendered, but the actual gameplay is easily above Galaxy, graphically.

The amount of sprites just for grass alone puts it above Galaxy. I don't think they were capable of doing all that on the Wii. Then again, xenoblade used a lot of it. Anyway, they models seem high poly too.

krizzx · May 28, 2013

I like where this is going. Its using the old aesthetic like it used to with the orange and brown blocks. I hope its mostly 3D movement. I was never fond of Hedgehog engine stiffness in the 2D plane.

Thunder Monkey · May 28, 2013

Heavy said:
The PS4 is a generational leap over the Wii U, the X1 is like 75% of a leap over the Wii U. Your comparison isn't accurate.

M2 was always the better comparison anyway.

The M2 could crunch marginally higher poly counts than the PS1 and N64, but generally improved on that era in texturing and featureset. WiiU crunches polygons within PS3/360 levels while featuring the potential for better texturing and a more modern featureset.

Neither are close to the potential of later releasing systems. But that does mean vastly different things now than it did then.

krizzx · May 28, 2013

Thunder Monkey said:
M2 was always the better comparison anyway.

The M2 could crunch marginally higher poly counts than the PS1 and N64, but generally improved on that era in texturing and featureset. WiiU crunches polygons within PS3/360 levels while featuring the potential for better texturing and a more modern featureset.

Neither are close to the potential of later releasing systems. But that does mean vastly different things now than it did then.

How do you know this? From what I've been seeing, Latte is likely using a dual graphics engine setup as there are five duplicate components on the chip.

This would make its peak polygon performance around double the last gen consoles and half the other 2 next gen.

OminoMichelin · May 28, 2013

phosphor112 said:
The parts where it showed those.. monster things, that was clearly pre-rendered, but the actual gameplay is easily above Galaxy, graphically.

The amount of sprites just for grass alone puts it above Galaxy. I don't think they were capable of doing all that on the Wii. Then again, xenoblade used a lot of it. Anyway, they models seem high poly too.

Yes, the grass is what stood out for me, but beside that I don't see how it looks better than an upressed mario galaxy

Edit: there was a pic of SMG2 on dolphin here but now it disappeared. weird

But maybe I'm just being blind. I look forward to see what we can extrapolate from that video.

Thunder Monkey · May 28, 2013

krizzx said:
How do you know this? From what I've been seeing, the Latte is likely using a dual graphics engine setup as there are five duplicate components on the chip. This would make its polygon performance around double the last gen consoles and half the other 2 next gen.

I don't.

I'm just going by conjecture we've heard. "Modern featureset and PS3/360 grunt." And we know it has more available memory.

krizzx · May 28, 2013

Thunder Monkey said:
I don't.

I'm just going by conjecture we've heard. "Modern featureset and PS3/360 grunt." And we know it has more available memory.

That was stated towards the shader efficiency, though, not the peak polygon performance.

The GPU apparently has two Rasterizers.

joesiv · May 28, 2013

krizzx said:
A trailer has been released for Sonic Lost World. Those graphcis look nice. The game play reminds me of the old unreleased Sonic X footage.
http://www.ign.com/videos/2013/05/28/sonic-lost-worlds-debut-trailer

Let the analysis begin.

Are those intermediate sequence CG or ingame?

Definitely CG. There is a particular part near the end where it transitioned from in-game where the grass was comprised of large planes, and then the CG immediately after was individual blades of grass.

freezamite · May 28, 2013

krizzx said:
How do you know this? From what I've been seeing, Latte is likely using a dual graphics engine setup as there are five duplicate components on the chip.

This would make its peak polygon performance around double the last gen consoles and half the other 2 next gen.

It was an analogy. This time, the newer featureset will include tessellation and that will bring much higher polygon counts, but the analogy is still valid:
Dreamcast was much better than both N64 and PS not because it could draw much more polygons per second or push much higher numbers than those systems but because it had a much more modern featureset.

Featureset -> Numbers, that's what is important.

prag16 · May 28, 2013

krizzx said:
That was stated towards the shader efficiency, though, not the peak polygon performance.

The GPU apparently has two Rasterizers.

I'm not sure I'd characterize that as "apparent" at this juncture. "Possible" is a better starting point, and may even been too optimistic. But either way it's all still speculation..

krizzx · May 28, 2013

freezamite said:
It was an analogy. This time, the newer featureset will include tessellation and that will bring much higher polygon counts, but the analogy is still valid:
Dreamcast was much better than both N64 and PS not because it could draw much more polygons per second or push much higher numbers than those systems but because it had a much more modern featureset.

Featureset -> Numbers, that's what is important.

I was speaking purely in regard to polygon performance, not tessellation. 2 Rasterizers, 2 Geometry Assemblers, 2 Vertex Assembles = double the polygon output.

prag16 said:
I'm not sure I'd characterize that as "apparent" at this juncture. "Possible" is a better starting point, and may even been too optimistic. But either way it's all still speculation..

True, it is not absolutley confirmed. That is why I said "likely" and "apparently". Its seeming more likely than any other suggestion, so I am leaning toward it.

We do know that there are 5 duplicate comoments We also know that AMD GPU with dual graphics engines have exactly 5 duplicate compoents.

Fourth Storm · May 28, 2013

krizzx said:
How do you know this? From what I've been seeing, Latte is likely using a dual graphics engine setup as there are five duplicate components on the chip.

This would make its peak polygon performance around double the last gen consoles and half the other 2 next gen.

There is no dual graphics engine. It's fiction. I realize nothing can convince you, though. But for the sake of others who read this thread and haven't followed the entire discussion, I feel the need to state this.

tipoo · May 28, 2013

Fourth Storm said:
Seems like a gigantic waste, I agree. Kind of like the 1 GB of RAM used for system applications that run like molasses.

Unless Nintendo are intentionally gimping the system so that there is room for improvement in each wave of titles. Insane, but perhaps true. And as we can see, even without access to MEM1, Wii U is holding its own against current gen, so you could argue that the 32MB really isn't necessary this early in the console's life. Iwata did claim that only 50% of Wii U's power was being utilized - a vague general statement, but this could in part explain his reasoning for saying that.

What would they be holding out on though? I can't understand that strategy at all, they'll want a strong install base and momentum by the time the PS4 and One are out.

Plus I don't recall stationary hardware ever being deliberately gimped and freed up like that, sure there's speed enhancements and OS shrinks but I mean deliberately locking a hardware feature?

I guess there's one way I could mentally justify that, if they started showing some incrediballs games just as the PS4/One were gearing up to go, but that would be such a kick to the pants of third party devs.

prag16 · May 28, 2013

Fourth Storm said:
There is no dual graphics engine. It's fiction. I realize nothing can convince you, though. But for the sake of others who read this thread and haven't followed the entire discussion, I feel the need to state this.

Quad graphics engine confirmed.

krizzx · May 28, 2013

Fourth Storm said:
There is no dual graphics engine. It's fiction. I realize nothing can convince you, though. But for the sake of others who read this thread and haven't followed the entire discussion, I feel the need to state this.

Dude, will you stop harassing me and trying to vilify my statements. Unlike you, I am not beyond reason.

I stated why I have drawn this conclusion and provided facts to support it. If someone provides data that explains why this is not likely, and it seems more plausible then I will accept that instead. As it stands, the facts are just as I stated above. There are 5 duplicate components on Latte and the AMD GPUs with dual graphics engine sets also have 5 duplicate components. If that is not an indicator of a dual graphics engine setup then please, enlighten me to what it is then, since you claim to know far better than I.

My conclusions are based on logic, reasoning, and facts.

I am not here to fight with you. I am not here for fanboyish arguments. I am not here to debate the business practices of Nintendo.

freezamite · May 28, 2013

krizzx said:
I was speaking purely in regard to polygon performance, not tessellation. 2 Rasterizers, 2 Geometry Assemblers, 2 Vertex Assembles = double the polygon output.

Yes, and this is why I talked about tessellation. The WiiU is a 352GFlop card at best, it will never reach the 550 million polygon/second limit. Even my HD5870 card, which has 1+ Teraflops, has only a single rasterizer and can draw up to 800 million polygon at max.

This is why tessellation was invented. To have a huge amount of polygons from the get go meant that you needed tons of flops to transform and manipulate those polygons.
If you have a competent tessellation unit, on the other hand, you can perform all the vertex operations at a low cost and then increase the polycount through the tessellator.

If the WiiU is confirmed to have such an architecture (and I think it's quite plausible given the amount of repeated blocks and what you said about them) then tessellation is the only thing that makes sense on a design like that.

krizzx · May 28, 2013

freezamite said:
Yes, and this is why I talked about tessellation. The WiiU is a 352GFlop card at best, it will never reach the 550 million polygon/second limit. Even my HD5870 card, which has 1+ Teraflops, has only a single rasterizer and can draw up to 800 million polygon at max.

This is why tessellation was invented. To have a huge amount of polygons from the get go meant that you needed tons of flops to transform and manipulate those polygons.
If you have a competent tessellation unit, on the other hand, you can perform all the vertex operations at a low cost and then increase the polycount through the tessellator.

If the WiiU is confirmed to have such an architecture (and I think it's very plausible given the amount of repeated blocks) then tessellation is the only think that makes sense on a design like that.

Okay. i see where you coming from.

Though, now I am confused about how you are calculating polygon output again. Zomie said it was equal to the Mhz.

The reason this confuses me are the 360/PS3.
The 360 GPU is clocked at 500 Mhz and has 240 ALUs buts its polygon performnace is rated at 500million polygon per second.
The PS3 GPU is rated at 550 Mhz and its polygon performance is listed as 333million polygons per second. I can't find its shader count though.

The Wii U GPU at is 550 Mhz and has most has 320 ALUs but with apparently duplicate components on the GPU.

Can someone clear this up for me?

Fourth Storm · May 28, 2013

krizzx said:
Dude, will you stop harassing me and trying to vilify my statements. Unlike you, I am not beyond reason.

I stated why I have drawn this conclusion and provided facts to support it. If someone provides data that explains why this is not likely, and it seems more plausible then I will accept that instead. As it stands, the facts are just as I stated above. There are 5 duplicate components on Latte and the AMD GPUs with dual graphics engine sets also have 5 duplicate components. If that is not an indicator of a dual graphics engine setup then please, enlighten me to what it is then.

My conclusions are based on logic, reasoning, and facts.

I am not here to fight with you. I am not here for fanboyish arguments.

Who is making fanboyish arguments? I realize we are all speculating here but your assesments are just way off base. Sorry if some of my replies have been blunt, but it's the truth. I have tried to point out some reasons why your conclusions are extremely unlikely in the past, yet you seem to ignore any facts which aren't alligned with your very generous portrait of the system components. I know I am not the only poster who sees this and have tried to just let it be, but I feel it necessary to chime in from time to time when you appear to be misleading people who just stop in for some quick info.

Absinthe · May 28, 2013

Fourth Storm said:
Who is making fanboyish arguments? I realize we are all speculating here but your assesments are just way off base. Sorry if some of my replies have been blunt, but it's the truth. I have tried to point out some reasons why your conclusions are extremely unlikely in the past, yet you seem to ignore any facts which aren't alligned with your very generous portrait of the system components. I know I am not the only poster who sees this and have tried to just let it be, but I feel it necessary to chime in from time to time when you appear to be misleading people who just stop in for some quick info.

For those who are just stopping in, could you quickly clarify once more why you believe "5 duplicate components" do not point to a dual graphics engine?

Edit: Is this where the dual graphics engine idea gained steam? http://m.neogaf.com/showpost.php?p=57899636

Blades64 · May 28, 2013

From an outsider looking in, krizz, I think you need to take a break or something and relax a little. It seems you're getting worked up bro.

krizzx · May 28, 2013

Fourth Storm said:
Who is making fanboyish arguments? I realize we are all speculating here but your assesments are just way off base. Sorry if some of my replies have been blunt, but it's the truth. I have tried to point out some reasons why your conclusions are extremely unlikely in the past, yet you seem to ignore any facts which aren't alligned with your very generous portrait of the system components. I know I am not the only poster who sees this and have tried to just let it be, but I feel it necessary to chime in from time to time when you appear to be misleading people who just stop in for some quick info.

I do not ignore facts. I have taken every reasonable argument brought about into consideration.

What I am saying is what I have concluded at the end of the day after analyzing all of the facts presented by everyone.

Your exploitations have been duly noted. I have taken what you, Zomie, blu and BGAssasin have said an analyzed it myself.

Bg correlated direct design similarities between Brazos and Latte and provided direct visual proof to support his. Logically it would follow that manufacturers will reuse components in chip design. I see no reason to discount this at this point. I have seen arguments brought against it, but none that outweigh it.

Then there are still the two facts that we can confirm.

Latte has 5 duplicate components near the base of the GPU and that it is a custom design. From there, I did research in order to look for an explanation.

I came across this.

It was 2011 tech so it was in line with the Wii U announcement. We know the chip is custom made and that is final form is not the same as the form it had annoucement.

I'm suggesting that the Wii U may be using a custom dual graphics engine design in conjunction with the RV700 tech. A hybrid chip of sorts. I have seen no better explanation provided for the 5 duplicate components.

It was same before when I suggest it could be an HD5550 based chip. Thats what the facts were leaning to at that point. Evidence came out that greatly discounted that, so I dropped the claim as it no long seemed plausible.

tipoo · May 28, 2013

I think I need a refresher on how the 6900 dual graphics engine worked, all I can find on it is AMD saying they "help keep the GPU well-fed with data". Was this carried forth with GCN? The 6900s had the same essential makeup as the Cypress-powered Radeon HD 5800s, but the feature did add a bit of performance.

http://anandtech.com/bench/Product/587?vs=509

EDIT: Actually the performance may have come from a few extra shaders, the dual graphics engine seemed to be more to do with not killing visual performance while doing tesselation: |
http://www.techradar.com/reviews/pc...hics-cards/amd-radeon-hd-6970-915716/review/2

Not endorsing this theory, just wondering what it would do.

Fourth Storm · May 28, 2013

akmcbroom said:
For those who are just stopping in, could you quickly clarify once more why you believe "5 duplicate components" do not point to a dual graphics engine?

Sure, for one, the components that make up a "setup engine" needn't each occupy an entire block. In Brazos, we have vertex setup, geometry setup, and tesselator on one block. The rasterizer is a separate block. In all documentation I've read, the tesselator is listed as being within the vertex setup block. Meanwhile, HiZ is listed as a function performed by the Rasterizer and depth buffer (one of the ROPs).

Common sense also dictates that a dual setup engine would not make sense in a GPU the size of Latte. The only GPUs they are found in are ~2TFLOP behemoths. In short, they are necessary to keep that many ALUs fed.

Finally, after doing an extensive comparison, I am near certain that 3 of the duplicated blocks may be identified as TMUs, L1 texture cache, and ROPs. The other two may be memory controllers and L2 texture cache, but I am not quite as sure on those two. Basically, these are things which are found in various multiples in all Radeon cards and should have been the first thing we turned to in order to explain the duplicate blocks. But since the floor plan of Latte is so different from what we are used to seeing, it was not as immediately apparent.

MDX · May 28, 2013

krizzx said:
I'm suggesting that the Wii U may be using a custom dual graphics engine design in conjunction with the RV700 tech. A hybrid chip of sorts. I have seen no better explanation provided for the 5 duplicate components.

And developers have only access to half of it???

krizzx · May 28, 2013

Fourth Storm said:
Sure, for one, the components that make up a "setup engine" needn't even occupy an entire block. In Brazos, we have vertex setup, geometry setup, and tesselator on one block. The rasterizer is a separate block. In all documentation I've read, the tesselator is listed as being within the vertex setup block. Meanwhile, HiZ is listed as a function performed by the Rasterizer and depth buffer (one of the ROPs).

Common sense also dictates that a dual setup engine would not make sense in a GPU the size of Latte. The only GPUs they are found in are ~2TFLOP behemoths. In short, they are necessary to keep that many ALUs fed.

Finally, after doing an extensive comparison, I am near certain that 3 of the duplicated blocks may be identified as TMUs, L1 texture cache, and ROPs. The other two may be memory controllers and L2 texture cache, but I am not quite as sure on those two. Basically, these are things which are duplicated in all Radeon cards and should have been the first thing we turned to in order to explain the duplicate blocks. But since the floor plan of Latte is so different from what we are used to saying, it was not as immediately apparent.

If 2 of the dual components are TMUs, then are what are block J1-4/N1-4?

I'm not saying it one of those actually chips. I'm simply suggest that it could be borrowing the setup and using its on a smaller scale. Though if you what you are saying is true, then I need to do some more research into this. I will shoot BG a message as well, because I'm mostly following his analysis.

MDX said:
And developers have only access to half of it???

I wouldn't say that. I would say that they just not fully utilizing it with most games being ports. I was also presenting it to explain the huge geometry increase observed in some of the first party games when compared to last gen games.

Absinthe · May 28, 2013

Fourth Storm said:
Sure, for one, the components that make up a "setup engine" needn't each occupy an entire block. In Brazos, we have vertex setup, geometry setup, and tesselator on one block. The rasterizer is a separate block. In all documentation I've read, the tesselator is listed as being within the vertex setup block. Meanwhile, HiZ is listed as a function performed by the Rasterizer and depth buffer (one of the ROPs).

Common sense also dictates that a dual setup engine would not make sense in a GPU the size of Latte. The only GPUs they are found in are ~2TFLOP behemoths. In short, they are necessary to keep that many ALUs fed.

Finally, after doing an extensive comparison, I am near certain that 3 of the duplicated blocks may be identified as TMUs, L1 texture cache, and ROPs. The other two may be memory controllers and L2 texture cache, but I am not quite as sure on those two. Basically, these are things which are found in various multiples in all Radeon cards and should have been the first thing we turned to in order to explain the duplicate blocks. But since the floor plan of Latte is so different from what we are used to saying, it was not as immediately apparent.

Thank you.

Fourth Storm · May 28, 2013

krizzx said:
If 2 of the dual components are TMUs, then are what are block J1-4/N1-4

N1-N8 are definitely the shaders. J1-J4 are in all likelihood fixed function interpolation units. Admittedly, I have not found much support for that notion, but the only counter seems to be that, starting with DirectX11 cards, AMD made interpolation a programmable function of the SPUs. That needn't be the case in Latte, and we must not fall into the trap of saying, "This would be better, so Nintendo must have included it." That logic has come back to bite us again and again.

krizzx · May 28, 2013

Fourth Storm said:
N1-N4 are definitely the shaders. J1-J4 are in all likelihood fixed function interpolation units. Admittedly, I have not found much support for that notion, but the only counter seems to be that, starting with DirectX11 cards, AMD made interpolation a programmable function of the SPUs. That needn't be the case in Latte, and we must not fall into the trap of saying, "This would be better, so Nintendo must have included it." That logic has come back to bite us again and again.

That is where I run into a problem with what you are suggesting. Didn't Marcan and even Iwata confirm something along the lines of their being no fixed function hardware?

I can see where you are getting the 160 claim from now, if you are only counting only 4 of the chips as shaders. That makes more sense than 8 chips that were 90% smaller than the companies 40SP blocks but 55% larger than the 20SP blocks being 160 altogether.

Though, all of that seems to ride on the assumption that those 4 blocks are fixed function. I'm still finding it hard to swallow.

Support NeoGAF

WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Junior Member

Member

Member

Member

Banned

Junior Member

Banned

Member

Banned

Junior Member

Banned

Banned

Banned

Member

Banned

Member

Banned

Banned

Junior Member

Banned

Member

Banned

Banned

Junior Member

Banned

Junior Member

Member

Banned

Junior Member

Member

Banned

Banned

Junior Member

Member

Banned

Banned

Junior Member

Banned

Junior Member

Member

Member

Banned

Junior Member

Banned

Member

Member

Junior Member

Member

Member

Junior Member

Similar threads