• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.

krizzx

Junior Member
M°°nblade;59725757 said:
Yet you literally did say that the bird tech demo was the same as any other game on the hardware except it wasn't sold.

So I just asked you again, apart from not being sold, navigating a camera is the same as playing a game?

That's not really a strawman is it? I'm just asking you to clarify.

I did say that it was like a real retail game, because it is. As I said above, there many game like that. Flower on the PS3 is a good example. Journey is a good example.

You are discounting the game based purely on the fact that its tech demo, but not showing why it being tech demo discredits it.

Its not a prebaked demo like the ones showed for the PS3 that the console couldn't actually produce at the end of the day. Its a real world, real time capability tech demo running on actual hardware live like the Peach's Castle demo for the Gamecube. It demonstrates what the console can do in real time.

All of you jumped on the fact that it wasn't a retail game, but did not specify why that made any difference.
 
What? They are relative to one another.

FPS = Frames per second
It is the same in this instance.

It's not. Polygons per frame is obviously a different measuement than polygons per second. I already gave you an easy to understand example in my previous post. Read it again.
 
I didn't say that it was like a real retail game because it is. As I said above, there many gam elike that. Flower on the PS3 is a good exmaple.
And I repeat. But in Flower you actually control/move an object which interacts with the environment. Can you control or move the bird in Nintendo's tech demo?
If you can't , it's not exactly the same is it?

You discounting the game based purely on the fact that its tech demo but not showing why it being tech demo discredits it.
By now, I kinda hoped you would have seen the difference I'm trying to explain by the questions I'm posing why you can't extrapolate tech demo assets to a game.
 
M°°nblade;59726561 said:
And I repeat. But in Flower you actually control/move an object which interacts with the environment. Can you control or move the bird in Nintendo's tech demo?
If you can't , it's not exactly the same
well you can move the camera
 

z0m3le

Banned
http://www.ign.com/articles/2001/08/29/rogue-leader-chat-transcript

While it was still a work in progress.

Question: To satisfy all the forum spec and number adorers, how many millions of polygons does RL tend to push in a standard situation per second, and how long does it tend to vary? and has their been any performance increases since previous showings?

LucasArts/Factor 5: If we would start counting the polygons now the game wouldn't be done, but we estimate most scenes at 12-15 million polygons per second. The version being shown in Europe is quite a performance increase in the Hoth level compared to previous showings.

However polygon count is usually per frame, the internet has bastardized this as per second, probably around the beginning of the PS2, GCN and Xbox generation. AMD GPUs are rated per second now as well I believe, so GCN @ 800MHz would offer 1.6 billion polygons per second where as GPU7 is either 550 Million or 1.1 Billion depending on if it is a dual graphics engine.

One thing I think is very odd about GPU7 is it's size. taking the eDRAM out of the equation, it is around 104mm^ if I remember correctly, that is simply huge for a 160 ALU part. For comparison 40nm gpus had 400 to 480 ALUs, brazos which most of it's bulk is the bobcat processors iirc is only 75mm on 40nm process and that is 80ALUs, so even that chip is far smaller than GPU7... GPU7 being 160 ALUs would be very odd, especially with how packed the die is.
 

D-e-f-

Banned
M°°nblade;59726561 said:
And I repeat. But in Flower you actually control/move an object which interacts with the environment. Can you control or move the bird in Nintendo's tech demo?
If you can't , it's not exactly the same is it?

What you're arguing is that in-engine, real-time cut-scenes are not representative of the visual quality of a game because you're not doing anything in them.
 

z0m3le

Banned
That bird demo was running on Wii U dev kits that were underclocked and overheating. We know the dev kits were 1GHz and 400MHz for the CPU and GPU at some point but these could actually be below that because afaik 1GHz and 400MHz was the target for the system and those numbers were "underclocked" at that E3.

Having said that, the bird demo wasn't running any real game logic, however the game could be the bird picking up bugs from the grass or maybe catching fish, this wouldn't effect the performance very much at all and would make it qualify as a game, I find it a bit ridiculous to completely dismiss an apparent graphics demo of the hardware as something Wii U couldn't do "in game" because that is a very broad term, it might not be able to have that sort of fidelity in some games, but in that type of environment, passing the controls onto the bird shouldn't drastically hurt performance of the hardware and with the clock speed bumps, probably wouldn't change a thing about frame rate or image quality, those things could possibly be improved.
 
Well what qualifies as a game? Can we argue something new guys?? :)

Anyways I do know they had button prompts in the demo. Im not so smart in this tech stuff so Im not sure if tech demos is it normal for that kind of stuff.
 

HTupolev

Member
Where did you here that? The polygon throughput is directly proportionate to the frame rate.

The count you can achieve at 120 FPS is half what you achieve 60 FPS and is half what you achieve at 30. FPS means frame per second, so that means 20 million polygons were drawn 60 times every second. Everytime the frame rate doubles, the tax on the system doubles.

If the polygon count was frame rate independent, then pretty much every game would be using its consoles max.
So Rogue Squadron was pushing 1,200,000,000 polygons per second? Flipper was pushing through 7.4 polygons per clock on average? There were 65 polygons on-screen for every single pixel?

Then every WiiU naysayer is spot-on, because it's vastly outclassed by a console released in 2001.

(Polygons/Frame)*(Frames/Second)=(Polygons/Second)

(Polygons/Second/Frame)*(Frames/Second)=(Polygons/Second/Second)

That latter one is the polygon throughput acceleration, which would be an absolutely hilarious (but awesome) metric to find a use for.
 

z0m3le

Banned
So Rogue Squadron was pushing 1,200,000,000 polygons per second? Flipper was pushing through 7.4 polygons per clock on average? There were 65 polygons on-screen for every single pixel?

Then every WiiU naysayer is spot-on, because it's vastly outclassed by a console released in 2001.

(Polygons/Frame)*(Frames/Second)=(Polygons/Second)

(Polygons/Second/Frame)*(Frames/Second)=(Polygons/Second/Second)

That latter one is the polygon throughput acceleration, which would be an absolutely hilarious (but awesome) metric to find a use for.

Question: To satisfy all the forum spec and number adorers, how many millions of polygons does RL tend to push in a standard situation per second, and how long does it tend to vary? and has their been any performance increases since previous showings?

LucasArts/Factor 5: If we would start counting the polygons now the game wouldn't be done, but we estimate most scenes at 12-15 million polygons per second. The version being shown in Europe is quite a performance increase in the Hoth level compared to previous showings.

Since you missed my post up the page a bit. That was from August that year I believe, and was later confirmed to push up to 20million PER SECOND. So you are right, though polygon count traditionally was per frame, but the internet likes to misrepresent things all the time.
 
So Rogue Squadron was pushing 1,200,000,000 polygons per second? Flipper was pushing through 7.4 polygons per clock on average? There were 65 polygons on-screen for every single pixel?

Then every WiiU naysayer is spot-on, because it's vastly outclassed by a console released in 2001.
What? That's not what krizzx said at all...
Correct me if I'm wrong but I think all krizzx is saying is having higher fps (60 vs. 30) is more taxing on a system, which is obvious.
 
You do know that when they talk about 20 million polygons they are talking about polygons per second, this makes any talk about fps meaningless.
Simply put framerate doesn't change the polygon throughput.
Yes, but it's not a completely irrelevant information. I mean, although in terms of polygon count it's the same 20 million at 60fps than 20 million at 30fps, there are other processes like lighting that are calculated every frame, so reaching 20 million in 60fps has more merit than doing it at 30fps.

krizzx said:
Factor 5 indicated they could get 20 million polygons/second per second with all effects.
Since it is 20 million polygons per second, in terms of polygon count, it doesn't matter if its 30fps or 60fps.
Don't you see that 20 million x 60 = 1200 million, which is more than double the theoretical polygon throughput of the Xbox 360, and even more than the WiiU even if the dual graphics engine gets confirmed?

20 million per second was a great feat on the GC/PS2/Xbox/DC generation, a high enough value to compete even against some Xbox 360 games in fact.

z0m3le said:
One thing I think is very odd about GPU7 is it's size. taking the eDRAM out of the equation, it is around 104mm^ if I remember correctly, that is simply huge for a 160 ALU part. For comparison 40nm gpus had 400 to 480 ALUs, brazos which most of it's bulk is the bobcat processors iirc is only 75mm on 40nm process and that is 80ALUs, so even that chip is far smaller than GPU7... GPU7 being 160 ALUs would be very odd, especially with how packed the die is.
I don't think it's 160 ALUs at all. I mean, the die area, the fact that there are no wasted spaces, the fact that the clock doesn't need to be as scalable as a PC GPU part (which can be sold as 2 different models simply down-clocking it) and surely allows for a higher transistor density, the fact that some functions or parts that may be needed on PC due to compatibility with older games or some extra functionalities won't be included in a customized design like this, etc. etc.

160 ALUs was a plausible theory when speculation about TEV emulation was still on the air, but now that we know that this emulation is done through a "translator" chip and that the ALUs may have nothing of special, then I think that 160 ALUs is really difficult to justify considering all the facts.
 

z0m3le

Banned
What? That's not what krizzx said at all...
Correct me if I'm wrong but I think all krizzx is saying is having higher fps (60 vs. 30) is more taxing on a system, which is obvious.

Actually I think he got this one wrong, I believe he said Xbox pushed 12 million polygons @ 30fps, that means Xbox was pushing ~400k Polygons a frame while GCN's best was ~334k per frame.

I don't think it's 160 ALUs at all. I mean, the die area, the fact that there are no wasted spaces, the fact that the clock doesn't need to be as scalable as a PC GPU part (which can be sold as 2 different models simply down-clocking it) and surely allows for a higher transistor density, the fact that some functions or parts that may be needed on PC due to compatibility with older games or some extra functionalities won't be included in a customized design like this, etc. etc.
Yes, the SPU is almost 92% larger than brazo's so I find it unlikely as well.
160 ALUs was a plausible theory when speculation about TEV emulation was still on the air, but now that we know that this emulation is done through a "translator" chip and that the ALUs may have nothing of special, then I think that 160 ALUs is really difficult to justify considering all the facts.
The entire size for flipper is 21 million transistors, GPU7 is nearly a billion and ~750 without the eDRAM so I find it completely unrealistic that Gamecube's entire GPU could bump the size of the SPUs so much.
 

z0m3le

Banned
Interesting, it looks like developers have no direct access to the 32MB of embedded memory, if true.

I doubt it's true, since you can not write and read from Wii U's memory2 at the same time, and 360/PS3's read and write equal nearly twice the speed of Wii U's memory, it wouldn't handle ports from those systems at all, developers have to be able to write to Mem1 in order to avoid this IMO.
 
Actually I think he got this one wrong, I believe he said Xbox pushed 12 million polygons @ 30fps, that means Xbox was pushing ~400k Polygons a frame while GCN's best was ~334k per frame.

I'm not surprised if true. I had all 4 systems that gen and it seemed pretty obvious to me that power wise they went Dreamcast<PS2<Gamecube<Xbox. I've heard the Dreamcast is more powerful than the PS2 but it died an early death and couldn't show off its power. I don't know how accurate that is.

Sorry, that was way off topic. Back to lurking I go. But first I'd like to say I find this thread incredibly interesting, and don't want to see it locked. Could everyone please try to keep the attacks (on both sides) to a minimum? Thanks! ^_^
 
I doubt it's true, since you can not write and read from Wii U's memory2 at the same time, and 360/PS3's read and write equal nearly twice the speed of Wii U's memory, it wouldn't handle ports from those systems at all, developers have to be able to write to Mem1 in order to avoid this IMO.
Also unlikely because of how the Wii U core philosophy is a way for Nintendo to keep their development pipeline alive and intact (on the CPU-side) adding a new GPU and SMP, but keeping it so that they have to further develop their engine's and tech to embrace them but not port them per see and go through an optimization bout.

And in that GC/Wii pipeline there's the fully accessible ETC (embedded texture cache) which was a very viable way to spam the GPU and pull "free" EMBM, fur shading and the like.
 
Actually I think he got this one wrong, I believe he said Xbox pushed 12 million polygons @ 30fps, that means Xbox was pushing ~400k Polygons a frame while GCN's best was ~334k per frame.
I don't think this means much (polys per frame I mean), since the fact that RS is running at 60 fps compared to the Xbox's game 30 fps, the only aspect I think would be harder to achieve in the 30 fps scenario is the fact that the assets would be bigger so more RAM would be needed.

All the other aspects involved on rendering a game I can think of are more demanding in a 300k at 60fps scenario than in a 600k at 30fps one, not to talk between a 334k polys at 60fps in comparison to a 400k at 30fps.


Yes, the SPU is almost 92% larger than brazo's so I find it unlikely as well.
The entire size for flipper is 21 million transistors, GPU7 is nearly a billion and ~750 without the eDRAM so I find it completely unrealistic that Gamecube's entire GPU could bump the size of the SPUs so much.
Well, I wasn't thinking in a Flipper/Hollywood being included as is, but each one of the 160 ALUs being able to perform the TEV operations at the same speed.
 

v1oz

Member
Those were only 2 small areas out of a huge number of things that the Genesis dwarfed the SNES in. The Genesis, uses just its core hardware could run FPS games better and produce more polygons. The Genesis also had a far superior sound system.
The Genesis had inferior sound quality to the SNES. The only advantage the Genesis had was built in FM synthesis. But that's not too useful for gaming purposes.

Yes the Genesis has a faster CPU but the SNES countered that with DSP chips. Most cross platform games like Mortal Kombat 2 and Super Street Fighter looked & sounded better on the SNES. I wont even mention SNES exclusives.
 
M°°nblade;59727629 said:
But a camera doesn't interact with the environment which is what you usually do in a game. You just change your viewing point in a scripted scene.
I should have mentioned that I dont care about real time physics. Give me scripted physics and better grafics and I'm all set.
 
According to Marcan, you can access it (unlike MEM0), but I guess it's possible that you're not allowed to.
I don't think so. I mean, would it be reasonable to cram 32MB of memory (wasting near 33% of the total die space) on the GPU die and then not give direct access to it? For MEM0 I could understand it, because although on GC/Wii it was accessible, I think that blocking its direct use on WiiU will allow Nintendo to eliminate those parts on future consoles without breaking backwards compatibility.
 
Actually I think he got this one wrong, I believe he said Xbox pushed 12 million polygons @ 30fps, that means Xbox was pushing ~400k Polygons a frame while GCN's best was ~334k per frame.
I'm not surprised if true. I had all 4 systems that gen and it seemed pretty obvious to me that power wise they went Dreamcast<PS2<Gamecube<Xbox. I've heard the Dreamcast is more powerful than the PS2 but it died an early death and couldn't show off its power. I don't know how accurate that is.
Gah.

Ok, let's tackle this. Not every developer of that generation released polygon figures, hence there's no way to really know. PS2 did 10 million polygons per second in some games, not more; GC did quite a few 15 million polygon per second games, it was its thing (Metroid Prime series were regarded as 15 million, as was F-Zero GX); I've heard Xbox went as high as 15 million polygons per second at 30 fps on Rallisport Challenge.

We have no figures for RE4, but that one runs at 30 fps and has ganados with 5.000 polygons everywhere, plus detailed closed environments and Leon and Ashey accounting for 10.000 polygons each. That's huge for that generation standards; so on a per frame basis it should be using a lot. It's simply down to choice basically, they were doing 60 frames because they could, on the GC; on the Xbox though, not so much, and I'll explain; not only did Xbox have more hit texturing than GC, hence lower textured polygon throughput; it also had a pipeline limitation of textures per pass, sitting at 4; GC did 8 textures per pass and turns out everyone wanted to have that that generation; but none besides GC had it. So that meant Xbox in order to do it had to use the polygon trick, which is the same as saying you rendered twice, so it hampered your effective polygon throughput (and nice framerate) in half, yet it was done all the time.

PS2 also did multiple passages often and some times (it was actually designed for that, so hit wouldn't be huge in doing so) the announced polygon throughputs sometimes took that into account, meaning a game pushing 10.000 polygons on it might be doing as low as 5.000 with a mere two passages. Hence polygon pushers on both PS2 and Xbox avoided multiple passages (VF4 looks somewhat simple, despite being high polygon as hell), most games that gen saw more advantage in going that route though, hence how Halo 2 actually pushes less polygons per second than Halo 1 (which was a 10 million polygon game) and Chronicles of Riddick or Doom 3 on Xbox did it as well; hell, Splinter Cell games were doing it.

So no Xbox had the advantage at places, but didn't do so on polygon throughput or texturing capabilities (a byproduct of it's polygon texturing capabilities)


As for DC, DC was more feature rich and that meant it supported things PS2 couldn't do or had too much hit doing, it also supported texture compression, tile rendering and was a texturing beast, 480p for all games (something that had to be implemented on PS2 and most devs wouldn't bother with). PS2 was crapshit texturing, hence most games used 4-8 bit textures, which was a huge downgrade from DC days (and again, without texture compression lending a hand, hence why they were also being downgraded in color depth), then again, DC max throughput was 5 million polygons per second on Le Mans 24 Hours (and it simply couldn't go higher due to RAM limitations, not to mention the GPU probably couldn't handle it); as previously said, PS2 did 10 million polygons per second, so there's a palpable difference between both systems.

So no, DC is not more powerful than PS2 and would have hard time competing throughout that generation; but PS2 did some things spectacularly wrong in comparison to DC; hance it pretty much only won because it came two years later. (moore's law states performance should double every 18 months and that was certainly true and easily achievable back then)
 

wsippel

Banned
I don't think so. I mean, would it be reasonable to cram 32MB of memory (wasting near 33% of the total die space) on the GPU die and then not give direct access to it? For MEM0 I could understand it, because although on GC/Wii it was accessible, I think that blocking its direct use on WiiU will allow Nintendo to eliminate those parts on future consoles without breaking backwards compatibility.
MEM0 is apparently still used, but managed by the OS. Could be some sort of buffer between CPU and GPU, or between GPU and RAM.

But I really don't see the point in reserving MEM1 for libraries. Unless VG Leaks got confused, and the memory isn't actually used to store graphics libraries, just managed by those libraries. In which case developers could still use it by going through certain GX2 functions.
 
I've heard the Dreamcast is more powerful than the PS2 but it died an early death and couldn't show off its power. I don't know how accurate that is.

I don't think it is accurate. Actually, the situation is somewhat similar to what we see with 360/PS3 in comparison to Wii U. Early PS2 ports looked worse compared to their counterparts on Dreamcast, so some people said PS2 was weaker. Today, some people claim Wii U is slower than 360 or PS3 because some early ports perform worse.
The difference of course is that there was only one and a half year between Dreamcast and PS2 releases.

Sorry, that was way off topic. Back to lurking I go. But first I'd like to say I find this thread incredibly interesting, and don't want to see it locked. Could everyone please try to keep the attacks (on both sides) to a minimum? Thanks! ^_^

I don't think it's that bad being off topic every now and then as long as it's still about console technology. Less personal attacks would be nice though.
 
MEM0 is apparently still used, but managed by the OS. Could be some sort of buffer between CPU and GPU, or between GPU and RAM.

But I really don't see the point in reserving MEM1 for libraries. Unless VG Leaks got confused, and the memory isn't actually used to store graphics libraries, just managed by those libraries. In which case developers could still use it by going through certain GX2 functions.
I don't doubt that MEM0 is still used, but blocking its access to developers and limiting its uses to what Nintendo wants would allow, in a future console with (for example) 64MB of eDram, to mimic those functionalities on the big pool of eDram and ditch the secondary ones if this is the reason Nintendo blocked its access.

Regarding the MEM1 pool, I agree with you that this is not what it seems reasonable to do. I mean, where were the graphic libraries stored on the Xbox 360? Isn't a graphic library simply a bunch of functions related to graphics?
If Nintendo's strategy is to ditch those MEM0 pools in future designs, then I think that the 2MB pool of eDram would be a good place to store those libraries.
 

KidBeta

Junior Member
I don't doubt that MEM0 is still used, but blocking its access to developers and limiting its uses to what Nintendo wants would allow, in a future console with (for example) 64MB of eDram, to mimic those functionalities on the big pool of eDram and ditch the secondary ones if this is the reason Nintendo blocked its access.

Regarding the MEM1 pool, I agree with you that this is not what it seems reasonable to do. I mean, where were the graphic libraries stored on the Xbox 360? Isn't a graphic library simply a bunch of functions related to graphics?
If Nintendo's strategy is to ditch those MEM0 pools in future designs, then I think that the 2MB pool of eDram would be a good place to store those libraries.

Yeah, the reason is there is because its a dev kit.

The highest address is near 4GB after all, the highest 2GB memory address is 0x80000000
 

v1oz

Member
So I guess we can say you know exactly what's going on under the hood then...?

Nobody outside of Capcom does.

You know, I'd agree, but look at MH3U. Same engine, same situation, launch game, similar improvements, no stutter, runs at 1080p. But it's also a more high profile project, exclusive no less, and more effort went into it.
Same with RE:R some people have reported frame rate issues and stuttering with MH3U. I don't know why not everyone experiences it.

Where did you hear that? Because we recently got a statement from Shin'en on that topic, and they apparently disagree. And you'll have a hard time finding a studio that has more experience on Nintendo hardware. The system might be "easy to develop for", but that doesn't mean it's easy to exploit - it just means you don't have to bend over backwards to get something up and running on the system. The architecture doesn't really matter, either, Espresso, Xenon and Cell are still extremely different animals.

Exploiting performance gets easier with experience on any hardware. However Nintendo's philosophy is to make hardware easy to develop for, keeping the architecture extremely balanced & straight forward, with very predictable performance and high efficiency. Exploiting performance is not difficult compared to other hardware (N64/PS2/PS3) because it's easier to the anticipate the kind of performance you will hit pretty early on. Basically you will rarely find any real show stoppers or bottlenecks that kill performance in Nintendo hardware and the hardware usually hits target specs whatever you throw at it. They learned their lessons after the N64 which was a nightmare to work with.

As for what we've heard. Tom Crago, CEO of Straight Right in Australia, said the console has been very easy to work with "Very straight forward to code for". Fumihiko Yasuda from Team Ninja has also said the hardware is "very easy to develop for". Shinen from what I've read also said the hardware easy to work with.

The only issues we've heard are that the tools are not as good or mature as the ones from Sony And Microsoft.
 
Eh, let's drop this discussion because even though it gets brought up a lot in here it's OT. In addition to that, if the claims of a developer who's worked on all three last-gen platforms doesn't convince you, nothing said in this thread will.

We also have devs, and plenty of games that tell us otherwise, sounds like some one isn't being accurate with you.


http://www.ign.com/articles/2001/08/29/rogue-leader-chat-transcript

While it was still a work in progress.

Question: To satisfy all the forum spec and number adorers, how many millions of polygons does RL tend to push in a standard situation per second, and how long does it tend to vary? and has their been any performance increases since previous showings?

LucasArts/Factor 5: If we would start counting the polygons now the game wouldn't be done, but we estimate most scenes at 12-15 million polygons per second. The version being shown in Europe is quite a performance increase in the Hoth level compared to previous showings.

To add to this, I talked to a Factor 5 guy back in the day on the Maya email list serv. I remember asking him about polygon counts and such for their games and he gave roughly the same range for RL (though he said 12 - 16) and said for RS they were nearly doubling that number.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
MEM0 is apparently still used, but managed by the OS. Could be some sort of buffer between CPU and GPU, or between GPU and RAM.

But I really don't see the point in reserving MEM1 for libraries. Unless VG Leaks got confused, and the memory isn't actually used to store graphics libraries, just managed by those libraries. In which case developers could still use it by going through certain GX2 functions.
Of course. Just because a pool of mem is not mapped in the game address space that does not mean it's not usable by games/apps. Unless somebody would also claim that PC GPU mem is not accessible by applications.

Apropos, the much more interesting thing from that map is the sum of non-app areas:

0x2000000 + (0xe0000000 &#8722; 0xc0000000) + (0xffffffff &#8722; 0xf4000000) = 0x2e000000 = 771751936. What about the remaining 1GB - 771751936 = 301989888 bytes?

Yeah, the reason is there is because its a dev kit.

The highest address is near 4GB after all, the highest 2GB memory address is 0x80000000
Address space != physical memory.
 

The_Lump

Banned
Interesting, it looks like developers have no direct access to the 32MB of embedded memory, if true.

Interesting indeed. Wonder where they got this? Thought devs had no detailed info at all?


(Also: Wow, this might be the first time I brought actual news to a thread. I've no idea what any of it means, of course, but still!)
 

wsippel

Banned
Of course. Just because a pool of mem is not mapped in the game address space that does not mean it's not usable by games/apps. Unless somebody would also claim that PC GPU mem is not accessible by applications.

Apropos, the much more interesting thing from that map is the sum of non-app areas:

0x2000000 + (0xe0000000 &#8722; 0xc0000000) + (0xffffffff &#8722; 0xf4000000) = 0x2e000000 = 771751936. What about the remaining 1GB - 771751936 = 301989888 bytes?
The map uses the full 32bit address space, no? The game seems to use 1GB starting at 0x02000000, so 0x02000000 - 0x42000000.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
The map uses the full 32bit address space, no? The game seems to use 1GB starting at 0x02000000, so 0x02000000 - 0x42000000.
Yes, I've accounted for that - 1GB is set aside for the app. From there on, the system areas mapped into the app address space total to ~700MB. Which means that the system-reserved portions not mapped into app space are ~300MB. Basically naively put, OS + system buffers = 300MB.
 

wsippel

Banned
Yes, I've accounted for that - 1GB is set aside for the app. From there on, the system areas mapped into the app address space total to ~700MB. Which means that the system-reserved portions not mapped into app space are ~300MB. Basically naively put, OS + system buffers = 300MB.
Ah, so you're talking about the OS reserved memory? The rest could simply be omitted I guess. Developers don't really have to know how that part works, not to mention most of the memory seems to be unused while a game is running.
 

krizzx

Junior Member
http://www.vgleaks.com/wii-u-memory-map/

Clearly of a dev kit but interesting none the less.

Nice find. Though, this more or less just reaffirms things we were already aware of. It also kind of contradicts some things we know if I'm reading this right.

I have to ask, how did they come about this info though? It sounds like they got it from hacking in Wii mode after reading it which would impose limits. This brings up one interesting point though, if the 32 MB of 1TSRAM is inaccessible.

That would make the claims that it was offsetting the DDR3 false. That would in turn mean that performance of the DDR3 is much better than people were insisting on.
 

MDX

Member
Is there a summary/theory on the memory bandwidth
between DDR3 and the MCM, on the CPU die, on the GPU die,
and between the CPU and GPU?



.
 

tipoo

Banned
That would make the claims that it was offsetting the DDR3 false. That would in turn mean that performance of the DDR3 is much better than people were insisting on.

imo that's too much conclusion hopping even if we knew that source was right. The markings on the DRAM modules themselves told us the bandwidth, and those don't lie. I remember the possibility of clocking beyond the bin was brought up a long time ago, but come on, for a console shipping in the millions that's a tough proposition. The argument, I know, will be of how the Wii U surpasses the 360 in some games without faster RAM, but that could be down to GPU architectures different requirements, plus the 360 iirc was 22GB/s read OR write, while this would be read AND write. But the max bandwidth would not change in either case.
 

krizzx

Junior Member
imo that's too much conclusion hopping even if we knew that source was right. The markings on the DRAM modules themselves told us the bandwidth, and those don't lie. I remember the possibility of clocking beyond the bin was brought up a long time ago, but come on, for a console shipping in the millions that's a tough proposition. The argument, I know, will be of how the Wii U surpasses the 360 in some games without faster RAM, but that could be down to GPU architectures different requirements, plus the 360 iirc was 22GB/s read OR write, while this would be read AND write. But the max bandwidth would not change in either case.

I didn't say "bandwith". I say performance. That means all things considered.

Other aspects people are not factoring into the bandwidth are the possibility of it being accessed in a dual channel fashion (it is 2X512 after all) and the latency.
 
imo that's too much conclusion hopping even if we knew that source was right. The markings on the DRAM modules themselves told us the bandwidth, and those don't lie. I remember the possibility of clocking beyond the bin was brought up a long time ago, but come on, for a console shipping in the millions that's a tough proposition. The argument, I know, will be of how the Wii U surpasses the 360 in some games without faster RAM, but that could be down to GPU architectures different requirements, plus the 360 iirc was 22GB/s read OR write, while this would be read AND write. But the max bandwidth would not change in either case.

the 360 is 11 read 11 write
 

tipoo

Banned
I didn't say "bandwith". I say performance. That means all things considered.

Other aspects people are not factoring into the bandwidth are the possibility of it being accessed in a dual channel fashion (it is 2X512 after all) and the latency.



Ah, ok. Yes, performance could be higher than the bandwidth implies.

But no, that's not how dual channel works. The calculated bandwidth was already accounting for every single memory module. Dual Channel has to do with addressing multi module banks separately on multiple memory buses, but the total bus width is already accounted for here (4 16-bit devices). Not applicable. Dual channel doesn't magically create more bandwidth than the physical modules allow.


the 360 is 11 read 11 write

Yes, as I said. It's 22 for read OR write, half for both.

the way you said it wasn't clear at all

22/2 = 11, for your clarity.
 

krizzx

Junior Member
Ah, ok. Yes, performance could be higher than the bandwidth implies.

But no, that's not how dual channel works. The calculated bandwidth was already accounting for every single memory module. Dual Channel has to do with addressing banks separately, but the total bus width is already accounted for here. Not applicable. Dual channel doesn't magically create more bandwidth than the physical modules allow.




Yes, as I said. It's 22 for read OR write, half for both.

Then I guess it just boils down to latency.
 
Ah, ok. Yes, performance could be higher than the bandwidth implies.

But no, that's not how dual channel works. The calculated bandwidth was already accounting for every single memory module. Dual Channel has to do with addressing multi module banks separately on multiple memory buses, but the total bus width is already accounted for here (4 16-bit devices). Not applicable. Dual channel doesn't magically create more bandwidth than the physical modules allow.




Yes, as I said. It's 22 for read OR write, half for both.

the way you said it wasn't clear at all
 
This might be a dumb question, but what do they mean exactly by "Graphics libraries?" I would be very surprised if MEM1 isn't being used at least as a frame buffer at this point - perhaps doing its job automatically? Also could explain why Vsync is so abundant if there's double buffering but not much dev control over memory management in the eDRAM.

This does bode well, perhaps, if Nintendo allow more direct control over MEM1 in the future. It would mean we're not seeing it used for render to texture or storage of local render targets at the moment, would it not?

It also makes it seem a bit more likely that MEM0 is acting as some type of buffer/cache for the DDR3 interface (not that the GPU doesn't have its own caches as well). Perhaps it is specifically for CPU access to MEM2, going by its position on die.
 

tipoo

Banned
This might be a dumb question, but what do they mean exactly by "Graphics libraries?" I would be very surprised if MEM1 isn't being used at least as a frame buffer at this point - perhaps doing its job automatically? Also could explain why Vsync is so abundant if there's double buffering but not much dev control over memory management in the eDRAM.


I was pondering that as well, surely they don't mean the whole custom OpenGL library is just dumped on there? Where does the library sit on other consoles?
 
Status
Not open for further replies.
Top Bottom