• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Wii U GPU base specs: 160 ALUs, 8 TMUs, 8 ROPs; Rumor: Wii U hardware was downgraded

Status
Not open for further replies.
Completely lost with the 192 threads. I have heard of threading in CPUs but not in GPUs. Time to learn a bit more. How does that work for dummys?


WiiU also supports multi-threaded rendering that scales perfectly with the number of cores you throw at it, unlike PC DX11 deferred contexts which don't scale very well.

This is also very interesting. Does the CPU help in rendering? Most thought the CPU was crap and that the GPGPU was going to help the CPU, not the other way around.
 
Completely lost with the 192 threads. I have heard of threading in CPUs but not in GPUs. Time to learn a bit more. How does that work for dummys?

Same way, but on a much larger scale pretty much. They just give everything it's own thread...and these GPUs can juggle more threads than they can execute at any given time.
 
How does this compare to PS4/XOne.

I don't care to look up the specifics, but PS4/Xbone are certainly in a different league.

So would you lose that efficiency if bandwidth from eDRAM was less than ideal?

Naturally, if you are talking about "efficiency," anything more or less than the ideal will decrease that value. But what is that ideal? For 160 shaders, a shitton of bandwidth shouldn't be necessary. It would be a waste.
 

Soren01

Member
So assuming this is an actual dev who has worked on Wii U... can we assume this is he best leaked info we have? some are saying he mixed up threads and ALU's... is that possible for a dev in todays world and if so can we mention that in the forums over there and see if he makes a correction on his initial comments.? We do have the fact that this game Project Cars was canceled on ps360 because devs had a certion vision and visual quality they wanted to hit.... but it wasnt canceled on Wii U and now a dev from that game comes out nad gives us a good bit of leaked information.

i am intrigued so if true that show that this custom GPU is exactly that custom and we will only know if someone leaks info.

Wow. Didn't know that. Slightly Mad put Wii U in a good position.
 
I don't care to look up the specifics, but PS4/Xbone are certainly in a different league.



Naturally, if you are talking about "efficiency," anything more or less than the ideal will decrease that value. But what is that ideal? For 160 shaders, a shitton of bandwidth shouldn't be necessary. It would be a waste.

Could these threads also explain the size of each ALU?
 

Soren01

Member
yep here is the actual press release.

“Project CARS has always led the pack in terms of insane detail,” said head of studio Ian Bell in a statement. “Whether that’s graphically in the craftsmanship of our cars and tracks, technically in the way we’ve approached weather and time of day, or emotionally in how each car feels and responds to your touch.”

“These powerful new platforms allow us therefore to not compromise on the quality of our vision and ultimately that means players are going to experience something truly breathtaking when they get behind the wheel.”

so yeah they grouped Wii U in with that group. again its a customized console that if developed the right way and not looked down on becuase it doesnt have 8GB of ram developers with passion for making great games wont find excuses... guess Shin'en was right its on the developers not the hardware.

I'll buy the Wii U version, then. :)
 

Argyle

Member
Completely lost with the 192 threads. I have heard of threading in CPUs but not in GPUs. Time to learn a bit more. How does that work for dummys?

This probably explains it better than I could:

http://fgiesen.wordpress.com/2011/07/10/a-trip-through-the-graphics-pipeline-2011-part-8/
http://fgiesen.wordpress.com/2011/10/09/a-trip-through-the-graphics-pipeline-2011-part-13/

(also, linked from above: http://bps10.idav.ucdavis.edu/talks/03-fatahalian_gpuArchTeraflop_BPS_SIGGRAPH2010.pdf)

This is also very interesting. Does the CPU help in rendering? Most thought the CPU was crap and that the GPGPU was going to help the CPU, not the other way around.

These are not referring to the same things, the "multithreaded rendering scaling across CPUs" refers to feeding the GPU with commands (draw calls) from more than one CPU thread. This is one of the problems that Mantle is expected to solve on PC.
 

mikk

Neo Member
OK, I don't know if this is already known but here is a forum post from the official Project CARS forum. It is from "PC Render Coder" of the project and there was a talk about WiiU. He replied :


He worked on the Most Wanted WiiU engine conversion and for pCARS as well, although not otimized yet.
 
This probably explains it better than I could:

http://fgiesen.wordpress.com/2011/07/10/a-trip-through-the-graphics-pipeline-2011-part-8/
http://fgiesen.wordpress.com/2011/10/09/a-trip-through-the-graphics-pipeline-2011-part-13/

(also, linked from above: http://bps10.idav.ucdavis.edu/talks/03-fatahalian_gpuArchTeraflop_BPS_SIGGRAPH2010.pdf)



These are not referring to the same things, the "multithreaded rendering scaling across CPUs" refers to feeding the GPU with commands (draw calls) from more than one CPU thread. This is one of the problems that Mantle is expected to solve on PC.

That is quite a lot to read through, care to highlight some stand out points that can explain the "192 threads" discussion?
 

Argyle

Member
That is quite a lot to read through, care to highlight some stand out points that can explain the "192 threads" discussion?

Basically each pixel/vertex gets its own thread, these are usually grouped into "quads" which are groups of four threads, and those are grouped into "wavefronts" or "warps", and for each of those groupings they are all essentially running the same program (obviously the data and the results are different per pixel/vertex, but the shader program is the same).

The PDF in the third link that I edited in does a good job of explaining things with pictures.
 
If I end up getting a Wii U I'll probably buy Project CARS since I wouldn't mind having a non-arcade racing game and I'd like to support these guys if it's a good port.

I just hope other people do likewise.
 

fred

Member
good choice it will need all the support it can get.



cant wait to see actual gameplay of this on Wii U. so the project cars Wii U is using the MW Wii U engine. hope that optimizations used and we get a great version in its own right.

It probably won't be the same engine, he just has experience working on converting an engine from the PS3/360 to the Wii U. I could be wrong but that's what it sounds like, have no idea which engine(s) Most Wanted and Project CARS use tbh.

My phone is freaking out yet again so hopefully I haven't messed those quotes up!
 

disap.ed

Member
OK, I don't know if this is already known but here is a forum post from the official Project CARS forum. It is from "PC Render Coder" of the project and there was a talk about WiiU. He replied :

Considering this is an official game dev, what do you think about his remarks on the console's performance ?

I really don't understand why you would post this here!
You didn't get that WMD forums are closed for a reason and it was said that we shouldn't post any WiiU specific updates outside?
You didn't think a second about that this could get the dev in trouble because of NDAs?
I sincerely hope they will ban you from the WMD forums (as announced by AndyG for braking these rules), I hope your 15 minutes of fame were worth it.
 
Well it seems that the 160 ALU count was the right one. Are those 160 ALUs the same as the ones of a vanilla R700? If that was true, then the bigger space taken by the ALUs would point to a bigger fabrication node than just 40nm.
55nm? Even 90nm wouldn't surprise me at this point considering how large they are compared to other 40nm parts!!! o_O
If Nintendo has gone for a 90nm process.... isn't that a bit "weird" to say the least? The chips end up being much bigger and expensive than equivalent chips at a lower fabrication process. And if that wasn't enough, 90nm is such an ancient technology I still can't believe Nintendo has cheapened out to that extent.
 

wsippel

Banned
Well it seems that the 160 ALU count was the right one. Are those 160 ALUs the same as the ones of a vanilla R700? If that was true, then the bigger space taken by the ALUs would point to a bigger fabrication node than just 40nm.
55nm? Even 90nm wouldn't surprise me at this point considering how large they are compared to other 40nm parts!!! o_O
If Nintendo has gone for a 90nm process.... isn't that a bit "weird" to say the least? The chips end up being much bigger and expensive than equivalent chips at a lower fabrication process. And if that wasn't enough, 90nm is such an ancient technology I still can't believe Nintendo has cheapened out to that extent.
It's 40nm.
 
It's 40nm.
It's what makes more sense, but the size of the 3MB of eDRAM/eSRAM on the die photo compared to the area they occupied on the GameCube at 180nm and the size of the ALUs being double as big as the ones of a vanilla R700 at the same fabrication process points towards a pretty unefficient 40nm.
I mean, could those "40nm" used on the WiiU, have the same density than 90nm on TSMC?
Because the ALUs occupy DOUBLE the area they do on equivalent designs at 40nm, and it seems that they're pretty normal ALUs without any "secret sauce" that could explain this increase...
 
Same way, but on a much larger scale pretty much. They just give everything it's own thread...and these GPUs can juggle more threads than they can execute at any given time.
I see. Is it normal for GPUs to have more threads than ALU's, though? I thought they were typically a 1:1 ratio. Though, I believe you did mentioned earlier that Xenos Had less threads compared to its ALU count.

It's what makes more sense, but the size of the 3MB of eDRAM/eSRAM on the die photo compared to the area they occupied on the GameCube at 180nm and the size of the ALUs being double as big as the ones of a vanilla R700 at the same fabrication process points towards a pretty unefficient 40nm.
I mean, could those "40nm" used on the WiiU, have the same density than 90nm on TSMC?
Because the ALUs occupy DOUBLE the area they do on equivalent designs at 40nm, and it seems that they're pretty normal ALUs without any "secret sauce" that could explain this increase...
The unique ratio of ALUs compared to threads may point to modifications to increase the efficiency of all the ALUs. There is also the possibly weird configuration ratio (160:8:8?) to look at.
 

wsippel

Banned
Wild guess: The four blocks above the shader units on the die are TEV & TC units, though only the block on the very left is fully compatible with Hollywood, which is why it's bigger than the other three. It's really hard to tell from the Flipper die shot, but I believe the register layout might match. If my assumption is correct, the system has no TMUs at all, but supports up to 32 textures at a time and can handle basic ops using the TEV register combiners without even tapping into the shader ALUs. That seems like the Nintendo Way(TM) of doing things and matches what was said in the Iwata Asks about the chipset.
 
I see. Is it normal for GPUs to have more threads than ALU's, though? I thought they were typically a 1:1 ratio.

I can only speak of compute via OpenCL or CUDA. There, modern GPUs can launch between 1024 or 2048 Threads per CU/SM, supported by large register sets (a current AMD-CU has 64 shaders while a NVIDIA-SM has 192). And you usually really need a lot more threads than you have shaders in order to hide memory latencies.
 

Schnozberry

Member
Wild guess: The four blocks above the shader units on the die are TEV & TC units, though only the block on the very left is fully compatible with Hollywood, which is why it's bigger than the other three. It's really hard to tell from the Flipper die shot, but I believe the register layout might match. If my assumption is correct, the system has no TMUs at all, but supports up to 32 textures at a time and can handle basic ops using the TEV register combiners without even tapping into the shader ALUs. That seems like the Nintendo Way(TM) of doing things and matches what was said in the Iwata Asks about the chipset.

Along with the focus on latency via EDRAM, that would make this one peculiar beast indeed. It's not tuned like any current platform or PS4/Xbone.
 

Panajev2001a

GAF's Pleasant Genius
Wild guess: The four blocks above the shader units on the die are TEV & TC units, though only the block on the very left is fully compatible with Hollywood, which is why it's bigger than the other three. It's really hard to tell from the Flipper die shot, but I believe the register layout might match. If my assumption is correct, the system has no TMUs at all, but supports up to 32 textures at a time and can handle basic ops using the TEV register combiners without even tapping into the shader ALUs. That seems like the Nintendo Way(TM) of doing things and matches what was said in the Iwata Asks about the chipset.

That would be quite cool actually. Also, it seems only Nintendo pushed for having the low latency embedded memory pool accessible by both CPU and GPU, unline Xbox One where its ESRAM canbe accessed by the CPU too but through a slow connection. I always liked both SCE and Nintendo hardware approaches. Both exotic and custom in their own ways :).
 
Wild guess: The four blocks above the shader units on the die are TEV & TC units, though only the block on the very left is fully compatible with Hollywood, which is why it's bigger than the other three. It's really hard to tell from the Flipper die shot, but I believe the register layout might match. If my assumption is correct, the system has no TMUs at all, but supports up to 32 textures at a time and can handle basic ops using the TEV register combiners without even tapping into the shader ALUs. That seems like the Nintendo Way(TM) of doing things and matches what was said in the Iwata Asks about the chipset.

When looking at those blocks' location in relation to the UTDP, LDS, and SIMDs, and their registers those are definitely Interpolators.
 
lwilliams3 said:
The unique ratio of ALUs compared to threads may point to modifications to increase the efficiency of all the ALUs. There is also the possibly weird configuration ratio (160:8:8?) to look at.
Well, that ratio is much higher than what was found on the Xbox 360, but regarding the ALU size, it was 90% bigger in area than the equivalent number of ALUs found on a VLIW 5 hardware, so we should compare it with the number of threads an R800 can execute simultaneously.

EDIT: Well, this is a bit weird. I've been reading a bit about the R800, and if I'm not mistaken, they were still limited at 2 concurrent threads per SIMD. So if those 160 ALU can work on 192 threads... well, it seems that this can be an optimization towards GPGPU, because that's a lot more threads per ALU than those designs.
 

wsippel

Banned
When looking at those blocks' location in relation to the UTDP, LDS, and SIMDs, and their registers those are definitely Interpolators.
Why would it have four interpolators, though? Shouldn't it be one per SIMD? And why is one of them apparently different from the other three?
 
Considering wsippels new approach: If Nintendo is implementing TEVs once again, about what kind of boost are we talking about? 5% more to the last estimates? 100% in the hands of EAD? What IF wsippel is right?
 
Why would it have four interpolators, though? Shouldn't it be one per SIMD? And why is one of them apparently different from the other three?
Regarding the interpolators, from what I've read it should be close to 1 interpolator per TMU, but in the R700 there were even less interpolators than TMUs (32 interpolators for 40 TMUs and 40 SIMD if I don't remember it bad).
 
I think you are really underestimating X... one we have seen alpha footage... but look at the scale attention to detail, and overall visuals. the game probably wont release until Q3 2014.... X will trump any game from 7th gen in the same open world environment.

X isn't looking like its anywhere close to Red Dead or GTAV, and what we've seen of Metal Gear just kills it.

The game may have its strong points, but technically its not impressive next to the best of the ps360.
 
Why would it have four interpolators, though? Shouldn't it be one per SIMD? And why is one of them apparently different from the other three?

I may be confusing the graphics pipeline, but is interpolation related to tex coordinate generation? I remember Flipper had a custom block for that. Perhaps they built it into J1 for BC?
 

wsippel

Banned
I may be confusing the graphics pipeline, but is interpolation related to tex coordinate generation? I remember Flipper had a custom block for that. Perhaps they built it into J1 for BC?
Ah, so TC? Yeah, as I wrote, if my assumption is correct, each of the four blocks would probably integrate both TC and TEV. They are next to each other on Flipper as well.
 
wsippel's guess about the integrated TEVs on the Interpolators? Good, since certain pixel effects could be done without using any of the "conventional" 160 ALU...
Couldn't this also be bad since no one aside from Nintendo's 1st party devs ever got comfortable with TEV? It would create a situation where 3rd party developers would always be at a disadvantage.

Except for Capcom, but they always seem to have a better handle of everyone's hardware than most developers.
 
Ah, so TC? Yeah, as I wrote, if my assumption is correct, each of the four blocks would probably integrate both TC and TEV. They are next to each other on Flipper as well.

Texture coordinate generation is my best guess. Again, I'm not sure if I'm thinking of the process correctly, however. I know what interpolation is in other contexts, and that it has to do with texture filtering on GPUs, but that's it. I don't think there's a reason for TEV on there. As blu mentioned in a post way way back, TEV is more akin to a primitive pixel shader than a TMU. Going by the comments of Nintendo's engineers and also Marcan, I'm fairly sure TEV isn't on there. Rather, it sounds like there's some rudimentary logic translating TEV instructions into shader language on the fly.
 

Powerwing

Member
Does the fact that wii backward compatibility is done 100% without any glitches/perfectly (as far as i know) for all games even those using tev implies that wii's specificities or «folkloric» features like tev are somehow in the wii ?
 

jmood88

Member
Am I missing something or has someone bumped a year-old thread? The system has been out for awhile, why does any of this matter?
 

AzaK

Member
Am I missing something or has someone bumped a year-old thread? The system has been out for awhile, why does any of this matter?

Because we're nerds. If you're not interested, hand back your pocket protector and socks and sandals NOW.
 
Status
Not open for further replies.
Top Bottom