nordique
Member
Interesting discussion regarding Wii U memory bandwidth and what-not from Not Enough Shaders
source:
http://www.notenoughshaders.com/2013/01/17/wiiu-memory-story/
(Recommend taking a few mins to read the whole thing, on their site)
Excerpts:
Re; eDRAM
^This is pretty much what many who have legitimately been following Wii U tech such as Thraktor and blu have speculated in the past
I recommend everyone read the entire article before posting/jumping to conclusions
Just a discussion of this technical aspect of the Wii U; its advantages over current gen HD and potentially relationship to next gen HD
Please try and keep it mature
I'll likely edit more quote in if people want, its a really long article and I wanted to keep it shorter. Much more discussion in link beyond what I posted
Just passing it on, its a great read in general from Not Enough Shaders, enjoy
source:
http://www.notenoughshaders.com/2013/01/17/wiiu-memory-story/
(Recommend taking a few mins to read the whole thing, on their site)
Excerpts:
That’s it, the Wii U has been finally released, and while a plethora are enjoying new game experiences and interesting Gamepad applications, a smaller but rather vocal number of gamers on the forums, is mixed regarding the technical viability of the console for other productions than Nintendo’s. These concerns stem mainly from several laborious ports to say the least, lukewarm comments on the CPU by 4A Games or DICE aggravated by its disclosed frequency and apparent architecture roots, and the system’s first teardowns. Their results revealed the type of main memory (RAM) used, which disappointed many techies. Let’s dig further into this affair, with exclusive words from developers stating that it’s not an issue.
There are several important parameters about this RAM. Among others, the capacity, the latency and the bandwidth, which is the amount of data per second read or written in the memory by other components, calculated by multiplying the width of the interface connecting those parts (the X bit bus that you may have encountered on spec lists), and the frequency at which the data is transferred (see here for more information on the topic). For the first point, the Wii U got 2GB of which 1GB is set aside for games, it’s twice the 512MB of the Xbox360 and PS3. Before the release of its next-gen rivals, the Wii U is the console with the most memory, so much that developers like Ubisoft’s Michel Ancel praised this volume. It’s the same flattering portray for latency, the main recipient of Shin’en’s Manfred Linzner compliments on the system in our exclusive interview.
..."There are four 4Gb (512MB) Hynix DDR3-1600 devices surrounding the Wii U’s MCM (Multi Chip Module). Memory is shared between the CPU and GPU, and if I’m decoding the DRAM part numbers correctly it looks like these are 16-bit devices giving the Wii U a total of 12.8GB/s of peak memory bandwidth."...
This is clearly a nice leap from the 5.6GB/s Wii’s bandwidth, but roughly 40% slower than the Xbox360 and PS3.
Facts
1 – Generally speaking, although being a non-negligible parameter, RAM bandwidth is less vital than the GPU power or the memory amount, especially as the Wii U is more targeting 720p resolution for its content, thus requiring less fillrate and bandwidth than in 1080p. The anonymous source involved in this article himself put this criterion into perspective, declaring:
"In general all those DRAM numbers are not of much importance. Much more for example are the size and speed of caches for each core in the CPU. Because DRAM is always slow compared to caches speed."
Caches are faster but way smaller pools of memory than the RAM where repeatedly accessed data is stored, and our anonymous developer lauded the Wii U CPU ones in our little chat. So those caches should ease to a certain degree the slow RAM issue for the CPU. Then what about the GPU?
2 – The Wii U supposedly includes 32MB of “embedded DRAM”, a costly memory integrated on the same die as components, apparently on the GPU for the Wii U, like on Wii or Xbox 360 (for the latter, it was on a daughter die but on the same package as the GPU). The gains of this kind of memory compared to traditional stand-alone RAM chips are huge in pretty much all areas, like the latency and the bandwidth (feasibly reaching XXXGB/s rates). You might consider this eDram like another cache, but unlike the CPU one, it can be accessed by the GPU. It’s an efficient solution to spare the RAM of large bandwidth usages which are mandatory for the image treatments handled by the GPU, as they will occur on this specific memory instead. Here are the motives from Xbox 360 architects behind its adoption:
Re; eDRAM
HD, alpha blending, z-buffering, antialiasing, and HDR pixels take a heavy toll on memory bandwidth. Although more effects are being achieved in the shaders, postprocessing effects still require a large pixel-depth complexity. Also as texture filtering improves, texel fetches can consume large amounts of memory bandwidth, even with complex shaders. One approach to solving this problem is to use a wide external memory interface. This limits the ability to use higher-density memory technology as it becomes available, as well as requiring compression. Unfortunately, any compression technique must be lossless, which means unpredictable—generally no good for game optimization. In addition, the required bandwidth would most likely require using a second memory controller (a circuit that manages the flow of data with the RAM) in the CPU itself, rather than having a unified memory architecture, further reducing system flexibility.
EDRAM was the logical alternative. It has the advantage of completely removing the render target and the z-buffer bandwidth from the main-memory bandwidth equation. In addition, alpha blending and z-buffering are read-modify write processes, which further reduce the efficiency of memory bandwidth consumption. Keeping these processes on-chip means that the remaining high-bandwidth consumers—namely, geometry and texture—are now primarily read processes. Changing the majority of main-memory bandwidth to read requests increases main memory efficiency by reducing wasted memory bus cycles caused by turning around the bidirectional memory buses.”
...
Concretely, the eDram can acts as an ultra-fast “bridge” between the RAM and the CPU/GPU, indirectly mitigating the hypothetical slowness of the RAM and boosting the overall performance of the memory chain. ...
Hypothesis
All these numbers we’ve mentioned are the theoretical maximum bandwidth. In real conditions, the observed performances will be inferior to those advertised ...
The Wii U RAM real bandwidth might be closer to its theoretical peak than for the Xbox 360 and PS3. It could be explained by a more modern memory controller that better handles the data flow from and into the RAM. ...
...We also must take into account several intricate concepts of bus and direction of RAM. For starters, the 22.4GB/s bandwidth of current gen systems buses is often an aggregated rate, distributed between reads and writes with the RAM. In the case of Xbox 360, the CPU doesn’t reach this speed as its access to the RAM is bound by the FSB, the interface connecting it with the GPU where resides the memory controller. And this FSB bandwidth is 10.8GB/s for read and 10.8GB/s for write....
...For that reason, and strictly speaking about the CPU transfers with the main memory, it’s possible that the Wii U RAM full write or read bandwidth (12.8GB/s) isn’t as much at a disadvantage as the marketed numbers might suggest if the interface between U-CPU and U-GPU authorize this peak rate.
....this RAM bandwidth isn’t only common for all the components consuming it, it’s also bidirectional, meaning it can be used either for reads or writes. This characteristic could be combined with the eDram too, working as a “scratchpad” where the important writing operations will be aimed, reducing the amount of writes to the RAM...
....However, this asymmetrical memory organization undoubtedly requires optimization, especially from ports that may have put to task the greater bandwidth of the Xbox 360 and PS3 GDDR3 RAM for concurrent reads and writes
The chips are DDR3, a newer standard comparatively to the GDDR3 of Microsoft and Sony actual systems which have the same technological foundation as DDR2. This could manifest in practical differences in favor of the Wii U RAM depending on the type of game code involved, for example if it requires either many short read/write or long transfers with the main memory.
he last assumption would be that Nintendo and AMD could have developed a more forward-thinking texture compression method than for existing platforms, therefore reducing bandwidth needs. This tweet of Two Tribes Games, may support this premise...
....
Manfred Linzner who insisted on the absence of bottlenecks in the Wii U. A GPU starved of data from the RAM due to a limited bandwidth would represent a noticed and surely mentioned hindrance. Likewise, Nintendo is known to build balanced systems since a couple of generations, so it would be strange to select a RAM which would by its nature and the way it’s implemented in the hardware constitute a congestion for the performances.
But these last points are nonetheless just hypothesis, and won’t reverse the unsatisfactory RAM specifications that have dismayed tech-enthusiasts.
Not worth the geeky soap? (discussion with Wii U developers)
Even if this “bandwidth drama” is only debated within confined circles, it shouldn’t be discarded, especially to gauge the Wii U longevity in regard to technically demanding third-party titles. To better comprehend this situation, we had a little chat with a developer (wishing to remain anonymous), who has released a technically solid retail game on Wii U.:
InquiringMind: The first teardowns of the system happened, and it seems the 2GB of ram are DDR3-1600 chips on a 64 bit bus, up to 43% slower than the RAM bandwidth of Xbox360 & PS3.
Anonymous developer: These numbers are not the actual Wii U memory performance. I wonder if such low specs would even make a port from XBox360 possible.
Q: Do you mean there is more to it, that the dismantling may have overlooked something and in fact the bandwidth is higher? Or perhaps those rates are indeed true, but your observations on the overall memory performance are better, thanks to the eDram and caches?
A: I’m not capable of calculating memory throughput of DRAM chips like those websites nor I know the memory controller or how many channels such a controller uses or the actual timings of those chips. But when using the Wii U CPU caches properly to write memory from A to B then these numbers above get exceeded by far. And I don’t mean theoretical throughput on paper but real throughput in a game situation.
Q: But are you strictly talking of the 1GB of RAM or the whole memory chain, including the CPU caches that you’re referring to here or even the eDram? And if it’s the first case, could you not be aware of some kind of mechanism involving the caches and the eDram that would automatically speed up the data from or into the RAM, and that would explains your higher measured numbers?
A: I talked about the 1GB. But if our results differ from the theoretical limits i think we simply measure different things.
Q: So if I understand right, you have a way to know the speed of the RAM at your end, and what you’ve seen is clearly greater than 12GB/S? Would you say it’s in the same ballpark than the 22GB/s bandwidth of the Xbox360?
A: In my experience the Wii U surpasses any of these numbers under the right conditions. But as said, i can’t calculate the theoretical bandwidth of such DRAM, I can only talk about the actual system memory performance which is very good for me.
We also discussed with Joel Kinnunen, vice-president of Frozenbyte, and while remaining general to respect the NDA’s, he stated:
"We haven’t really done any proper measurements, our interest is always on the actual game performance which has been good. We didn’t have any major issues otherwise either, obviously some optimizations for the CPU were needed but we did that and now it runs better than the other consoles. The one thing we did not notice until release was the gamma/washed-out look we have/had in the release version – we had the hardware perform an extra and unnecessary gamma correction, we’ve fixed that in the update that’s coming out soonish. But on the topic, we had no issues at all with memory bandwidth on Trine 2: Director’s Cut."
Scenarios
How can we interpret these reports whereas the recent disassembles indicate a lackluster RAM bandwidth? Let’s see the several explanations that come in mind:
1) A thought (or hope) seen several times on boards is that we could lack the whole set of specifications of the RAM, the teardowns have not revealed everything, and the theoretical bandwidth could be higher than 12.8GB/s. But how could we contemplate that, the people behind these studies are professionals after all and the chips provided by at least three companies answer to a known standard. What could, strangely I must say, falsify those first analysis witnessed in different websites? It’s extremely improbable so we can safely drop this scenario.
2) The anonymous developer doesn’t measure the same parameter or the memory speed at the same spot, like he presumed himself. Could he talk about the speed of the data processed after the intervention of the eDram or CPU’s caches, which would affect positively his results? Still, it illustrates how the RAM bandwidth isn’t a perceived deterrent at all for the system in this context.
3) The types of game that these developers have worked on could imply a coding that doesn’t require an extensive use of the RAM, for example with the load of huge textures and detailed environments in an openworld/sandbox title. In this scenario, even if the RAM would indeed have a theoretical bandwidth of 12.8GB/s, there would be no negative impact because the main bulk of data that the CPU and GPU need rapidly would be in the caches and the fast eDram...
...it could be linked to the suppositions brought up earlier like the bus and ram direction and the more advanced memory controllers. But in this case, to what extent can the eDram stand as the “magical savior”, compensating for the relatively slow 1GB of RAM? Should we fear that games more pushing visually, in scale, and necessitating a bigger amount of data to be transferred to the CPU and GPU than this embedded RAM can store, will not be able to rely on this method and thus meet performances drawbacks on Wii U?
Memory Digest
All in all, our anonymous feedback, combined with other developers’ acclaims, tends to confirm that the Wii U memory hierarchy is cleverly thought-out when appropriately used, and even if the RAM bandwidth could be displeasing on a cold specsheet, it’s not translated into a tangible weakness, at least for exclusives and properly done ports. We tried to contact Hynix, one of the manufacturers of the RAM, and their answer albeit limited because of non-disclosure agreements, leans indeed toward a more complex situation than the theoretical rates might infer:
[see link]
We can say as well that if the eDram’s huge role is established, this organization would be somewhat of an extreme continuation of Nintendo habits since the Gamecube, with multiple pools, one acting as the “performer” with very efficient characteristics (the eDram), and another being the “backup” pool, here bigger but much slower (the RAM). In fact, it’s perhaps one of the causes behind the poor execution of a few ports, some studios might not had enough time and resources to adapt their titles planned for big RAM space with large bandwidth to the Wii U asymmetrical memory organization. Then the real interrogation, as tackled in the third scenario, relies upon this configuration and if this 32MB performer will be adequate and sufficient to let third-party next-gen games run acceptably on Wii U.
....
^This is pretty much what many who have legitimately been following Wii U tech such as Thraktor and blu have speculated in the past
I recommend everyone read the entire article before posting/jumping to conclusions
Just a discussion of this technical aspect of the Wii U; its advantages over current gen HD and potentially relationship to next gen HD
Please try and keep it mature
I'll likely edit more quote in if people want, its a really long article and I wanted to keep it shorter. Much more discussion in link beyond what I posted
Just passing it on, its a great read in general from Not Enough Shaders, enjoy