• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Nintendo Switch Dev Kit Stats Leaked? Cortex A57, 4GB RAM, 32GB Storage, Multi-Touch.

Status
Not open for further replies.
Vita? The vita is a handheld. The Switch is a hybrid. That's not a good comparison.

I think the point is that each hardware has different value propositions, thus price can be different.

Economically, the Switch might be a substitute good for a PS4 (at a high level), but I imagine it's competing more against tablets than say Project Scorpio.
 
Even the top mobile phones won't be as powerful for gaming.



How high clocked they are doesn't mean anything though. Even at 1ghz A57 is as powerful or maybe even more powerful overall than Jaguar at 1.6ghz. That's if the CPUs even end up being A57 and not something better.

The number of cores is the only possible issue. But we have no idea if there could be extra smaller cores in there just for the OS (a couple of A53s maybe).

Yeah. The Wii, 3DS, and Wii U even had a ARM9 not mentioned in its public specs sheet to deal with things like security, so I can see that happening.

Emilys early leak was really off (or worded poorly) if the known specs are true and nothing changed. Closer to Xbox One than PS4 sounded like it would be stronger than Xbox One. Later it said: struggled to even reach XBox One in raw power. That sounded that it would be weaker but not that much away.

From what I recall, the quote that Emily stated was worded the same way it was giving to her. It was intentionally vague. Emily also talked about the difference in architecture, which plays a big difference. Things like the 2x FP16, DX 12.1 GPU feature set, and the generally newer architecture will work to the Switch's advantage to make its performance closer to the XB1 than what you could conclude by just comparing FLOPS to FLOPS.
 
Anyway it's obvious even with the minimal specs that Switch is a very powerful handheld and a reasonable powerful console. Could be even better with some hardware tweaks that are common for Nintendo systems.
 
Guys, why isn't 200 dollars possible ? TX1 is a 2015 SoC that released in a 200 dollars product in 2015.
Right, you have to add the price of the battery and screen. Shouldn't the nearly 2 years gap making for that price ?
 

Asd202

Member
Guys, why isn't 200 dollars possible ? TX1 is a 2015 SoC that released in a 200 dollars product in 2015.
Right, you have to add the price of the battery and screen. Shouldn't the nearly 2 years gap making for that price ?

Nintendo tax. I they feel the console is not selling enough they will drop the price for Holidays 2017 and keep it pernament after.
 
Guys, why isn't 200 dollars possible ? TX1 is a 2015 SoC that released in a 200 dollars product in 2015.
Right, you have to add the price of the battery and screen. Shouldn't the nearly 2 years gap making for that price ?
The new Shield TV coming out still retails for $200

Nintendo tax. I they feel the console is not selling enough they will drop the price for Holidays 2017 and keep it pernament after.
This is the beauty of the March release, we'll start seeing discounted prices and bundles as early as 9 months post release.
 
Guys, why isn't 200 dollars possible ? TX1 is a 2015 SoC that released in a 200 dollars product in 2015.
Right, you have to add the price of the battery and screen. Shouldn't the nearly 2 years gap making for that price ?

Because margins.

We also don't know if there is more to it and what's packed in.

TX1 ain't confirmed.
 

Asd202

Member
Anyway I fully expect Nintendo not talk talk about spec at the event because that definitely won't be the console selling point.
 
Just because people don't think like you doesn't mean they're wrong. Some people like this stuff and are enthusiasts. Don't be nasty. If you don't like it then simply don't participate.



Well I'm still reading then. I wasn't here back when it was discussed to death so this doesn't help my case. If anyone else feels like helping me out, I'd be thankful. Maybe z0m3le can chime in and put this to rest, or Fourth Storm, or Thraktor.

Oh hey, I can try to recall some of the major points which led us to our conclusions. The point of dispute for a good duration was whether each of the identical blocks on the GPU die contained 20 spus or 40 spus. Both had been seen in AMD retail products, with 20 perhaps being more likely from the outset due to Latte's heritage as an R700 custom job.

There were some who initially had doubts about Latte being a 320 spu part due to the resolution of certain titles at launch and after. Granted, there are a wide variety of factors which could cause for launch software to exhibit less than stellar performance, but the line of thought was that even low end PC cards were pulling higher resolutions, so if Wii U had a similar configuration, that should have been a cakewalk.

The actual physical evidence comes from the die photos. At the time, pouring over the high quality photo from Chipworks and comparing it against the shots of likely relatives, such as the RV770 and Brazos, became one of my main hobbies. It was perhaps even unhealthy at times. I didn't actually care one way or another how powerful it was for the sake of games, but the mystery of it all intrigued me. At some point in going through the die photo block by block, I came to find a core configuration of 160:8:8 to be by far the most likely spec. The conclusion was arrived at after considering all the following:

1) Articles analyzing the die size at this point were using TSMC node sizes as a point of comparison when Latte is manufactured on a Renesas node, probably 45nm or 55nm, with a heavy emphasis on eDRAM implementation. Seeing as that one foundry can have major differences in implementing even the same node (high performance vs low energy, for example) comparing feature sizes of products from 2 different foundries is even more ill-advised. At least for the purposes of figuring out Wii U's specs.

2) Specific amounts and arrangements of SRAM within the GPU blocks were discovered to be identical to the 8 TMU arrangement in AMD's Brazos APU. What most initially thought to be four blocks containing 16 TMUs were actually fixed function interpolation hardware, a feature common to Latte's R700 lineage. From the TMU count comes the shader count, as more than 160 shaders paired to that number of TMUs would be quite unbalanced. What we saw instead was Nintendo's much attested emphasis on cache and memory, with a discreet L1 and L2 cache tied to each block of 4 texture units. The memory hierarchy of Latte is probably the biggest custom job of the whole thing, and while not capable of pulling miracles, it allowed the graphics processor to punch above its weight compared to the Xbox 360 and PS3.

3) Finally, as stated by Azak, some of us started getting word of or seeing actually documentation which confirmed the numbers above.

There were many twists and turns along the way, but that is all the energy I have at the moment. Maybe I can dig up some old posts where the major blocks on the GPU were discussed if people are interested. I think one of the last things that popped out at me was the likelihood of the Gamecube's fixed function hardware being included near-whole-cloth in one of the blocks.

As for Switch, I have been following the discussion and eagerly await some pics of the internals. Even a minimally customized Tegra X1 in combination with a low-overhead OS and Vulkan (or NVN) should produce stellar results. I'd be surprised if Nintendo could resist throwing some additional SRAM on the die, though. My one main worry is the low-clocked CPU again being a hindrance for getting some ports even at low resolutions. Perhaps this week will shed some new light on that...or not.
 

Vash63

Member
Guys, why isn't 200 dollars possible ? TX1 is a 2015 SoC that released in a 200 dollars product in 2015.
Right, you have to add the price of the battery and screen. Shouldn't the nearly 2 years gap making for that price ?

It would be possible if they want to sell at a loss or cut costs on the screen or other components. The SoC isn't all that expensive, maybe $30-40 tops. Being a year old won't make the overall system much cheaper compared to all the extra components that the Switch has vs the Shield.

I'd rather it be $250 and have a decent screen, speakers and battery.
 

MDave

Member
Alright back with some more results on the Shield.

The app I was using turned out to be giving me incorrect readings, so I figured out how to get readings right off the kernel itself.

I shall let you come to your own conclusions.

First run from fresh start these are GPU readings:

http://puu.sh/tfsJG/2df39dff3c.png

3 ~ 4 minutes into the GPU benchmark:

http://puu.sh/tfsYF/af2223ec4d.png

What does this mean? It does actually run at 1GHz initially, but then it lowers gradually to the lowest point we see here at 768Mhz - probably why Nintendo choose this speed as it allows for at least a consistent clock speed without it thermal throttling. And this also matches what the Dolphin developers see with the GPU on the Shield.

I'm going to try to push the CPU and GPU as hard as I can and see if the CPU does indeed throttle, but it looks like it doesn't so far as to keep the console game ports on the Shield to run at consistent speeds.
 

LordOfChaos

Member
Edit: Original comment answered above.

Was that a Vsynced load though, or did you find the way to uncap it?

I think your guess is right that it was the highest possible speed without throttling. Android SoCs are pretty notorious for starting out at high rated clocks, and then throttling in seconds or minutes. Seems the Shield could probably manage marginally higher than the Switches clock rate if it too was capped to be steady (maybe ~850MHz), but in a larger stationary device.

And yeah, if the CPU doesn't throttle that's still the larger cut than the GPU.
 

Clov

Member
I thought 32 GB carts were recommended but could be higher.

Wasn't the minimum size rumored to be 16GB? I can imagine not every game released for it will need a ton of space, especially since Nintendo's own games tend to be low in size.
 

Donnie

Member
Guys, why isn't 200 dollars possible ? TX1 is a 2015 SoC that released in a 200 dollars product in 2015.
Right, you have to add the price of the battery and screen. Shouldn't the nearly 2 years gap making for that price ?

We don't know exactly what this SoC is. For a start it runs faster under load than TX1 does in Shield TV (or in any other device it seems). Could possibly be a newer node to achieve that. There will also be customisations. Either way the SoC is never going to be the make or break thing when it comes to pricing. Switch has more RAM, possibly a wider bus. A screen, battery, dock and two controllers.
 
Marketing speak,the Switch is designed to be a handheld first and foremost :)

Yeah it's great that they added some local multiplayer and multiple control options on a single handheld device, but the talk of it being a console is like saying the PSP 2000 with video out was a console. We've just reached the point where the specs in a handheld are approaching what a console can do without severe handicaps.
 

MDave

Member
Edit: Original comment answered above.

Was that a Vsynced load though, or did you find the way to uncap it?

I think your guess is right that it was the highest possible speed without throttling. Android SoCs are pretty notorious for starting out at high rated clocks, and then throttling in seconds or minutes. Seems the Shield could probably manage marginally higher than the Switches clock rate if it too was capped to be steady (maybe ~850MHz), but in a larger stationary device.

And yeah, if the CPU doesn't throttle that's still the larger cut than the GPU.

I ran the 3DMark benchmark, and those tests have vsync on. They don't run higher then 60fps though. I don't see an easy way of turning vsync off, I suspect I have to edit something in Android system files or so.

I think the fan in the Shield could go faster and cool more aggressively, but I think by design it is meant to run as quiet as possible with its set-top box design. Shame, as it could potentially reach its maximum output. 48c air flows out of its little exhaust slowly at maximum load.
 

pulsemyne

Member
Alright back with some more results on the Shield.

The app I was using turned out to be giving me incorrect readings, so I figured out how to get readings right off the kernel itself.

I shall let you come to your own conclusions.

First run from fresh start these are GPU readings:

http://puu.sh/tfsJG/2df39dff3c.png

3 ~ 4 minutes into the GPU benchmark:

http://puu.sh/tfsYF/af2223ec4d.png

What does this mean? It does actually run at 1GHz initially, but then it lowers gradually to the lowest point we see here at 768Mhz - probably why Nintendo choose this speed as it allows for at least a consistent clock speed without it thermal throttling. And this also matches what the Dolphin developers see with the GPU on the Shield.

I'm going to try to push the CPU and GPU as hard as I can and see if the CPU does indeed throttle, but it looks like it doesn't so far as to keep the console game ports on the Shield to run at consistent speeds.

Looks like the 768 is the thermally stable load speed of the chip, hence why nintendo would lock it to that speed when docked. Also the cooling solution may not be as powerful as the shield TV's is due to it's smaller form factor. Also factor in the additional heat generated why batteries are recharging and the screen backlight (however minor that may be) then your likely looking at a level that was comfortable for long term use of the chip.
Contrary to what some suggested it looks like nintendo really did get the best mobile GPU for a mobile platform. And the speeds they choose were as good as they could get for what they were trying to achieve.
 
With all the news/theories about the power in here, I think my only issue is that the price ($249) seems too high for the package. Am I wrong here?
Is this in relation to Sony and Microsoft's consoles? Just because Sony and Ms are cutthroat with their prices and undervalue gaming along with it, doesn't mean Nintendo isn't worth the price because it doesn't match it in raw specs. For a hybrid with a portability aspect with tech like this(not to mention rumored software at launch), the switch is worth the price at $250. Kimishima also has said he wants to make a profit of each console sold as well.
 

dEvAnGeL

Member
will this be much more powerful then the Wii U? it might seem like a dumb question but with the rumored price so low i fear the difference will be as negligible as the CG to the Wii
 
The Wii U GPU is most certainly better than the 360,maybe not by much but better
If the Wii U' has 30% less GFLOPs in GPU than the 360, but still actually performs 50% better than 360's 230 GFLOPs, that sounds like a significant difference to me! That's sounds significantly even more efficient than switch's Nvidia Maxwell architecture efficiency over ps4/xbone's AMD gpu's architecture(which we've been estimating like 30-50% more?)
 
will this be much more powerful then the Wii U? it might seem like a dumb question but with the rumored price so low i fear the difference will be as negligible as the CG to the Wii
3-4x at least when docked, and 1.5x in portable. This is when factor in architectural differences.
 
If the Wii U' has 30% less GFLOPs in GPU than the 360, but still actually performs 50% better than 360's 230 GFLOPs, that sounds like a significant difference to me! That's sounds significantly even more efficient than switch's Nvidia Maxwell architecture efficiency over ps4/xbone's AMD gpu's architecture.

TBH, I don't know exactly how much stronger the Wii U's GPU is compared to the 360's. I know that its performance is notably better, but not anything like 2x. It is probably less than 50%. Perhaps someone that do the actual calculations to make sure.
 

Thraktor

Member
Given how closely Nintendo and Nvidia worked on this API, I wouldn't be shocked to find out if cache was customized towards minimizing the need to move data in and out of main memory for additional post processing subpasses. A sizable L3 cache for the GPU and a larger L2 cache for the CPU would save a lot of bandwidth.

Oh, I agree, and I do certainly expect some kind of customised cache for this reason, whether it's an increased GPU L2 or a shared L3 or whatever. Given Nintendo's historical desire to keep framebuffer accesses on-die, and Nvidia's use of tile based rendering from Maxwell onwards, it would seem a natural assumption.

The issue that was brought up (by blu and Durante I think?) earlier in this thread is that, under pre-vulkan APIs, at least, intermediate buffers like g-buffers can't be tiled, regardless of the GPU or cache configuration, because you have to implement them as render-to-texture. This requires the full buffer to be rendered and pushed to memory before it can be read, and because a shader can access any pixel of the texture while reading you can't tile the reads.

I sort of ended up answering my own question with vulkan's renderpasses and subpasses, as it offers an alternative to render-to-texture (or more specifically a replacement, as I don't believe you're supposed to use render-to-texture at all in vulkan) which can be efficiently tiled. By treating the g-buffer (or any kind of intermediate buffer) as an attachment within the renderpass and restricting shaders to only accessing data from the pixel its operating on (potentially including the same pixel across multiple attachments), then you get a g-buffer which can be tiled.

The subpasses can then define exactly when a given attachment is needed (or not needed), so the GPU can efficiently allocate a sufficient amount of cache in advance for a given tile, and can evict a g-buffer as soon as it's no longer needed to free up that cache. Effectively, a deferred renderer could keep the bulk of the render pipeline (including lighting) as subpasses within a single renderpass, meaning that for a tile-based GPU (such as Switch's) this can all be done on one tile at a time, with the entire process (creating multiple g-buffers, calculating lighting and shading, etc., etc.) being completed on that tile within cache without having to touch main memory at all until you have a near-final color buffer. Post-processing, screen-space reflections, etc. would still be done in a non-tiled manner, but you'd get basically the full benefit of TBR for a deferred renderer, which you can't do with any other API (even DX12).

Those tiles can be of any size, they can be run concurrently or consecutively, or even on different GPUs (in theory, anyway). Most importantly, though, it avoids any need for hardware-specific extensions, which I imagine Nintendo would want to avoid (both to improve compatibility with third-party engines and to keep their options open for future Switch hardware). If a third-party dev (let's say id) have an existing vulkan renderer that's well-implemented (which I imagine id's is) then it should tile well on Switch right out of the box without id having to mix up their rendering pipeline just to accommodate Switch's GPU.

From Nintendo's point of view, then, all they need to do is make sure the GPU's tiling algorithm fully exploits vulkan's renderpasses (I don't see why Nvidia would screw this up) and then make sure there's a large enough cache to accommodate tiles which include, potentially, a color buffer, z-buffer, multiple g-buffers, and possible even extra buffers for transparencies, without the tiles being too small.

If the article's wrong, then Thraktor's wrong, btw, since it was based on Thraktor's thread. I'm reading Thraktor's thread and still reading posts but it has 200+ pages and it's not easy going through all of them to reach the definite conclusion. Some here prefer to troll instead of pointing me to the proper answer. "SMD IS SHIT!!11!!1" is not an answer.

Can you please be a kind soul like a couple here and point me to that post that proves this article wrong so I can put this to rest?

I was wrong. The OP was based on initial assumptions, and after a while I was too busy to keep on top of the thread so I stopped updating the OP, but the 176Gflops is based on actual Nintendo documentation for Wii U, as AzaK says above, so there's not really any scope for it to be wrong.
 
Alright back with some more results on the Shield.

The app I was using turned out to be giving me incorrect readings, so I figured out how to get readings right off the kernel itself.

I shall let you come to your own conclusions.

First run from fresh start these are GPU readings:

http://puu.sh/tfsJG/2df39dff3c.png

3 ~ 4 minutes into the GPU benchmark:

http://puu.sh/tfsYF/af2223ec4d.png

What does this mean? It does actually run at 1GHz initially, but then it lowers gradually to the lowest point we see here at 768Mhz - probably why Nintendo choose this speed as it allows for at least a consistent clock speed without it thermal throttling. And this also matches what the Dolphin developers see with the GPU on the Shield.

I'm going to try to push the CPU and GPU as hard as I can and see if the CPU does indeed throttle, but it looks like it doesn't so far as to keep the console game ports on the Shield to run at consistent speeds.

Thanks for the update. That is definitely interesting. The others have done a good job elaborating what these results mean. I look forward to hear more of your findings.
 

Thraktor

Member
Alright back with some more results on the Shield.

The app I was using turned out to be giving me incorrect readings, so I figured out how to get readings right off the kernel itself.

I shall let you come to your own conclusions.

First run from fresh start these are GPU readings:

http://puu.sh/tfsJG/2df39dff3c.png

3 ~ 4 minutes into the GPU benchmark:

http://puu.sh/tfsYF/af2223ec4d.png

What does this mean? It does actually run at 1GHz initially, but then it lowers gradually to the lowest point we see here at 768Mhz - probably why Nintendo choose this speed as it allows for at least a consistent clock speed without it thermal throttling. And this also matches what the Dolphin developers see with the GPU on the Shield.

I'm going to try to push the CPU and GPU as hard as I can and see if the CPU does indeed throttle, but it looks like it doesn't so far as to keep the console game ports on the Shield to run at consistent speeds.

Thanks for this, this is pretty interesting. It would seem to suggest that the Switch SoC is a 20nm chip, as going with the same frequency as TX1's thermal sweet spot would seem like too much of a coincidence on any other node. We can also see that the base clock is 76.8MHz, which tracks with both the docked mode on Switch (obviously), but also with Switch's portable clock (76.8 x 4 = 307.2MHz).
 
Thanks for this, this is pretty interesting. It would seem to suggest that the Switch SoC is a 20nm chip, as going with the same frequency as TX1's thermal sweet spot would seem like too much of a coincidence on any other node. We can also see that the base clock is 76.8MHz, which tracks with both the docked mode on Switch (obviously), but also with Switch's portable clock (76.8 x 4 = 307.2MHz).
This adds more weight to Eurogamers reported clock frequency. It seems that the CPU doesn't need to adjust, though. Do you have any theories on why it was decided to be clocked at 1020MHz?
 
With all the news/theories about the power in here, I think my only issue is that the price ($249) seems too high for the package. Am I wrong here?

Well the Shield TV is $200...so 50 dollars for a screen, two joycons, battery (EDIT), dock, and grip seems about right honestly.


EDIT: Never mind the R&D to get the Shield TV to essentially work in a tablet form factor with a battery taking up space. Honestly when I really think about it, I'm a bit surprised the rumors aren't citing a higher price considering the Shield TV is 200.
 
Well the Shield TV is $200...so 50 dollars for a screen, two joycons, dock, and grip seems about right honestly.

Also it's safe to assume Nvidia's selling the Shield TV at a decent profit. Nintendo can bite the bullet and sell the Switch at cost or even at a small loss because they know they'll earn that money back in software sales.
 
Well the Shield TV is $200...so 50 dollars for a screen, two joycons, battery (EDIT), dock, and grip seems about right honestly.


EDIT: Never mind the R&D to get the Shield TV to essentially work in a tablet form factor with a battery taking up space. Honestly when I really think about it, I'm a bit surprised the rumors aren't citing a higher price considering the Shield TV is 200.

I'd imagine all the sensors add up as well. According to the patent, there are gyros and accelerometers contained in the main unit and both Joy Cons. Add in NFC reader, IR camera, rumble in each Joy Con, plus all the other goodies and the price makes sense. I still have my doubts that it will light the market on fire, but that will largely be decided by the software. It seems Nintendo have a great base hardware/software combo to build on with Switch. Following up with a smaller 16nm version in 18 months or so wouldn't surprise me...along with a handheld only sku or different dock. They could go nuts with the modular nature of this thing.
 

antonz

Member
Yeah CPU clock speed seems to be basically tied to what offers the best battery life while giving adequate enough CPU performance.

The new details on the GPU clocks from Mdave are illuminating at least and give us a good explanation on why Nintendo went with the clocks they did.
Overall nothing has really changed but we have a better picture of why things are the way they are and we see Nintendo is not lowering clocks for the sake of battery or anything but because its the optimal speed for an X1 based device.
 
Guys, why isn't 200 dollars possible ? TX1 is a 2015 SoC that released in a 200 dollars product in 2015.
Right, you have to add the price of the battery and screen. Shouldn't the nearly 2 years gap making for that price ?

The new android TV coming out 2017 is still 199 with less ram (3gb), no touchscreen, no dock, no things like nfc, ir etc. etc. 250 dollars would seem a fair price to me in comparison.
 

LordOfChaos

Member
Thanks for this, this is pretty interesting. It would seem to suggest that the Switch SoC is a 20nm chip, as going with the same frequency as TX1's thermal sweet spot would seem like too much of a coincidence on any other node. We can also see that the base clock is 76.8MHz, which tracks with both the docked mode on Switch (obviously), but also with Switch's portable clock (76.8 x 4 = 307.2MHz).

Fab gains could always be used towards battery life and lower TDP rather than upping clocks, though 14/16nm FF did always seem a bit advanced for it with Nintendos record this decade. At least 20nm isn't horribly behind, being a half gen fab.

Speaking of 14 or 16nm, do we know who the expected fab is, since IBM is out?
 

Schnozberry

Member
This adds more weight to Eurogamers reported clock frequency. It seems that the CPU doesn't need to adjust, though. Do you have any theories on why it was decided to be clocked at 1020MHz?

Probably the same reason they went with 768mhz on the GPU. It won't need to throttle in either mode at that speed.
 

Schnozberry

Member
Fab gains could always be used towards battery life and lower TDP rather than upping clocks, though 14/16nm FF did always seem a bit advanced for it with Nintendos record this decade. At least 20nm isn't horribly behind, being a half gen fab.

Speaking of 14 or 16nm, do we know who the expected fab is, since IBM is out?

TSMC is the fab for the Tegra X1. I don't see why that would change.
 
Probably the same reason they went with 768mhz on the GPU. It won't need to throttle in either mode at that speed.
I was wondering about that too, but MDave's numbers so far are showing that the CPU can run full power for at least a few minutes without throttling. He said that he will do more strainious tests to challenge the CPU/GPU.
 
Status
Not open for further replies.
Top Bottom