• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Confirmed: The Nintendo Switch is powered by an Nvidia Tegra X1

Status
Not open for further replies.

flkraven

Member
The comparison between gaming computers and dedicated consoles didn't make much sense back in the 80s and 90s given the widely different input methods, software libraries and prices. It didn't make sense to compare them on the basis of "look, my computer can do all these things, what can your console do?" either.

Now, with the Switch vs. the iPad, it makes even less sense. Sure, on the surface, they're both tablets with mobile hardware in it, and they're both able to push 3D games. But that's like comparing a helicopter and a plane because they're both flying machines designed for transportation. Yeah, that's true, but you don't use them in the same situations. Helicopters aren't competing with planes because their respective capabilities make them suitable for very different jobs. Planes aren't competing with helicopters.

Cue someone linking to a study that shows helicopter sales are down due to planes or something :lol

The rise of mobile gaming on smartphones has certainly impacted sales of mobile gaming devices (DS+PSP vs 3DS+Vita). While for you it may seem like a plane vs a helicopter, for many other consumers these products occupy the same space.

The answer to the question "I want my kid to be able to play decent looking games while he's in the car" could be answered with 'iPad' or 'Nintendo' (especially since they will probably just install minecraft on it lol). iPad takes the edge here, with the ability to play youtube, Netflix, etc etc(other ways a child can be occupied). Likewise, someone looking for a pure home console may lean more heavily towards a more-powerful PS4, Xbox One, or even a PC.

These are all valid comparisons, because to many consumers, these are products in competing spaces fighting for their dollars. If the Switch can't be compared to a tablet, is it fair to compare it to Xbox One/PS4 (as a device that costs the same/more, has less power, fewer games, and less 3rd party support), or does that comparison not make sense either?
 

Hermii

Member
My guess is that in a couple of years, a revision will appear at the same price point $300 with a Tx2. And that is when Nvidia announces the Tx3
Yea except it will called tv1 or something based on Volta/ Xavier and be much more powerful.
 

Hermii

Member
Yes, but that's around what a Shield TV throttles down to under load, which could be a reason why this is the docked clock speed.
Definitely one reason, another reason is that 307.2 x 2.5= 768 meaning a game using non boost mode can render at 1080p.
 

mario_O

Member
Plus what fp16 brings to the table. Don't take flops as gospel

Fp16 is the new cloud power. It's been on nvidia cards for 12 years, devs still use fp32.

And PS4 and Xbox One dont support it. I dont see third party devs going through all that trouble for one platform.
 
Fp16 is the new cloud power. It's been on nvidia cards for 12 years, devs still use fp32.

And PS4 and Xbox One dont support it. I dont see third party devs going through all that trouble for one platform.

Everything supports FP16. Not everything supports double speed processing of FP16 code however.

PS4Pro does though, and Scorpio likely does too. Therefore developers will naturally begin to prioritize it.
 

Hermii

Member
Fp16 is the new cloud power. It's been on nvidia cards for 12 years, devs still use fp32.

And PS4 and Xbox One dont support it. I dont see third party devs going through all that trouble for one platform.
3 platforms. PS4 pro and soon Scorpio.

I definitely see epd going through all that trouble. UE4 already supports it.
 

ZOONAMI

Junior Member
3 platforms. PS4 pro and soon Scorpio.

I definitely see epd going through all that trouble. UE4 already supports it.

Cool, puts it more in the realm of getting AAA ports but probably still won't because Nintendo. This is going to be a Nintendo machine, indies, and a lot of Japanese portable titles.

So Vita 2.0 + Nintendo titles.
 

mario_O

Member
PS4Pro does though, and Scorpio likely does too. Therefore developers will naturally begin to prioritize it.

but the games have to run on the original ps4 and xbox one. So, no, I don't see thrid parties going through all that trouble. Maybe when next gen starts.

Also, if it's been on nvidia cards for 12 years, why haven't we seen games use this before?
at least on PC.
 

Hermii

Member
but the games have to run on the original ps4 and xbox one. So, no, I don't see thrid parties going through all that trouble. Maybe when next gen starts.

Also, if it's been on nvidia cards for 12 years, why haven't we seen games use this before?
at least on PC.
Are you sure we haven't? Might account for some of the "Nvidia flop" advantage that's been there forever.
 

beril

Member
but the games have to run on the original ps4 and xbox one. So, no, I don't see thrid parties going through all that trouble. Maybe when next gen starts.

Also, if it's been on nvidia cards for 12 years, why haven't we seen games use this before?
at least on PC.

It hasn't. Pascal are their first desktop cards with native fp16 support; and even there it's severely gimped on most models and only really used for compatibility
http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/5
 

tkscz

Member
Fp16 is the new cloud power. It's been on nvidia cards for 12 years, devs still use fp32.

And PS4 and Xbox One dont support it. I dont see third party devs going through all that trouble for one platform.

Pro and Scorpio do. Plus any dev that makes a Switch exclusive title or gets a separate team to work on a Switch version of a game.
 

Lonely1

Unconfirmed Member
Yeah, I'm sure that, with their fab advantage alone, Apple socs are more power efficient than the Tx1. However, even at its meager clocks, Nintendo/Nvidia made the decision to have active cooling and I'm sure that it wasn't done lightly. The power envelope of the switch is only comparable to the OG Shield.

About the CPUs, the Twister cores are designed to do burst computing, render webpages/load apps as fast as possible between long (nearly) idle periods and don't have to worry about wrestling the GPU for resources. At beyond3d, someone did the math and even a couple of A57 at max clocks can starve the memory bandwidth. This not an issue for a phone/tablet, but on a gaming device a GPU would be at maximum usage at all times.

And finally, could an iPhone 6S run BotW at Switch settings? Maybe, but I'm sure it wouldn't be able to run SM Odyssey. The Switch has double the ammount of RAM.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Basically the way I understand it, double rate FP16 has been used primarily in mobile processors as a way to get more processing speed out of less power. That's why it's been used in Tegra chips (which are meant to be mobile) but only recently started appearing in desktop (and console) chips.

It's really not magic.

EDIT:
Just at it doesn't bring the Ps4P to 980ti/Fury levels, it won't double the Switch performance, but it can help.

Right. We had a Ubisoft dev say that 70% of the code in his game could be FP16, which would give the Switch (or PS4Pro for that matter) an extra ~50% boost. That's probably close to the practical maximum.
 

Lonely1

Unconfirmed Member
Basically the way I understand it, double rate FP16 has been used primarily in mobile processors as a way to get more processing speed out of less power. That's why it's been used in Tegra chips (which are meant to be mobile) but only recently started appearing in desktop (and console) chips.

It's really not magic.

Just at it doesn't bring the Ps4P to 980ti/Fury levels, it won't double the Switch performance, but it can help.
 

mario_O

Member
From the article:

GeForce GTX 1080, on the other hand, is not faster at FP16. In fact it’s downright slow. For their consumer cards, NVIDIA has severely limited FP16 CUDA performance. GTX 1080’s FP16 instruction rate is 1/128th its FP32 instruction rate, or after you factor in vec2 packing, the resulting theoretical performance (in FLOPs) is 1/64th the FP32 rate, or about 138 GFLOPs.

If the GTX 1080 is severely gimped at FP16, I don't think the older Tegra X1 is going to be any better, probably worse.
 
The foxconn leak.

I wouldn't say that's exactly disproven. He speculated 16nm (and to be fair we still have no confirmation on that but it's highly unlikely) and A72s, but we still haven't seen anyone hack the Switch to determine its max clock rates. As far as I know.

We should all just expect the DF clocks at this point though.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
From the article:

If the GTX 1080 is severely gimped at FP16, I don't think the older Tegra X1 is going to be any better, probably worse.
Maxwell2, as found in the TX1, does have a double fp16 rate.
 

Lonely1

Unconfirmed Member
If the GTX 1080 is severely gimped at FP16, I don't think the older Tegra X1 is going to be any better, probably worse.

Nvidia confirmed that the X1 gets double performance on fp16. Is a feature of the chip. A newer chip not using that feature doesn't change that fact.
 

Mameshiba

Neo Member
but the games have to run on the original ps4 and xbox one. So, no, I don't see thrid parties going through all that trouble. Maybe when next gen starts.

Also, if it's been on nvidia cards for 12 years, why haven't we seen games use this before?
at least on PC.

Older Nvidia cards are capable of running FP16 calculations, but not at twice the speed of FP32 calculations. That is a feature unique to the X1 and the GPU in the PS4 Pro at the moment iirc.

I tried to compare the Switch to the iphone 7 Plus based on the Manhattan 3.1 GFXbench 1080p Offscreen. All values are taken from the GFXbench website.
The Iphone uses the Metal Api, the X1 runs OpenGl.

Shield TV (X1) 46
Pixel C (X1) 44
Iphone 7 Plus (A10) 43,7


Those are actually really close together xD.
The Iphone 7 Plus throttles down by ~40% to a sustainable 25 FPS looping the same Benchmark according to Anandtech:
Manhattan_Rundown_7Plus_575px.png


If we compare Metal and OpenGl on the Iphone, Metal should have a performance advantage a little bit below 10% according to http://www.anandtech.com/show/9223/gfxbench-3-metal-ios,
which would put the Iphone 7 roughly at ~23 Fps, or a Pixel C running at ~450 Mhz.
No idea why the Shield TV is barely ahead of the Pixel C, maybe the Shield TV already throttles on the first run? Could someone run the same Benchmark on his Shield TV to compare the values?

I don't know how taxing the Manhattan 3.1 Benchmark is, especially on the CPU and Ram side, but I would guess it is quite complex for a generic benchmark. Somewhat more demanding than most IOS games, but not as stressful as running a Game optimised for the hardware. Therefore the Iphone 7 hypothetically running something like Breath of the Wild should throttle down even further, resulting in a performance really close to the 384 Mhz X1 in the Switch in boost mode.

In general, the GPU of the X1 is really amazing, probably more efficient that all other Smartphone/mobile GPUs right now. It's just held back by the horrible manufacturing node and outdated CPU cores :(
 

z0m3le

Banned
Different hardware etc...and the Switch has to run at 196 Gflops when undocked...

The A8 7600 is GCN, and we can compare it directly to Maxwell with RX 470/480 benchmarks vs GTX 970 and 980 in BF1 using DX11. http://www.gamersnexus.net/game-bench/2652-battlefield-1-graphics-card-benchmark-dx11-vs-dx12

GTX 970oc has 4466 gflops vs RX 480oc 6074 gflops
These are overclocked cards, but we only need to look at the comparison of flops to see how Maxwell holds up to Polaris (GCN 4.0)

So while it isn't a 1:1 comparison, yes Switch's GPU (393gflops) should be very close to A8 7600's GPU (550gflops).
 

mario_O

Member
Nvidia confirmed that the X1 gets double performance on fp16. Is a feature of the chip. A newer chip not using that feature doesn't change that fact.

Well, if FP16 is the future and where game development is heading, why is it so gimped on the latest GTX cards?

Maybe it makes sense for mobile apps, but not so much for console/PC games.
 

z0m3le

Banned
Well, if FP16 is the future and where game development is heading, why is it so gimped on the latest GTX cards?

Maybe it makes sense for mobile apps, but not so much for console/PC games.

By that same logic, why is AMD building up fp16 even more aggressively with Vega, even pushing fp8 (8bit calculations) onto the cards. All these modes are just total values, how exact does the answer need to be? That's all it's doing, and fp16 is exact enough for a very large amount of gpu work loads, no not everything, but many game tasks don't need more exact answers.
 
Well, if FP16 is the future and where game development is heading, why is it so gimped on the latest GTX cards?

Maybe it makes sense for mobile apps, but not so much for console/PC games.

So why did Sony include it in the PS4 Pro? It is definitely going to be in Scoepio. You think maybe instead of trying to endless just stick more and more power in there they wanna find otherway to get more outta the chips?
 
By that same logic, why is AMD building up fp16 even more aggressively with Vega, even pushing fp8 (8bit calculations) onto the cards. All these modes are just total values, how exact does the answer need to be? That's all it's doing, and fp16 is exact enough for a very large amount of gpu work loads, no not everything, but many game tasks don't need more exact answers.

Nvidia doesnt need fp16 on its gaming cards right now. If it gets much use in games the will impliment it. Vega will be used in gaming and professionell cards.
 
That's why it's been used in Tegra chips (which We had a Ubisoft dev say that 70% of the code in his game could be FP16, which would give the Switch (or PS4Pro for that matter) an extra ~50% boost. That's probably close to the practical maximum.
You might want to revisit that math. If 70% of the workload can be done in FP16, which runs twice as fast, then it's completely impossible for the boost to be 50%. It will be 35% at very best.
 

z0m3le

Banned
You might want to revisit that math. If 70% of the workload can be done in FP16, which runs twice as fast, then it's completely impossible for the boost to be 50%. It will be 35% at very best.

If only 30% of Switch's gpu is needed to run in fp32, you will only use up 118gflops of the 393, leaving 275gflops in fp32 or 550gflops in FP16, giving a game designed this way 668gflops to work with, or an easy way to do this is simply to add 70% as you are doubling 70% of the gpu's output.
 

Instro

Member
If only 30% of Switch's gpu is needed to run in fp32, you will only use up 118gflops of the 393, leaving 275gflops in fp32 or 550gflops in FP16, giving a game designed this way 668gflops to work with, or an easy way to do this is simply to add 70% as you are doubling 70% of the gpu's output.

That's how I was thinking about the situation. I'm not sure if it's correct to average performance in this manner, but this makes more logical sense to me than the other posts.
 
You might want to revisit that math. If 70% of the workload can be done in FP16, which runs twice as fast, then it's completely impossible for the boost to be 50%. It will be 35% at very best.

That's no proper math ;)

Think of it this way: imagine that the GPU could render 100fps in FP32 mode. Now, in FP16 mode, 30% of the GPU still do FP32, so that part renders 30fps. The other 70% of the GPU however, now render twice as fast, i.e. 140fps, instead of 70fps. Which means: 170fps overall -> 70% boost.
 

Butta

Neo Member
Doesn't the switch seem like an exact .5 of XBone? I mean as I understand it you get an increase of 35% for fp16 and about 25% for other architectural improvements compared with XBone's 2013 GPU which equals to about 668Gflops.

668 x 2 = 1336 Gflops which is pretty much a match for 1331 gflops of XBone
RAM is exactly half
CPU cores is exactly half
bandwidth with colour compression savings (30%) is about 33Gflops which is about exactly half of the 68Gflops of XBone
The switch has tiled based rendering to make up for a lack of EDRAM

Are ports really going to be as complicated as people are making them out to be? Specs seem pretty good given the form factor.
 

z0m3le

Banned
Doesn't the switch seem like an exact .5 of XBone? I mean as I understand it you get an increase of 35% for fp16 and about 25% for other architectural improvements compared with XBone's 2013 GPU which equals to about 668Gflops.

668 x 2 = 1336 Gflops which is pretty much a match for 1331 gflops of XBone
RAM is exactly half
CPU cores is exactly half
bandwidth with colour compression savings (30%) is about 33Gflops which is about exactly half of the 68Gflops of XBone
The switch has tiled based rendering to make up for a lack of EDRAM

Are ports really going to be as complicated as people are making them out to be? Specs seem pretty good given the form factor.

The 668gflops has no Nvidia advantage tied into the number, if we are looking to add that, it's another 40% or 935gflops equivalent, or half the PS4. Thing is, that's a best case scenario in favor of the switch. It's easy enough to just look at what amd apu's do with certain gflops and compare them. I think the A8 7600 with 550gflops is a pretty easy low bar for switch when it comes to ports, the main problem with the comparison though is that the apu has a much faster cpu.
 
In this thread I learned people have a hard time coming to the reality their stupid overpriced phones aren't designed to play video games as well as a system designed to play videogames.

Who gives a shit if the iPhone15 is faster it still isn't going to let me play Mario Kart 8 with my family. Get over it already.

We get it you need to justify that fruit branded social status item.
 

Butta

Neo Member
The 668gflops has no Nvidia advantage tied into the number, if we are looking to add that, it's another 40% or 935gflops equivalent, or half the PS4. Thing is, that's a best case scenario in favor of the switch. It's easy enough to just look at what amd apu's do with certain gflops and compare them. I think the A8 7600 with 550gflops is a pretty easy low bar for switch when it comes to ports, the main problem with the comparison though is that the apu has a much faster cpu.

Doesn't the fact that switch is using custom NVidia api's that are closer to the metal and smaller os footprint kind of negate the cpu advantage?
 

z0m3le

Banned
Doesn't the fact that switch is using custom NVidia api's that are closer to the metal and smaller os footprint kind of negate the cpu advantage?

Especially compared to dx11 sure, but it's not going to negate it completely. Nintendo is still likely going to have to free up some or all of that 4th core and possibly increase the cpu clock to 1.2-1.3ghz which would consume an extra half watt or more, but might be able to get away with dropping the ram speed some. It's all a balance, and the solution to the cpu problem was to use a better cpu at a higher clock (A72 could have hit 1.7ghz at the same power consumption) which would have absolutely been enough cpu performance.
 

Butta

Neo Member
Especially compared to dx11 sure, but it's not going to negate it completely. Nintendo is still likely going to have to free up some or all of that 4th core and possibly increase the cpu clock to 1.2-1.3ghz which would consume an extra half watt or more, but might be able to get away with dropping the ram speed some. It's all a balance, and the solution to the cpu problem was to use a better cpu at a higher clock (A72 could have hit 1.7ghz at the same power consumption) which would have absolutely been enough cpu performance.

Doesn't throttling prevent the clock from going any higher? Also reduction in memory bandwidth seems a bad idea given how slow the bandwidth already is
 
In this thread I learned people have a hard time coming to the reality their stupid overpriced phones aren't designed to play video games as well as a system designed to play videogames.

Who gives a shit if the iPhone15 is faster it still isn't going to let me play Mario Kart 8 with my family. Get over it already.

We get it you need to justify that fruit branded social status item.

But gflops!
 

z0m3le

Banned
Doesn't throttling prevent the clock from going any higher? Also reduction in memory bandwidth seems a bad idea given how slow the bandwidth already is

The AMD apu actually has less bandwidth, DDR3 is only dual channel here, and gives less than 20GB/s for the entire system AFAIK. Nvidia also still has a bandwidth advantage, so it might make sense to push cpu clocks even at the expense of bandwidth, but it is definitely just side speculation, I'm not sure about the throttling since this is a different form factor than a normal shield TV, and I suspect that it simply isn't designed to throttle, but rather just turn off once it hits a dangerous temperature, if the device is really compromised by a half watt increase, they have more problems than not being able to achieve ports.
 
Status
Not open for further replies.
Top Bottom