• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

AMD's next gen mobile chips were supposed to feature a large LLC to solve gaming memory bandwidth issues; Microsoft's demand for AI NPUs changed it

LordOfChaos

Member
https://forums.anandtech.com/thread...dge-ryzen-9000.2607350/page-350#post-41186460

Revealed and supported by tech leaker Kepler_L2 and other tech enthusiasts.

The idea is to have a large iGPU with 16 RDNA 3.5 compute units (fixed and enhanced version of RDNA3) and enable much higher gaming performance with the addition of a large 16MB shared cache.

However, this die area has been replaced with a large NPU due to demands from Microsoft for ever more AI processing power to run its upcoming Windows AI productivity features. It is also claimed that Microsoft is doubling down on AI and will demand even larger NPUs in the future.
 

Buggy Loop

Member
Everything will be AI eventually

It's probably the right direction

Even if it came to be that the game mechanics are shown on simple geometry, the AI could fill the rest to near real life graphics. What's the point of learning monte carlo path tracing and the power required? AI knows how it should be lit and shadowed at any time frame and weather.

Honestly AMD kind of needed a kick into AI once and for all. Rasterization is old news.
 

Dr.D00p

Member
I don't see the problem,really.

This will likely be used in games for much better hardware accelerated image reconstruction.
 

Ozriel

M$FT
AMD’s rivals Intel, Qualcomm, Apple and MediaTek are heavily focused on NPUs in their chips and TOPs performance.

Definitely MS is pushing Copilot for Windows, but I’m skeptical that it’s just MS push that’s forced AMDs hand here. It’s played a part, for sure, but AMD certainly doesn’t want to be left behind.
 
Last edited:

Xyphie

Member
I can see why Microsoft wants NPUs to become a standard in computers quickly and it's clear Windows 12 is moving towards some kind of NPU requirement (or at least X TOPS capable). Having stuff like better speech recognition to transcribe Teams meetings, better AI noise cancellation etc would be extremely useful features for corporate users, so the handful of people wishing for a better ROG Ally 2 is going to have to take a backseat to those use cases.
 
Last edited:

Zathalus

Member
They actually can. They are the biggest name in AI right now, and can - without much issue - not use AMD.
This is for desktop processors. What consumers use is up to them and OEMs. Microsoft cannot force AMD to build a processor a certain way, although they can make AI processing a requirement for some new Windows feature. That is still not forcing though, AMD is free to release chips that don't support it, but AI features is probably more useful then gaming on a iGPU.
 

Ozriel

M$FT
They actually can. They are the biggest name in AI right now, and can - without much issue - not use AMD.

None of their upcoming Windows hardware uses AMD. Their Surface lineup for business uses Intel, and their upcoming lineup for consumers is on the Qualcomm Snapdragon X Elite platform.
 
This is the right call. Demand for ai is growing exponentially, faster and local ai processing is especially desired for commercial use.

Not everything is about gaming folks, AMD would have been foolish not to jump on the AI train.
 

LordOfChaos

Member
Everything will be AI eventually

It's probably the right direction

Even if it came to be that the game mechanics are shown on simple geometry, the AI could fill the rest to near real life graphics. What's the point of learning monte carlo path tracing and the power required? AI knows how it should be lit and shadowed at any time frame and weather.

Honestly AMD kind of needed a kick into AI once and for all. Rasterization is old news.

Yeah, I don't see this as just a loss for gaming performance, I think increasingly the NPU will become a necessary third partner to the CPU and GPU in gaming
 

winjer

Gold Member
https://forums.anandtech.com/thread...dge-ryzen-9000.2607350/page-350#post-41186460

Revealed and supported by tech leaker Kepler_L2 and other tech enthusiasts.

The idea is to have a large iGPU with 16 RDNA 3.5 compute units (fixed and enhanced version of RDNA3) and enable much higher gaming performance with the addition of a large 16MB shared cache.

However, this die area has been replaced with a large NPU due to demands from Microsoft for ever more AI processing power to run its upcoming Windows AI productivity features. It is also claimed that Microsoft is doubling down on AI and will demand even larger NPUs in the future.

One thing does not invalidate the other.
AMD can very well have an L3 cache on top of the SoC, connected with TSVs, like on X3D chips. Or connected on the side, like DRNA3.
And leave the main die to have whatever they want, be it more CUS, NPUs, CPU cores, etc.
 

nemiroff

Gold Member
I don't know the full context of the story, but anyway, as already pointed out, traditional rendering is on it's way out. Most of what we see on a screen will be generated in the near future.
 
Last edited:

Dorfdad

Gold Member
Everything will be AI eventually

It's probably the right direction

Even if it came to be that the game mechanics are shown on simple geometry, the AI could fill the rest to near real life graphics. What's the point of learning monte carlo path tracing and the power required? AI knows how it should be lit and shadowed at any time frame and weather.

Honestly AMD kind of needed a kick into AI once and for all. Rasterization is old news.
Cant wait till developers can design games in 320p 30fps and AI will just fix the games to 8k/240
 
Last edited:

PaintTinJr

Member
One thing does not invalidate the other.
AMD can very well have an L3 cache on top of the SoC, connected with TSVs, like on X3D chips. Or connected on the side, like DRNA3.
And leave the main die to have whatever they want, be it more CUS, NPUs, CPU cores, etc.
I'm guessing the design difference is that the NPU is designed to make really large variable counts, like +300, for linear inference equations do their inference on a single clock cycle, and the problem is that the data needs to all be in an L1 cache of the NPU to achieve that, meaning maybe 1MB-2MB of L1 cache just for that unit to handle up to 512x512 at FP32 (variables x constants inference).

I would have hoped that the NPU compute could be built from smaller compute units and used as smaller units, with big efficiency losses, if the NPU wasn't needed for AI tasks given the way AMD don't like to commit silicon to ASICS in the way Nvidia do, but I'm not seeing a way that is possible, especially with the NPUs needing to handle general inference equations of variable input-variable-counts and to eliminate time in matching the input variable count to the correct inference equations, and/or any other strategy to quickly traverse the complexity of using the knowns to derive needed unknown variables to feed to higher level equations needed to infer the AI answer.

There's also the need to be able to setup the NPU to rapidly handle commonly needed functionality as those mentioned in this article


Which makes the NPUs sound very ASIC level and preloaded with vendor's own trained models(inference equations) for those sets of tasks.
 
Last edited:

Mahavastu

Member
Everything will be AI eventually

It's probably the right direction
On one way of course it is correct, but then the APUs are usually VERY bandwith starved, for example in games they usually really profit from overclocking the RAM in games, way more then CPUs without iGPU do. Adding more Cache would really help to bring the performance of the APU higher and maybe even bring it to some more or less usable level in some games.

OTOH most buyers of these APUs usually won't play anyway (those chips would probably still be too slow for most games) and use it mostly as an office PC, so adding AI is probably more useful for the average user in the future.

Xbox360 and XboxOne had some some sort of EDRAM/ESRAM to boost bandwidth, cant they use the same thing?
EDRAM/ESRAM in the XBox was some kind of Cache and intended to reduce the need for bandwith. Anyway, you need to optimize your game for those and this just won't be done on PC games. A larger normal cache would be automagicly be used and reduce the hunger for bandwith
 
Last edited:

Tams

Member
Fuck that shit.

Microsoft have almost never shown AMD support, choosing Chipzilla over them. Xbox and very briefly Surface are all they've offered.
 

FireFly

Member
Strix Halo apparently features a 256-bit memory bus. So it looks like it's designed for laptops/portables only. On the other hand, that should give it 240 GB/s of bandwidth when combined with LPDDR5X 7500 which is already above Series S' 224.0 GB/s. So I think it will be fine when paired with fast enough memory.
 
Last edited:

Griffon

Member
Shouldn't an NPU be a separate specialized chip? Aren't GPU just plain better at handling AI? Why are we taking precious CPU die space for a feature better served elsewhere?
 
Top Bottom