Support NeoGAF

Bernoulli · Sep 18, 2023

NVIDIA’s upcoming GPU architecture, codenamed Blackwell, is poised to be the successor to Ada Lovelace. In contrast to the Hopper/Ada architecture, Blackwell is set to extend its reach across both datacenter and consumer GPUs. NVIDIA is gearing up to introduce several GPU processors, with no major alterations to core counts, but there are hints of a significant restructuring of the GPU architecture.

According to the latest series of tweets from Kopite, Blackwell is not expected to feature a substantial increase in core counts. While it remains unclear whether this pertains to both data-center and gaming series, the core count for Blackwell is anticipated to remain relatively unchanged, while the underlying GPU clusters will undergo significant structural modifications. Kopite has not disclosed further details at this point, but it is said that GB100 GPU might feature twice as many cores as GB102, both are data-center GPUS.

Additionally, there has been mention of GB100, the data-center GPU for Blackwell, adopting a Multi Chip Module (MCM) design. This suggests that NVIDIA will employ advanced packaging techniques, dividing GPU components into separate dies. The specific number and configuration of these dies are yet to be determined, but this approach will grant NVIDIA greater flexibility in customizing chips for consumers, mirroring AMD’s intentions with the Instinct MI300 series

NVIDIA Blackwell GB100 to utilize MCM design, GPU unit structure to see major reorganization - VideoCardz.com

NVIDIA Blackwell GPUs New rumors on NVIDIA’s next-gen Blackwell architecture points towards the company’s first MCM design. NVIDIA’s upcoming GPU architecture, codenamed Blackwell, is poised to be the successor to Ada Lovelace. In contrast to the Hopper/Ada architecture, Blackwell is set to...

videocardz.com

Lucifers Beard · Sep 18, 2023

I’m not really sure what to make of this. If core counts aren’t changing much, does this indicate that whilst performance gains will be better, performance won’t be significantly better than 4000 series?

I imagine if this is the case, they are likely moving away from traditional core performance and moving further in with A.I to improve performance through technology deployed through the chipset.

It’s all very confusing, just sell me a decent 5000 series card that will give me better performance than a 4090 but not cost as much. Thanks.

Loxus · Sep 18, 2023

This reminds me of this from Nvidia back in 2017.

MCM-GPU: Multi-Chip-Module GPUs for Continued Performance Scalability

Historically, improvements in GPU-based high performance computing have been tightly coupled to transistor scaling. As Moore's law slows down, and the number of transistors per die no longer grows at historical rates, the performance curve of single monolithic GPUs will ultimately plateau.

However, the need for higher performing GPUs continues to exist in many domains. To address this need, in this paper we demonstrate that package-level integration of multiple GPU modules to build larger logical GPUs can enable continuous performance scaling beyond Moore's law. Specifically, we propose partitioning GPUs into easily manufacturable basic GPU Modules (GPMs), and integrating them on package using high bandwidth and power efficient signaling technologies. We lay out the details and evaluate the feasibility of a basic Multi-Chip-Module GPU (MCM-GPU) design. We then propose three architectural optimizations that significantly improve GPM data locality and minimize the sensitivity on inter-GPM bandwidth.

Our evaluation shows that the optimized MCM-GPU achieves 22.8% speedup and 5x inter-GPM bandwidth reduction when compared to the basic MCM-GPU architecture. Most importantly, the optimized MCM-GPU design is 45.5% faster than the largest implementable monolithic GPU, and performs within 10% of a hypothetical (and unbuildable) monolithic GPU. Lastly we show that our optimized MCM-GPU is 26.8% faster than an equally equipped Multi-GPU system with the same total number of SMs and DRAM bandwidth.

Kenpachii · Sep 18, 2023

Interesting i thought that chiplets wasn't really beneficial much for nvidia to go with it. But i guess they still do so.

Wonder what gains we will be seeing and what new tech there 5000 cards bring along the way.

RoboFu · Sep 18, 2023

It’s a much more efficient way to manufacture cards for sure.

Tsaki · Sep 18, 2023

Sooner or later they'd have to do this. You can't pump massive chips that reach the reticle limit when each new node becomes much more expensive than the last.

Zathalus · Sep 18, 2023

Time to show AMD how it's done.

TheRedRiders · Sep 18, 2023

Zathalus said:
Time to show AMD how it's done.

By the time Blackwell releases, AMD's MCM technology would have matured as well in RDNA 5.

Dr.D00p · Sep 18, 2023

All you need to know about Nvidias future GPUs is that they'll continue to be overpriced and, with the exception of the Halo card, stripped back trash.

Zathalus · Sep 18, 2023

TheRedRiders said:
By the time Blackwell releases, AMD's MCM technology would have matured as well in RDNA 5.

Don't you mean RDNA 4? Blackwell is the next release from Nvidia is it not? Likely due end of next year.

Exede · Sep 18, 2023

Nvidia, gimme more RAM on mid Tier cards pretty please. AI stuff is to addictive. Thanks

VN1X · Sep 18, 2023

As long as it means we'll get more affordable cards from them!

TheRedRiders · Sep 18, 2023

Zathalus said:
Don't you mean RDNA 4? Blackwell is the next release from Nvidia is it not? Likely due end of next year.

Not entirely sure, I know a few of the PC leakers mentioned that RDNA 5 is being designed to compete against the 50 Series, not RDNA 4 which apparently isn't going to compete in the high end.

Bo_Hazem · Sep 18, 2023

Can't wait for this 2000W GPU.

Loxus · Sep 18, 2023

Zathalus said:
Time to show AMD how it's done.

If I remember correctly, high end RDNA4 was cancelled because of pricing. Not because they couldn't get it working.

I mean, it's not like AMD didn't already created the most complex chip on the planet.

Bernoulli · Sep 18, 2023

Bo_Hazem said:
Can't wait for this 2000W GPU.

release is in 2025

shamoomoo · Sep 18, 2023

Zathalus said:
Don't you mean RDNA 4? Blackwell is the next release from Nvidia is it not? Likely due end of next year.

AMD already have chiplet GPUs for their data center line of GPUs.

Unknown? · Sep 18, 2023

Bo_Hazem said:
Can't wait for this 2000W GPU.

Soon, you'll have to decide between running your A/C or a high end PC!

OverHeat · Sep 18, 2023

Bo_Hazem said:
Can't wait for this 2000W GPU.

Finally you have seen the light!!!!

Zathalus · Sep 18, 2023

Loxus said:
If I remember correctly, high end RDNA4 was cancelled because of pricing. Not because they couldn't get it working.

I mean, it's not like AMD didn't already created the most complex chip on the planet.

I know they have it. It was a joke aimed at how mediocre uplift RDNA 3 is over RDNA 2.

Hudo · Sep 18, 2023

Only $5000!

Black_Stride · Sep 18, 2023

Lucifers Beard said:
I’m not really sure what to make of this. If core counts aren’t changing much, does this indicate that whilst performance gains will be better, performance won’t be significantly better than 4000 series?

I imagine if this is the case, they are likely moving away from traditional core performance and moving further in with A.I to improve performance through technology deployed through the chipset.

It’s all very confusing, just sell me a decent 5000 series card that will give me better performance than a 4090 but not cost as much. Thanks.

Its a new architecture, so simply looking at CUDA counts is pointless, thats always been the case.

Case in point.
RTX 3070 - 5888 CUDA cores.
RTX 4070 - 5888 CUDA cores.

The 4070 performs like a 3080 while having much fewer CUDA cores.

Bernoulli · Sep 18, 2023

Zathalus said:
I know they have it. It was a joke aimed at how mediocre uplift RDNA 3 is over RDNA 2.

it's because RDNA 3 was their first chiplet GPU

the jump should be bigger when they master it

Bo_Hazem · Sep 18, 2023

OverHeat said:
Finally you have seen the light!!!!

Nah, going Mac Studio next. RIP Windows PC 1996-2023.

damiank · Sep 18, 2023

Unknown? said:
Soon, you'll have to decide between running your A/C or a high end PC!

AC during summer, PC during winter, yeah

SolidQ · Sep 18, 2023

TheRedRiders said:
RDNA 5 is being designed to compete against the 50 Series, not RDNA 4 which apparently isn't going to compete in the high end.

RDNA4 was compete to 5xxx, but just moved to RDNA5.

Black_Stride · Sep 18, 2023

Bernoulli said:
it's because RDNA 3 was their first chiplet GPU

the jump should be bigger when they master it

AMDs MCM design simply separates GCDs and MCDs, the GCDs are effectively the same compute dies theyve been making for eons.

Theyve already "mastered" the GCD (that does most of the work) in MCM design, reducing latency and improving the MCDs is pretty much the only place to gain anything.

The big block in the center is the GCD.....basically the same as a monolithic die but surrounded by the MCDs which are on a different node.
This isnt a perf thing, its a cost cutting measure, shrink the GCD while keeping the MCDs on another node to keep costs down.

They arent going to be making major jumps in power and likely the reason they arent bothering with high end GPUs anymore, cuz theres no point aiming that high and being in a no mans land.

Better to fight the xx70s and Arc x70s.

winjer · Sep 18, 2023

Zathalus said:
I know they have it. It was a joke aimed at how mediocre uplift RDNA 3 is over RDNA 2.

RDNA3 is lacking, but not because of the chiplet design.
If anything, chiplets have been AMD's biggest suceess in the last decade.

But what most people still don't understand about chiplets or MCM, is that it's not meant to improve performance. It's meant to improve yields and cost. Especially cost.

Xyphie · Sep 18, 2023

Black_Stride said:
Its a new architecture, so simply looking at CUDA counts is pointless, thats always been the case.

The 4070 performs like a 3080 while having much fewer CUDA cores.

4070 clocks a lot higher though which offsets the shader deficit.

4070:

3080:

3080 - 68*128*2*1.93 = ~33.6 TFLOPS
4070 - 46*128*2*2.762 = ~32.5 TFLOPS

Perf/TFLOPS is effectively the same Ampere -> Ada.

Bernoulli · Sep 18, 2023

Black_Stride said:
AMDs MCM design simply separates GCDs and MCDs, the GCDs are effectively the same compute dies theyve been making for eons.

Theyve already "mastered" the GCD (that does most of the work) in MCM design, reducing latency and improving the MCDs is pretty much the only place to gain anything.

The big block in the center is the GCD.....basically the same as a monolithic die but surrounded by the MCDs which are on a different node.
This isnt a perf thing, its a cost cutting measure, shrink the GCD while keeping the MCDs on another node to keep costs down.

They arent going to be making major jumps in power and likely the reason they arent bothering with high end GPUs anymore, cuz theres no point aiming that high and being in a no mans land.

Better to fight the xx70s and Arc x70s.

but they say the latency is the biggest problem and improving that would already already give them a jump

Setsuna Mudou · Sep 18, 2023

VN1X said:
As long as it means we'll get more affordable cards from them!

They got a plan for you, basically it goes like this...

The more you buy, the more you save.

You have to pretend i'm wearing a leather jacket for full effect

Isa · Sep 18, 2023

Heh, chiplet.

Black_Stride · Sep 18, 2023

Xyphie said:
4070 clocks a lot higher though which offsets the shader deficit.

4070:

3080:

3080 - 68*128*2*1.93 = ~33.6 TFLOPS
4070 - 46*128*2*2.762 = ~32.5 TFLOPS

Perf/TFLOPS is effectively the same Ampere -> Ada.

Adas effeciency and clock speed advantage IS its architectural advantage.
Which is why I said simply looking at the CUDA count doesnt really mean much.

Bernoulli said:
but they say the latency is the biggest problem and improving that would already already give them a jump

Without major improvements with the GCD your arent going to be getting a "big" jump in performance.
A big jump is of course relative, im assuming they are going back to the RDNA1 model and will only have a 8700XT/8800XT as their range topping GPU which will likely compete with the RTX 5070.
Now that might still be considered a big jump if the 8700XT/8800XT beat and/or match the 7900XTX.
Who knows, only time will tell.

tusharngf · Sep 18, 2023

Nintendo this is your chance ! grab it

shamoomoo · Sep 18, 2023

Black_Stride said:
Its a new architecture, so simply looking at CUDA counts is pointless, thats always been the case.

Case in point.
RTX 3070 - 5888 CUDA cores.
RTX 4070 - 5888 CUDA cores.

The 4070 performs like a 3080 while having much fewer CUDA cores.

I'm not sure if that's a good enough example since Ampere was on a inefficient node and with Ada Lovelace the frequency was dramatically increased. If it's possible to get like-for-like clock on both GPUs to see if the improvement to architecture actually changed anything.

The RX 6700xt performed the same as the 5700xt at the same clock speed.

SolidQ · Sep 18, 2023

winjer said:
RDNA3 is lacking, but not because of the chiplet design.

RDNA3 problem is WGP. Waiting for some info about RDNA 3.5 and fixed WGP, otherwise PS5 pro will be with cursed RDNA3 WGP

Buggy Loop · Sep 18, 2023

As expected since the very beginning of Blackwell name appearing.

Even Ada had an MCM alternative ready to go according to kopite, but Nvidia was too impressed with the TSMC output so they scrapped the idea. As everyone saw, 4090 was a beast, probably the last monolithic king.

Lucifers Beard said:
I’m not really sure what to make of this. If core counts aren’t changing much, does this indicate that whilst performance gains will be better, performance won’t be significantly better than 4000 series?

I imagine if this is the case, they are likely moving away from traditional core performance and moving further in with A.I to improve performance through technology deployed through the chipset.

It’s all very confusing, just sell me a decent 5000 series card that will give me better performance than a 4090 but not cost as much. Thanks.

There's so many things that can change in an SM that core count is a meaningless metric. Their "not substantial" can also be compared with previous gens.

From Pascal GP104 (2560) → Turing TU104 (3072) → Ampere GA104 (3072 → ~2x3072) → Ada AD104 (3840 → ~2x3840)

Nvidia "doubled" the int32 / FP cores for Ampere and Ada, but that's a bit of a stretch, not all of them can be used. Point is that from a top view block diagram perspective, until Nvidia announced it, you would barely see a "substantial" core count increase from pascal to ada either.

My guesses are that fundamentally, there's a lot of work to be done around datapaths and memory paths especially if they go chiplet. This will the foundation of their MCM and while almost every companies have done MCM at this point, for the likes of data servers, where the data iterations are known quantities, as in non real-time productivity tasks, those solutions scale badly for gaming. Even with Apple's 2.5TB/s interposer, GPU MCMs do not scale as well as CPUs, it's like 200% CPU while +50% GPU. So everything will depend on the chiplet datapaths before you even decide to slap 2 GCDs together, but there's a need for a major overhaul. Just doing like servers & Apple did, is not a good plan for gaming.

Probably also a big focus on their ReStir / NRC path tracing, i would imagine that the ray tracing cores gen 5 + memory caching will be majorly overhauled for path tracing. And also of course, even more ML leverage as NRC is AI based and as we see now they are about to release ray reconstruction, Nvidia's future is ML and rightfully so. NRC is also highly dependent on low level caches and memory traffic, a revamp of the WARP so that the NRC remains in registers and doesn't bounce around memory lanes is in order if they're going this path. All that is detailed in their research papers as mitigations to yield additional performances.

Scaling to MCM only for +50% performances in gaming doesn't cut it, you stay monolithic if this is the result. So the question is will Nvidia be the first to unlock gaming MCM where it scales similarly to CPUs?

Might even need an OS revamp to achieve that. AMD’s engineer kind of covered that with the press that it’s more tricky to have multi GCD on GPUs than CPU CCDs. As of now the OS have native multi CPU support and it’s well understood how the system handles multi tasks over multiples of them. There’s no such thing for GPUs, it has to be handled on driver side, which is a big yikes.. but time will tell.

Toots · Sep 18, 2023

Next stop on the Toots Hot Take Tour 2023
Hot take :

I was wondering if the name "Blackwell" was virtue signaling like Lovelace was ?

I guess it is

Dude seems like a pretty cool guy tho and i don't really care if you celebrate black or white nerds as long as you are celebrating real nerds.
Anyway here's my hot take :
I wonder when all-white nvidia marketing execs will find enough balls between all of them to do the right thing and call the next gen of gpu "Elijah Muhammad" and be done with their pandering...

Black_Stride · Sep 18, 2023

Toots said:
Next stop on the Toots Hot Take Tour 2023
Hot take :

I was wondering if the name "Blackwell" was virtue signaling like Lovelace was ?

I guess it is
Dude seems like a pretty cool guy tho and i don't really care if you celebrate black or white nerds as long as you are celebrating real nerds.
Anyway here's my hot take :
I wonder when all-white nvidia marketing execs will find enough balls between all of them to do the right thing and call the next gen of gpu "Elijah Muhammad" and be done with their pandering...

GeForce 6 Curie was codenamed after Marie Curie and Hopper was named after Grace Hopper I take it they were virtue signaling then as well.
The Anti-Woke crowd are absolutely insane mane.....its reached the point where I dont know if this is parody or these people are actually this deluded and so desperate for something to be outraged at.

Buggy Loop · Sep 18, 2023

Toots said:
Next stop on the Toots Hot Take Tour 2023
Hot take :

I was wondering if the name "Blackwell" was virtue signaling like Lovelace was ?

I guess it is
Dude seems like a pretty cool guy tho and i don't really care if you celebrate black or white nerds as long as you are celebrating real nerds.
Anyway here's my hot take :
I wonder when all-white nvidia marketing execs will find enough balls between all of them to do the right thing and call the next gen of gpu "Elijah Muhammad" and be done with their pandering...

is everything about wokism now

Fucking hell

hlm666 · Sep 18, 2023

This is data centre right? Can't see nvidia wasting the limited advanced packaging they will need to use on these for game gpus when they can sell every AI gpu they can make at crazy prices and it's the packaging they need for these being the limiting factor currently, because of hbm apparently. The game gpus will still be monolithic most probably because of this because mcm will require them to also use this packaging.

Black_Stride · Sep 18, 2023

We are about to get charged out the ass.
Nope fuck that, imma be skipping Blackwell for sure.

hlm666 said:
This is data centre right? Can't see nvidia wasting the limited advanced packaging they will need to use on these for game gpus when they can sell every AI gpu they can make at crazy prices and it's the packaging they need for these being the limiting factor currently, because of hbm apparently. The game gpus will still be monolithic most probably because of this because mcm will require them to also use this packaging.

Nvidia is consolidating their architectures.
HPC will be GB10x consumer will be GB20x

It doesnt make sense to have monolithic be consumer and MCM be HPC when in terms of CUDA counts they arent far apart.
If anything the opposite would be more likely.
The margins on gaming GPUs is lower so going MCM to get as much savings makes sense, the HPC market you can charge them whatever the fuck you want.

KungFucius · Sep 18, 2023

Lucifers Beard said:
I’m not really sure what to make of this. If core counts aren’t changing much, does this indicate that whilst performance gains will be better, performance won’t be significantly better than 4000 series?

I imagine if this is the case, they are likely moving away from traditional core performance and moving further in with A.I to improve performance through technology deployed through the chipset.

It’s all very confusing, just sell me a decent 5000 series card that will give me better performance than a 4090 but not cost as much. Thanks.

Oh C'mon. We all know whatever they come out with will push you towards the 5090 that costs 1800.

Black_Stride said:
We are about to get charged out the ass.
Nope fuck that, imma be skipping Blackwell for sure.

Nvidia is consolidating their architectures.
HPC will be GB10x consumer will be GB20x

It doesnt make sense to have monolithic be consumer and MCM be HPC when in terms of CUDA counts they arent far apart.
If anything the opposite would be more likely.
The margins on gaming GPUs is lower so going MCM to get as much savings makes sense, the HPC market you can charge them whatever the fuck you want.

Are you sure? I said I am fine with 3090 for 4-6 years. 2 years later I was hammering F5 to get a 4090.

I would love to see them go to 2025 without launching the next gen, because that will make me look less like a weak bitch when I cave and buy one at launch. 32GB alone will have them bumping the MSRP to 2k. They will price these things like they are losing HPC revenue just by selling them.

dave_d · Sep 18, 2023

Black_Stride said:
GeForce 6 Curie was codenamed after Marie Curie and Hopper was named after Grace Hopper I take it they were virtue signaling then as well.
The Anti-Woke crowd are absolutely insane mane.....its reached the point where I dont know if this is parody or these people are actually this deluded and so desperate for something to be outraged at.

My main annoyance with Hopper is idiots who try to play up her importance by playing up the fact she coined the term bug and then miss her work on Cobol.(One of the first higher level languages. I am so glad I don't have to program in machine code.) To give an analogy that would be like somebody talking about Thomas Jefferson as "the guy who invented french fries" and basically missed the other stuff he kind of did.

Elder Legend · Sep 18, 2023

Hudo said:
Only $5000!

The more you buy. The more you save!

Celcius · Sep 18, 2023

This will be when I finally upgrade from my rtx 3090.
Honestly, I wouldn't mind if they took a generation and kept the performance the same as last gen but focused on getting the heat and power draws cut in half. Stuff is getting out of hand.

Reallink · Sep 18, 2023

Black_Stride said:
Adas effeciency and clock speed advantage IS its architectural advantage.
Which is why I said simply looking at the CUDA count doesnt really mean much.

Important distinction to make is the same core count on the same node is going to perform very similar. Actual architectural gains are very small at this point, especially 1 gen apart. For people completely unversed in this stuff, 90% of performance gains come from node shrinks allowing more cores on the same size die, higher clocks due to improved power efficiencies, or both. The 4070s 55% clock advantage is the achievement of TSMC (and the failure of Samsung), not Nvidia

cinnamonandgravy · Sep 18, 2023

nah, consumer version still mono.

Bo_Hazem · Sep 18, 2023

Bernoulli said:
release is in 2025

Might join your club of M2 sluts instead.

Dream-Knife · Sep 18, 2023

Probably not upgrading until 6000 series.

Support NeoGAF

NVIDIA’s Next-Gen Blackwell GB100 GPUs Utilize Chiplet Design

M2 slut

Member

Member

Member

One of the green rats

Member

Member

Member

Gold Member

Member

Member

Banned

Member

Banned

Member

M2 slut

Member

Member

« generous god »

Member

Member

do not tempt fate do not contrain Wonder Woman's thighs do not do not

M2 slut

Banned

Member

Member

do not tempt fate do not contrain Wonder Woman's thighs do not do not

Gold Member

Member

M2 slut

Member

Member

do not tempt fate do not contrain Wonder Woman's thighs do not do not

Member

Member

Member

Member

Gold Member

do not tempt fate do not contrain Wonder Woman's thighs do not do not

Member

Member

do not tempt fate do not contrain Wonder Woman's thighs do not do not

King Snowflake

Member

Banned

°Temp. member

Member

Member

Banned

Banned

Similar threads