• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

WiiU technical discussion (serious discussions welcome)

This again.

And example. The audio for Killzone3 took less then 3% of the CPU budget. Audio does not take a large part of a games CPU budget.
Depends on the game and the developers. Also, the guy you replied to was specifically referring to the 360. The PS3's Cell can probably handling sound resources better than the 360 due to the SPUs, though I'm not sure.
 
Depends on the game and the developers. Also, the guy you replied to was specifically referring to the 360. The PS3's Cell can probably handling sound resources better than the 360 due to the SPUs, though I'm not sure.
The Cell is primarily designed as a multimedia processor. It's not surprising that it would handle audio without problems.
On a general purpose CPU audio can really clog up the capacity.
Racing games in particular are extremely taxing on the sound. A single car uses a multitude of "voices" at any time, wich multiplies with each car on track.
 

beril

Member
There is a very simple explanation to the massive OS RAM: Internet Browser

They should probably be able to reduce it quite a bit. But that still put's it at a completely different level than the PS3 or 360
 

Biggzy

Member
I find this insane. I mean yeah, the Dashboard/XMB can be a bit slow sometimes, but they still manage to do pretty much everything a console OS could possibly need. 1GB just seems like such a mind boggling level of overkill.

Wait till you see what Durango uses, if the rumors turn out to be true.
 

mrklaw

MrArseFace
There is a very simple explanation to the massive OS RAM: Internet Browser

They should probably be able to reduce it quite a bit. But that still put's it at a completely different level than the PS3 or 360

And there is a simple rebuttal to that - vita
 

KageMaru

Member
I couldn't find clear examples of CPU audio usage on Xbox360. I can at least link to my sources. Can you?

Every game is different. IIRC Halo 3 used an entire core for audio while using a thread in Reach.

Wait till you see what Durango uses, if the rumors turn out to be true.

Yeah, I'm shaking my head at those rumors. If this turns out true, I wonder how much Sony will reserve.
 

beril

Member
And there is a simple rebuttal to that - vita

Do we know how much ram is locked away for the OS on Vita? The 3DS also has half the ram locked away, though it's obviously much less than WiiU. But I'm hoping the WiiU browser is a lot better than the one in the 3DS
 

Durante

Member
What's wrong with limited range RGB? Assuming you're using a HDTV, that matches the levels used by HD broadcast and bluray. 16-235 I think, rather than 0-255. If you're calibrated for those, then your console should be set to limited anyway
You answered your own question. With limited range RGB you lose precision, since the transmission range is, well, limited. It's not a massive issue (though I can easily see it on some gradients), but it's really an unnecessary restriction.
 
Yeah, I'm shaking my head at those rumors. If this turns out true, I wonder how much Sony will reserve.


Yes, but if Sony or MS went the DDR3 route as well, increasing the amount of RAM also allows them to use a wider bus and increase overall bandwidth, and thus performance, even if the total amount/amount reserved for apps seem like overkill. If Nintendo had relented a little bit and used 8 RAM chips, like the original 360, instead of the 4 they went with to cut costs, a 128-bit bus would have been possible and we wouldn't be having this discussion right now. I fully expect MS and Sony to go with a bunch of RAM chips next gen. Like 12 perhaps.

Edit: Responded to the correct post this time. :)
 

mrklaw

MrArseFace
You answered your own question. With limited range RGB you lose precision, since the transmission range is, well, limited. It's not a massive issue (though I can easily see it on some gradients), but it's really an unnecessary restriction.

You lose some numbers, but its more accurate to a calibrated display - it would be calibrated so that 16 is black. If you pass it below 16 with a full range signal, you won't get very dark grey, you'll still get black so you're clipping the signal.

Conversely if you calibrate for full range (eg a computer monitor) then playing a bluray on it would have dark grey blacks, and light grey whites. (Unless you have a PC with software bluray player set to compensate for that)

I agree it seems an odd restriction, but its the standard. Just like the weird way some TVs still overscan on a 1080p input. That's just nuts there aren't broadcasts at that resolution, just display it 1:1 by default for goodness sake.
 

Vanillalite

Ask me about the GAF Notebook
I'd be interested to hear some info from the devs on their DSP usage. At anyrate I like the sound discussion info I'm getting in this thread. Thanks GAF!
 

TheD

The Detective
The Cell is primarily designed as a multimedia processor. It's not surprising that it would handle audio without problems.
On a general purpose CPU audio can really clog up the capacity.
Racing games in particular are extremely taxing on the sound. A single car uses a multitude of "voices" at any time, wich multiplies with each car on track.

Audio is easy to handle.

Just because one game used a core on the 360 for something like 300 sounds at a time does not mean a damn thing for most games.

"general purpose CPU"s are perfectly fine to process audio, the Cell does not have some big advantage (other than the fact it has around double the cores of the 360).
 

Margalis

Banned
"general purpose CPU"s are perfectly fine to process audio, the Cell does not have some big advantage (other than the fact it has around double the cores of the 360).

Cell SPUs are built to be super high performance data stream processors and have a lot in common with a DSP.
 

Vanillalite

Ask me about the GAF Notebook
Audio is easy to handle.

Just because one game used a core on the 360 for something like 300 sounds at a time does not mean a damn thing for most games.

"general purpose CPU"s are perfectly fine to process audio, the Cell does not have some big advantage (other than the fact it has around double the cores of the 360).

Might be easy to handle, but it's still taking up valuable threads and clock cycles that the CPU could be used else where. I know a few years ago they estimated on the PC you gained like 5-10% performance gain cpu wise if you grabbed a dedicated soundcard verses onboard. That's not a ton, but every little bit can count!
 

Reg

Banned
So, is it fair to say that the wii u is a side step rather than a jump forward compared to current gen hardware?
 

ozfunghi

Member
Might be easy to handle, but it's still taking up valuable threads and clock cycles that the CPU could be used else where. I know a few years ago they estimated on the PC you gained like 5-10% performance gain cpu wise if you grabbed a dedicated soundcard verses onboard. That's not a ton, but every little bit can count!

Exactly. Especially if the U-CPU is indeed not the biggest monster, using the DSP should be even more rewarding.
 
So, is it fair to say that the wii u is a side step rather than a jump forward compared to current gen hardware?

That'd be fair.

It seemingly has a GPU of fairly modern featureset with similar capability, more (but slower memory).

But it's also fair to say it has drawbacks in comparison to the PS3/360. Mentioned memory bandwidth, and a CPU that lacks any kind of grunt.

Culminating in something that should be marginally more powerful, getting there in different ways.
 

TheD

The Detective
Might be easy to handle, but it's still taking up valuable threads and clock cycles that the CPU could be used else where. I know a few years ago they estimated on the PC you gained like 5-10% performance gain cpu wise if you grabbed a dedicated soundcard verses onboard. That's not a ton, but every little bit can count!

That was a very long time ago.

Modern soundcards will give you no performance increase what so ever.
 

ikioi

Banned
That was a very long time ago.

Modern soundcards will give you no performance increase what so ever.

But that's due to Microsoft changing the audio APIs in modern versions of Windows, not because dedicated DSPs have no performance advantage.

Microsoft got sick of shitty sound card drivers BSOD Windows. To help improve OS stability Microsoft changed the APIs for sound processing in Windows Vista to prevent direct hardware sound processing. By stopping sound hardware acceleration, they stopped sound card drivers being able to talk directly to the Windows NT kernel and making API calls to hardware. Doing this elimited the ability for shit house sound drivers (Creative basically) causing Windows computers to blue screen.
 

KageMaru

Member
Yes, but if Sony or MS went the DDR3 route as well, increasing the amount of RAM also allows them to use a wider bus and increase overall bandwidth, and thus performance, even if the total amount/amount reserved for apps seem like overkill. If Nintendo had relented a little bit and used 8 RAM chips, like the original 360, instead of the 4 they went with to cut costs, a 128-bit bus would have been possible and we wouldn't be having this discussion right now. I fully expect MS and Sony to go with a bunch of RAM chips next gen. Like 12 perhaps.

Edit: Responded to the correct post this time. :)

IF they go with ~8GBs of memory, I expect it to be DDR4.

Still if the rumors are true and MS plans to reserve 3GBs for the OS, that's a huge waste IMO.

That'd be fair.

It seemingly has a GPU of fairly modern featureset with similar capability, more (but slower memory).

But it's also fair to say it has drawbacks in comparison to the PS3/360. Mentioned memory bandwidth, and a CPU that lacks any kind of grunt.

Culminating in something that should be marginally more powerful, getting there in different ways.

To be fair I think it'll handle general purpose code better than the Xenon or Cell. It's lacking in floating point performance and I imagine that's where most of the weak CPU complaints are coming from.
 

Durante

Member
You lose some numbers, but its more accurate to a calibrated display - it would be calibrated so that 16 is black. If you pass it below 16 with a full range signal, you won't get very dark grey, you'll still get black so you're clipping the signal.

Conversely if you calibrate for full range (eg a computer monitor) then playing a bluray on it would have dark grey blacks, and light grey whites. (Unless you have a PC with software bluray player set to compensate for that)
Every display device I'm using can be switched between full and limited range RGB (most recent ones auto-detect it).

To be fair I think it'll handle general purpose code better than the Xenon or Cell. It's lacking in floating point performance and I imagine that's where most of the weak CPU complaints are coming from.
It will certainly handle branchy code better than Xenon/Cell per clock. But I really wish we knew how much lower the clock is.
 

wsippel

Banned
Some anonymous dude on Slashdot who claims to be a developer working on Wii U wrote something interesting: According to him, the Wii U CPU has no VMX or VSX units. Instead, Nintendo still uses a SIMD capable FPU with support for paired singles. That makes sense, paired singles would be required for Wii compatibility, so they have to be supported. No surprise here.

But he also wrote that he expected the FPRs (floating point registers) to be 64bit, enough for one double or two singles. 64bit FPRs seem to be standard for all PowerPC cores, no matter whether or not the cores support paired singles (like the 476FP). What he found instead were 96bit registers. I've never heard of a 96bit floating point unit or 96bit registers. I don't even know if there would be a point. Coordinates are typically 96bit I believe, so maybe having one vertex coordinate per register would be beneficial?
 

Durante

Member
As I understood that post, it's not really 96 bit registers. It's that there are general 64 bit registers, but the paired single instructions don't use those 64 bits, they use 32 bits of that and another extra 32 bit register.

Which does honestly sound confusing to me (I'm used to SIMD architectures that simply pack N X-bit numbers into one N*X bit register).
 

wsippel

Banned
As I understood that post, it's not really 96 bit registers. It's that there are general 64 bit registers, but the paired single instructions don't use those 64 bits, they use 32 bits of that and another extra 32 bit register.

Which does honestly sound confusing to me (I'm used to SIMD architectures that simply pack N X-bit numbers into one N*X bit register).
The whole point of paired singles was to use one double precision register to store two single precision values and then do math on both values across registers. If the values have to be in separate registers, it's not really a paired single anymore.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Some anonymous dude on Slashdot who claims to be a developer working on Wii U wrote something interesting: According to him, the Wii U CPU has no VMX or VSX units. Instead, Nintendo still uses a SIMD capable FPU with support for paired singles. That makes sense, paired singles would be required for Wii compatibility, so they have to be supported. No surprise here.

But he also wrote that he expected the FPRs (floating point registers) to be 64bit, enough for one double or two singles. 64bit FPRs seem to be standard for all PowerPC cores, no matter whether or not the cores support paired singles (like the 476FP). What he found instead were 96bit registers. I've never heard of a 96bit floating point unit or 96bit registers. I don't even know if there would be a point. Coordinates are typically 96bit I believe, so maybe having one vertex coordinate per register would be beneficial?
PPC FPRs are normally 64bit because PPCs normally support double precision (even a lowly embedded CPU does that). From there on, when processing singles, half of the FPR is used. When processing doubles, the entire FPR is used. When using paired singles 750CL (aka Gekko, aka Hollywood) uses the low 32 bits of the FPR for ps0 (paired-single-0) and the high 32 bits for ps1 (paired-single-1). That slashdot anonymous post is a joke.

edit: Ok, I'm in the middle of a gargantuan build, so let me elaborate a bit. First, a disclaimer: I'm in no way familiar with U-CPU, never touched it, never smelled it, never even removed its heatsink. Yet, there's a single, very useful public bit of info re U-CPU - it's binary compatible with the Gekko/Hollywoord. And that is a well-studied CPU. So back to 750CL (which is IBM's stock name for Gekko, in case somebody missed that).

Gekko supports paired singles by utilizing its super-scalar FPU pipeline and through some enhancements to its load/store unit. As every PPC I've had contact with, Gekko features 64bit floating-point registers (FPR), and unsurprisingly, bluntly uses those to store paired singles (as I said in the part before the edit). Now, there's a catch in that - while the organisation of the singles is indeed in the logical high/low spit of the FPR, the CPU _cannot_ interpret doubles as a singles pair, and vice versa. What that means, is that if you load a pair of singles (through its dedicated load instruction, blessed be anonymous cowards) to an FPR, that FPR cannot be treated in a subsequent instruction as if it contained a double - no subsequent math on doubles can use that register as a double - such math will produce unpredictable results. But the restriction does not end there - not only math cannot treat this FPR as a double, but also the store instructions that work with doubles cannot use that register as a double. So if you naively expect that an FPR which contains a singles pair could be stored via a doubles store op - you'd be wrong! You have to use the dedicated paired-singles op to store an FPR containing paired singles.

*build done*
 

Kenka

Member
Guys, can anyone enlighten me? I would like to know how the capacities and rates we hear left and right influence the quality of the output (framerate, polygons, visual effects). All that I know is that data from the disc is read and ends up being displayed on the screen (yeah, I am that clueless). I have no idea how the components, be it GPU, CPU, RAM, EDRAM, disk player, work with one another to make that happen and how crucial their attributes are. That's what I want to learn (if you could help me).
 

Erasus

Member
Guys, can anyone enlighten me? I would like to know how the capacities and rates we hear left and right influence the quality of the output (framerate, polygons, visual effects). All that I know is that data from the disc is read and ends up being displayed on the screen (yeah, I am that clueless). I have no idea how the components, be it GPU, CPU, RAM, EDRAM, disk player, work with one another to make that happen and how crucial their attributes are. That's what I want to learn (if you could help me).

Often its more = better. But architechture shifts make a big difference. (Like going from 3ghz Pentium 4 to 2ghz i3, the i3, dispite having lower clockrate, is way more powerful)
 

neoneogaffer

Neo Member
I have a question about the form factor of the U. Is it just the way things turned out to be because of the parts they used, they used specific parts in order to get a small form factor, or is there is a business case for it? I'm assuming having small form factors results in a higher costs.
 

Kenka

Member
I have a question about the form factor of the U. Is it just the way things turned out to be because of the parts they used, they used specific parts in order to get a small form factor, or is there is a business case for it? I'm assuming having small form factors results in a higher costs.
What a username! :p

A gaffer has been wowed by the low power consumption of the WiiU, stating that achieving such efficiency is "a work of art", so I guess this isn't the result of pure luck. I also read (on GAF, lol) that form factor is an important feature in electronics in Japan. More advanced suppositions suggest it is because the Japanese have smaller houses. Do whatever you want with this info.
 

SmokeMaxX

Member
I don't know where else to post this, and it's not worth making a new thread for (plus I don't know if it's old or not), but Panasonic are supplying Wii U's optical discs drives, which are equipped with optical pickup (whatever that is).

http://www.asahi.com/digital/nikkanko/NKK201211200004.html

Maybe:
ABSTRACT

The optical pick-up for DVD with CD (compact disc) compatibility is discussed. The difference of the substrate thickness between DVD and CD causes the different spherical aberration and prevents the laser beam from being focused into a diffraction limit spot size with only one objective lens. Several methods of reducing this aberration and possessing compatibility with CD are proposed. The twin lens type optical pick-up is one solution in overcoming this problem. It incorporates objective lenses for both DVD and CD. Each lens results in an optimum focused spot without the extra spherical aberration for each type of disc
http://ieeexplore.ieee.org/xpl/logi...rg/iel1/30/11320/00536189.pdf?arnumber=536189

?
 

wsippel

Banned
PPC FPRs are normally 64bit because PPCs normally support double precision (even a lowly embedded CPU does that). From there on, when processing singles, half of the FPR is used. When processing doubles, the entire FPR is used. When using paired singles 750CL (aka Gekko, aka Hollywood) uses the low 32 bits of the FPR for ps0 (paired-single-0) and the high 32 bits for ps1 (paired-single-1). That slashdot anonymous post is a joke.

edit: Ok, I'm in the middle of a gargantuan build, so let me elaborate a bit. First, a disclaimer: I'm in no way familiar with U-CPU, never touched it, never smelled it, never even removed its heatsink. Yet, there's a single, very useful public bit of info re U-CPU - it's binary compatible with the Gekko/Hollywoord. And that is a well-studied CPU. So back to 750CL (which is IBM's stock name for Gekko, in case somebody missed that).

Gekko supports paired singles by utilizing its super-scalar FPU pipeline and through some enhancements to its load/store unit. As every PPC I've had contact with, Gekko features 64bit floating-point registers (FPR), and unsurprisingly, bluntly uses those to store paired singles (as I said in the part before the edit). Now, there's a catch in that - while the organisation of the singles is indeed in the logical high/low spit of the FPR, the CPU _cannot_ interpret doubles as a singles pair, and vice versa. What that means, is that if you load a pair of singles (through its dedicated load instruction, blessed be anonymous cowards) to an FPR, that FPR cannot be treated in a subsequent instruction as if it contained a double - no subsequent math on doubles can use that register as a double - such math will produce unpredictable results. But the restriction does not end there - not only math cannot treat this FPR as a double, but also the store instructions that work with doubles cannot use that register as a double. So if you naively expect that an FPR which contains a singles pair could be stored via a doubles store op - you'd be wrong! You have to use the dedicated paired-singles op to store an FPR containing paired singles.

*build done*
Thanks for the explanation. I already knew a few things about how paired singles work and that they are limited to certain operations (which are listed here) - I mostly wondered if extending the system by adding a ps2 would make sense, to modify a 3D coordinate with two registers in a single cycle for example.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Thanks for the explanation. I already knew a few things about how paired singles work and that they are limited to certain operations (which are listed here) - I mostly wondered if extending the system by adding a ps2 would make sense, to modify a 3D coordinate with two registers in a single cycle for example.
While I would not rule out a hypothetical ps2 as having some value, 3D math is normally a 4D vector matter - homogeneous coordinates and stuff.

Apropos, I've been getting some vector math code of mine on the wii lately, but it's more of an exercise in compilers than anything else (it's cross-platform heavily templetized c++ vector code). I'm yet to succeed in persuading the compiler to vectorize it to paired singles, but even in the current scalar state of the code, Gekko performs perfectly in line my expectations, per-clock. Nothing horrible about it ; )
 

wsippel

Banned
While I would not rule out a hypothetical ps2 as having some value, 3D math is normally a 4D vector matter - homogeneous coordinates and stuff.
What does the fourth value do? I only code much higher level, and only ever used three floats to translate coordinates.
 

IdeaMan

My source is my ass!
Hey while my two techies buddy are here, how could you explain that a developer could be circumspect in front of the alleged speed of the Wii U ram after the recent teardowns, stating that they haven't witnessed such performances ? (yes i received some impressions on this affair).

I presume there are two scenarios:

1) All the memory layout is built in a way no developer could suspect that in the middle of its processing from storage to the screens, their code is having a hard time on the RAM, and on the contrary the general performances of all this memory department is good. (the more likely case, with the eDram, etc, as said many times).

2) There is more to it than those rather simple - at least the publicly available ones - analysis (with pictures of the chips, finding of the references, searches on the net), and the bandwidth could be clearly higher than what those innards studying results push us to think it is (so more than those around 12GB/S) ?
 
Conversely if you calibrate for full range (eg a computer monitor) then playing a bluray on it would have dark grey blacks, and light grey whites. (Unless you have a PC with software bluray player set to compensate for that)

Why would you calibrate a TV for RGB anything with Blu-rays and DVDs? YCbCr is what you should be using. YCbCr also happens to be what the Wii U uses in Wii mode.

I'm not going to be especially happy until Full Range RGB is patched onto the Wii U.
 
Hey while my two techies buddy are here, how could you explain that a developer could be circumspect in front of the alleged speed of the Wii U ram after the recent teardowns, stating that they haven't witnessed such performances ? (yes i received some impressions on this affair).

I suspect there are two scenarios:

1) All the memory layout is built in a way no developer could suspect that in the middle of its processing from storage to the screens, their code is having a hard time on the RAM, and on the contrary the general performances of all this area is good. (the more likely case, with the eDram, etc, as said many times).

2) There is more to it than those rather simple - at least the publicly available ones - analysis (with pictures of the chips, finding of the references, searches on the net), and the bandwidth could be clearly higher than what those innards studying results push us to think it is (so more than those around 12GB/S) ?

If your a pc developer and approach the WII U as if you were making a pc game then you would have speed problems. You would not use any of the custom features that Nintendo added. This is specially true if your porting games.
 

Durante

Member
Has anyone done any power measurements in more demanding games than NSMBU?

Taking that measurement of 33 Watt for that (that was after the power supply I believe), I'd estimate (roughly!) that at most the GPU alone will probably pull is 30W. The top-end of Flops/W AMD provides in the Turks line is a bit below 15, so let's go with 15. That would put the best-case Wii U GPU GFLOPS at 450. Which seems to match up pretty well with some of the (lower end) early rumours.

Why would you calibrate a TV for RGB anything with Blu-rays and DVDs? YCbCr is what you should be using. YCbCr also happens to be what the Wii U uses in Wii mode.

I'm not going to be especially happy until Full Range RGB is patched onto the Wii U.
Yeah, that really should have been there from the start. But I'm sure Nintendo will patch it in soon enough, that shouldn't be particularly hard.


2) There is more to it than those rather simple - at least the publicly available ones - analysis (with pictures of the chips, finding of the references, searches on the net), and the bandwidth could be clearly higher than what those innards studying results push us to think it is (so more than those around 12GB/S) ?
This seems exceedingly unlikely. The (maximum) bandwidth to the 2GB DDR3 memory is quite simply what it says on the chips.
 

Thraktor

Member
Hey while my two techies buddy are here, how could you explain that a developer could be circumspect in front of the alleged speed of the Wii U ram after the recent teardowns, stating that they haven't witnessed such performances ? (yes i received some impressions on this affair).

I presume there are two scenarios:

1) All the memory layout is built in a way no developer could suspect that in the middle of its processing from storage to the screens, their code is having a hard time on the RAM, and on the contrary the general performances of all this memory department is good. (the more likely case, with the eDram, etc, as said many times).

2) There is more to it than those rather simple - at least the publicly available ones - analysis (with pictures of the chips, finding of the references, searches on the net), and the bandwidth could be clearly higher than what those innards studying results push us to think it is (so more than those around 12GB/S) ?

It's most likely that the developers you're talking to have optimised their rendering pipelines around the eDRAM, basically ensuring that anything which requires sustained high bandwidth access is on the eDRAM at the time of that access.
 
The issue was the amount of eDRAM and the lack of CPU muscle to handle tiling in many games. IIRC eDRAM can be beneficial to deferred rendering, saving memory where the G buffer would typically be stored.

Main problem with Deferred engines in 360 is they ditch a lot of GPU's fixed functions, so they rely on raw power only. Not considering unified shaders vs fixed shaders setup, Xenos is barely stronger than RSX in raw Gflops output.

eDRAM could have had some benefit if bigger, but then again, Deferred do not like tiling at all.

360 is designed over DX foward rendering, and very optimized for it. Any other renderer will have hard times being optimal to such architecture. But you now, some people made use of hardware fixed functions in other ways than original intended in the past.

About WiiU CPU, with such low TDP and 45nm fab process, I wouldn't expect it to be clocked over 2,4Ghz. Even with double L2 cache, it would have hard times struggling with Xenon. Since no heavy architecture improvements are known and it didn't support SMP .

Also, even when I think 1GB is overkill for OS, smarthphones needs over 512MB to deliver a smooth experience. I guess WiiU browser is able to open websites desktop version, Am I wrong?
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
What does the fourth value do? I only code much higher level, and only ever used three floats to translate coordinates.
Homogeneous coordinates are N-dimensional coordinates where one of the coordinates (read: normally the last one) is an 'out-of-this-world' component, figuratively speaking. It allows matrix transformations (among other things) to feature the entire set of spatial transformations you'd normally care about, as long as the matrices are NxN, or Nx(N-1). Basically, the extra component is an 'out-of-band' thing which carries extra information, which you cannot encode in a 'normal' vector of just the bare dimensionality.

Historically, the out-of-band component in 3D homogeneous space is called W, i.e. a 3D homogeneous space coordinate/vector is an <x, y, z, w> tuple. Setting W = 1.0 makes the tuple behave as a first-class-citizen coordinate, subject to translations and perspective transforms; setting W = 0.0 makes the tuple immune to translations, which is good for directional vectors. Yes, you can do all that manually, if you have the prior knowledge what type of tuple that is. But aside from a better abstraction model, it can also be an efficiency gain if the hw supports 4-wide ops. For instance, imagine we have a vec3 coordinate we want to get the partial product of with a row/column from a homogeneous transform. We can do that as:

res = dot(vec4.xyz, vec3)
res += vec4.w

or as:

res = dot(vec4, vec4(vec3, 1.0))

If the hw does dot4 natively the latter case is preferable over the former, which has a data dependency and is ergo not issuable in the given order - it would stall the pipeline (the dot operation can have arbitrary high latency). The knowledge of the nature of the tuple still allows us to store our original argument as vec3 and not as vec4 though.

Last but not least, we have quaternions, which are inherently 4D.
 
Top Bottom