Halo Reach and 4 did bake the direction, but perhaps they used a more primitive version that required less memory. (Radiosity normal mapping ala Source engine for example)
Why? Because Halo 3 baked 9 term Spherical Harmonics for every texel in the lightmap. at least half of those terms each required a 3 channel textures, and I think all of these were baked at the full resolution of the lightmap.
-------------------------EDIT:--------------
So, I guess its stored as 5 x 1024^2 x 3 channel float textures.
--------------------------------------------------
My understanding is that this is basically like using an environment-map/cube-map (as with older techniques for specular) except instead of whole segments of your environment sampling from
a single cubemap or the nearest single environment probe, ala source engine, you pretty much had a unique 'cubemap'
for every texel of your lightmap. This is represented as a 9 term SH that comes within < 3% error when compared to what the actual memory prohibitive cubemap would look like.
They wanted their bump maps to work well this go around as one of there goals:
Part of this is getting the specular to work with their lightmaps and getting proper indirect lighting baked in that included
indirect specular. With, for example, radiosity normal maps (Which is what I imagine Reach and 4 use) you only really get directional information suitable for diffuse lighting (I believe with Radiosity normal mapping you could only get the dominant diffuse light for each of the 3 directions and most engines would still just use the nearest cubemap/environment-probe for specular).
However, when you use 9 term SH lightmaps you get essentially a unique 'cubemap'
for every texel. So you can know all the lighting coming at a texel from every angle, pretty much the irradiance, and therefore this allows them to not only precompute direction diffues lighting, but also specular that would change with your viewing angle (the way specular should be) per texel.