For a start, any unit inside a GPU doing a task offloaded from the CPU is h/w acceleration, so the BVH units with the PS5 and Series GPUs are accelerators, as are shaders and compute shaders, which I assume you think because they are programmable software somehow invalidates the acceleration they provide.
What is different between AMD and Nvidia, and for the purpose of this discussion we'll say the Nvidia RT accelerators are like fixed path ASICs pre opengl 1.5 when Hardware T&L was coined as acceleration. The Nvidia accelerators, and even CUDA units work independently of their conventional programmable graphics pipeline, so work in parallel without blocking once the setup of the parallel jobs has been complete.
The concept of hardware acceleration is usually referred to a unit designed specifically to run certain instructions. Shaders are generalist parallels processors, so they are not hardware accelerators.
For example, if a GPU has a unit specific to decode AV1, that is hardware acceleration decode/encode of AV1 video. But if it's run on shaders, the it's just software based.
An example of this is relating to UE5 is rasterization. A GPU has specific units for rasterization, what you called ASICs. But EPIC choose to do software rasterization, because it is more flexible and better suited for their engine.
In the the case of RT, nvidia has units to specifically accelerate both, BVH traversal and ray testing.
In the case of RDNA2, it only has some instructions inside the TMUs, that accelerate ray-testing. But BVH traversal is done in software, in the GPU shaders.
RDNA2 does not have any hardware to accelerate BVH traversal. It's just shaders and they don't even have instructions to accelerate the BVH.
Secondly, Mesh shaders are still part of the geometry pipeline that encompasses the vertex and geometry shader pipelines. Mesh shading doesn't magically save the GPU from having to use CUs to transform vertices, texture coordinates and vertex normals of assets from model space through the transforms to project as fragments in the viewport. Mesh shaders just become a different abstract entry point to that same functionality.
Even if GPU extensions have existed for basic geometry culling, geometry culling as a meaningful process has previous been a CPU + shader/compute shader task, unless we are being verbose to focus on the frustum clip plane culling - which is supposed to be somewhat redundant in a game engine for all but polygons that straddle the walls of the frustum and trigger primitive subdivision so the inside part gets kept and the outside part gets culled.
I should have been clearer about what I was talking about when refereeing to Mesh Shaders.
I'm talking about the new GPU pipeline for geometry rendering, introduced with DX12_2
Previously in DX12, we had these stages: Input Assembler; Vertex Shader; Hull Shader; Tessellation; Domain Shader; Geometry Shader; Rasterization; Pixel Shader
But with the new pipeline: Amplification Shader; Mesh Shader; Rasterization and Pixel Shader.
It's a simpler pipeline that reduces overhead and increases geometry throughput significantly.
GPUs have been doing hardware culling, to prevent overdraw, even before the existence of programable shaders.
Of course it wasn't as advanced as what we have today, but it did offer performance improvements.
In the discussion we were having, I'm talking about the models' BVH representations getting real-time kit-bashed accelerated via the BVH units - probably as async compute shader - while the previous frame's gather, denoise and upscale is taking place in a shader - because Cerny in his Road to PS5 said the BVH units can resolve a query while the shader is still running, meaning (AFAIK) unlike the Series hardware, the PS5 doesn't block the Texture memory unit use while waiting on a BVH query, so can async both shader and BVH via compute.
We already talked about this. That feature is called In-Line ray-tracing.
Something that the Series X and RDNA2 on PC can do. Even NVidia's hardware benefited from this, as it improved contention during the execution pipeline.
BTW, can you point me to where Cerny said that.
As for UE5 it will exploit the custom hardware in the PS5 as part of Sony and Epic's partnership - Sony owns a few percent of Epic - so, if as I suspect the PS5 can real-time kit-bash via the custom geometry engine using the BVH units, then that will be an option for developers - in addition to the default provided by UE5. The hierarchical zbuffering, by its very name zbuffering, doesn't cull geometry, but fragments, as the zbuffer operates in the fragment shader pipeline, unless they are using the terminology loosely as a catchy name for an algorithm - it just sounds like a frustum partition algorithm to improve accuracy of zbuffer fragement when projected, ie everything is already past the frustum culling to be in the rendering process.
But geometry can be culled using an Hierarchical Z:
That is what the r.HZBOcclusion cvar does in Unreal.
This is not related to UE5, but it's a good example of occlusion culling with Hierarchical-Z