@UnrealEngine Hits are buffered on CPU and sent to Niagara in batches (up to 64 each-frame), so you can absolutely hammer it with hits.
Takes ~0.6ms for a 1K render target on GTX 980. Next improvement is to prioritise updating within camera view, and atlas targets for different mesh sections.
A custom in-editor tool is used to bake the acceleration data from the content browser.
Specify the mesh, choose the sampling regions abnd UV set, and it packs accordingly. Even shows a preview now :)
There is an upper limit of 65,536 tris - but can be increased to 262,144 by reducing precision of the quantized barycentric data.
Alternatively you can compute pixel barycentric position at runtime, and have >4 million triangles. Good luck skinning a mesh like that though...
Frustratingly - it's impossible to store an R32_UINT texture in Unreal as an asset because silly API decisions, so have to use R32_FLOAT for the acceleration map and asuint/asfloat HLSL intrinsics.
Could fix it by modifying engine, but trying to restrict this to a plugin format.