Guillaume Boissé Profile picture
Oct 5, 2022 14 tweets 3 min read Read on X
there seems to be quite a bit of interest for this so thought I’d do a small breakdown:)
so there is essentially 2 steps to the algorithm.
first the scene is voxelized into cells storing the index of the closest triangle for each overlapping voxel.
this is done entirely in compute using an optimized version of voxel pipe by @jpantaleoni.
the optimisation mainly revolves around not using a full-on gpu sorting primitive (radix sort i believe in the original paper?).
but rather scan the per-cell counters and scatter in a single re-order pass using the return value of the atomic increment operation.
still the main idea is the same, split tris into tiles, which are further split into virtual tiles or subtiles.
this allows performing all the voxel blending in LDS (i.e., on chip), without the need for global atomics.
ends up roughly 3 to 5x faster than HW conservative raster.
the blending itself uses 64-bit LDS atomic min with the distance to center on the top 32 bits and triangle index on the bottom ones.
the final pass resolves the subtiles into tiles in registers and simply writes the results (i.e., triangle indices) out to the 3D texture.
then, the 2nd part fills the grid with distance values.
the 1st step is a single pass 3x3x3 JFA (separated) to recover the narrow band distances around the geometry.
at this point, the triangle indices (used as the JFA seeds) are discarded in favor of 16-bit distance values.
finally, 3x passes of fast sweeping back and forth are executed to fill the rest of the grid.
from this point, we add "scissor volumes" to the 3D "rasterizer" and implement what i called a cascaded voxel clipmap.
i.e., only re-draw the missing bits of the field as the camera animates and use toroidal addressing within each cascade.
this avoids having to copy the data.
the voxelization + distance field generation process runs only over the scissor volumes, which is much faster, but the fast sweeping passes still run over the whole texture, to "connect" the new slices to the rest of the cascade.
technically, you'd also need to do this propagation across cascades, but there's a traversal trick to avoid having to do this:)
finally, we time-slice the updates by updating the 1st cascade every even frame and the other cascades alternatively every odd frame, so there's only at most one cascade being updated per frame (or none).
for dynamic objects, you need to re-run the whole pipeline (although you can also time-slice it if you're okay with some amounts of latency) and perform a union of the fields with a simple min operator.
for traversal, we simply start at the finest cascade, march the ray until we hit something (then, we're done) or until we leave the volume and then iterate to the next available cascade.
the trick is that when fetching a distance value that'd make us leave the volume, we iterate to the next cascade without advancing the ray.
this enables that each cascade can be calculated independently of the others and of geometry that would lie outside their bounding volume.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Guillaume Boissé

Guillaume Boissé Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(