Stefan Werner Profile picture
Aug 21, 2018 10 tweets 4 min read Read on X
Back of the envelope calculation:
RTX 2080Ti: 10GRay/s @ 616GB/s mem bandwidth = 61 bytes/Ray
1 triangle, 3x 32 bit float3 vertices: 48 bytes
61 - 48 = 13 bytes left for BVH traversal

That would be under an ideal BVH that requires only 1 ray triangle intersection/ray
Compressed wide BVH (…) requires 80 Bytes per BVH node. A balanced BVH8 over 1 million triangles is 7 level deep, so we're looking at 80 bytes * 7 = 560 bytes of processed data per ray. Times ten gigarays/s = 5.6 TB/s of bandwidth just for BVH traversal.
#Volta #V100 has 12-14TB/s shared memory bandwidth (…), so 10GRays/s are plausible if most of the data fits in L1 cache/shard mem.
V100 has 80 SMs with 128KB L1/shared mem each, a total of 10MB. 10MB aren't enough to fit a 7 levels deep BVH8.
Global mem bandwidth is already exhausted with triangle data, and as the die size of #Turing is claimed to be smaller than #Volta at the same process, I don't expect there to be much room for more L1 mem.
I may very well have some errors in my calculations, so please point them out if you see them.
If I'm right though, either @nvidia has some tricks up their sleeve that I don't know of, or 10 GRays/s are only possible with small data sets and coherent rays.
And real-world performance will be limited by memory bandwidth very quickly, leaving a lot of the ray/triangle intersection hardware idle.
But I'm coming from a film standpoint, maybe for game ray tracing, mesh sizes in the 10k poly range are plenty.
Now ignoring cache size, incoherent global mem access penalty etc and blindly calculating with a 95% cache hit rate:… … claims 1-6kB of mem traffic per ray in Figure 3 - let's say 2kB avg. At 5% cache miss, that's 104 bytes/ray.
Still significantly more than the 61 bytes/ray our global memory budget is under ideal conditions.
And obviously, at this point no shading has happened yet. If memory bandwidth is the bottleneck for RTX, then ray differentials and mip mapping are a must for optimal performance.
Don't get me wrong, I'm thrilled about the wide availability of ray tracing hardware, and hope that soon we'll see this from other hardware vendors. Can't wait to have hardware under my fingers.

• • •

Missing some Tweet in this thread? You can try to force a refresh

Keep Current with Stefan Werner

Stefan Werner Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!


Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @stefan_3d

Jul 25, 2022
With @IntelGraphics Arc GPUs launching, I may or may not be sharing some of my experience developing for them here on Twitter.

Is that something you, my follower, would be interested in?
@IntelGraphics I’d be talking about GPGPU development using SYCL. For APIs like DX or Vulkan, you’ll have to ask someone else.
@IntelGraphics First, the basics: What do you need to get started?

I'm using SYCL to develop for Arc. If you don't have Arc hardware, you still can use SYCL: @IntelSoftware 's SYCL compiler supports not only many Intel iGPUs but also Nvidia and AMD GPUs!…
Read 15 tweets
Sep 24, 2020
@andrewpprice They're not the same thing. If you like, I'd be happy to explain the details to you, but it's too long to fit into a tweet.
@andrewpprice In a super dense format: The first use of ray tracing in computer graphics was proposed in 1968, and was focused on solving the visibility problem, not any light bounces.…
@andrewpprice When in the last century, people were speaking of ray tracing vs rasterisation, they usually meant Turner Whitted's method of computing global illumination through a recursive ray tree.…
Read 13 tweets
Mar 5, 2020
Thanks to Brecht for reviewing my patch, adaptive sampling has now landed in #b3d #Blender #Cycles.
For those interested in details:
It is following the approach outlined in sections 7.1.3 and 7.2 of this paper describing RenderMan:…

The error metric is from section 2.1 here:…
In layman's terms: Every other sample is written to a separate buffer. By comparing this extra buffer to the main image buffer, the renderer can estimate convergence.

Pixels receive progressively more samples until a convergence threshold or sample count limit is reached.
Read 8 tweets
Oct 30, 2019
I uploaded the slides from my presentation at #bcon19 about Cycles Volume rendering. They include speaker notes, so hopefully they are useful for those who didn't attend or watch the stream.!Ap47HIkOUU…
Keep in mind that I prepared this talk on the train to Amsterdam, only to redo it almost completely in my hotel room the night before the presentation. Neither the slides nor the sample images in it are polished.
And while I have your attention, I'd like to point to the fantastic presentation by Rob Silvestri @tangent_anim
Read 4 tweets
May 27, 2019
Ich finde dieses CDU Zitat entlarvend für alle Parteien:
"Den Grünen ist das Geschäft mit der Angst am besten gelungen. Sie haben die Angst vor dem Klimawandel in Zustimmung ummünzen können. Diese Ängste haben [..] die Ängste vor Terror und Kriminalität übertroffen."
Das traurige ist, dass uns (fast?) alle Pareien Angst vor igendeinem Buhmann machen wollen - Atomkraft, Migranten, Superreichen oder Terror. Und sich dann wundern, warum der Wähler so pessimistisch ist.

Wie wäre es mal mit postiven Zukunftsvisionen? Hoffnung machen und so?
Read 4 tweets
May 23, 2019
Ich versuche (mehr schlecht als recht) meinen Twitteraccount auf C++, #b3d, ray tracing, und Technik zu beschränken und mich aus Politik rauszuhalten.

Heute eine Ausnahme: Ein Entwurf zur Verschärfung des #WaffG, der Bastler ("Maker"), Heim- und Handwerker betrifft:
Die sogenannten Waffenverbotszonen sollen ausgeweitet werden auf Innenstädte, Schulen, etc. Klingt harmlos?
"waffenähnliche gefährliche Gegenstände gelten zum Beispiel Haushaltsmesser, Schraubendreher, Hammer und andere metallene oder scharfkantige Werkzeuge oder Holzstiele"
Wenn also der Hackspace in der Innenstadt zwei Blocks von euerer Wohnung entfernt ist, kann "mal eben schnell einen Schraubenzieher von zu Hause holen" schon ein Verstoß gegen das WaffG sein.…
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!


0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy


3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!