Tristan Profile picture
11 Oct, 15 tweets, 7 min read
Tesla has added new voxel 3D birdseye view outputs and it's pretty amazing!

Nice of them to start merging some bits of FSD into the normal firmware in 2021.36 so we can play with the perception side 🙂

Thanks to @greentheonly for the help!
Most of the critical FSD bits are missing in the normal firmware. These outputs aren't normally running but with some tricks we can enable it.

This seems to be the general solution to handling unpredictable scenarios such as the Seattle monorail pillars or overhanging shrubbery.
The nets predict the location of static objects in the space around them via a dense grid of probabilities.

The output is a 384x255x12 dense grid of probabilities. Each cube seems to be ~0.33 meters and currently outputs predictions ~100 meters in front of the vehicle.
This is similar to the previous single camera depth model but given it the birdseye view treatment.

See our previous tweets on that at:
Here's an example of a post in the middle of a narrow road. fn.lc/s/depthrender/…

Before this Tesla would have to manually label this as part of the training set to ensure the car doesn't run into it
Here's a full intersection, outputs seem quite reasonable in all directions. You can see the 4 buildings on each side, the curbs ahead as well as the trees by the side of the road.

fn.lc/s/depthrender/…
Here's a hard median with a post that it correctly identifies. Adds another level of safety to ensure that the car doesn't drive over hard curbs.

This example highlights that the model ignores cares and only shows the static objects. fn.lc/s/depthrender/…
Pretty impressive how much detail it can capture of trees and the landscaping on the side of the road.

fn.lc/s/depthrender/…
Here's the view leaving a parking lot. Clearly distinguishes where the road is vs the T-bone style intersection.

fn.lc/s/depthrender/…
You can check out the raw data at fn.lc/s/depthrender/…

And the corresponding time synced video is at

The uploaded voxel frames are every half second for practicality reasons (in the car it's much higher FPS)
I suspect they're taking the same offline 3D models they use to label the birdseye view training data (as seen during AI day) and converting it to voxel data to train a net.

It's a very clever solution, kudos to the engineers who worked on this.
I'm very curious what the model architecture looks like and how much it differs from the other birdseye view nets.

The 3D convolutional NNs used here are similar to what could potentially be used merge radar with vision if Tesla can get access to the raw Conti radar data.
The 3D birdseye view is a fair bit lower resolution than LIDAR but very impressive and achieves much of the same purpose.
These models are outputting probabilities so you can see where the model is confident vs not.

I don't quite know what the scale is here but having a 75% threshold seems to work pretty well. For all these renders I only show voxels that are above the target threshold
Rendering voxel data in a browser is pretty tricky so if anyone wants to help with more advanced visualizations let me know :)

I see why Elon said they were having issues visualizing it

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tristan

Tristan Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @rice_fry

12 Apr
We recently got some insight into how Tesla is going to replace radar in the recent firmware updates + some nifty ML model techniques

⬇️ Thread
From the binaries we can see that they've added velocity and acceleration outputs. These predictions in addition to the existing xyz outputs give much of the same information that radar traditionally provides
(distance + velocity + acceleration).
For autosteer on city streets, you need to know the velocity and acceleration of cars in all directions but radar is only pointing forward. If it's accurate enough to make a left turn, radar is probably unnecessary for the most part.
Read 15 tweets
10 Apr
Got a sample of the Tesla Insurance telemetry data. The insurance records are on a per drive basis. Here's the fields:

* Unique Drive ID
* Record Version
* Car Firmware Version
* Driver Profile Name
* Start / End Time
* Drive Duration
* Start / End Odometer

(1/2)
* # of Autopilot Strikeouts
* # of Forward Collision Warnings
* # of Lane Departure Warnings
* # of ABS activations (All & User)
* Time spent within 1s of car in front
* Time spent within 3s of car in front
* Acceleration Variance
* Service Mode
* Delivered

(2/2)
There's lot of basic stuff which insurance companies can get via companion apps/dongles but there's a lot of deep insights into driver behavior which Tesla can get but others cannot.

I bet a lot of insurance companies would love to get their hands on this kind of data
Read 11 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(