Phil Harrison Profile picture
19 Oct, 14 tweets, 7 min read
What is the lauded FSD rewrite @elonmusk is referring to? @karpathy already detailed exactly what it is in Feb 2020 (). Let's discuss...
The vector space @elonmusk mentions is a Software 2.0 code rewrite of the occupancy tracker which is now engulfed into a "Bird's Eye View" Neural Net which merges all cameras and projects features into top down view. In @karpathy's own words: "This makes a huge difference".
In this image, we really see the step change the re-write makes. Current FSD (right image) is not able predict the intersection layout, however the rewrite (centre image) almost *perfectly* predicts the exact layout of the intersection when compared to the map data (on the left).
Without the rewrite, FSD cannot take unprotected turns in an intersection because it cannot build an accurate representation of the layout. This is especially apparent at the horizon where only a few pixels of error create a lot of noise in distance mapping.
In the next example, we see how the rewrite is able to perfectly predict not only the layout of the intersection but also the features. Dividers, traffic flow, drivable space along with what the car already understands as traffic light controls, stop lines etc. Truly remarkable.
The next example seemingly shows a massive improvement to Smart Summon. The car is now able to re-project almost the entire carpark layout in bird's eye view plus accurately predict corridors, intersections and traffic flow. This is all directly output from the FSD re-write.
Having this "long term" planning ability based on a Neural Net output (i.e. FSD rewrite) means things like reverse smart summon become possible as the car can wind its way through the environment building a consistent topology as it moves.
Next, we learn a more about what @elonmusk means when he says "labelling in 3d space". There is now a dedicated NN to predict the depth of each pixel. Essentially simulating LiDAR and then using that "point cloud" to classify objects in 3d space over time.
This again proves that the benefit of LiDAR is quickly evaporating as vision based Neural Nets can produce a point cloud representation of reality with enough accuracy for autonomous vehicles. This improvement is constant due to the self supervision training mode.
Self supervision training involves predicting and then checking how closely it matches reality in the subsequent frame before adjusting the weights and trying again. Repeat this billions of times and the NN model starts to be able to detect depth per pixel very accurately.
Finally the most interesting part is to do with the policy (i.e. how to drive) component of FSD. In @karpathy's own words "This policy is still in the land of (software) 1.0". However, the only way to achieve true autonomy is to train a NN to builds a policy internally.
Hand labelling all examples for the NN to train on is onerous to say the least. This is where the Dojo may come in. It allows for a few examples to be seeded and use "self supervision" training to iterate over raw video of examples to create an optimised model.
To be clear, I don't think the imminent FSD beta release will include the Neural Net based policy but the previous perception based NN in conjunction with Dojo certainly sets the stage for this to occur.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Phil Harrison

Phil Harrison Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!