Congrats to #FSD team for the great progress!
Our Supervision system shares the camera-centric approach, but differs in some key elements. 1) Crowd-based HD map vs. SD map 2) Math-based vs. simulation-based driving policy 3) Multiple redundant systems vs. a single one
0/n
1/n Humans drive better when they are familiar with the road ahead. Furthermore, it is better to solve problems offline than to solve them online. Offline has more compute, knowledge of the future, optimal weather conditions, and the ability to validate quality.
2/n We can plan way in advance for curves ahead, or for occluded areas in an intersection. We could use rear camera when there's a low-sun in front, and still know the lanes ahead. But, REM maps are much more than that.
3/n With REM maps, we adjust driving style to the crowd behavior at each geographical region. This is a key aspect in generalizing our system to so many different places.
4/n Our driving policy approach is unique. In a nutshell, we specify transparent assumptions on the behavior of other road users, and then calculate analytically the worst-case. That is, we use math formulas instead of simulating many possible futures.
5/n For details on driving policy see:
6/n Tesla's perception system is based on one big "hydranet". It is a great solution. But, it is *one* great solution. There are many other great solutions. We believe in redundancy. Every piece of our system is solved by more than one approach.
7/n We use e2e deep networks as well as decomposable methods, and even good old computer vision. Every single solution will suffer from diminishing returns at some point. Multiple redundant approaches can cover for each others.
• • •
Missing some Tweet in this thread? You can try to
force a refresh