End-to-end approach to self-driving πŸŽ₯ πŸ•ΈοΈ πŸ•ΉοΈ

I recently wrote about the classical software architecture for a self-driving car. The end-to-end approach is an interesting alternative.

The idea is to go directly from images to the control commands.

Let me tell you more... πŸ‘‡ Image
This approach is actually very old, dating back to 1989 and the ALVINN model by CMU. It is a 3-layer neural network using camera images and a laser range finder.

Again, this was back in 1989... 🀯

papers.nips.cc/paper/1988/fil… Image
A modern example is Nvidia's PilotNet - a Convolutional Neural Network with 250M parameters, which takes as input the raw camera image and predicts directly the steering angle of the car.

No explicit lane boundary or freespace detection needed!

arxiv.org/abs/1604.07316 Image
How to train such network?

Easy! When a human drives the car, we can record the camera images and the actual steering angle as ground truth πŸ€·β€β™‚οΈ

The network will then learn to predict a steering angle similar to what the human driver chose - this is called immitation learning.
Take a look at this video to see the system in action. Around 8 minutes you can see what the network actually "sees".

Now, to be clear - this is not really a full self-driving car, but only a model that does the steering! There is much more you need to do to actually let the car drive by itself:

β–ͺ️ Longitudinal control (acceleration and breaking)
β–ͺ️ Lane changes
β–ͺ️ Emergency maneuvers
A similar approach is implemented by Comma AI in their newest version of Openpilot.

They train a version of EfficientNet combined with a network that predicts the trajectory the car needs to drive.

blog.comma.ai/end-to-end-lat… Image
The good βœ…

The advantage of end-to-end networks is that you can get small and efficient models. The net will only focus on solving the final task and no intermediate representations.

Collecting of training data is fairly easy as well - no manual labeling!
The bad ❌

The disadvantage is that it will be very difficult to understand why the net makes some mistakes - we don't have any explicit intermediate representations, like lane boundaries.

It will also need lots of data to cover all possible scenarios on the road.
There is an interesting paper by Prof. Shashua, CEO of Mobileye, arguing that in order to reach very high accuracy, end-to-end methods will require exponentially more training samples, than the more modular approaches.

arxiv.org/abs/1604.06915
The truth lies somewhere in the middle...

There are now approaches that try to combine the advantages of both the modular and the end-to-end approaches. I recommend watching this great talk by Prof. Raquel Urtasun from Uber ATG.

Read more about the classical software architecture for self-driving cars in my other thread:

If you liked this thread and want to read more about self-driving cars and machine learning follow me @haltakov!

I have many more threads like this planned πŸ˜ƒ

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with Vladimir Haltakov

Vladimir Haltakov Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @haltakov

21 Apr
Computer vision for self-driving cars 🧠 πŸš™

There are different computer vision problems you need to solve in a self-driving car.

β–ͺ️ Object detection
β–ͺ️ Lane detection
β–ͺ️ Drivable space detection
β–ͺ️ Semantic segmentation
β–ͺ️ Depth estimation
β–ͺ️ Visual odometry

Details πŸ‘‡ Image
Object Detection πŸš—πŸšΆβ€β™‚οΈπŸš¦πŸ›‘

One of the most fundamental tasks - we need to know where other cars and people are, what signs, traffic lights and road markings need to be considered. Objects are identified by 2D or 3D bounding boxes.

Relevant methods: R-CNN, Fast(er) R-CNN, YOLO Image
Distance Estimation πŸ“

After you know what objects are present and where they are in the image, you need to know where they are in the 3D world.

Since the camera is a 2D sensor you need to first estimate the distance to the objects.

Relevant methods: Kalman Filter, Deep SORT Image
Read 11 tweets
20 Apr
Interesting results from the small experiment... πŸ˜„

This was actually a study reported in a Nature paper. Most people offer additive solutions (adding bricks) instead of substractive solutions (removing the pillar).

More details πŸ‘‡

In this example, the most elegant solution is to remove the pillar completely and let the roof lie on the block. It will be simpler, more stable and won't cost anything.

Some people quickly dismiss this option assuming this is not allowed, but it actualy is πŸ˜ƒ
This isn't because people don't recognize the value, but because many don't consider the substractive solution at all. Me included πŸ™‹β€β™‚οΈ

The paper shows that this happens a lot in real life, especially in regulation. People tend to add new rules, instead of removing old ones.
Read 5 tweets
20 Apr
Sensors for self-driving cars πŸŽ₯ 🧠 πŸš™

There are 3 main sensors types used in self-driving cars for environment perception:
β–ͺ️ Camera
β–ͺ️ Radar
β–ͺ️ Lidar

They all have different advantages and disadvantages. Read below to learn more about them.

Thread πŸ‘‡
1️⃣ Camera

The camera is arguably the most important sensor - the camera images contain the most information compared to the other sensors.

Modern cars across all self-driving levels have many cameras for a 360Β° coverage:
β–ͺ️ BMW 8 series - 7
β–ͺ️ Tesla - 8
β–ͺ️ Waymo - 29 ImageImageImage
This is an example from Tesla of what a typical camera sees and detects in the scene. Videos from other companies look very similar.

Read 9 tweets
15 Apr
Open-Source Self-Driving Car Simulators πŸ•ΉοΈ πŸš™

You want to play around with self-driving car software and gather some experience? Check out these open-source self-driving car simulators!

Details below πŸ‘‡
CARLA

CARLA is a great software developed by Intel. You can use it to work on any step of the pipeline, model different sensors, maps, traffic. It also integrates with ROS.

carla.org
Deepdive from Voyage

Another great simulator by Voyage - the self-driving company that was recently acuired by Cruise. It is built on the Unreal Engine and supports lots of features.

deepdrive.voyage.auto
Read 11 tweets
14 Apr
Useful online courses on self-driving cars 🧠 πŸš™

Here is a list of useful courses if you want to learn about software for self-driving cars.

Some of the courses are paid, but all platforms offer regular discounts and financial aids if you can't affor them.

Thread πŸ‘‡
Udacity Self-Driving Car Nanodegree

This program offers hands on experience on all kind of relevant topics like perception, localization, planning and control. It takes a lot of time, but it is worth it.

In the end, your code is run on an actual car!

udacity.com/course/self-dr…
Udacity Introduction Course by Apollo

This is a beginner course explaining the different components of a self-driving car. Apollo is the self-driving division of Baidu.

udacity.com/course/self-dr…
Read 8 tweets
13 Apr
How Does a Self-Driving Car Work? πŸ”§ 🧠 πŸš™

This is classical self-driving car software stack. Nowadays, all steps in the pipeline are dominated by machine learning.

Read below for details on each step πŸ‘‡
Sensors πŸŽ₯

There are 3 main sensors for environment perception 360Β° around the car:
β–ͺ️ Cameras
β–ͺ️ Lidars
β–ͺ️ Radars

Each sensor has different advantages and disadvantages, so combining all 3 is the best strategy to achieve maximum robustness.
Perception πŸ–ΌοΈ

The perception module processes all the sensor raw data to detect different objects, drivable space, lane boundaries, measure distance etc.

Fusing the information from different sensors usually increases the quality of the data significantly.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!