What are typical challenges when training a deep neural networks ⁉️

▪️ Overfitting
▪️ Underfitting
▪️ Lack of training data
▪️ Vanishing gradients
▪️ Exploding gradients
▪️ Dead ReLUs
▪️ Network architecture design
▪️ Hyperparameter tuning

How to solve them 👇
Overfitting 🐘

Your model performs well during training, but poorly during test.

Possible solutions:
- Reduce the size of your model
- Add more data
- Increase dropout
- Stop the training early
- Add regularization to your loss
- Decrease batch size
Underfitting 🐁

You model performs poorly both during training and test.

Possible solutions:
- Increase the size of your model
- Add more data
- Train for a longer time
- Start with a pre-trained network
Lack of training data 🤷‍♂️

Deep learning algorithms are hungry for data compared to classical ML methods.

Possible solutions:
- Get more data (😂)
- Use data augmentation
- Use a pre-trained network and fine tune for your problem
- Try transfer learning
Vanishing Gradients 📉

During training the gradients in the first layers become small or 0. Learning is slow or the net doesn't learn at all.

Possible solutions:
- Use ReLU, which doesn't saturate in the positive direction
- Add residual/skip connections
- Batch normalization
Exploding Gradients 📈

Gradients become too big, training is unstable.

Possible solutions:
- Decrease the learning rate
- Use saturating activation functions, like sigmoid or tanh
- Gradient clipping
- Batch normalization
Dead ReLUs 😵

When using ReLU as activation function, a large gradient can knock off the weight of certain neurons so that they output 0 for all input data. They become useless.

Possible solutions:
- Decrease learning rate
- Use Leaky ReLU, ELU or some other variant
Network Architecture Design 🕸️

What and how many layers? How many neurons? Designing a network architecture may not be trivial.

Possible approaches:
- Find existing architectures for similar problems
- Trail and error
- User Network Architecture Search (en.wikipedia.org/wiki/Neural_ar…)
Hyperparameter Tuning 🔧

Finding the right hyperparameters for your problem is not easy.

Possible approaches:
- Check out existing parametrizations for similar problems
- Inspect your data and play around
- Grid search
- Bayesian optimization
- User Auto ML
If you wan to get better at training neural networks, this article by @karpathy is a must read!

karpathy.github.io/2019/04/25/rec…
This is my answer to my question in the Tweet below. This list is no complete, so make sure you also check out the replies below. There are many other very interesting answers!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Vladimir Haltakov

Vladimir Haltakov Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @haltakov

28 Dec 20
You are feeling overwhelmed when learning something new? 😫

There is so much information out there and you don't know where to start? 🥴

Here is my strategy to learn new concepts that has helped me a lot in my career...

👇 Thread 👇
The problem with complex topics? 🤔

Today, the problem is not the availability of the information, but its discovery! 🔭

You need to avoid going down the rabbit whole, before you are sure this is the right rabbit hole 😀

Learn to focus and prioritize how to spend your time!
Get a rough overview 🗺️

Research about the topic you are trying to learn and get a rough idea of the existing concepts. Don't try to understand everything yet!

The goal is to only have an overview of what is out there.

Survey papers about a specific topic are a good example.
Read 7 tweets
27 Dec 20
Artificial Intelligence and Machine Learning trends in 2020 🧠🤖

Short overview of the fields where AI and ML is growing fast.

👇 Thread 👇
Robotics 🤖

Traditional robotics algorithms like localization, mapping, path planning and vehicle/robot control are being successfully replaced by AI versions.

Reinforcement learning is also a big topic here!
Computer Vision 📷

Computer vision grew massively in the last years after great improvements in deep learning. This is one of the fields that benefited most from CNNs.

While computer vision problems start getting commoditized now, there are still many interesting challengec.
Read 6 tweets
9 Nov 20
Self-driving car engineer roles - Big Data Engineer 💽

Self-driving cars have lots of cameras, lidars and radars. Waymo currently has 29 cameras on a single vehicle! The cars generate huge amounts of data, easily more than 1 GB/s. This data needs to be processed...

Thread 👇
Problems to work on 🤔

The big data engineer needs to design and implement efficient storage and data processing pipelines to handle such large amounts of data.

The data also needs to be made available to the developers in a way that they can efficiently get to what they need.
Data 💾

Imagine that the self-driving car is recording data at a rate of 1 GB/s. Going on a test drive for 4 hours means that you'll collect more than 14 TB of data!

There are specialized loggers that can handle such rates, like this beast for example: vigem.de/en/content/pro…
Read 6 tweets
28 Oct 20
Self-driving car engineer roles - Computer Vision Engineer 👀

The camera is one of the most important sensors! It is not always the most accurate one, but it can provide much more data than a lidar or radar. Extracting this data is the job of the CV engineer.

Thread 👇
Problems to work on 🤔

Here some typical object classes that need to be detected and classified.

🛑 Traffic signs
🚦 Traffic lights
🚘 Vehicles
🚶 Pedestrians
🦌 Animals
🛣️ Lane markings
🏔️ Landmarks
🚧 Construction zones
🧱 Obstacles
🚔 Police cars
Distance estimation 📏

Detecting an object is not enough, though. You also want to know how far the object is from the car. While the detection part is dominated by deep learning, the traditional CV methods (e.g. Kalman Filter) are still very useful for distance estimation.
Read 10 tweets
27 Oct 20
Self-driving car engineer roles - Software Engineer 💻

There are many specialized roles in a self-driving car project, like ML or CV engineers. However, every projects needs lots of good software devs - you can enter the industry even without specific knowledge!

Thread 👇
Problems to work on 🤔

Some problems that software developers work on to build a self-driving car (the list is not exhaustive):
- HMI
- Operating system
- Logging and tracing
- Communication between ECUs
- Internal frameworks and libraries
- Implementing diagnostic interfaces
Software engineers also work closely with many of the more specialized roles.

For example with Machine Learning engineers to implement models on the ECU or with Vehicle Control engineers to get their algos working efficiently.

And tooling and testing are huge separate topic!
Read 8 tweets
27 Oct 20
Self-driving car approaches 🧠🚗

Some interesting self-driving news lately:
- Waymo launching test fleet without safety driver
- Tesla launching a beta of their Full Self-Driving
- Mercedes announcing a level 3 traffic jam pilot for 2021

There are 3 very different approaches 👇
1️⃣ "Everything that fits" approach.

This is Waymo's approach, but other companies like Cruise, Argo, Aurora, Uber, Zoox have a similar strategy.

Fit as much sensors as possible on the car, build high-definition maps of the environment and throw in lots of compute power.
Check out some images of these cars - they all have multiple lidars, cameras and radars all around the car. Waymo now has 29 cameras! 😲

They are not really integrated in a consumer oriented way, but it should be fine for a robotaxi.
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!