Tweet

Jean de Nyandwi

7 Nov, 12 tweets, 4 min read

Machine Learning weekly highlights 💡

◆3 threads from me
◆3 threads from others
◆2 news from the ML communities

3 POSTS FROM ME

This week, I explained Tom Mitchell's classical definition of machine learning, why it is hard to train neural networks, and talked about some recipes for training and debugging neuralnets.

https://twitter.com/Jeande_d/status/1455872282899877894?s=20

Here is the meaning of Tom's definition of machine learning

https://twitter.com/Jeande_d/status/1455872282899877894?s=20

https://twitter.com/Jeande_d/status/1455486459876569091

Here is the reason why it is hard to train neural networks

https://twitter.com/Jeande_d/status/1455486459876569091

https://twitter.com/Jeande_d/status/1456590365222260744?s=20

And here are recipes for training and debugging neural networks

https://twitter.com/Jeande_d/status/1456590365222260744?s=20

@gusthema

2 POSTS FROM OTHERS

@gusthema had a great thread about working with large files using Pandas 🐼

https://twitter.com/gusthema/status/1456607188277936132?s=20

@haltakov

Another amazing thread that I came across was about creating arts with machine learning by @haltakov.

https://twitter.com/haltakov/status/1455982555610636291?s=20

@JanaSunrise

Git is an important tool for machine learning engineers.

@JanaSunrise made a great thread about it.

https://twitter.com/JanaSunrise/status/1456478904869277705?s=20

@fchollet

2 NEWS FROM THE ML COMMUNITY

TensorFlow 2.7 is now released. There are new updates and changes about tf.keras, TF Core API, TensorFlow Datasets, and TensorFlow Lite.

TF 2.7 also brings new debugging experiences. @fchollet had a great thread about it.

https://twitter.com/fchollet/status/1456393128756191233?s=20

@sama

The last news from the community is from OpenAI. They trained a system to solve math problems.

Their system can "solve about 90% as many problems as real kids." They released the paper, some samples, and datasets.

More from @sama

https://twitter.com/sama/status/1454191236210839555?s=20

That's it from the machine learning community this week.

I plan to keep doing these weekly highlights and in the upcoming weeks, I will also add more news from other channels beyond Twitter.

@Jeande_d

Thanks for reading!

Follow @Jeande_d for more machine learning content and you are welcome to share the news with other people who you think might benefit from it.

Until the next week, stay safe!

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @Jeande_d

Jean de Nyandwi

@Jeande_d

11 Nov

The initial loss value that you should expect to get when using softmax activation in the last layer of the neural network:

Initial loss = ln(number_of_classes), ln being a natural logarithm.

Example:

last_layer = api.layers.dense(10, activation='softmax')

# number of classes = 10
initial_loss = ln(10) #2.302

Understanding this is important when it comes to debugging the network. If you see a loss of 4.5 when you have 10 classes, there is something wrong.

Also, the reported loss on the first training epoch is the average loss of the whole batch.

Thus, you may instead get the initial loss less than ln(number_of_classes) because you are training in batches. And it is a good thing.

Read 4 tweets

Jean de Nyandwi

@Jeande_d

10 Nov

The below illustration shows early stopping, one of the effective and simplest regularization techniques used in training neural networks.

A thread on the idea behind early stopping, why it works, and why you should always use it...🧵

Usually, during training, the training loss will decrease gradually, and if everything goes well on the validation side, validation loss will decrease too.

When the validation loss hits the local minimum point, it will start to increase again. Which is a signal of overfitting.

How can we stop the training just right before the validation loss rise again? Or before the validation accuracy starts decreasing?

That's the motivation for early stopping.

With early stopping, we can stop the training when there are no improvements in the validation metrics.

Read 15 tweets

Jean de Nyandwi

@Jeande_d

5 Nov

One of the things that makes training neural networks hard is the number of choices that we have to make before & during training.

Here is a training guideline covering:

◆Architectural choice
◆Activations
◆Losses
◆Optimizers
◆Batch size
◆Training & debugging recipes

🧵🧵

1. ARCHITECTURAL CHOICE

The choice of neural network architecture is primarily guided by data and the problem at hand.

Unless you are researching a new architecture, here are the popular conventions:

◆Tabular data: Feedforward networks (or Multi-layer perceptrons)
◆Images: 2D Convolutional neural networks (Convnets), Vision-transformers(ongoing research)

Read 28 tweets

Jean de Nyandwi

@Jeande_d

2 Nov

Why is so hard to train neural networks?

Neural networks are hard to train. The more they go deep, the more they are likely to suffer from unstable gradients.

A thread 🧵🧵

Gradients can either explode or vanish, and neither of those is a good thing for the training of our network.

The vanishing gradients problem results in the network taking too long to train(learning will be very slow), and the exploding gradients cause the gradients to be very large.

Read 9 tweets

Jean de Nyandwi

@Jeande_d

1 Nov

How to think about precision and recall:

Precision: What is the percentage of positive predictions that are actually positive?

Recall: What is the percentage of actual positives that were predicted correctly?

The fewer false positives, the higher the precision. Vice-versa.

The fewer false negatives, the higher the recall. Vice-versa.

How do you increase precision? Reduce false positives.

It can depend on the problem, but generally, that might mean fixing the labels of those negative samples(being predicted as positives) or adding more of them in the training data.

Read 11 tweets

Jean de Nyandwi

@Jeande_d

28 Oct

All in just one repository:

◆Data visualization with Matplotlib & Seaborn
◆Data preprocessing with Pandas
◆Classical machine learning with Scikit-Learn: From linear models, trees, ensemble models to PCA
◆Neural networks with TensorFlow & Keras: ConvNets, RNNs, BERT, etc...

You can get all of the above here:

github.com/Nyandwi/machin…

View the notebooks easily here:

nbviewer.org/github/Nyandwi…

Read 5 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Thank you for your support!

Share this page!

Jean de Nyandwi

Try unrolling a thread yourself!

More from @Jeande_d

Jean de Nyandwi

Jean de Nyandwi

Jean de Nyandwi

Jean de Nyandwi

Jean de Nyandwi

Jean de Nyandwi

Did Thread Reader help you today?

Like this author's thread?