Gradient Descent is great but there are a whole bunch of problems associated with it.
Getting stuck in the local minima while browsing the solution space is one of the major issues.

A possible Solution?

SIMULATED ANNEALING

Here's a little something about it 🧵👇
The method of Simulated Annealing in Optimization is analogical to the process of Annealing in Metallurgy ⚗️🔥, hence the name.
We get stuck in the local minima because we tend to always accept a solution that seems best in shortsight. We just move in the downwards direction ⬇️ (negative gradient) and not upwards⬆️

So once we reach a point which is low but not the lowest, we may end up getting stuck.
🔹Either we can just stay there and accept current state as the solution, which is not very effective

🔹We can temporarily take a step in upwards direction that can get us out of the local minimum

And obvious enough we go with the second option.
To get out of local minimum we consider a random step in a direction.

If it improves the solution we always accept it. If not then we accept it anyways with a certain probability.

So we are accepting a bad move temporarily but it would make things better in the long run Image
T, the temperature 🌡️ parameter determines how bad a move are we willing to accept.

The higher the temperature, the more likely we are to accept a bad solution or a step upwards and vice-versa.
It doesn't end there, we also need a schedule for decreasing the value of the temperature parameter.

Can you guess the reason❓

To make sure that when we do reach near the global minimum solution we want to be less likely to accept worse solutions
There are multiple methods we can use to schedule the temperature reduction Image
This is how SIMULATED ANNEALING helps Gradient Descent to be better than usual.

Hope you enjoyed this!

We will see how to overcome some more problems of Gradient Descent.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Prashant

Prashant Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @capeandcode

26 Mar
Here are the links for all the notes that I have from the Andrew NG Machine Learning Course that I made back in 2016

This was my first exposure to #MachineLearning They helped me a lot and I hope anyone who's just starting out and prefers handwritten notes can reference these 👇 Image
Read 8 tweets
11 Feb
Ever heard of Autoencoders?

The first time I saw a Neural Network with more output neurons than in the hidden layers, I couldn't figure how it would work?!

#DeepLearning #MachineLearning
Here's a little something about them: 🧵👇
Autoencoders are unsupervised neural networks whose architecture you can picture as two funnels connect from the narrow ends.

These networks are primary focus for compression tasks of data in Machine Learning.
We feed them the data so that they can learn the most important features, a smaller representation while keep the integrity of the data.

Later when someone needs, can just take that small representation and recreate the original, just like a zip file.📥
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!