Sachin Kumar Profile picture
May 11 10 tweets 4 min read Twitter logo Read on Twitter
Day 58 of #100DayswithMachineLearning

Topic - Stochastic Gradient Descent ( SGD )

A Thread 🧵 Image
SGD is an optimization algorithm often used in machine learning applications to find the model parameters that correspond to the best fit between predicted and actual outputs. It’s an inexact but powerful technique.
Saddle point or minimax point is point on the surface of graph of function where slopes (derivatives) in orthogonal directions are all zero (a critical point), but which is not local extremum of function

A saddle point (in red) on graph of z = x2 − y2 (hyperbolic paraboloid) Image
Cost or loss function is function to be minimized (or maximized) by varying decision variable

They tend to minimize diff between actual & predicted output by adjusting model parameters(like weights & biases for neural network decision rules for random forest or gradient boosting
Stochastic Gradient Descent (SGD) is a variant of the Gradient Descent algorithm used for optimizing machine learning models. In this variant, only one random training example is used to calculate the gradient and update the parameters at each iteration
Advantages:

Speed: SGD is faster than other variants of Gradient Descent such as Batch Gradient Descent and Mini-Batch Gradient Descent since it uses only one example to update the parameters.

Memory Efficiency

Avoidance of Local Minima
Disadvantages:

Noisy updates: The updates in SGD are noisy and have a high variance, which can make the optimization process less stable and lead to oscillations around the minimum.

Slow Convergence

Sensitivity to Learning Rate

Less Accurate
This cycle of taking the values and adjusting them based on different parameters in order to reduce the loss function is called back-propagation
@geeksforgeeks

geeksforgeeks.org/ml-stochastic-…
Mini-Batch Gradient Descent: Parameters are updated after computing the gradient of  the error with respect to a subset of the training set

@github Notebook

github.com/sachinkumar160…
📷If this thread was helpful to you

1. Follow me @Sachintukumar
for daily content

2. Connect with me on Linkedin: linkedin.com/in/sachintukum…

3. RT tweet below to share it with your friend

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Sachin Kumar

Sachin Kumar Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Sachintukumar

May 13
🔸CONCAT_WS() in SQL { Very Helpful }

A Thread 🧵 Image
CONCAT_WS() function in SQL is used to concatenate multiple strings into single string with specified separator between each string

"WS" stands for "with separator." This function is commonly used to construct strings contain multiple values such create comma-separated list
The syntax for CONCAT_WS() is as follows:

🔸CONCAT_WS(separator, string1, string2, ..., stringN)
Read 6 tweets
May 12
Day 59 of #100DayswithMachinelearning

Topic - Mini-Batch Gradient Descent

A Thread 🧵 Image
Mini-batch gradient descent is a variation of the gradient descent optimization algorithm used in ML & DL

It is designed to address the limitations of two other variants: BGD and SGD Image
In BGD the entire training dataset is used to compute the gradient of the cost function for each iteration.

This approach guarantees convergence to the global minimum but can be computationally expensive, especially for large datasets
Read 12 tweets
May 10
Day 57 of #100dayswithMachinelearning

Topic - Batch Gradient Descent (BGD)

A Thread 🧵 Image
(BGD) is optimization algorithm commonly used in ML & optimization problems to minimize the cost function or maximize the objective function

It is type of GD algorithm that update model parameters by taking the average gradient of entire training dataset at each iteration Image
Here's how the BGD algorithm works:

1) Initialize the model parameters: Start by initializing the model parameters, such as weights and biases, with random values. Image
Read 14 tweets
Apr 30
Day 47 of #100dayswithmachinelearning

Topic -- Principle Component Analysis
(PCA) Part 1 Image
PCA statistics is science of analyzing all the dimension & reducing them as much as possible while preserving exact information

You can monitor multi-dimensional data (can visualize in 2D or 3D dimension) over any platform using the Principal Component Method of factor analysis.
Step by step explanation of Principal Component Analysis

STANDARDIZATION
COVARIANCE MATRIX COMPUTATION
FEATURE VECTOR
RECAST THE DATA ALONG THE PRINCIPAL COMPONENTS AXES Image
Read 6 tweets
Apr 29
Hello Folks 👨‍💻

If you are someone who is learning SQL, then this list can be helpful to you.

SQL - END-TO-END Learning Resources and Guide 👇 ( Must Read ) Image
1. SQL for Data Science

🔗lnkd.in/dw4aAC-q

2. Databases and SQL for Data Science with Python

🔗lnkd.in/d2psKJd9
3. Scripting with Python and SQL for Data Engineering

🔗lnkd.in/dD3cxWAJ

4. Introduction to Structured Query Language (SQL)

🔗lnkd.in/dvB6eA9m
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(