May 15 β’ 12 tweets β’ 4 min read

Topic - Ridge Regression in ML ( Part 1 )

π§΅ Ridge Regression (RR) is regularization technique used in statistical modeling & ML to handle the problem of multicollinearity (high correlation) among predictor variables

May 15 β’ 8 tweets β’ 3 min read

Here is a simple concept roadmap for learning SQL as a complete beginner:

π§΅ 1. ππππ«π§ ππ‘π πππ¬π’ππ¬:

- Primary Key vs Foreign Key

- Data Types

- Database diagrams

- Tables

- Records and Fields

- Naming standards for tables and fields

May 14 β’ 12 tweets β’ 3 min read

Topic - Bias Variance Trade-off in ML

π§΅ πΉ If ML model is not accurate. it can make predictions error & these prediction errors are usually known as Bias & Variance

πΉ In ML these errors will alway be present as there is always slight difference between model predictions & actual predictions

May 13 β’ 12 tweets β’ 3 min read

Topic - Polynomial Regression in ML

π§΅ Polynomial regression is type of regression analysis where relationship between independent variable(s) and dependent variable is modeled as an nth-degree polynomial function.

It is an extension of simple linear regression which assumes linear relationship between the variable

May 13 β’ 6 tweets β’ 2 min read

A Thread π§΅ CONCAT_WS() function in SQL is used to concatenate multiple strings into single string with specified separator between each string

"WS" stands for "with separator." This function is commonly used to construct strings contain multiple values such create comma-separated list

May 12 β’ 12 tweets β’ 4 min read

Topic - Mini-Batch Gradient Descent

A Thread π§΅ Mini-batch gradient descent is a variation of the gradient descent optimization algorithm used in ML & DL

It is designed to address the limitations of two other variants: BGD and SGD

May 11 β’ 4 tweets β’ 2 min read

May 11 β’ 10 tweets β’ 4 min read

Topic - Stochastic Gradient Descent ( SGD )

A Thread π§΅ SGD is an optimization algorithm often used in machine learning applications to find the model parameters that correspond to the best fit between predicted and actual outputs. Itβs an inexact but powerful technique.

May 10 β’ 14 tweets β’ 5 min read

Topic - Batch Gradient Descent (BGD)

A Thread π§΅ (BGD) is optimization algorithm commonly used in ML & optimization problems to minimize the cost function or maximize the objective function

It is type of GD algorithm that update model parameters by taking the average gradient of entire training dataset at each iteration

Apr 30 β’ 6 tweets β’ 3 min read

Topic -- Principle Component Analysis

(PCA) Part 1 PCA statistics is science of analyzing all the dimension & reducing them as much as possible while preserving exact information

You can monitor multi-dimensional data (can visualize in 2D or 3D dimension) over any platform using the Principal Component Method of factor analysis.

Apr 29 β’ 6 tweets β’ 4 min read

If you are someone who is learning SQL, then this list can be helpful to you.

SQL - END-TO-END Learning Resources and Guide π ( Must Read ) 1. SQL for Data Science

πlnkd.in/dw4aAC-q

2. Databases and SQL for Data Science with Python

πlnkd.in/d2psKJd9

Apr 29 β’ 6 tweets β’ 3 min read

Topic -- Curse of Dimensionality

π§΅ Refers to phenomenon where the performance of ML algorithms deteriorates as No. of dimension or feature of input data β¬οΈ

This is because the volume of space increases exponentially with No. of dimension which causes data to become sparse & distance btwn data point to increase

Apr 28 β’ 10 tweets β’ 3 min read

Topic - Feature Construction & Feature Splitting

A Thread π§΅ Feature construction is a critical aspect of feature engineering, which involves the process of creating new features or transforming existing ones to improve the performance of machine learning models.

Apr 27 β’ 4 tweets β’ 2 min read

Apr 27 β’ 8 tweets β’ 2 min read

π§΅ π―Are NULL values same as that of zero or a blank spaceβ

πΊA NULL value is not at all same as that of zero or a blank space.

πΊNULL value represents a value which is unavailable, unknown, assigned or not applicable whereas a zero is a number and blank space is a character.

Apr 27 β’ 8 tweets β’ 3 min read

Day 44 of #100dayswithmachinelearning

Topic -- Outlier Detection using Percentile Method

A Thread π§΅ Outliers are a very important and crucial aspect of Data Analysis.

It can be treated in different ways, such as trimming, capping, discretization, or by treating them as missing values.

Apr 26 β’ 8 tweets β’ 3 min read

Topic - Outlier Detection and Removal using the IQR Method

A Thread π§΅ The IQR (Interquartile Range) method is a common approach for detecting and removing outliers from a dataset

IQR is the difference between 75th and 25th Quartile

we can remove the bad data from left or right skewed distribution as well for that statistics have introduced IQR

Apr 25 β’ 7 tweets β’ 3 min read

Topic -- Outlier Detection & Removal using Z-score Method

A Thread π§΅ The Z-score method is statistical approach used for detecting & removing outlier in dataset. An outlier is observation that lies far away from other observation in dataset. Such observations can significantly affect statistical properties of dataset & lead to erroneous conclusion

Apr 16 β’ 8 tweets β’ 3 min read

A Thread π§΅ 1β£ HR Analytics Dashboard

linkedin.com/posts/sachintuβ¦

Apr 16 β’ 7 tweets β’ 3 min read

Topic - Handling Mixed Variable in Feature Engineering π¨βπ»

A Thread π§΅ Handling missing Variable is very important as many machine learning algorithms do not support data with missing values. If you have missing values in the dataset, it can cause errors and poor performance with some machine learning algorithms.

Apr 15 β’ 8 tweets β’ 3 min read

A Thread π§΅ βΆοΈ( Q1-Q5 )

https://twitter.com/Sachintukumar/status/1643975772917633026?s=20