Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Naina Chaturvedi

Sep 24, 2021 • 17 tweets • 9 min read • Read on X

Scrolly

Pandas is a fast, powerful, flexible and open source data analysis and manipulation tool.

A Mega thread 🧵covering 10 amazing Pandas hacks and how to efficiently use it(with Code Implementation)👇🏻
#Python #DataScientist #Programming #MachineLearning #100DaysofCode #DataScience

1/ Indexing data frames
Indexing means to selecting all/particular rows and columns of data from a DataFrame. In pandas it can be done using two constructs —
.loc() : location based
It has methods like scalar label, list of labels, slice object etc
.iloc() : Interger based

2/ Slicing data frames
In order to slice by labels you can use loc() attribute of the DataFrame.

Implementation —

3/ Filtering data frames
Using Filter you can subset rows or columns of dataframe according to labels in the specified index of the data.

Implementation —

4/ Transforming Data Frames
Pandas Transform helps in creating a DataFrame with transformed values and has the same axis length as its own.

Implementation —

5/ Adding Rows — append()

Implementation —

6. Hierarchical indexing
Hierarchical indexing is the technique in which we set more than one column name as the index. set_index() function is used for when doing hierarchical indexing.

Implementation —

7/ Merging data frames
Concat() Function is used to merge the dataframes.

Implementation --

8/ Joins —
It helps us merging DataFrames. Types of Joins —
Inner Join :- Returns records that have matching values in both tables.
Left Join :- Returns all the rows from the left table that are specified in the left outer join clause, not just the rows in which the columns match

9/ Right Join :- Returns all records from the right table, and the matched records from the left table.
Full Join :- Returns all records when there is a match in either left or right table.
Cross Join :- Returns all possible combinations of rows from two tables.

Implementation-

10/ Pivot Tables
It creates a Spreadsheet style pivot table as a DataFrame.

Implementation -

11/ Aggregate Functions
Pandas has a number of aggregating functions that reduce the dimension of the grouped object.
count()
value_count()
mean()
median()
sum()
min()
max()
std()
var()
describe()
sem()

Implementation -

12/ I write quality threads on Data Science, Python, Programming, Machine Learning and AI in my free time. If you like this thread, then give a follow.

13/ Want to learn complete hands on #Python with Code Implementation? Try this ( 89% off) :
udemy.com/course/complet…

https://twitter.com/NainaChaturved8/status/1438819014130618372

14/ 11 Amazing Data Science Techniques You Should Know!
With Code Implementation.

https://twitter.com/NainaChaturved8/status/1438819014130618372

https://twitter.com/NainaChaturved8/status/1438236351388733450

15/ 10 Efficient Code and Optimization techniques for Python with Code Implementation.

https://twitter.com/NainaChaturved8/status/1438236351388733450

16/ Join us in the 60 days of Data Science and Machine Learning journey --
medium.datadriveninvestor.com/day-6-60-days-…
#Python #TensorFlow #DataScientist #Programming #Coding #100DaysofCode #DataScience #AI #MachineLearning #hubofml #Pytorch #Pandas

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @NainaChaturved8

Naina Chaturvedi

@NainaChaturved8

Dec 16, 2023

✅Attention Mechanism in Transformers- Explained in Simple terms.
A quick thread 👇🏻🧵
#MachineLearning #Coding #100DaysofCode #deeplearning #DataScience
PC : Research Gate

1/ Attention mechanism calculates attention scores between all pairs of tokens in a sequence. These scores are then used to compute weighted representations of each token based on its relationship with other tokens in the sequence.

2/ This process generates context-aware representations for each token, allowing the model to consider both the token's own information and information from other tokens.

Read 23 tweets

Naina Chaturvedi

@NainaChaturved8

Nov 13, 2023

✅Regularization is a technique used in ML to prevent overfitting and improve the generalization of a model - Explained in Simple terms.
A quick thread 👇🏻🧵
#MachineLearning #Coding #100DaysofCode #deeplearning #DataScience
PC : Research Gate

1/ Regularization is a technique in machine learning used to prevent overfitting by adding a penalty term to the model's loss function. The penalty discourages overly complex models and promotes simpler ones, improving generalization to new, unseen data.

2/ When to use regularization:

Use regularization when you suspect that your model is overfitting the training data.
Use it when dealing with high-dimensional datasets where the number of features is comparable to or greater than the number of samples.

Read 18 tweets

Naina Chaturvedi

@NainaChaturved8

Nov 9, 2023

✅XGBoost is a powerful and efficient gradient boosting library designed for ML tasks, specifically for supervised learning problems- Explained in Simple terms.
A quick thread 🧵👇🏻
#MachineLearning #Coding #100DaysofCode #deeplearning #DataScience
PC : Research Gate

1/ XGBoost is ensemble learning method that combines multiple decision trees into a strong predictive model. It builds decision trees sequentially, where each tree corrects errors of previous ones. XGBoost optimizes a differentiable loss function to minimize prediction errors.

2/ When to Use XGBoost:

Use XGBoost when you need a highly accurate predictive model, especially in situations where other algorithms may struggle with complex patterns and relationships in the data.

Read 25 tweets

Naina Chaturvedi

@NainaChaturved8

Nov 9, 2023

✅Gradient Boosting is a powerful machine learning technique used for both regression and classification tasks - Explained in Simple terms.
A quick thread 🧵👇🏻
#MachineLearning #Coding #100DaysofCode #deeplearning #DataScience
PC : Research Gate

1/ Gradient Boosting is an ensemble learning method that combines the predictions of multiple weak learners (often decision trees) to create a stronger and more accurate predictive model.

2/ How Gradient Boosting Works:

Gradient Boosting builds an ensemble of decision trees sequentially. It starts with a simple model (typically a single tree) and then iteratively adds more trees to correct the errors made by the previous ones.

Read 25 tweets

Naina Chaturvedi

@NainaChaturved8

Nov 6, 2023

✅Cross-validation in ML is particularly useful for estimating how well a model will perform on unseen data - Explained in Simple terms.
A quick thread 🧵👇🏻
#MachineLearning #Coding #100DaysofCode #deeplearning #DataScience
PC : Research Gate

1/ Cross-validation involves splitting the dataset into multiple subsets and using different parts of the data for training and testing at each iteration. The primary goal of cross-validation is to obtain a more robust and unbiased estimate of a model's performance.

2/ Why use Cross Validation -
Performance Estimation: Cross-validation provides a more robust and unbiased estimate of a model's performance. It helps you to obtain a more accurate assessment of how well your model will perform on new, unseen data.

Read 25 tweets

Naina Chaturvedi

@NainaChaturved8

Oct 22, 2023

✅Feature selection and Feature scaling are crucial Feature Engineering steps - Explained in Simple terms.
A quick thread 👇🏻🧵
#MachineLearning #Coding #100DaysofCode #deeplearning #DataScience
PC : Research Gate

1/ Feature selection is the process of choosing a subset of the most relevant features (variables or columns) from your dataset. It involves excluding less informative or redundant features to improve model performance and reduce computational complexity.

2/ When to Use It:
High-Dimensional Data: Feature selection is crucial when you have a high-dimensional dataset, meaning there are many features compared to the number of data points. High dimensionality can lead to overfitting and increased computational costs.

Read 39 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Naina Chaturvedi

Try unrolling a thread yourself!

More from @NainaChaturved8

Naina Chaturvedi

Naina Chaturvedi

Naina Chaturvedi

Naina Chaturvedi

Naina Chaturvedi

Naina Chaturvedi

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!