1/ Indexing data frames
Indexing means to selecting all/particular rows and columns of data from a DataFrame. In pandas it can be done using two constructs —
.loc() : location based
It has methods like scalar label, list of labels, slice object etc
.iloc() : Interger based
2/ Slicing data frames
In order to slice by labels you can use loc() attribute of the DataFrame.
Implementation —
3/ Filtering data frames
Using Filter you can subset rows or columns of dataframe according to labels in the specified index of the data.
Implementation —
4/ Transforming Data Frames
Pandas Transform helps in creating a DataFrame with transformed values and has the same axis length as its own.
Implementation —
5/ Adding Rows — append()
Implementation —
6. Hierarchical indexing
Hierarchical indexing is the technique in which we set more than one column name as the index. set_index() function is used for when doing hierarchical indexing.
Implementation —
7/ Merging data frames
Concat() Function is used to merge the dataframes.
Implementation --
8/ Joins —
It helps us merging DataFrames. Types of Joins —
Inner Join :- Returns records that have matching values in both tables.
Left Join :- Returns all the rows from the left table that are specified in the left outer join clause, not just the rows in which the columns match
9/ Right Join :- Returns all records from the right table, and the matched records from the left table.
Full Join :- Returns all records when there is a match in either left or right table.
Cross Join :- Returns all possible combinations of rows from two tables.
Implementation-
10/ Pivot Tables
It creates a Spreadsheet style pivot table as a DataFrame.
Implementation -
11/ Aggregate Functions
Pandas has a number of aggregating functions that reduce the dimension of the grouped object.
count()
value_count()
mean()
median()
sum()
min()
max()
std()
var()
describe()
sem()
Implementation -
12/ I write quality threads on Data Science, Python, Programming, Machine Learning and AI in my free time. If you like this thread, then give a follow.
1/ DefaultDict
In python, a dictionary is a container that holds key-value pairs. Keys must be unique, immutable objects. If you try to access or modify keys that don’t exist in the dictionary, it raise a KeyError & break up your code execution ( continued..)
2/ (Continued..)To tackle this issue,Python defaultdict type, a dictionary-like class is used.If you try to access or modify a missing key,then defaultdict will automatically create the key & generate a default value for it
A defaultdict will never raise a KeyError ( Continued..)