Pandas
- Pandas is probably one of the most powerful and flexible open source data analysis and manipulation tool available in any language.
- It provides a wide range of functions for data wrangling and cleaning.
resources:
1⃣ youtube.com/playlist?list=…
2⃣
NumPy (Numerical Python)
- NumPy is an open source project aiming to enable numerical computing with Python.
- It provides functions and methods for performing high level mathematical functions on multi-dimensional arrays and matrices.
resources:
1⃣
Matplotlib (Data Visualization)
- Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.
Seaborn (Data Visualization)
- Seaborn is a Python data visualization library based on Matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
resources:
The previous Github repository.
Scikit-Learn
- Scikit-Learn is an open source machine learning library.
- It is built upon SciPy, and it provides a wide range of machine learning algorithms such as regression, classification, clustering, etc.
resources:
1⃣
2⃣
TensorFlow
- An open source end-to-end machine learning library that allows you to develop and train models.
- TensorFlow focuses mainly on training and inference of deep neural networks.
resources:
1⃣
(part 1)
(part 2)
👇👇
2⃣
OpenCV
- An open source computer vision and machine learning library.
- It has a huge number of algorithms for computer vision applications such as object detection, face recognition, movement tracking, etc.
Other libraries include:
SciPy
PyTorch (Similar to TensorFlow)
statsmodel
plotly (for creating dashboards)
XGBoost
That's it for this thread.
If you find it useful, kindly consider retweeting the first tweet.
For more #DataScience and #MachineLearning content, follow me @ammaryh92.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
▶️The Power of Non-Saturating Activation Functions
One of the reasons deep learning was largely abandoned in the early 2000s is the problem of vanishing gradients.
So, what is the vanishing gradients problem?
In a nutshell, a deep network is trained by iteratively updating its parameters.
The update that every parameter receives depends mainly on two things: 1. The gradient of the cost function with respect to that parameter. 2. The learning rate.
The vanishing gradient problem happens when these gradients get smaller and smaller as backpropagation progresses down to the lower layers.
This causes the lower layers to receive very small updates to the point they are unchanged.
When training deep neural networks, the shape of the cost function is rarely smooth or regular.
Some parts of the cost function will have a steep slope on which gradient descent can move pretty quickly.
On the other hand, the cost function commonly contains valleys (flat segments) where gradient descent moves really slow.
Vanilla gradient descent techniques such as (stochastic gradient descent or mini-batch gradient descent), which make parameter updates depending only on the local gradients, will start by quickly going down the steepest slope even if it doesn't point to the global minimum.
How to deal with overplotting in data visualization?
Overplotting happens when there are a large number of data points which results in an undetailed plot with too many overlapping points.
Overplotting makes it really difficult to interpret the plot and detect the patterns.
The above code produces the following plot where there is a clear overlapping of the plotted points.
This overlapping makes it really difficult to understand the distribution patterns of the data.
Possible solutions include: 1- Sampling 2- Transparency 3- Heatmap
1⃣Sampling
Because overplotting happens mainly when there is an extremely large number of data points, taking a smaller random sample of the data seems like a reasonable approach.
Weight Initialization Strategies for Deep Neural Networks
Thread Contents:
a. Why do we need better initialization strategies?
b. Weight initialization techniques
c. Implementation in Keras
In a previous thread, we discussed the vanishing/exploding gradient problem in DNNs, and we mentioned that the two main reasons for this problem are: 1. Improper weight initialization techniques. 2. Saturating activation functions.
We also discussed that because of the previous weight initialization techniques, using a normal distribution with a mean of 0 and a variance of 1, the variance of the outputs of each layer is much greater than the variance of its inputs.
Most data scientists spend almost 80% of their time inspecting and cleaning data rather than working on their machine learning models.
But why?
According to the "No Free Lunch theorem", most machine learning models have a set of built-in assumptions, and before you start training your model, you have to make sure that your data is in line with the underlying assumptions of your model.
For example, a linear model assumes that your input features are linearly correlated with the target values (linear relationship). This means that if you tried to fit a linear model on quadratic training data, your model will probably underfit the training data.