Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Ammar Yasser

@ammaryh92

Jul 23, 2021 • 12 tweets • 7 min read • Read on X

Scrolly

#python packages for #DataScience and #MachineLearning
(explanation + resources)

🧵👇

Pandas
- Pandas is probably one of the most powerful and flexible open source data analysis and manipulation tool available in any language.
- It provides a wide range of functions for data wrangling and cleaning.
resources:
1⃣ youtube.com/playlist?list=…
2⃣

NumPy (Numerical Python)
- NumPy is an open source project aiming to enable numerical computing with Python.
- It provides functions and methods for performing high level mathematical functions on multi-dimensional arrays and matrices.

resources:
1⃣

Matplotlib (Data Visualization)
- Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.

resources:
1⃣
2⃣ github.com/ammaryh92/Data… (matplotlib + seaborn)

Seaborn (Data Visualization)
- Seaborn is a Python data visualization library based on Matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

resources:
The previous Github repository.

Scikit-Learn
- Scikit-Learn is an open source machine learning library.
- It is built upon SciPy, and it provides a wide range of machine learning algorithms such as regression, classification, clustering, etc.

resources:
1⃣
2⃣

TensorFlow
- An open source end-to-end machine learning library that allows you to develop and train models.
- TensorFlow focuses mainly on training and inference of deep neural networks.

resources:
1⃣
(part 1)

(part 2)

👇👇

2⃣

OpenCV
- An open source computer vision and machine learning library.
- It has a huge number of algorithms for computer vision applications such as object detection, face recognition, movement tracking, etc.

resources:
1⃣

👇👇

Subscribe to this youtube channel for really cool computer vision applications.
youtube.com/channel/UCYUjY…

Other libraries include:
SciPy
PyTorch (Similar to TensorFlow)
statsmodel
plotly (for creating dashboards)
XGBoost

@ammaryh92

That's it for this thread.
If you find it useful, kindly consider retweeting the first tweet.
For more #DataScience and #MachineLearning content, follow me @ammaryh92.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @ammaryh92

Ammar Yasser

@ammaryh92

Apr 27, 2022

Deep Learning Mini-Tutorial

▶️The Power of Non-Saturating Activation Functions

One of the reasons deep learning was largely abandoned in the early 2000s is the problem of vanishing gradients.

So, what is the vanishing gradients problem?

In a nutshell, a deep network is trained by iteratively updating its parameters.

The update that every parameter receives depends mainly on two things:
1. The gradient of the cost function with respect to that parameter.
2. The learning rate.

The vanishing gradient problem happens when these gradients get smaller and smaller as backpropagation progresses down to the lower layers.
This causes the lower layers to receive very small updates to the point they are unchanged.

The network never converges to a good solution.

Read 10 tweets

Ammar Yasser

@ammaryh92

Apr 24, 2022

Data Analysis Mini-Tutorial

Understanding the distribution of your data is usually a crucial step in the data analysis process.

Having normally distributed (bell-shaped) data makes it a lot easier for most machine learning algorithms to detect patterns.

Unfortunately, it's never that easy.

A lot of data are usually skewed, either to the right or to the left.
This is pretty common if you're dealing with financial data, for example.

This usually happens when the data contains a lot of outliers.

Using a linear scale, we end up with an unevenly balanced plot with a lot of gaps in order to capture the highest values.

This would make it difficult for many machine learning models, especially linear models, to capture the small differences on both ends of the scale.

Read 6 tweets

Ammar Yasser

@ammaryh92

Apr 23, 2022

Deep Learning Mini-Tutorial

When training deep neural networks, the shape of the cost function is rarely smooth or regular.

Some parts of the cost function will have a steep slope on which gradient descent can move pretty quickly.

On the other hand, the cost function commonly contains valleys (flat segments) where gradient descent moves really slow.

Vanilla gradient descent techniques such as (stochastic gradient descent or mini-batch gradient descent), which make parameter updates depending only on the local gradients, will start by quickly going down the steepest slope even if it doesn't point to the global minimum.

Read 8 tweets

Ammar Yasser

@ammaryh92

Nov 5, 2021

How to deal with overplotting in data visualization?

Overplotting happens when there are a large number of data points which results in an undetailed plot with too many overlapping points.

Overplotting makes it really difficult to interpret the plot and detect the patterns.

The above code produces the following plot where there is a clear overlapping of the plotted points.

This overlapping makes it really difficult to understand the distribution patterns of the data.

Possible solutions include:
1- Sampling
2- Transparency
3- Heatmap

1⃣Sampling

Because overplotting happens mainly when there is an extremely large number of data points, taking a smaller random sample of the data seems like a reasonable approach.

Read 10 tweets

Ammar Yasser

@ammaryh92

Nov 3, 2021

Weight Initialization Strategies for Deep Neural Networks

Thread Contents:
a. Why do we need better initialization strategies?
b. Weight initialization techniques
c. Implementation in Keras

https://twitter.com/ammaryh92/status/1452151604963168259

In a previous thread, we discussed the vanishing/exploding gradient problem in DNNs, and we mentioned that the two main reasons for this problem are:
1. Improper weight initialization techniques.
2. Saturating activation functions.

https://twitter.com/ammaryh92/status/1452151604963168259

We also discussed that because of the previous weight initialization techniques, using a normal distribution with a mean of 0 and a variance of 1, the variance of the outputs of each layer is much greater than the variance of its inputs.

Read 14 tweets

Ammar Yasser

@ammaryh92

Oct 16, 2021

Most data scientists spend almost 80% of their time inspecting and cleaning data rather than working on their machine learning models.

But why?

According to the "No Free Lunch theorem", most machine learning models have a set of built-in assumptions, and before you start training your model, you have to make sure that your data is in line with the underlying assumptions of your model.

For example, a linear model assumes that your input features are linearly correlated with the target values (linear relationship). This means that if you tried to fit a linear model on quadratic training data, your model will probably underfit the training data.

Read 10 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Ammar Yasser

Try unrolling a thread yourself!

More from @ammaryh92

Ammar Yasser

Ammar Yasser

Ammar Yasser

Ammar Yasser

Ammar Yasser

Ammar Yasser

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!