May 28 8 tweets 15 min read
1/ One way to test whether a time series is stationary is to perform an augmented Dickey-Fuller test - A Thread

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics #programming #ArtificialIntelligence
2/ H0: The time series is non-stationary. In other words, it has some time-dependent structure and does not have constant variance over time.

HA: The time series is stationary.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist
3/ If the p-value from the test is less than some significance level (e.g. α = .05), then we can reject the null hypothesis and conclude that the time series is stationary.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist
4/ Example: Augmented Dickey-Fuller Test in Python

Suppose we have the following time series data and first we create a quick plot to visualize the data

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics
5/ To perform an augmented Dickey-Fuller test, we can use the adfuller() function from the statsmodels library. First, we need to install statsmodels:

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics
6/ Here’s how to interpret the most important values in the output:

• Test statistic: -0.97538
• P-value: 0.7621

Since the p-value is not less than .05, we fail to reject the null hypothesis

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI
7/ This means the time series is non-stationary. In other words, it has some time-dependent structure and does not have constant variance over time.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics
8/ For daily tips and techniques on #DeepLearning, #ComputerVision and #MachineLearning follow me @rohanpaul_ai

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

# More from @rohanpaul_ai

May 28
1/ "Software is eating the world. Machine learning is eating software. Transformers are eating machine learning."

Let's understand what these Transformers are all about

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataAnalytics
2/ #Transformers architecture follows Encoder and Decoder structure.

The encoder receives input sequence and creates intermediate representation by applying embedding and attention mechanism.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI
3/ Then, this intermediate representation or hidden state will pass through the decoder, and the decoder starts generating an output sequence.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics
May 28
But what p-value means in #MachineLearning - A thread

It tells you how likely it is that your data could have occurred under the null hypothesis

1/n

#DataScience #DeepLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat
2/n
What Is a Null Hypothesis?

A null hypothesis is a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations.

#DataScience #MachineLearning #100DaysOfMLCode #Python #stat #Statistics #Data #AI #Math #deeplearning
3/n
A P-value is the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis

#DataScience #MachineLearning #100DaysOfMLCode #Python #DataScientist #Statistics #Data #DataAnalytics #AI #Math
May 28
Kullback-Leibler (KL) Divergence - A Thread

It is a measure of how one probability distribution diverges from another expected probability distribution.

#DataScience #Statistics #DeepLearning #ComputerVision #100DaysOfMLCode #Python #programming #ArtificialIntelligence #Data
May 27
2/ It is important to standardize variables before running Cluster Analysis. It is because cluster analysis techniques depend on the concept of measuring the distance between the different observations we're trying to cluster.

#DataScience #MachineLearning #DeepLearning
3/ If a variable is measured at a higher scale than the other variables, then whatever measure we use will be overly influenced by that variable.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics
May 27
Did you know how TensorFlow can run on a single mobile device as well as on an entire data center? Read this thread

1/n

#TensorFlow #DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data
2/n
Google has designed TensorFlow such that it is capable of dividing a large model graph whenever needed.

#TensorFlow #DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat #AI
3/n
It assigns special SEND and RECV nodes whenever a graph is divided between multiple devices (CPUs or GPUs).

#TensorFlow #DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat #AI
May 27
2/16

"roc_auc_score" is defined as the area under the ROC curve, which is the curve having False Positive Rate on the x-axis and True Positive Rate on the y-axis at all classification thresholds.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python