Discover and read the best of Twitter Threads about #statistics

Most recents (24)

⭐️ Statistics basics:

A thread 🧵
#Statistics
• Covariance:

Covariance measures the relationship between two random variables and to what extent, they change together or we can say- it is essentially a measure of the variance between two variables.
ImageImage
Read 6 tweets
64% of #Polish people say #Germany should pay reparations for the damage it caused to #Poland during #WWII according to a new Social Changes opinion poll (source: @wPolityce_pl, Sept 4)

#WarLossesReport #RaportStratWojennych #ReparacjedlaPolski #ReparationsForPoland #Statistics Image
The death toll in #WWII was enormous with millions of lives lost.
There was 1️⃣ country that suffered disproportionately, it was #Poland 🇵🇱

Chart: WW2 casualties as % of each country's population

#unREDEEMed #ReparationsNow #ReparationsForPoland #ReparacjedlaPolski #Infographic Image
Read 9 tweets
During #WWII, 🇵🇱 suffered the greatest personal & material losses out of all European countries, in terms of total population and national assets.

#unREDEEMed #WarLossesReport #RaportStratWojennych #ReparationsNow #ReparacjedlaPolski @arekmularczyk @PolandMFA @michalrachon
64% of #Polish people say #Germany should pay reparations for the damage it caused to #Poland during #WWII according to a new Social Changes opinion poll (source: @wPolityce_pl, Sept 4)

#WarLossesReport #RaportStratWojennych #ReparacjedlaPolski #ReparationsForPoland #Statistics Girl looks out over the rui...
Read 5 tweets
With #rstats, it's dead-simple to implement logistic regression or Poisson regressions. Or any other kind of generalized linear model.

Here's how you can do that with {stats} or with {tidymodels}. 🧵
#Statistics #MachineLearning
Need to brush up on the math behind these models before we get started?

My most popular thread may help you.
One more hint before we start:

All of my code examples can be copied from my newest blog post.

The data that I use here comes from {palmerpenguins}. And we're going to classify a penguin's sex based on its weight, species and bill length. 🐧 🐧

albert-rapp.de/posts/14_glms/…
Read 18 tweets
Box plot is also called box and whiskers plot.

It gives us information about different things which help us get insights of the data, these include median (Q2), first quartile (Q1), third quartile (Q3), minimum value, maximum value, outliers, and interquartile range (IQR). Image
1. The line in the middle of the box is the median value.
50% of the data is on one side of the median, and 50% of the data is on the other side of the median.

2. The first quartile (Q1) is the middle value between the median and minimum value in the data set. (See figure)
Read 6 tweets
⌛️ Making a Career in Data Science!

Folks, who want to break into Data Science but not sure how. Here are the list of skills you need in the exact order as below.

A thread 🧵👇
SQL - Knowledge of querying databases to fetch relevant data points is important throughout your career. #SQL

1/8
Python - If you are in tech domain, coding is a must skill, and Python is the most commonly used language in Data Science. #Python

2/8
Read 10 tweets
And let us not forget that our #statistics in #finance event will will be hot on the heels of tomorrow's rapid-fire session...

... 🤔there is a distinct heat/fire theme to our recent tweets!

@RoyalStatSoc
Registration is still live for this hybrid event where you can hear representatives from @LeedsBS and @4mostEurope

Graduates, take note @leedsmathsoc @LeedsMaths

rss.org.uk/training-event…
Read 4 tweets
Ever heard of logistic regression? Or Poisson regression? Both are generalized linear models (GLMs).

They're versatile statistical models. And by now, they've probably been reframed as super hot #MachineLearning. You can brush up on their math with this 🧵. #rstats #Statistics
Let's start with logistic regression. Assume you want to classify a penguin as male or female based on its

* weight,
* species and
* bill length

Better yet, let's make this specific. Here's a data viz for this exact scenario. It is based on the {palmerpenguins} data set.
As you can see, the male and female penguins form clusters that do not overlap too much.

However, regular linear regression (LR) won't help us to distinguish them. Think about it. Its output is something numerical. Here, we want to find classes.
Read 25 tweets
BITCOIN ON-CHAIN STATISTICS
 
With the FUD around the bankruptcy of Celsius, $BTC is witnessing one of the worst declines since 2022. Let’s check the  on-chain statistics of BTC.
 
#onchain #statistics
BITCOIN ON-CHAIN STATISTICS
 
The on-chain volume on the Bitcoin network is at the 6-months low, and has been moving to the lowest in Jul 2021.
 
#onchain #statistics
BITCOIN ON-CHAIN STATISTICS
 
This shows though the market is more grown than 2021, despite the major decline lately, but the public do not show interest in BTC these days.
 
#onchain #statistics
Read 5 tweets
1/ "Software is eating the world. Machine learning is eating software. Transformers are eating machine learning."

Let's understand what these Transformers are all about

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataAnalytics
2/ #Transformers architecture follows Encoder and Decoder structure.

The encoder receives input sequence and creates intermediate representation by applying embedding and attention mechanism.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI
3/ Then, this intermediate representation or hidden state will pass through the decoder, and the decoder starts generating an output sequence.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics
Read 14 tweets
But what p-value means in #MachineLearning - A thread

It tells you how likely it is that your data could have occurred under the null hypothesis

1/n

#DataScience #DeepLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat
2/n
What Is a Null Hypothesis?

A null hypothesis is a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations.

#DataScience #MachineLearning #100DaysOfMLCode #Python #stat #Statistics #Data #AI #Math #deeplearning
3/n
A P-value is the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis

#DataScience #MachineLearning #100DaysOfMLCode #Python #DataScientist #Statistics #Data #DataAnalytics #AI #Math
Read 11 tweets
1/ One way to test whether a time series is stationary is to perform an augmented Dickey-Fuller test - A Thread

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics #programming #ArtificialIntelligence
2/ H0: The time series is non-stationary. In other words, it has some time-dependent structure and does not have constant variance over time.

HA: The time series is stationary.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist
3/ If the p-value from the test is less than some significance level (e.g. α = .05), then we can reject the null hypothesis and conclude that the time series is stationary.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist
Read 8 tweets
2/ It is important to standardize variables before running Cluster Analysis. It is because cluster analysis techniques depend on the concept of measuring the distance between the different observations we're trying to cluster.

#DataScience #MachineLearning #DeepLearning
3/ If a variable is measured at a higher scale than the other variables, then whatever measure we use will be overly influenced by that variable.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics
Read 16 tweets
Did you know how TensorFlow can run on a single mobile device as well as on an entire data center? Read this thread

1/n

#TensorFlow #DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data
2/n
Google has designed TensorFlow such that it is capable of dividing a large model graph whenever needed.

#TensorFlow #DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat #AI
3/n
It assigns special SEND and RECV nodes whenever a graph is divided between multiple devices (CPUs or GPUs).

#TensorFlow #DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat #AI
Read 9 tweets
2/16

"roc_auc_score" is defined as the area under the ROC curve, which is the curve having False Positive Rate on the x-axis and True Positive Rate on the y-axis at all classification thresholds.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python
Read 16 tweets
2/n

Alibi Detect is a Python library for detecting outliers, adversarial data, and drift. Accommodates tabular data, text, images, and time series that can be used both online and offline. Both TensorFlow and PyTorch backends are supported

#DataScience #DeepLearning
3/n

Supports a variety of outlier detection techniques, including Mahalanobis distance, Isolation forest, and Seq2seq

#DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat #pythoncode
Read 10 tweets
1/ Can you classify something without seeing it before - that's what Zero-Shot Learning is all about - A Thread

👉 One of the popular methods for zero-shot learning is Natural Language Inference (NLI).

#DataScience #DeepLearning #MachineLearning #100DaysOfMLCode #Pytho
3/ In Zero-shot classification, we ask the model to classify a sentence to one of the classes (label) that the model hasn't seen during training.

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming #ArtificialIntelligence #Data #pythoncode #AI
Read 13 tweets
1/ Why do we need the bias term in ML algorithms such as linear regression and neural networks ? - A thread

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming #ArtificialIntelligence #Data #pythoncode #AI #Stats #DeepLearning #100DaysOfCode Image
2/ In linear regression, without the bias term your solution has to go through the origin. That is, when all of your features are zero, your predicted value would also have to be zero.

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming Image
Read 7 tweets
Google has released Imagen: a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding

#Python3 #MachineLearning #DataScience #100DaysOfCode #DataScience #DataAnalytics #100DaysOfMLCode #DataScientist #Statistics Image
Imagen builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation.

#Python3 #MachineLearning #DataScience #100DaysOfCode #DataScience #DataAnalytics #100DaysOfMLCode
This generator is scarily accurate with super-resolution! "A photo of a raccoon wearing an astronaut helmet, looking out of the window at night."

#Python3 #MachineLearning #DataScience #100DaysOfCode #DataScience #DataAnalytics #100DaysOfMLCode #DataScientist #Statistics Image
Read 4 tweets
What is p-value - A thread

It tells you how likely it is that your data could have occurred under the null hypothesis.

1/n

#DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat Image
2/n
What Is a Null Hypothesis?

A null hypothesis is a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations.

#DataScience #MachineLearning #100DaysOfMLCode #Python #stat #Statistics #Data #AI #Math #deeplearning
3/n
A P-value is the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis

#DataScience #MachineLearning #100DaysOfMLCode #Python #DataScientist #Statistics #Data #DataAnalytics #AI #Math
Read 10 tweets
1/ #MachineLearning #Interview questsion -
Why L1 regularizations causes parameter sparsity whereas L2 regularization does not?

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming #ArtificialIntelligence #Data #pythoncode #AI #Stats
2/ L1 & L2 regularization add constraints to the optimization problem. The curve H0 is the hypothesis. The solution to this system is the set of points where the H0 meets the constraints.

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming Image
3/ Regularizations in statistics or in the field of machine learning is used to include some extra information in order to solve a problem in a better way.

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming #ArtificialIntelligence #Data
Read 7 tweets
Galveston: In 1900, Galveston, population 36,000, was obliterated, making it the site of the worst natural disaster in US history, a sad record it retains to this day. 6,000 to 12,000 were killed, thousands of buildings wiped out, etc. 1/
2/ Galveston was rebuilt with a massive seawall and the land was raised as much as 17 feet. 15 years later, another similar hurricane and storm surge hit. The wall worked. Only 53 were killed. #Galveston #Texas #hurricane #UShistory
3/ 1900 #Galveston hurricane: Galveston was one of Texas' largest cities and was very, very wealthy. Disregarding #statistics #probabilities #science #data, plans for a seawall to protect the sandbar city were dismissed as ridiculous. #nature
Read 6 tweets
✔️ The difference between correlation and regression: towardsdatascience.com/the-difference…

#Correlation #Regression #Statistics #DataScience
1. Correlation: Correlation is a statistical measure that expresses the linear relation between two variables.
2. Regression: Regression analysis is a mathematical technique used to analyze some data, consisting of a dependent variable and one (or more) independent variables with the aim to find an eventual functional relationship between the dependent variable and the independent ones.
Read 3 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!