Selçuk Korkmaz Profile picture
Jul 25 10 tweets 2 min read Twitter logo Read on Twitter
1/ 🤔 Ever wonder how "bootstrapping" works? I recently used it for estimating confidence intervals & someone asked me about its logic. At first, I was stumped, even though I've used it often! Here's my attempt to clarify.

#Statistics #Bootstrapping #DataScience 📈📉 https://towardsdatascience.com/bootstrapping-statistics-what-it-is-and-why-its-used-e2fa29577307
2/ 🥾 What's bootstrapping? It's a resampling technique where you take many subsamples from your sample data & analyze them. The idea? The subsamples give us an insight into the variability in our sample.
3/ 🤷‍♂️ But how do we go from understanding our sample to drawing conclusions about a larger population? Here’s the tricky part. The underlying assumption is that our sample is a good representation of the population.
4/ 💡 If the original sample is representative, resampling from it mimics drawing multiple samples from the population. By assessing the variability across bootstrapped samples, we infer the population's variability.
5/ 🎯 Remember, statistics is about estimation. With bootstrapping, we're creating a distribution of estimates. This distribution helps us understand how stable or variable our original estimate might be.
6/ 🔄 Think of it as a simulated "what if" scenario. What if we took many samples from the population? Bootstrapping replicates that process by resampling from our best available representation of the population - our sample!
7/ ⚠️ But there are limitations. If your initial sample is biased or unrepresentative, bootstrapping can't fix that. It can only provide information based on the data you have. Hence, ensuring a good sample is crucial.
8/ 🔍 Also, bootstrapping isn't a silver bullet for every statistical scenario. But it's especially useful when the sample size is small or when the underlying distribution is unknown.
9/ 🌟 So, the leap from understanding our sample to making inferences about the population using bootstrapping is rooted in the idea that by understanding variability in our sample, we get a window into variability in the larger population.
10/ 🚀 In essence, bootstrapping takes our single sample & amplifies its insights, giving us a richer perspective. It’s a powerful tool in our statistical arsenal, as long as we remember its assumptions & limitations.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Selçuk Korkmaz

Selçuk Korkmaz Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @selcukorkmaz

Jul 21
🧵 Difference Between Confidence Interval & Credible Interval
1/ Intro
Both Confidence Intervals (CIs) and Credible Intervals (CrIs) provide a range for estimating an unknown parameter. But they're based on different philosophies and interpretations. #DataScience #Stats Image
2/ Confidence Interval (CI) 📉
•Based on frequentist statistics.
•If we were to repeat a study many times, ~95% (or another chosen level) of the CIs would contain the true parameter.
•It's about the intervals and their likelihood of capturing the true value.
3/ Credible Interval (CrI) 📈
•Based on Bayesian statistics.
•Gives a probability that the true parameter is within a specific range.
•It's about the probability of the parameter being in that interval, given the observed data.
Read 8 tweets
Jul 21
1/ 🧵 Let's dive into a common statistical question: When calculating standard deviation, why do we square the differences rather than taking their absolute value? Let's break this down. 📊 #DataScience #rstats Image
2/ Historical Context:
To start, the idea of squaring differences has a historical basis. Sir Francis Galton, a cousin of Charles Darwin, introduced it. Galton's work influenced the development of the variance (and subsequently the standard deviation).
3/ Differentiability:
One of the main reasons is mathematical convenience. Squaring makes the function differentiable everywhere, which is not the case for absolute differences. This is crucial for calculus-based optimization methods in statistics.
Read 10 tweets
Jul 19
1/10 🧵 Dive into Data Visualization with #ggplot2! 📊
Let's explore the foundation of this popular #R package and how to create stunning plots using its components. Follow along! #DataScience #Rstats https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/
2/10 🖼️ The Canvas:
ggplot(data = your_data) creates the canvas. Every ggplot plot begins here. You're specifying the dataset you're working with. But just this alone won’t visualize anything! #RStats #DataScience
3/10 🎨 Aesthetics (aes):
This is where you map variables to visual properties (like x and y axes). For instance, aes(x = variable1, y = variable2) would plot variable1 on the x-axis and variable2 on the y-axis. #RStats #DataScience
Read 10 tweets
Jul 13
1/15 🧵 Want to level up your #R programming skills? Whether you're a beginner or an intermediate R user, this thread is for you! Follow along for valuable tips, resources, and strategies to become a more confident and skilled R programmer. 🚀 #RStats #DataScience Image
2/15 R is a powerful language for data manipulation, analysis and visualization. To elevate your skills, start by understanding the language at its core. This includes the syntax, data types, vectors, matrices, lists, and data frames. #RStats
3/15 Get familiar with the most commonly used packages in R. Some of these include #tidyverse (data manipulation), #ggplot2 (visualization), and #caret (machine learning). Learning how to effectively utilize these packages can greatly enhance your capabilities.
Read 15 tweets
May 7
1/ 📊📈 Let's dive into the fascinating world of #statistics and explore two key concepts: Odds Ratio and Relative Risk! Understanding the differences and applications of these two measures is crucial for interpreting study results and making informed decisions. #DataScience Image
2/ 🎲 Odds Ratio (OR): The Odds Ratio is a measure of association between an exposure and an outcome. It represents the odds of an event occurring in one group compared to the odds in another group. OR is particularly useful in case-control studies. #DataScience
3/ 🌡️ Relative Risk (RR): Also known as Risk Ratio, RR is the ratio of the probability of an event occurring in the exposed group to the probability of the event occurring in the non-exposed group. RR is often used in cohort studies to assess risk. #DataScience
Read 8 tweets
May 7
1/ 📊📏 Let's dive into the world of #statistics & explore the Levels of Measurement! Understanding these levels is crucial for choosing the right statistical methods for data analysis. Today, we'll cover the 4 main levels: Nominal, Ordinal, Interval, and Ratio. #DataScience Image
2/ 🏷️ Nominal Level: At this level, data is purely qualitative and categorical. There's no inherent order or ranking involved. Examples include colors, genders, or nationalities. It's important to note that mathematical operations like addition or subtraction don't apply here.
3/ 🥇🥈🥉 Ordinal Level: This level involves data that has an inherent order or ranking, but the difference between categories is not uniform. Examples include survey responses (Strongly Disagree to Strongly Agree) or educational levels (elementary, high school, college).
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(