Another #ChelseaExplains 🧵 (trying to start with simpler topics).

Today that's 💫Conjugate Priors💫
First, PRIORS. In Bayesian Statistics, we use probability distributions (like a normal, Cauchy, beta...) to represent uncertainty about the value of parameters.

Instead of choosing ONE number for a param, a distribution describes how likely a range of values are A distribution with the x-axis label "Possible Paramete
Bayesian Stats works by taking previously known, PRIOR information (this can be from prior data, domain expertise, regularization...) about the parameter

and combining it with data to make the POSTERIOR (the distribution of parameter values AFTER including data)
But that's all conceptual. Math-wise the way this works is by taking the function (the Probability Density Function for continuous, or Probability Mass Function for discrete) for the prior and multiplying it with the function for the likelihood of the data*
* we're ignoring the marginal here (again this tweet is more for statisticians than understanding)🚨
Sometimes multiplying the likelihood and the prior gets...messy (and luckily IRL we have MCMC algos to help us!)

But when doing things by hand, we like cases where the math is easy.

E.g. it would be REALLY NICE if the prior and posterior had the same distribution 💡
Conjugate priors refer to a prior distribution you can use with a specific likelihood function to make the posterior distribution have the SAME type of distribution as the prior!
For example,

a 💫beta prior💫 + a binomial likelihood = 💫beta💫 posterior
a ✨normal✨ prior + a normal likelihood = ✨normal✨ posterior
Now that a lot of Bayesian Stats are done via sampling algorithms like a Gibbs Sampler (JAGS, BUGS) or Hamiltonian Monte Carlo (Stan, PyMC3), we don't have to worry about conjugate priors as much, because we can estimate the posterior no matter if the prior is a conjugate prior
But conjugate priors still provide a nice way to demonstrate concepts, and when appropriate are a LOT QUICKER than a solution found by an MCMC algo, because they have a "closed-form solution" (meaning we can solve for an exact answer)

FIN

again, critiques in GIF form only
( I HATE POSTING formulas and math on Twitter because inevitably I will have made a typo...🥲 But I did it for you #statsTwitter, I did it for you)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Chelsea Parlett-Pelleriti

Chelsea Parlett-Pelleriti Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ChelseaParlett

30 Jul
True to my word, the best statistical model, a Thread🧵

As an applied statistician and freelance consultant, I work a LOT with people trying to figure out the best model. Here are things I consider and ask.
1. Does the model ACTUALLY answer a question you have?

If you don’t have a question THEN THE BEST INFERENTIAL/PREDICTIVE MODEL IS NO MODEL. Do some EDA first!

Stop doing hypothesis testing if you don’t have a hypothesis to test. ✋
2. Is the model realistic?

We love prior predictive checks bc they allow u to generate data based on ur priors😻 but also ask whether ur leaving out important effects, whether the variable u use actually measures what you want, or if linear relationships are realistic…
Read 7 tweets
6 Jul
SINCE @kierisi has threatened to sarcastically/chaotically say incorrect things about p-values during #sliced tonight 😱 just to annoy people 😉,

I thought I’d do a quick thread on what a 🚨p-value🚨actually is.

🧵
(1/n)
Computationally, a p-value is p(data as extreme as ours | null). Imagine a 🌎 where the null hypothesis is true (e.g. there is no difference in cat fur shininess for cats eating food A vs. food B for 2 weeks), and see how extreme your observed data would be in that 🌎 (2/n)
So, you can think of the p-value as representing our data’s compatibility with the null hypothesis. Low p-values mean our data is not very likely in a world where the null is true. High p-values mean it is relatively likely. (3/n)
Read 20 tweets
20 Nov 20
⚠️ SO YOU WANT TO BE A BAYESIAN⚠️ :

(since I compiled a quick list of Bayesian resources today, I figured I should share; these are just my opinion!)
📚 Beginner book (R): Bayesian Statistics the fun way —amazon.com/dp/B07J461Q2K/…

📚 Beginner-Intermediate book (R + JAGS + Stan): Doing Bayesian Data Analysis: amazon.com/Doing-Bayesian…
📚 Intermediate book (R + Stan) : Statistical Rethinking—
amazon.com/Statistical-Re…

📚 Intermediate/Advanced Book (R + Stan) : Bayesian Data Analysis (amazon.com/Bayesian-Analy…)
Read 9 tweets
18 Nov 20
Wanna become a data scientist?

Sᴛᴇᴘ 1: ɢᴇᴛ ᴅᴀᴛᴀ.
Sᴛᴇᴘ 2: ᴅᴏ sᴄɪᴇɴᴄᴇ.
Alternatively:

Step 1: Think about becoming a lawyer but ditch that because you can’t stand foreign political history classes. Add a philosophy double major because sureee that’ll help🙄. Then switch to psychology, meet an awesome statistician and decide you love statistics
in your last semester of college.
graduate. Work in a cognitive neuroscience lab while living at home, apply to data science grad programs, get rejected by Berkeley, get into Chapman university, find an advisor who needs someone with stats AND psych expertise,
Read 5 tweets
14 Aug 20
Pangolins that look like grad students asking their advisor if there's funding for them for next year: A Thread🧵
"Should I apply for GRFP?"
"I have to TA how many courses?!"
Read 4 tweets
4 Mar 20
Moving from psych to stats/DS is totally doable Depending on your training, there may be some gaps you need to fill, content-wise, but those gaps 1) aren't insurmountable + 2) will not automatically make you a bad data person just because you're working on filling them.

1/8
Doing good DS requires hard work/rigor but it’s not exclusive to “math” people. You can do it.

2/8
Personally, I had gaps in math + comp sci. I learned to code (Python, R, C++, SQL) and took/audited a bunch of probability, stats, and linear algebra classes. Those classes CERTAINLY helped me, but I could've learned the content w/o them it was just easier/more structured.

3/8
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(