Tweet

Chelsea Parlett-Pelleriti

30 Jul, 7 tweets, 3 min read

https://twitter.com/chelseaparlett/status/1411835531122540547

True to my word, the best statistical model, a Thread🧵

As an applied statistician and freelance consultant, I work a LOT with people trying to figure out the best model. Here are things I consider and ask.

https://twitter.com/chelseaparlett/status/1411835531122540547

1. Does the model ACTUALLY answer a question you have?

If you don’t have a question THEN THE BEST INFERENTIAL/PREDICTIVE MODEL IS NO MODEL. Do some EDA first!

Stop doing hypothesis testing if you don’t have a hypothesis to test. ✋

2. Is the model realistic?

We love prior predictive checks bc they allow u to generate data based on ur priors😻 but also ask whether ur leaving out important effects, whether the variable u use actually measures what you want, or if linear relationships are realistic…

3. Is your model accessible?

If no one can actually RUN THE ideal MODEL then it’s not so ideal is it!? Software/knowledge/sample size limitations are a real life issue.

We need to educate better, make better packages, and figure out how to make stats help accessible.

4. Are your model assumptions appropriate?

All models assume things. Even non-parametric ones. Make sure the ones your model is making are reasonable in your context.

5. Is this a model you can explain well?

It is not useful to have a fancy model that no one understands. Either learn more about it first or get some help so someone else on your team understands it.

TL;DR the best model is one that

✅answers ur question
✅is realistic
✅can be RUN by you
✅has met assumptions
✅has results you can effectively communicate to others

Also the best model is Hierarchical Bayesian ZOIDBERG (ZOIB if ya nasty) 😇

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @ChelseaParlett

Chelsea Parlett-Pelleriti

@ChelseaParlett

31 Jul

https://twitter.com/ChelseaParlett/status/1420189231943610370

Another #ChelseaExplains 🧵 (trying to start with simpler topics).

Today that's 💫Conjugate Priors💫

https://twitter.com/ChelseaParlett/status/1420189231943610370

First, PRIORS. In Bayesian Statistics, we use probability distributions (like a normal, Cauchy, beta...) to represent uncertainty about the value of parameters.

Instead of choosing ONE number for a param, a distribution describes how likely a range of values are

Bayesian Stats works by taking previously known, PRIOR information (this can be from prior data, domain expertise, regularization...) about the parameter

and combining it with data to make the POSTERIOR (the distribution of parameter values AFTER including data)

Read 11 tweets

Chelsea Parlett-Pelleriti

@ChelseaParlett

6 Jul

@kierisi

SINCE @kierisi has threatened to sarcastically/chaotically say incorrect things about p-values during #sliced tonight 😱 just to annoy people 😉,

I thought I’d do a quick thread on what a 🚨p-value🚨actually is.

🧵
(1/n)

Computationally, a p-value is p(data as extreme as ours | null). Imagine a 🌎 where the null hypothesis is true (e.g. there is no difference in cat fur shininess for cats eating food A vs. food B for 2 weeks), and see how extreme your observed data would be in that 🌎 (2/n)

So, you can think of the p-value as representing our data’s compatibility with the null hypothesis. Low p-values mean our data is not very likely in a world where the null is true. High p-values mean it is relatively likely. (3/n)

Read 20 tweets

Chelsea Parlett-Pelleriti

@ChelseaParlett

20 Nov 20

https://twitter.com/ChelseaParlett/status/1329572163947433986

⚠️ SO YOU WANT TO BE A BAYESIAN⚠️ :

(since I compiled a quick list of Bayesian resources today, I figured I should share; these are just my opinion!)

https://twitter.com/ChelseaParlett/status/1329572163947433986

📚 Beginner book (R): Bayesian Statistics the fun way —amazon.com/dp/B07J461Q2K/…

📚 Beginner-Intermediate book (R + JAGS + Stan): Doing Bayesian Data Analysis: amazon.com/Doing-Bayesian…

📚 Intermediate book (R + Stan) : Statistical Rethinking—
amazon.com/Statistical-Re…

📚 Intermediate/Advanced Book (R + Stan) : Bayesian Data Analysis (amazon.com/Bayesian-Analy…)

Read 9 tweets

Chelsea Parlett-Pelleriti

@ChelseaParlett

18 Nov 20

https://twitter.com/angebassa/status/1329092320021663744

Wanna become a data scientist?

Sᴛᴇᴘ 1: ɢᴇᴛ ᴅᴀᴛᴀ.
Sᴛᴇᴘ 2: ᴅᴏ sᴄɪᴇɴᴄᴇ.

https://twitter.com/angebassa/status/1329092320021663744

Alternatively:

Step 1: Think about becoming a lawyer but ditch that because you can’t stand foreign political history classes. Add a philosophy double major because sureee that’ll help🙄. Then switch to psychology, meet an awesome statistician and decide you love statistics

in your last semester of college.
graduate. Work in a cognitive neuroscience lab while living at home, apply to data science grad programs, get rejected by Berkeley, get into Chapman university, find an advisor who needs someone with stats AND psych expertise,

Read 5 tweets

Chelsea Parlett-Pelleriti

@ChelseaParlett

14 Aug 20

Pangolins that look like grad students asking their advisor if there's funding for them for next year: A Thread🧵

"Should I apply for GRFP?"

"I have to TA how many courses?!"

Read 4 tweets

Chelsea Parlett-Pelleriti

@ChelseaParlett

4 Mar 20

https://twitter.com/epsnonsense/status/1234567738858901505

Moving from psych to stats/DS is totally doable Depending on your training, there may be some gaps you need to fill, content-wise, but those gaps 1) aren't insurmountable + 2) will not automatically make you a bad data person just because you're working on filling them.

1/8

https://twitter.com/epsnonsense/status/1234567738858901505

Doing good DS requires hard work/rigor but it’s not exclusive to “math” people. You can do it.

2/8

Personally, I had gaps in math + comp sci. I learned to code (Python, R, C++, SQL) and took/audited a bunch of probability, stats, and linear algebra classes. Those classes CERTAINLY helped me, but I could've learned the content w/o them it was just easier/more structured.

3/8