Today, I would like to share some resources on causal inference. - a thread ⬇️
I came to this topic, while working with clinicians who use IPW and matching on a daily basis (they are not familiar with double robust approaches). I don’t know for you, but I am so admirative of them as they combine their work with patients with research to advance knowledge
I recommend @Susan_Athey's videos (such as aeaweb.org/webcasts/2018/…), an example of pedagogy - they also make available the analysis notebooks, gsbdbi.github.io/ml_tutorial/in…. For a longer and more technical course see, Stefan’s Wager web.stanford.edu/~swager/stats3…
@Susan_Athey I also recommend Jonas Peter’s lecture stat.mit.edu/news/four-lect…, covering more the Structural Causal Model’s point of view and highlighting recent works on invariant predictions. For a longer course see @heinzedeml with stat.ethz.ch/lectures/ss20/…
@Susan_Athey @heinzedeml I can also share the google doc, getting started with causal inference, with many pointers tinyurl.com/du69bt3m done mainly by @imkemay, for our causal inference and missing data group (CIMD) at @Inria. I share this to anyone who tries to navigate in this productive field.
@Susan_Athey @heinzedeml @imkemay @Inria Note that this is a field where women are well represented and where there are many inspiring figures! see for instance the speakers at the Online Causal Inference Seminar.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Women in Statistics and Data Science

Women in Statistics and Data Science Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @WomenInStat

29 Sep
Now, I would like to mention an R package, FactoMineR that I use on a daily basis to explore and visualize heterogeneous data: quantitative, categorical, with group structures, (multiple) contingency tables.

At its core, SVD! (I am also an SVD fan, @daniela_witten ;-).
@daniela_witten FactoMineR is indebted to the “French School of Data Analysis” (see arxiv.org/abs/0805.2879 or juliejosse.com/data-analysis/ for historical background), a field of statistics I was trained in.
@daniela_witten Note it was also the case for the famous @SherlockpHolmes, a role model for reproducibility, who I admire both from a scientific and personal point of view.
Read 5 tweets
29 Sep
Hello!
So today, I will share a few thoughts and advice I usually give to my PhD students. I hope this might be helpful for a wider audience, even if it is obvious and already stated by others. Anyway, as a teacher, we know repetition is important ;) - a thread ⬇️
1)Ask questions
Ask questions
Ask questions
….
Ask questions!
I mean that: don't hesitate to ask questions in seminars (in France in particular, we don't dare enough). Be curious, don’t be shy.
2) If you are tired and can't work, just don’t. Take a break, take a walk if you can. I've never regretted it, although I've often regretted staying in front my computer all day because I couldn't get anything done
Read 12 tweets
28 Sep
Missing Data, a thread ⬇️
Missing values are everywhere! We have listed more than 150 R packages in cran.r-project.org/web/views/Miss…
So let us give few pointers:
The method of handling missing data depends on the purpose of the analysis: estimation, completion, prediction, etc.
1) For inference with missing values, estimating as well as possible a parameter and giving a confidence interval, consider likelihood approaches (using EM algorithms) or multiple imputation
2) Single Imputation/Matrix completion aims at completing (predicting the missing entries) a dataset as best as possible. Multiple imputation aims at estimating parameters and their variability, taking into account the uncertainty due to missing values
Read 18 tweets
27 Sep
I have been working for >10 years on missing data.
My passion for data science mainly comes from its transversality: as a statistician, we can interact with so many scientific fields!
With missing data the same is true but within statistics, as it can pop up in all its branches.
When I first meet a scientist for a new project, I always start the conversation by asking “Show me the data!” to understand the underlying challenges.
So far, I have never been shown a complete dataset... (of course there might be some bias!).
With @imkemay, Aude Sportisse, @nj_tierney and @Natty_V2, we created the Rmistatic platform rmisstastic.netlify.app, to organize all the resources (courses, tutorials, articles, software, etc.) and implement analysis pipelines with missing data in R/Python.
Read 4 tweets
30 Apr
I’m going to begin today with a bold claim: Being an applied statistician is a lot like being an ethnographer.
I say this both based upon years of experience working in collaborative projects and consulting and based on my experience studying ethnography. (Recall: before my PhD in statistics, I started and quit a PhD in sociology).
Very often a question asked is not the ‘real’ question at hand. Typically, the person asking has a sense of the problem, but may not know exactly how to ask the question.
Read 12 tweets
28 Apr
Yesterday I tweeted about nested data, with multi-level models (MLM) versus OL + cluster-robust variance estimation (CRVE). This made me think about another confusion that arise, between what are called fixed versus random effects.
Let’s begin with a simple relationship between a covariate X and Y in nested data, e.g. students i nested in school j. We are interested in understanding the relationship between X and Y at the student level.
Approach 1: Assume the schools are fixed, but that students are a random sample within these schools. Assume the relationship between X and Y is the same in all schools. This often amounts to including a dummy variable for each school in the model. Here I use OLS to estimate β_1.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(