My Authors
Read all threads
Post half term and I am going to twitter thread two books I read. The first is the much lauded "Book of Why" by @yudapearl amazon.co.uk/dp/0141982411/…
I did enjoy this book; it takes you through a statistical understanding of causality, much of it created by @yudapearl and colleagues and importantly stresses that "mere correlation" is never able to answer causal questions.
Much of the ideas I've been exposed to before - experimental design, frequentist statistics, Bayesian statistics, the different statistical paradoxes (interestingly from the ENCODE days I had happy times with Peter Bickel, mentioned in this book) and Mendelian Randomisation
However this book convincingly for me put together a far stronger thesis about how these things relate to our understanding of causality, and furthermore how to formally think of this. @yudapearl and Dana Mackenzie thoughtfully titrated the amount of maths throughout the book
I particularly liked the emphasis on graphical causal models; they both feel natural and, similar to Bayesian models, are a way to capture our understanding of the world in a way which one can mathematically build upon.
There were somethings I was surprised at. @yudapearl spent a lot of time dismissing (perhaps denigrating in some passages) "old" statistical thinking, from Pearson to Fisher onwards. However, there was less discussion of the other "arm" of statistics around experimental design
Here experimental design and subsequent statistical tests - and far more than just "RCTs" - seems to me have implicitly or explicitly formalised much of the second level of causality.
In my own close field of genetics concepts such as "sufficient" and "necessary" components in a biological pathway - everyday language for "classical" geneticists - seem directly related to causal pathway diagrams (indeed, an interesting overlap of words).
What I find interesting here is that although RA Fisher has some dodgy ideas about smoking and confounders to smoking, he literally wrote the book on experimental design - at some level he completely understood causality concepts in this experimental design.
An experimental mindset is also I find interesting; experimentalists often known that the key thing is being able to independently, at the behest of the experimenter, alter the value or property of some aspect of a system; once done, one can study cause + effect of this component
Much of experimental thinking is working out how to specifically control aspects - this brought to mind recent elegant work to control the frequency (and *only* the frequency) of a biological oscillator to show its importance in development.
Here the scientist is able to work out cause and effect by being able to create specific manipulation paths, and in modern biology there are simply endless ways to do this, from transgenics, conditional genetics, optogenetics, chemical probes, light or electrical induced change.
There is a richness to these experimental designs and techniques which I think is best thought of as being able to be able to "wiggle" (@yudapearl's phrase) different components along a pathway at will, thus helping build a "true" understanding of the pathway
The second is that I feel that Mendelian Randomisation was both poorly explained and also, in my view, far more powerful in the most common, thorny "non experimental" questions of human observation.
In particular I think the meta-analysis (many many genetic instrument) modes of mendelian randomisation are ... amazing, and can handle far more complex scenarios than @yudapearl sketched out.
This is not magic pixie dust that can render all observational human problems tractable to causal analysis - some aspects of life is just complex + the MR result is just a big mess - plus one needs to be skeptical of genetic gotchas (eg cryptic population stratification)
A final quibble is that the example SNP used by @yudapearl - rs16969968 - is indeed notorious and its reasonably clear its mechanistic action for lung cancer as it is a missense change in the nicotinic receptor. Judea+Dana don't really do it justice in the book in my view!
It is almost certain that its effect on lung cancer is because it increases the addictive impact of nicotine; consistent with this is that people who have *never* smoked have virtually no effect from this SNP, whereas people have smoked at least once - a big effect.
ie, the action here of the variant is on "not giving up smoking" rather than on "taking up smoking" and it is certainly not something directly involved in lung cancer. This actually to me seems like a good causal model diagram to write down and diagnose!
Anyway, to return to my thesis - this book is definitely worth reading, and helps me understand now things which I implicitly thought about but now have a far stronger framework for discussion.
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Ewan Birney

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!