Big fan of the "I forced a bot to [...] over 1000" memes. But most of those posts are fake (i.e. human-generated). That's why I decided to make a real one

So I forced a bot to read over 1000 PubMed abstracts in order to generate new abstracts ImageImage
Basically, I pulled a random sample of 5000 abstracts from PubMed using the search terms: (causal inference) AND English[Language]

A random sample of the returned abstracts was used to train a recurrent neural network (RNN)
Basically, a sequence of 40 characters is used to predict the next character. This process can then be repeated with the new character to generate a whole new sentence

So you give the machine a starting point, set a 'creativity dial', and let it go
But enough of that nerd stuff. Here are some more abstracts ImageImageImage
Personally, I like giving it a starting string where it has to make up the estimates ImageImage
I can already see you typing, "But is it really AI?"

The answer is yes. I think the following examples demonstrate this quite clearly (by the AI discovering a new alpha-level) ImageImage
It really does like making the p-value to be around 0.01 though (a very backwards / weird demonstration of publication-bias perhaps?) Image
Okay, but I helped out my poor RNN a little with the abstracts. I had it generate each of the sections from a starting seed. It doesn't really understand the whole structured abstract concept

I also don't know what measure 'psycionion' is, but I am intrigued to learn ImageImage
Also it turns out, abstracts of causal inference are pretty predictable. Below is an abstract with the creativity dial turned up to 2.5 (the previous were 0.5) Image
All the code is available on GitHub. I provide a trained version of the RNN (since it has a long run-time without use of a GPU)

However, my code is structured so you could easily change the search terms and train a new version

github.com/pzivich/RNN-Ab…
I used biopython to query the abstracts from PubMed. The RNN is with tensorflow to do character-level text generation. There are a bunch of online guides if you wanted to code a version from scratch (that's how I did it)
I will say I did run into some fitting issues initially. To prevent over-fitting, I ran 10 epochs with exponential decay learning. I also added bounds to prevent gradient explosions (sounds cooler than what it actually is)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Pausal Živference

Pausal Živference Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @PausalZ

Nov 17, 2021
a 🧵 on M-Estimation and why I think its a valuable tool that epidemiologist should be using more often
M-Estimation is a general approach of defining an estimator as the solution to estimating equations like the following. Importantly, obs are independent and \psi is a known function that doesn't depend on i or n
I think its a great tool for two reasons: (1) the ability to stack estimating equations together, and (2) the sandwich variance
Read 17 tweets
Sep 24, 2020
Herd immunity is a far squishier concept then many seem to be describing in their "shielding" or "stratified herd immunity" plans. Here is the formula for herd immunity threshold for a SIR model Image
where \beta is the effective contact rate, N is the number of individuals, and r is the inverse of the duration

The threshold says if are above that level the disease will disappear / we expect no outbreaks of disease. However, that threshold is neither sufficient nor necessary
To show this, let's talk about a perfect vaccine. If you get this vaccine you are perfectly protected from the infection and thus cannot transmit it (everything also applies to imperfect vaccines but it's messier)

Blue circles are vaccinated individuals and red are unvaccinated
Read 15 tweets
Sep 20, 2020
8: WHEN CAN I IGNORE THE METHODOLOGISTS
Section 8 discusses when standard analytic approaches are fine (aka time-varying confounding isn't as issue for us). Keeping with the occupation theme, it is presented in the context of when employment history can be ignored Image
First we go through the simpler case of point-exposures (ie only treatment assignment at baseline matters). Note that while we get something similar to the modern definition, I don't think the differentiation from colliders is quite there yet (in the language) ImageImage
Generalization of the point-exposure definition of confounding to time-varying exposures isn't direct Image
Read 13 tweets
Sep 19, 2020
7: MORE ASSUMPTIONS
Section 7 adds some additional a priori assumptions that can allow us to estimate in the context where we don't have all necessary confounders.
We have the beautifully named: A-complete Stage 0 PL-sufficient reduced graph of R CISTG A Image
We start with some rules for reducing graph G_A to a counterpart G_B. Honestly the language in this section isn't clear to me despite reading it several times... ImageImage
I do think the graphs help a bit though. To me it seems we are narrowing the space of the problem. We are going from multiple divisions at t_1 and t_2 to only considering the divisions at t_2 for a single branch. The reduced STG is a single branch ImageImage
Read 6 tweets
Sep 15, 2020
6: NONPARAM TESTS
Section 6 goes through the sharp null hypothesis (that no effect of exposure on any individual). Note that this is weaker than the null of no _average_ effect in the population Image
Another way of thinking about this is if there is no individual causal effect (ICE) then there must be no average causal effect (ACE). The reverse (no ACE then no ICE) is not guaranteed
Robins provides us with the G-null hypothesis as a means of assessing the sharp null (the g-null is that call causal parameters are 0) ImageImageImage
Read 9 tweets
Sep 13, 2020
5: ESTIMATION
After a little hiatus, back to discussing Robins 1986 (with a new keyboard)! Robins starts by reminding us (me) that we are assuming the super-population model for inference Image
If we had a infinite n in our study, we could use NPMLE. However, time-varying exposures have a particular large number of possible intervention plans. We probably don't have anywhere near enough obs to consider all the possible plans Image
Instead we use a parametric projection of the time-varying variables. We hope that the parametric projection is sufficiently flexible to approx the true density function (it is why it is best to include as many splines and interaction terms as feasible)
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

:(