How can deep learning be useful in causal inference?
In our #NeurIPS2022 paper, we argue that causal effect estimation can benefit from large amounts of unstructured "dark" data (images, sensor data) that can be leveraged via deep generative models to account for confounders.
Consider the task of estimating the effect of a medical treatment from observational data. The true effects are often confounded by unobserved factors (e.g., patient lifestyle). We argue that latent confounders can be discovered from unstructured data (e.g., clinical notes).
For example, suppose that we have access to raw data from wearable sensors for each patient. This data implicitly reveals whether each patient is active or sedentary—an important confounding factor affecting treatment and outcome. Thus, we can also correct for this confounder.
In our paper, we propose a deep generative model of treatments t, outcomes y, latent confounders z, and multiple types of unstructured data x_i. This model generalizes structural equations—when it fits the data well, it yields estimates of treatment effect.
Learning this deep latent generative model can be challenging. We provide a variational inference algorithm that leverages the graphical structure of the model, naturally deals with multiple sources of data possibly missing at random, and scales to large datasets.
We apply this model to a wide range of tasks, including genome-wide association studies in plants. We show that leveraging unstructured weather data can help more accurately identify the causal effect of genetic mutations.
This work was led by Cornell PhD student Shachi Deshpande. If you missed Shachi’s presentation at NeurIPS, check out our paper here: openreview.net/pdf?id=ByYFpTw…
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Imagine you build an ML model with 80% accuracy. There are many things you can try next: collect data, create new features, increase dropout, tune the optimizer. How do you decide what to try next in a principled way?
Here is an iterative process for developing ML models using which you can obtain good performance even in domains in which you may have little expertise (e.g., classifying bird songs). These ideas are compiled from my Applied ML class at Cornell.
You want to start with an initial baseline and evaluate its performance on a held-out development set. Based on what you see, you try a new model and fix the actual problems you observed. You retrain the new model, re-analyze, and repeat the process as long as needed.
Did you ever want to learn more about machine learning in 2021? I'm excited to share the lecture videos and materials from my Applied Machine Learning course at @Cornell_Tech! We have 20+ lectures on ML algorithms and how to use them in practice. [1/5]
One new idea we tried in this course was to make all the materials executable. Each set of slides is also a Jupyter notebook with programmatically generated figures. Readers can tweak parameters and generate the course materials from scratch. [2/5]
Also, whenever we introduce an important mathematical formula, we implement it in numpy. This helps establish connections between the math and how to apply it in code. [3/5]