Researcher @MSFTResearch. Co-founder pywhy/dowhy. Work on causality & machine learning. Searching for a path to causal AI https://t.co/tn9kMAmlKw
May 2, 2023 • 15 tweets • 5 min read
New paper: On the unreasonable effectiveness of LLMs for causal inference.
GPT4 achieves new SoTA on a wide range of causal tasks: graph discovery (97%, 13 pts gain), counterfactual reasoning (92%, 20 pts gain) & actual causality.
How is this possible?🧵 arxiv.org/abs/2305.00050
LLMs do so by bringing in a new kind of reasoning based on text & metadata. We call it knowledge-based causal reasoning, distinct from existing data-based methods.
Essentially, LLMs approximate human domain knowledge: a big win for causal tasks that often depend on human input.
Dec 20, 2022 • 9 tweets • 4 min read
#ChatGPT obtains SoTA accuracy on the Tuebingen causal discovery benchmark, spanning cause-effect pairs across physics, biology, engineering and geology. Zero-shot, no training involved.
I'm beyond spooked. Can LLMs infer causality? 🧵 w/ @ChenhaoTan
The benchmark contains 108 pairs of variables and the task is to infer which one causes the other. Best accuracy using causal discovery methods is 70-80%. On 75 pairs we've evaluated, ChatGPT obtains 92.5%.
And it doesn't even use the data. All it needs are the variable names!
Jul 29, 2022 • 7 tweets • 4 min read
This ICML, I was (pleasantly) surprised to see a bunch of papers on causal inference, specifically on how machine learning can help in estimating causal effects.
There is a lot of excitement about causal machine learning, but in what ways exactly can causality help with ML tasks?
In my work, I've seen four: enforcing domain knowledge, invariant regularizers, "counterfactual" augmentation & better framing for fairness & explanation. 🧵👇🏻
1)Enforcing domain knowledge: ML models can learn spurious correlations. Can we avoid this by using causal knowledge from experts?
Rather than causal graphs, eliciting info on key relationships is a practical way. See #icml2022 paper on how to enforce them arxiv.org/abs/2111.12490
My take: DAGs and PO are compatible, and the best analyses benefit from using both. 1/7
In a KDD tutorial with @emrek, we outline how you can use DAGs and potential outcomes together for causal analysis and discuss empirical examples. 2/7 causalinference.gitlab.io/kdd-tutorial/