Tweet

More from @ChristophMolnar

Christoph Molnar

@ChristophMolnar

Sep 20

Interpretable machine learning is a mishmash of different methods.
I use mental models to understand how interpretation methods work.

My favorite mental model is like an x-ray view that reveals the core of any interpretation method: Functional decomposition.

A thread🧵

A prediction model is a function f that maps from p features to 1 output.

Interpretation often means breaking f down into lower dimensional parts. Partial dependence plots reduce f to 1 feature through marginalization.

It's immensely helpful to view prediction function f as a decomposition.

We can decompose f into lower dimensional functions. Let's say f has 3 features, then:

f = f0 + f1 + f2 + f3 + f12 + f23 + f13 + f123

That's the formula through which I view all interpretation methods.

Read 9 tweets

Christoph Molnar

@ChristophMolnar

Sep 19

Supervised learning "only" gives you a prediction function.

But with the right tools, you'll get a lot more:

• Uncertainty quantification
• Causality
• Interpretability
• Analysis of variance
• ...

And the best news: tools in this thread work for any black box model

👇

Uncertainty quantification

Conformal prediction turns "weak" uncertainty scores into rigorous prediction intervals.

For example:

• class probabilities -> classification sets
• quantile regression -> conformalized quantile regression

arxiv.org/abs/2107.07511

Causality

Orthogonal/double machine learning brings causal inference to supervised learning. You can estimate treatment effects by training two models (one for treatment, one for control).

econml.azurewebsites.net/spec/estimatio…

Read 8 tweets

Christoph Molnar

@ChristophMolnar

Sep 15

Bayesians versus Frequentists is an ancient debate.
But have you heard of likelihoodism?

🧵 A thread on likelihoodism, why no one uses it, and how it helps to understand the Bayesian versus Frequentist debate better.

Likelihoodists honor the likelihood function above all else.
• They reject prior probabilities. That's a big middle finger to the Bayesian approach.
• Evidence from the data must only come through the likelihood. That's why they reject frequentist inference.

I gotta explain the second point. It's not intuitive how frequentist modeling violates this "likelihood principle".

In other words: in frequentist inference, information from the data influences the modeling, but is not part of the likelihood. Nani?? How can that be?

Read 14 tweets

Christoph Molnar

@ChristophMolnar

Sep 15

Most ML interpretation methods have a common enemy:

Correlated features.

They ruin interpretation both on a technical and a philosophical level.

Why correlation is problematic, how to patch it, and why we have no cure.

A thread 🧵

Correlated features are the rule, not the exception.

• Predicting bike rentals? Season and temperature are correlated.
• Credit scoring? Income correlates with age, job, ...
• Diagnosing patients? Blood values are correlated, like multiple markers of inflammation, ...

We'll use 3 points of view to understand the effect of correlation on interpretability:

• Extrapolation
• Entanglement
• Latent variables

Note: correlation here includes more general dependencies, not only linear correlation.

Read 19 tweets

Christoph Molnar

@ChristophMolnar

Sep 13

Supervised machine learning models are deployed everywhere.

It's an open secret that all models have a huge problem:

Performative prediction - when predictions change future outcomes.

How to spot and handle this problem: A thread 🧵

Once a machine learning model is deployed in the wild, the predictions will affect its environment. If not, what would be the point of the model? That's why almost every deployed model is affected by performative prediction. Once you get the concept, you see it everywhere.

The model changes the environment ...

So what?

The change often affects model performance. And so endangers the product, the people involved and your wallet.

Read 14 tweets

Christoph Molnar

@ChristophMolnar

Sep 9

It's overwhelming to keep up with research on interpretable machine learning. I say that as the author of the Interpretable Machine Learning book. 😅

I use these 3 questions to quickly understand new interpretation methods:

Questions for a quick assessment of an ML interpretation method:

• Are the explanations global or local?
• Does the method interpret model components or model behavior?
• Does the method compute feature effects, feature importance, or attributions?

Global versus local is the simplest one.

A global interpretation method describes the overall "behavior" of the model. Examples: Permutation feature importance, linear model coefficients, and SHAP importance.

Read 17 tweets

Share this page!

Christoph Molnar

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @ChristophMolnar

Christoph Molnar

Christoph Molnar

Christoph Molnar

Christoph Molnar

Christoph Molnar

Christoph Molnar

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!