Tweet

NicoWolf

Oct 25 • 7 tweets • 3 min read

A translation of my first thread for the general public out there. I will talk about how to correctly, yet efficiently model the uncertainty on predictions (for example in machine learning). (1/n)

#statistics #DataScience #machinelearning #conformalprediction

When I started as a PhD I was convinced of two things:
1) Modelling uncertainty is hard, and
2) The only viable approach is the Bayesian one.

This idea is so strongly ingrained in the statistical literature and data science community that it must be true, right? (2/n)

The answer is no and luckily I quickly learned of a great alternative. The idea behind "Conformal prediction" is as simple as possible: You calculate the errors on a holdout dataset and choose, for example, the 90% quantile. (3/n)

If you include in your prediction intervals (the confidence intervals for new predictions) all those values that deviate at most as much from your prediction as the chosen quantile, your intervals will be correct in 90% of the cases. (4/n)

In sharp contrast to most other methods, the intervals coming from this approach will be correct in a statistical sense (everybody will be happy) and in contrast to Bayesian models, you do not need a decent guess of the underlying distribution. (5/n)

The only thing you need is that the order of the data points does not matter (this holds for almost all data sets without a time aspect). Moreover, the framework is so much more efficient and straightforward than the numerical nightmare that is Bayesian modelling. (6/n)

@predict_addict

If you would like to know more about conformal prediction and how it behaves when combined with existing models, read our paper arxiv.org/abs/2107.00363 or follow @predict_addict with his awesome library. (7/n, n=7)

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @NicoWolf16

NicoWolf

@NicoWolf16

Oct 25

Mijn eerste draadje (zoals men dit blijkbaar noemt) gaat over het correct, maar eenvoudig modelleren van de onzekerheid op voorspellingen. (1/n)

#Statistics #DataScience #machinelearning #conformalprediction

Toen ik begon als PhD was ik van twee dingen overtuigd:
1) Onzekerheid op voorspellingen modelleren is niet eenvoudig.
2) Dit kan enkel op een Bayesiaanse manier.

Dit idee is zo sterk verspreid binnen de statistiek en datawetenschappen dat het wel waar moet zijn. Of niet? (2/n)

Gelukkig leerde ik al vrij snel een alternatief kennen. Het idee achter "Conformal prediction" is zo simpel als het maar kan zijn: je berekent de fouten op wat validatiedata en kiest bijvoorbeeld het 90%-kwantiel. (3/n)

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Separate emails with commas Message

Share this page!

NicoWolf

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @NicoWolf16

NicoWolf

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!