Follow @betanalpha

12,399 views

\mathfrak{Michael "El Muy Muy" Betancourt}

Follow @betanalpha

, 18 tweets, 3 min read

My Authors

One of the most frustrating aspects of the computational statistics literature is the varying convention for which assumptions will be implicitly assumed or not, making it near impossible to read a paper out the context of that convention. Let's talk about one bad consequence.

Like many other fields comp stats has evolved to be very provincial, with different perspectives evolving in different communities. For example in Markov chain Monte Carlo theory you can see the differences between work from the UK, Duke, Minnesota, etc.

Each of those perspectives give rise to their own conventions for notation, terminology, and, most importantly, assumptions. Papers written by that community for that community will often leave those assumptions implicit making them hard to read by those outside of the group.

Only once you've picked up on the pattern -- or learn which authors were trained in which perspectives -- will you start to be able to really understand the assumptions and goals for the methods being introduced or discussed.

One of the most egregious in my personal opinion? The Markov chain Monte Carlo central limit theorem. One particularly popular perspective assumes that the MCMC CLT exists unless stated otherwise, focusing on the consequences of that existence.

To be clear the work on those consequences is often really good and critically important for the robust practice of Markov chain Monte Carlo. The problem is that those consequences are meaningful only when those assumptions hold, and in practice they often don't.

Because they aren't aware of these implicit assumptions many readers, especially from applied fields, assume that those consequences always hold for any MCMC algorithm and any target distribution. As bespoke algorithms and models become more complicated this leads to disaster.

Once you start to recognize these assumptions not only are you better positioned to use these results responsibly but also you can start to understand some of the more fervent arguments in the statistics literature.

For example consider the great "one long chain or many short chains" Markov chain Monte Carlo debate that raged in the 90s with authors on both sides writing passive aggressive papers about the other side.

_Assuming a MCMC CLT_ the differences between the two are largely irrelevant, with a single long chain being a bit easier to implement well in practice. But if one doesn't assume that a MCMC CLT will always exist then multiple chains offer better diagnostics of CLT failures.

For decades the two sides talked past each other because they didn't recognize or acknowledge the different assumptions that were being taken for granted by the other side. Even today papers are being written that continue to make this mistake and sustain this confusion.

The consequences of implicit but inconsistent assumptions stretch even further. For example assuming that a MCMC CLT exists is reasonable if one restricts oneself to a very small class of modeling circumstances, say general linear models with conjugate priors and no misfit.

In this restricted context other algorithms _also_ do really well, such as expectation propagation and old-school variational Bayes. But when people talk about the superiority of these algorithms they're assuming this particular context.

The problem is that this context does not generalize to many applied settings where practitioners are working with complex system and the building correspondingly elaborate models. All of the assumed consequences can no longer be taken for granted.

In the end practitioners do their best. They read the literature and follow the guidance as closely as they can, not recognizing that that guidance may not be appropriate for their particular analysis and having no idea how to empirically verify whether it might be appropriate.

Again much of the work in the literature is very thoughtfully done, just _within its implicit assumptions_. In most cases the results are necessary but not sufficient for contemporary applied analyses, with untrained practitioners being left to cobble everything together.

In my opinion the original sin within the statistics academy is writing for ones own community instead of writing for the community that one wants to use their results. Conforming to the assumptions they can make instead of the ones convenient for oneself.

To end on a more productive note I have tried to expose all of the critical assumptions in Markov chain Monte Carlo and the best methods to validate them in a case study, betanalpha.github.io/assets/case_st…. It's way longer than any intro to MCMC you've seen but only because it has to be.🤓

Try unrolling a thread yourself!

More from @betanalpha see all

Embed code for your website

Did Thread Reader help you today?