(1/9) Presenting: Bayesian Algorithm Execution (BAX) and the InfoBAX algorithm.
Bayesian optimization finds global optima of expensive black-box functions. But what about other function properties?
w/ @KAlexanderWang @StefanoErmon at #ICML2021
URL: willieneis.github.io/bax-website
(2/9) Methods like Bayes opt / quadrature can be viewed as estimating properties of a black-box function (e.g. global optima, integrals). But in many applications we also care about local optima, level sets, top-k optima, boundaries, integrals, roots, graph properties, and more
(3/9) For a given property, we can often find an algorithm that computes it, in the absence of any budget constraint. We then reframe the task as: how do we adaptively sample to estimate *the output of this algorithm*?
We call this task “Bayesian algorithm execution” or BAX.
(4/9) Given an algorithm that computes the property we desire, our procedure InfoBAX sequentially chooses queries to the black-box function that maximally inform us of this computable property.
(5/9) Example 1: Bayesian local optimization.
If we run InfoBAX with a local optimization algorithm (e.g. evolution strategies, Nelder-Mead method, gradient descent), we get a variant of Bayesian optimization that targets local, rather than global, optima.
(6/9) Here’s an animation of InfoBAX with an evolution strategy.
Notice that, unlike (standard) Bayesian optimization, this algorithm only spends budget converging to a single local optima. It shows good performance in high dimensions.
(7/9) Example 2: Estimating shortest paths in graphs.
In graphs where it is expensive to access edge weights (e.g. transportation networks) we can use InfoBAX with Dijkstra’s algorithm to estimate shortest paths.
(8/9) Example 3: Top-k estimation.
Given a set of points X, we may wish to estimate the subset of k points that have the highest value under a black-box function f. This generalizes standard Bayesian optimization (i.e. top-1 estimation) to top-k estimation.
(9/9) Our method has connections to stats/ML areas such as Bayesian optimal experimental design, information theory, optimal sensor placement, stepwise uncertainty reduction, likelihood-free Bayesian inference, and more.
See more discussion in our paper arxiv.org/abs/2104.09460
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.