Random numbers (e.g. from PRNGs) are everywhere in AI—but are they actually a good idea? In particular, are random numbers the best we can do for numerical problems like linear algebra, integration and optimisation? Probabilistic Numerics (PN) has (very radical!) views 🧵
In numerical integration, e.g. estimating ∫_{-3}^{3} f(x) dx, one popular approach is Monte Carlo (named after the casino), which uses random numbers to select the locations x_i of evaluations f(x_i).
The PN approach to numerical integration is called Bayesian quadrature (BQ).
BQ—e.g. our algorithm WSABI—can be faster than Monte Carlo, using fewer evaluations to achieve a required level of error. In this example, Monte Carlo takes minutes (using thousands of evaluations) to reach an error that WSABI achieves in seconds (using a handful of evaluations).
BQ's speed-up is partly due to BQ's use of complex models, which squeeze more information out of each evalution. However, these complex models take a long time to compute. This makes BQ's wall-clock speed-up even more impressive—BQ starts with a big handicap, but still wins!
As in this thread—the better-informed (and consequently more complex) our model, the faster BQ can be.
BQ can also improve performance through selecting evaluations more *intelligently*. Using the complex model p(f(x)), WSABI evaluates at the x_i at which f(x_i) is most uncertain—such evaluations result in reductions in uncertainty, and increases in information.
From this perspective of *selecting* evaluations to achieve a goal (like reducing uncertainty)—using random numbers is clearly sub-optimal. Why roll a die (no-one is forcing us to roll!) when we could just pick the six?
Recall that our strong model enables us to intelligently select evaluations—the computational costs (costs in time) of the model are better viewed as a sound INVESTMENT. You pay for the model up-front, but then it pays you back with interest.
That is, a random number is not cheap. More precisely, a random number is cheap like a rusty second-hand car—cheap in the short run, perhaps, but more expensive in the long run.
Random numbers achieve their "cheapness" through imposing constraints—no knowledge of the task to be solved (e.g. in Monte Carlo, the locations x are irrespective of f) nor more than minimal storage of previous choices x. These self-imposed constraints are extreme.
The Monte Carlo method was conceived in the 1940s, when compute was very limited—then, perhaps the up-front costs of a complex model could not be afforded. But is it still reasonable to labour under the computational constraints of vacuum tubes?
At this point, we need to define what is unique about random numbers: they are *unpredictable*. But unpredictability is hard to define! Which of the following sequences is random?
622… was generated by throwing a 6-sided dice 7 times and repeating the result of the i-th throw i times. This sequence is “random” because it is unpredictable, but it does not pass standard tests of randomness.
100… are taken from a CD-ROM published by George Marsaglia, containing random numbers generated using a variety of physical random number generators. These digits (perhaps) were once “random”. But, now that you know where we got them, they are obviously deterministic, not random
712… was generated by the von Neumann method, a pseudo-random number generator (PRNG), using the seed 908344. It is the kind of sequence used in real Monte Carlo algorithms, and—now that you know the seed—entirely deterministic, no longer random!
I apologise for giving you a trick question. The point is that random numbers are defined as being unpredictable by *average humans*. As such random numbers make sense for cryptography. However, random numbers seem a strange basis on which to build scientific computing.
In a sort of analogy to cryptography, some defend the use of random numbers by arguing that randomness provides a defence against adversarial problems, via concerns about subjectivity and bias. Importantly, however, the problem tackled by a numerical method is not adversarial.
Exactly the opposite—problems (like architectures and loss functions) and the numerical method (like optimisers) are designed *together*. We're all on the same team! The problems are hence regular and well-characterised by source code—why not give that information to our model?
Relatedly, Monte Carlo is often believed to be better for the "adversarial" setting of high-dimension, because Monte Carlo's error "does not depend on dimension"—however, here Nicola shows that Monte Carlo estimates are NOT independent of input dimensionality at all.
Let's return to PRNGs, the most common random numbers in practice. Our discussion above about the seed for a PRNG is not purely philosophical. Studies have noted that the empirical performance of popular machine learning algorithms is sensitive to the choice of random seed.
Sensitivity to the seed is wild! The seed is, after all, a hyperparameter. But we can't optimise this hyperparameter, because we can't get a gradient with respect to the seed—the PRNG is EXPLICITLY DESIGNED to make the output independent of the seed. What a weird thing to design!
Some people fix the seed (e.g. for reproducibility)—but doing so just underlines that you don't need random numbers. Fixing a seed defeats the one goal of a PRNG: to be random. If you're going to use a deterministic sequence, why not use one designed for *your* goals?
Some have reported that they cannot see the tweet just before the one starting "To the extent"—if so, this is the missing tweet:
Relatedly, Monte Carlo is often said to be better for the "adversarial" setting of high-dimension, because Monte Carlo's error "does not depend on dimension"—however, here Nicola shows that Monte Carlo estimates are NOT independent of input dimensionality.
People who say that AI could not possibly kill us all seem very confident about the geopolitical relationships between states that have invested a lot in both AI and nuclear weapons
Even if AI does not "press the button", the rapidly-advancing, uncertain, progress of AI might threaten the balance of peace e.g. AI-powered underwater drones that prove capable of locating nuclear submarines—then a state might think it could launch a successful first strike
Even if a state THINKS that an opposing state is developing such a drone, peace could be threatened
My views on covid seem to have become a bit, well—radical. Please allow me to explain. I did not start out radical. I am lucky to have a settled, establishment-adjacent, career. Three years ago, on the eve of the pandemic, I trusted the establishment.
I trusted the establishment when it said that there was ~no covid in Oxford in Mar 20. Then our little family all got it. I trusted the establishment when it said that, because we were healthy, we'd be fine. I trusted the establishment when it said kids don't get sick
Our 12 month old developed a fever (40.1 degrees) for three weeks, aside from one day—when, over a few hours, our baby's temperature dropped like a stone, into hypothermia, like that little body was just shutting down. And we still don't know about the long-term consequences.
You can't judge a Twitter fiction account by a single tweet any more than you can judge a novel(la) by fifty of its words. You have to read for a while to absorb the voice, the themes, the trajectories. Nonetheless, here's a sampler of tweets from my favourite fiction accounts:
It's Sunday—porridge day! Porridge, my (slightly-sticky) emotional crutch. I'm going to document my porridge with photos, in the hope that you may be able to share in my enjoyment.
Most are not OK with eating a raw egg because of the 1 in 20,000 risk of Salmonella—which causes diarrhoea & vomiting.
Most seem OK with getting covid (when triply-vaccinated) despite the **1 in 20** risk of #LongCovid—which causes diarrhoea, vomiting, BRAIN DAMAGE & much more