How to get URL link on X (Twitter) App
Imagine predictive medicine LLMs being used by a doctor to help treat patients. One is made by Merck. The doctor wants to cure his patients as quickly as possible. The Merck LLM also tries to do this, but has a preference for Merck drugs, resulting in substantial misalignment.
LLM prompts can be very different. A model might be much more likely to hallucinate when asked for citations to the functional analysis literature vs. when asked for state capitals. Calibrated models can be systematically over-confident for one and under-confident for the other.
Instead of making point predictions, we can quantify uncertainty by producing "prediction sets" --- sets of labels that contain the true label with (say) 90% probability. The problem is, in a k label prediction problem, there are 2^k prediction sets. The curse of dimensionality!
First, some links. Here is our paper: arxiv.org/abs/2206.01067 Here is me giving a talk about it: simonsfoundation.org/event/robust-a… It’s joint work with Bastani, Gupta, @crispy_jung, Noarov, and Ramalingam. Our code will shortly be available on github, in the repository linked in the paper.