Latest Twitter Threads by @adityagrover_ on Thread Reader App

Aditya Grover

@adityagrover_

Assistant Professor of CS @UCLA. Previously: PhD @StanfordAILab, bachelors @IITDelhi. AI, ML, Climate.

Mar 9, 2022 • 8 tweets • 3 min read

How does exploration vs exploitation affect reward estimation? Excited to share our #AISTATS2022 paper that constructs optimal reward estimators by leveraging the demonstrator's behavior en route to optimality.

🧵:

Exploration vs exploitation is key to any no-regret learners in bandits. We derive an information-theoretic lower bound that applies to any demonstrator, which shows a quantitative tradeoff between exploration and reward estimation. 2/

Share this page!

Enter URL or ID to Unroll