One of my favorite parts of grad school is learning about all the awesome work my friends are doing. I thought I'd make a thread of some of it (most of them the first paper of a PhD!) that's coming out this week at #NeurIPS2021. Apologies in advance if I forgot some:
First up: An elegant regularization technique for stabilizing Q-functions by @alexlioralexli: proceedings.neurips.cc/paper/2021/fil…. I really like the idea of Fourier features and it was neat to see them applied to RL. The NTK-based analysis taught me a bunch as well.
Next, a parallelized training procedure for DEQs and their inputs by @SwaminathanGur3: arxiv.org/abs/2111.13236. Full of solid optimization theory leveraged to provide some really impressive empirical results. Implicit models are getting more impressive every day.
Next, an extension of learning under strategic behavior to the sequential setting by @keegan_w_harris: arxiv.org/abs/2106.03827. Knowledge of response dynamics is a very cool tool for incentivization. Really excited about the what's coming next in this line of work.
Next, multi-step curious exploration by @mendonca_rl: openreview.net/forum?id=Qf1C1…. Using ensemble disagreement as a measure of uncertainty is an idea that I think has broad and interesting applications for sequential decision making. Excited about the real robot experiments :)
Next, learning to compose closed-loop controllers by @mihdalal: arxiv.org/abs/2110.15360. For lots of real-world problems (e.g. self driving), hierarchical decomposition works extremely well at managing complexity. This is a really solid baseline for future work.
Next, an incredibly impressive exploration of the statistical limits of imitation learning by Nived Rajaraman: proceedings.neurips.cc/paper/2021/has…. Confirms that some of our / DAgger-style reductions are statistically optimal in the finite sample regime. Quite excited about some follow-ups!
• • •
Missing some Tweet in this thread? You can try to
force a refresh