, 16 tweets, 5 min read Read on Twitter
This is a well-written, well-argued blog post by @jacobmbuckman which lays out a partial agenda for deep RL research. It is a perfectly respectable position. I find it very instructive, as it is close to the opposite of my own way of doing and thinking about research.
First of all, I don't think that to "derive automated solutions to real-world tasks" is an interesting or fertile research goal. I take issue with the words "real-world" and "solution". If you let the so-called real world dictate problems you must solve, you can only do so much.
It is much more interesting to think of new problems that have not been solved yet, or finding a new way of looking at an existing problem, or redefining what it means to "solve" a problem. Or maybe not think in terms of problems, but in terms of opportunities?
Looking at groundbreaking AI research, the papers that really inspire us, my understanding is that they were rarely motivated by a desire to solve a real-world problem. (Of course, the papers are often written that way, because peer review, funding bodies, and "respectability").
Moving on to the more technical content of the blog post, @jacobmbuckman argues for reducing complex deep RL problems to simple MDPs that remove the messiness of the real world. I think this leads to mostly useless algorithms about which you can say nice things mathematically.
The thing is, very often the "messiness" is an essential part of the problem/environment. A simplification is always a simplification _to_ something; there is no "core" of a problem, only a "core" from a particular perspective.
In our Six Neurons paper, we abstracted away the perceptual messiness of Atari playing and showed that the core could be solved with extremely simple methods. Except that... maybe what we abstracted away was the core?
There is no good answer to that question. There is not a single problem of Atari-playing, there's a number of different problems, depending on how you choose to see things. Maybe you allow yourself a forward model, and then you can "solve" the problem easily with planning.
Or maybe the goal is not to play the games well, but to play them with interestingly different playing styles, mimicking the archetypes of human players. Well, then completely different methods are called for.
arxiv.org/abs/1802.06881
@jacobmbuckman also argues that in "when looking at the big picture, nobody really cares about whether we can learn to play Atari games". What matters is "progress in algorithms to solve complex MDPs". Well, I don't particularly care about abstractions such as MDPs.
MDPs are one type of abstraction that might be seen as central to the RL enterprise. But only if you view problems from the perspective that they should be reduced to MDPs. And why should you? There is no "core". You could choose another abstraction.
Or you could simply choose to work on interesting problems and opportunities. Those are not necessarily "real-world", as I'm not even sure what that term means. Playing Atari is about as real-world to me as driving cars, balancing power grids, or composing music.
I think people often use the term "real-world" as a stand-in for "someone is willing to pay money for this". And I have no problem with people working on particular problems because they have to eat. We all do. But let's be honest: those problems are not ontologically privileged.
Also, I do not in any way mean this as an attack on @jacobmbuckman or his research agenda. This is a well-written and well-argued post laying out an agenda which I see as close to the mainstream in deep RL research. Many people get good results doing this type of research.
My own agenda is perhaps best summarized by this thread:

Basically, I'm more into problems than solutions, but even more into opportunities. And I think that to do great research, you need to either have massive resources or simply think differently.
Thanks to @jacobmbuckman for writing this post and inviting public discussion about it! Peace out 😊
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Julian Togelius
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!