, 14 tweets, 6 min read Read on Twitter
Sharing my winter break project. I tried combining @hardmaru's world models and @yaroslav_ganin's SPIRAL to see if an agent can learn to paint inside its own dream. It can! These strokes are generated purely inside a world model, yet transfer seamlessly to a real paint program.
The approach is simple. I train a world model that can predict the output stroke of a real paint program for a corresponding action, in effect creating a fully differentiable paint program. I then train an agent to paint using the same adversarial method detailed in SPIRAL.
I find the most amazing thing about this is how quickly it converges. The original SPIRAL paper shows it converges in 50M steps (using multiple multi-CPU/GPU instances). The MNIST-trained agent I show here used 11K steps and took ~10 hours to train on a free @GoogleColab GPU.
MNIST was easy, so I tried KMNIST. Simple chars like つ and ハ were easy but overall, the dataset is much harder😅. I'm confident it can be improved with more strokes + training steps, though. Stroke order is also all over the place. @tkasasagi @mikb0b @hardmaru @KitamotoAsanobu
I also trained a full color 15-stroke agent on CelebA. The outputs seem on par with SPIRAL (orig. version). Unlike SPIRAL, my agent doesn't make strokes that are fully covered by future strokes. I attribute this to the ease of credit assignment when full gradients are available.
I *am* aware of the latest beautiful SPIRAL results with 200(!) strokes. I'd love to try that, but @GoogleColab will probably die before finishing a single training step.
A problem I found with this approach was how it reacts to discrete actions (eg. binary 0 or 1 to lift a brush or not. What happens at 0.5?). The gradients in this case do not model the real world properly and can cause weird behavior. In the worst case, it can kill training.
Perhaps in these cases, we can find a way to use gradient-free optimization (ES) to learn to explore these discrete actions, while keeping the power of regular gradient descent for continuous actions. 🤷‍♂️ Open to ideas
Anyway, I have to state here that I was not the first to think of this. While planning this tweet thread, I decided to look at some citations of the World Models paper and saw openreview.net/forum?id=HJxwD…. Apparently, I am some months late to the party 😅
I use a different painter, training method, and agent architecture, but the core idea remains the same: fully differentiable world models work well for generation. Painful lesson in doing proper literature review before working. Guess that's why I'm not a researcher ;)
In any case, I'm very excited about exploring the possibilities of world models for creative ML in general. Some ideas: Style transfer, DeepDream w/ a painter parameterization, going beyond brush strokes and changing the world model itself as a way to produce art.
There's lots of room for further work in this direction, and I'll continue exploring it to see what I come across. Maybe I should write a blog post
CelebA 30-stroke agent trained for 26k steps. About ~3 days on @GoogleColab. The agent has decided on an interesting representation of eyes.
@GoogleColab For anyone who wants to know more about this project, I wrote a blog post detailing my motivations, approach, and future work in this direction I'm excited about. reiinakano.github.io/2019/01/27/wor…
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Reiichiro Nakano
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!