Profile picture
, 5 tweets, 3 min read Read on Twitter
Do you formally know Monte-Carlo and TD learning, but don't intuitively understand the difference? This is for you.

distill.pub/2019/paths-per… (with @samgreydanus)
@samgreydanus We frame TD learning as MC learning over extra, simulated "paths". I think this is a really beautiful way to think about it.
@samgreydanus You can see this directly in the recursive expansion of the update rules.
@samgreydanus Another thing I like about this: it becomes intuitive why TD methods can become risky when you work with a function approximator rather than just a tabular environment.
@samgreydanus Another small takeaway: we find that RL equations are significantly simplified and focused on the important parts if you frame them in terms of an "update operator". I think this is a really useful pedagogical technique.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Chris Olah
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!