How to get URL link on X (Twitter) App
https://twitter.com/ericjang11/status/1639882111338573824?s=20
https://twitter.com/awjuliani/status/1639806412007280640?s=20
https://twitter.com/ducha_aiki/status/1587366668845588480One nice explanation I've seen, from an optimization standpoint, is that CE gradients don't vanish as you get closer to the target:
https://twitter.com/ericjang11/status/1451628209376673794You might even be able to replace many inductive biases in RL theory with sufficient amounts of generalization.