How to get URL link on X (Twitter) App
https://twitter.com/1x_tech/status/1836094175630200978@1x_tech work was mostly done by @JackMonas and Kevin Zhao btw 2/n
https://twitter.com/ericjang11/status/1639882111338573824?s=20
https://twitter.com/awjuliani/status/1639806412007280640?s=20
https://twitter.com/ducha_aiki/status/1587366668845588480One nice explanation I've seen, from an optimization standpoint, is that CE gradients don't vanish as you get closer to the target:
https://twitter.com/ericjang11/status/1451628209376673794You might even be able to replace many inductive biases in RL theory with sufficient amounts of generalization.