Happy to finally share what I have been working on for some time now.
Introducing »Ludic« – an LLM-RL library for the era of experience.
While there are now a lot of LLM-RL codebases, even many good ones, I want to share my very idiosyncratic way to think about LLM-RL.
Ludic is not meant to be the next big framework that everyone will use to train frontier models; there is no way for one individual to compete with an entire team of experienced engineers working on this full-time. What, then, is the purpose of Ludic?