Taiwei Shi Profile picture
AI Researcher & Ph.D. student @nlp_usc. Intern @MSFTResearch. Formerly @GeorgiaTech @USC_ISI. NLP, RL & Computational Social Science.
Feb 17 10 tweets 4 min read
For decades, we’ve trained AI to chase rewards. But humans don’t just optimize outcomes. We experience, reflect, then learn.

Can AI do the same?

Introducing 𝐄𝐱𝐩𝐞𝐫𝐢𝐞𝐧𝐭𝐢𝐚𝐥 𝐑𝐞𝐢𝐧𝐟𝐨𝐫𝐜𝐞𝐦𝐞𝐧𝐭 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠, a step toward AI that truly learn from experience. Image Most training paradigms today either teach models to imitate examples or to optimize reward signals through trial and error.

But in both cases, feedback is not explicitly turned into behavioral insight. Models improve, but they do not explicitly reflect on why. Image