Latest Twitter Threads by @siddkaramcheti on Thread Reader App

Aug 26, 2021 • 5 tweets • 6 min read

In addition to the codebase, @laurel_orr1 and I wrote up a blog post (with the rest of the Propulsion team!) describing a bit more about Mistral and our journey in more detail.

Check it out here, and we'd love to hear your thoughts: crfm.stanford.edu/blog.html [1/5]

https://twitter.com/siddkaramcheti/status/1430195543301492744

I really hope that our voices came through; we tried to keep it light, while also hitting on the hurdles we encountered along the way!

Not everything made it into the blog, so we also recorded a light & lively 25-min podcast: soundcloud.com/propulsion-mix… [2/5]

Aug 24, 2021 • 4 tweets • 5 min read

We're excited to open-source Mistral 🚀 - a codebase for accessible large-scale LM training, built as part of Stanford's CRFM (crfm.stanford.edu).

We're releasing 10 GPT-2 Small & Medium models with different seeds & 600+ checkpoints per run!

github.com/stanford-crfm/… [1/4] At 10:20 PDT, @laurel_orr1 and I will be talking at the Workshop for #FoundationModels (crfm.stanford.edu/workshop.html) about Mistral, as well as our journey towards transparent and accessible training.

We hope to see you there - bring your questions! [2/4]

Jul 23, 2020 • 7 tweets • 4 min read

Since getting academic access, I’ve been thinking about GPT-3’s applications to grounded language understanding — e.g. for robotics and other embodied agents.

In doing so, I came up with a new demo:

Objects to Affordances: “what can I do with an object?”

cc @gdb

“Priming” the model was pretty straightforward — I just picked four random objects, and chose the first few affordances that came to mind:

Share this page!

Enter URL or ID to Unroll