Tweet

Siddharth Karamcheti

26 Aug, 5 tweets, 6 min read

@laurel_orr1

In addition to the codebase, @laurel_orr1 and I wrote up a blog post (with the rest of the Propulsion team!) describing a bit more about Mistral and our journey in more detail.

Check it out here, and we'd love to hear your thoughts: crfm.stanford.edu/blog.html [1/5]

https://twitter.com/siddkaramcheti/status/1430195543301492744

I really hope that our voices came through; we tried to keep it light, while also hitting on the hurdles we encountered along the way!

Not everything made it into the blog, so we also recorded a light & lively 25-min podcast: soundcloud.com/propulsion-mix… [2/5]

@Thom_Wolf

Big thanks to everyone who helped us build Mistral -- from @Thom_Wolf & @StasBekman who helped us navigate @huggingface Transformers, to @carey_phelps for providing support with @weights_biases.

Also huge shoutout to @BlancheMinerva from #EleutherAI for providing feedback! [3/5]

@Tianyi_Zh

And of course, big thanks to the team: Jason, @Tianyi_Zh, @krandiash, @Avanika15, @RishiBommasani, @deepakn94 - Mistral is the culmination of months of our work!

Seriously - this is an amazing team and I'm incredibly lucky that I got to work with them! [4/5]

@tatsu_hashimoto

Finally, a huge thank you to our advisors for their pep talks and for keeping the lights on: @tatsu_hashimoto, @jurafsky, @chrmanning, @ChrisGPotts, @HazyResearch, and @percyliang.

And thanks to @StanfordHAI and @StanfordAILab! [5/5]

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @siddkaramcheti

Siddharth Karamcheti

@siddkaramcheti

24 Aug

We're excited to open-source Mistral 🚀 - a codebase for accessible large-scale LM training, built as part of Stanford's CRFM (crfm.stanford.edu).

We're releasing 10 GPT-2 Small & Medium models with different seeds & 600+ checkpoints per run!

github.com/stanford-crfm/… [1/4]

@laurel_orr1

At 10:20 PDT, @laurel_orr1 and I will be talking at the Workshop for #FoundationModels (crfm.stanford.edu/workshop.html) about Mistral, as well as our journey towards transparent and accessible training.

We hope to see you there - bring your questions! [2/4]