Julien Launay Profile picture
DIR Large Language Modeler.
May 26, 2023 5 tweets 3 min read
👑 It's time for a new contender in the open LLM space!

▶️ We are releasing Falcon-40B & 7B, two strong LLMs which are topping the charts on @huggingface Open LLM leaderboard.

huggingface.co/tiiuae/falcon-…

huggingface.co/spaces/Hugging… Cool features of Falcon:
🏎 FlashAttention (from @tri_dao @realDanFu) + multi query enable cool inference optims.
💪 Super strong perf across the board, outperforming public & private models.
📀 And we are releasing 600GT of our high-quality web data: huggingface.co/datasets/tiiua….
May 24, 2022 14 tweets 10 min read
🌸 The @BigScienceLLM BLOOM 176B parameters model training has just passed 230B tokens: that’s more than a million books in two months!

🤔 But how did we decide what model to train with our one million GPU hours?

⬇️ Thread time! #acl2022nlp 🏅 We had five main considerations: it needed to be proven, scalable, efficient, multilingual, and to exhibit emergent capabilities (e.g. zero-shot generalization)

⏰ At the >100B scale, every inefficiency matters! We can’t afford an unoptimized setup…
Jun 26, 2020 9 tweets 4 min read
💡 Can we learn challenging tasks without backpropagation? Scale a biologically-motivated method to hard datasets? Without *any* knowledge of the forward weights in the backward? Yes, We Can!

🎓 arxiv.org/abs/2006.12878
Joint work with @iacopo_poli @KrzakalaF @LightOnIO
[1/9] 🧐 A central question in bio-inspired ML is the weight transport problem: the backward pass cannot realistically access information about the forward weights. While local learning has been demonstrated, methods devoid of weight transport fail on computer vision tasks.
[2/9]