Latest Twitter Threads by @slippylolo on Thread Reader App

May 26, 2023 • 5 tweets • 3 min read

👑 It's time for a new contender in the open LLM space!

▶️ We are releasing Falcon-40B & 7B, two strong LLMs which are topping the charts on @huggingface Open LLM leaderboard.

huggingface.co/tiiuae/falcon-…

huggingface.co/spaces/Hugging… Cool features of Falcon:
🏎 FlashAttention (from @tri_dao @realDanFu) + multi query enable cool inference optims.
💪 Super strong perf across the board, outperforming public & private models.
📀 And we are releasing 600GT of our high-quality web data: huggingface.co/datasets/tiiua….

May 24, 2022 • 14 tweets • 10 min read

🌸 The @BigScienceLLM BLOOM 176B parameters model training has just passed 230B tokens: that’s more than a million books in two months!

🤔 But how did we decide what model to train with our one million GPU hours?

⬇️ Thread time! #acl2022nlp

🏅 We had five main considerations: it needed to be proven, scalable, efficient, multilingual, and to exhibit emergent capabilities (e.g. zero-shot generalization)

⏰ At the >100B scale, every inefficiency matters! We can’t afford an unoptimized setup…

Jun 26, 2020 • 9 tweets • 4 min read

💡 Can we learn challenging tasks without backpropagation? Scale a biologically-motivated method to hard datasets? Without *any* knowledge of the forward weights in the backward? Yes, We Can!

🎓 arxiv.org/abs/2006.12878
Joint work with @iacopo_poli @KrzakalaF @LightOnIO
[1/9]

🧐 A central question in bio-inspired ML is the weight transport problem: the backward pass cannot realistically access information about the forward weights. While local learning has been demonstrated, methods devoid of weight transport fail on computer vision tasks.
[2/9]

Share this page!

Enter URL or ID to Unroll