Mikhail Burtsev Profile picture
Landau AI Fellow @ LIMS, Founder of open-source conversational AI framework - https://t.co/TWm2mcySvt

Apr 21, 2023, 10 tweets

🚀 1/ Excited to share our (with Aydar Bulatov and @yurakuratov ) report on scaling Recurrent Memory Transformer to 2M (yes, two millions)😮 tokens! 🧠🌐 #AI #NLP #DeepLearning

2/ 📈 We've tackled the quadratic complexity of attention in #Transformers by combining token-based memory & segment-level recurrence, using RMT.
🔸 RMT adapts to any Transformer family model
🔸 Memory tokens provide the recurrent connection 🎛️💡 #AI #NLP #DeepLearning

3/ 🧠 We tested RMT's memorization capabilities with synthetic datasets requiring fact memorization, detection, & reasoning. The model must separate facts from irrelevant text and use them to answer questions in a 6-class classification. 🎯 #AI #NLP #DeepLearning

4/ 📊 In our experiments, we used the pretrained BERT model as the backbone for RMT. We employed curriculum learning, starting with shorter tasks & increasing length upon convergence. This improved accuracy & stability in our model's performance. 💪 #AI #NLP #DeepLearning

5/ 📈 RMT's extrapolation abilities: Models trained on 7 segments generalize surprisingly well even on sequences up to 2,043,904 tokens! 🔝🚀 #AI #NLP #DeepLearning

6/ 🍃 Computational efficiency: RMT scales linearly for any model size with fixed segment length. Larger Transformers exhibit slower quadratic scaling, but RMT requires fewer FLOPs and can reduce FLOPs by up to 295x! 🌟✂️ #AI #NLP #DeepLearning #Efficiency

7/ 🔍 Attention Patterns of Memory Operations: RMT's attention maps reveal specific patterns in memory operations during a reasoning task. 💡📚

8/ 🔗 Report: bit.ly/3Lk9jbQ
Code: bit.ly/40sMt6b

@booydar and @yurakuratov did all the job and I just have a lot of fun! 🥸

And finally on arxiv 🍾 arxiv.org/abs/2304.11062

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling