Vlad Lialin Profile picture
Teaching robots to learn at https://t.co/keOXJH7j4U 🦾
Jul 13, 2023 12 tweets 3 min read
Parameter-efficient fine-tuning revolutionized the accessibility of LLM fine-tuning, but can they also revolutionize pre-training? We present ReLoRA — the first PEFT method that can be used for training from scratch! 🔥🔥

https://t.co/eVyRyJkgxGarxiv.org/abs/2307.05695
ReLoRA achieves the same perplexity as full-rank training at a fraction of the number of trainable parameters. Why can't we use regular LoRA for pre-training? Because it only does optimization in a small low-rank subspace of the model parameters. It is enough for fine-tuning, but you don't want to have rank restrictions during pre-training.
Mar 30, 2023 4 tweets 3 min read
Yesterday we published our parameter-efficient fine-tuning survey. Let's go over some of the methods that we discuss in the paper!
We found that (AI)3 by @liu_haokun, Derek Tam, @Muqeeth10, and @colinraffel is one of the hidden gems of PEFT. It is simple, trains very few parameters, and outperforms strong methods like LoRa and Compacter. Let's quickly go over how it works.
Mar 29, 2023 9 tweets 6 min read
How to RLHF #LLAMA if you don't have hundreds of GPUS? Do it in a parameter-efficient way.
I'm happy to finally share our parameter-efficient fine-tuning #PEFT survey! It took quite a bit more time to make than I expected, but I feel good about the result
arxiv.org/abs/2303.15647 Taxonomy of PEFT PEFT methods can target several things: storage efficiency, multitask inference efficiency, and memory efficiency are among them. We are interested in the case of fine-tuning large models, so memory efficiency is a must.