Latest Twitter Threads by @ArtidoroPagnoni on Thread Reader App

May 24, 2023 • 8 tweets • 4 min read

4-bit QLoRA is here to equalize the playing field for LLM exploration. You can now fine-tune a state-of-the-art 65B chatbot on one GPU in 24h.

Paper: arxiv.org/abs/2305.14314
Code and Demo: github.com/artidoro/qlora

https://twitter.com/Tim_Dettmers/status/1661379354507476994

QLoRA uses a frozen 4-bit base model with adapters. We backpropagate through the 4-bit weights into the adapters. QLoRA incorporates the NF4 datatype, double-quantization, and paged optimizers. We show it is on par with 16-bit finetuning at a fraction of the memory footprint.

Share this page!

Enter URL or ID to Unroll