How to get URL link on X (Twitter) App
https://twitter.com/Tim_Dettmers/status/1661379354507476994Guanaco models use Low-rank Adapters (LoRA) and a base model (LLaMA). As such, to use Guanaco models, you need to load each of them and combine them. You can do that in many different ways. The CPU memory needed is the final model size (not checkpoint size). Here the use-cases: