Thread by @lupantech on Thread Reader App

🔥Excited to release LLaMA-Adapter! With only 1.2M learnable parameters and 52K instruction data, LLaMA-Adapter turns a #LLaMA into an instruction-following model within ONE hour, delivering high-quality responses!

🚀Paper: arxiv.org/abs/2303.16199
🚀Code: github.com/ZrrSkywalker/L…

We adopt learnable adaption prompts and prepend them to the input text tokens at higher transformer layers. A zero-init attention mechanism with zero gating adaptively injects the new instructional cues into LLaMA, while effectively preserving its pre-trained knowledge.

With efficient training, LLaMA-Adapter generates high-quality responses, comparable to Alpaca with fully fine-tuned 7B parameters.

LLaMA-Adapter can be simply extended to multi-modal input, e.g., images, for image-conditioned LLaMA, which achieves superior reasoning capacity on #ScienceQA (scienceqa.github.io), a recent multi-modal science question benchmark.

With a mere 1.2 million learnable parameters, LLaMA-Adapter demonstrates superior reasoning capacity on #ScienceQA, surpassing a diverse range of multi-modal and LLM models, such as fully-finetuned MM-COT and few-shot GPT-3.

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll