🔥Excited to release LLaMA-Adapter! With only 1.2M learnable parameters and 52K instruction data, LLaMA-Adapter turns a #LLaMA into an instruction-following model within ONE hour, delivering high-quality responses!
We adopt learnable adaption prompts and prepend them to the input text tokens at higher transformer layers. A zero-init attention mechanism with zero gating adaptively injects the new instructional cues into LLaMA, while effectively preserving its pre-trained knowledge.
With efficient training, LLaMA-Adapter generates high-quality responses, comparable to Alpaca with fully fine-tuned 7B parameters.
LLaMA-Adapter can be simply extended to multi-modal input, e.g., images, for image-conditioned LLaMA, which achieves superior reasoning capacity on #ScienceQA (scienceqa.github.io), a recent multi-modal science question benchmark.
With a mere 1.2 million learnable parameters, LLaMA-Adapter demonstrates superior reasoning capacity on #ScienceQA, surpassing a diverse range of multi-modal and LLM models, such as fully-finetuned MM-COT and few-shot GPT-3.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
🚨Struggling to select examples for GPT-3? Try our PromptPG, the first work that applies RL to select in-context examples for GPT-3! PromptPG achieves a gain of 5.31% on TabMWP, a new dataset of tabular math word problems! Check out data and codes:👇 promptpg.github.io 🧵1/7