Yesterday, I covered the 3 classic ways to finetune LLMs. Let's now delve into parameter-efficient finetuning techniques.
Parameter Efficient Finetuning Part I: let's start with Prefix Finetuning.
1/6
The intuition is that a proper context can steer the LLM towards performing a desired task without the need of updating the LLM's parameters. We learn a set of tokens called a "prefix" that (conditioned on by the model) guides the model's output toward the desired behavior.
2/6
Prefix finetuning is somewhat related to in-context learning or prompting -- in prompting, we change the input prompts, for example, by changing or inserting certain tokens, to get a desired output. In Prefix Finetuning, we work with the token embeddings.
3/6
In contrast, in Prefix Finetuning, we prepend one or more "virtual tokens," which are continuous, task-specific vectors. We then keep the pretrained LLM frozen and only train the prefix parameters.
Below is some pseudocode to illustrate this.
4/6
In this prefix finetuning approach, the context size may exceed the original model's maximum sequence length if we are not careful. To ensure that the context size does not exceed the original model's maximum sequence length, we can apply truncate the input sequences.
5/6
Based on "Prefix-Tuning: Optimizing Continuous Prompts for Generation" (arxiv.org/abs/2101.00190), this method is pretty competitive with "Fine-Tune" (here, "Fine-Tune" means updating all model parameters)
6/6
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
