Ars longa
Building serverless post-training for developers
Nov 19 • 5 tweets • 1 min read
On active learning in prompt optimization:
The maddening thing about the instruction mining / spec discovery side of prompt optimization is that you're essentially inferring human preferences from data.
Often, you want to be sure about those inferences!
Active learning gives us a procedure for deciding when to gather extra info, whenever that is costly.
There are two main "costly tools" we can give the optimizer, at least two that come to mind for me:
- expensive / broad dataset search
- human-as-a-tool