Harman Singh Profile picture
PhD student @berkeley_ai, Prev: Gemini @GoogleDeepMind, AI Resident @MetaAI Interested in intelligence.
Mar 5 9 tweets 5 min read
Can LLMs Self-Verify? Much better than you'd expect.

LLMs are increasingly used as parallel reasoners, sampling many solutions at once.
Choosing the right answer is the real bottleneck.

We show that pairwise self-verification is a powerful primitive.

Introducing V1, a framework that unifies generation and self-verification:

💡 Pairwise self-verification beats pointwise scoring, improving test-time scaling
💡 V1-Infer: Efficient tournament-style ranking that improves self-verification
💡 V1-PairRL: RL training where generation and verification co-evolve for developing better self-verifiers

🧵👇 Paper: arxiv.org/abs/2603.04304
Code: github.com/HarmanDotpy/pa…
Project page: harmandotpy.github.io/v1-verificatio…

Pairwise self-verification improves test-time scaling across code and math tasks.