Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

elvis

@omarsar0

Sep 6 • 7 tweets • 3 min read • Read on X

Scrolly

Everyone is talking about this new OpenAI paper.

It's about why LLMs hallucinate.

You might want to bookmark this one.

Let's break down the technical details:

Quick Overview

The paper argues that hallucinations are not mysterious glitches but the predictable result of how LLMs are trained and evaluated.

Pretraining creates statistical pressure to make errors, and post-training benchmarks often reward confident guessing over honest uncertainty.

The fix is to realign mainstream evaluations to stop penalizing abstentions.

Pretraining inevitably produces some errors

Even if you trained on flawless text, the way models learn guarantees they’ll still slip up sometimes.

That’s because the training goal pushes them to give answers instead of saying “I don’t know.”

The calibration histograms below illustrate that GPT-4 style base models are well calibrated prior to RL, consistent with this claim.

Arbitrary facts drive a floor on hallucinations.

Details like birthdays or one-off events show up rarely in training data. If a fact appears only once, the model is just as likely to guess wrong later.

So for these “one-shot facts,” hallucinations are baked in.

Weak models add to the problem.

When the model family cannot represent the needed distinctions, errors persist.

The paper formalizes this via an agnostic-learning bound and gives simple cases like multiple choice, where even optimal thresholding leaves a fixed error tied to model capacity, with an example showing classic n-gram models must fail on certain context dependencies.

Post-training often reinforces guessing

Most benchmarks score models only on right vs. wrong answers.

Saying “I don’t know” gets you zero, while making a confident guess could get you a point.

That system rewards bluffing, so models learn to “sound sure” even when they’re not.

The authors survey widely used leaderboards and find abstentions largely penalized, explaining why overconfident hallucinations persist despite mitigation efforts.

The fix is to reward honesty

The authors suggest changing benchmarks so models aren’t punished for admitting uncertainty.

If we add clear rules about when to guess and when to abstain, models will learn to only answer when they’re reasonably confident.

This promotes behavioral calibration, where models choose between answering and abstaining according to the target confidence, and should steer the field toward more trustworthy systems.

Paper:
cdn.openai.com/pdf/d04913be-3…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @omarsar0

elvis

@omarsar0

Sep 8

I'm surprised Agentic RAG is not getting more attention.

That's all about to change.

Here's why:

Standard RAG systems can only do so much and are quite limited in how much value you can pack in the AI response.

Configuring LLMs to leverage tools via an agent allows you to prepare responses that not only ground answers better but also reduce hallucinations across the board.

Tools provide the agentic RAG system with more important context when it needs it.

Simple queries can be answered by the vector store retriever component but more complex queries can be answered more precisely with multiple retriever components that are themeselves subagents.

Read 9 tweets

elvis

@omarsar0

Sep 7

Another impressive paper by Google DeepMind.

It takes a closer look at the limits of embedding-based retrieval.

If you work with vector embeddings, bookmark this one.

Let's break down the technical details:

Quick Overview

This paper looks at how search engines that rely on vector embeddings have built-in limits.

Even if you train them perfectly, they just can’t handle every possible search query once the combinations of relevant documents get too complex.

The authors prove this with math, then confirm it with experiments on a simple but tricky dataset they call LIMIT.

Built-in ceiling

Each document and query is turned into a single vector.

The study shows there’s only so many correct top-k results these vectors can represent.

If you ask for more combinations than the vectors can encode, it’s impossible for the system to get it right.

Read 8 tweets

elvis

@omarsar0

Sep 6

Universal Deep Research

NVIDIA recently published another banger tech report!

The idea is simple: allow users to build their own custom, model-agnostic deep research agents with little effort.

Here is what you need to know:

Overview

Universal Deep Research (UDR) proposes a general, model-agnostic deep-research agent that lets users bring their own model and strategy.

Instead of a fixed pipeline, UDR compiles natural-language research strategies into executable code, runs them in a sandbox, and emits structured progress notifications before returning a final report.

Motivation

Current deep-research tools hard-code strategy and model choice, limiting source prioritization, domain-specific workflows, and model swap-ability.

UDR targets all three gaps by separating the research strategy from the underlying model.

Read 8 tweets

elvis

@omarsar0

Sep 5

Cool research from Microsoft!

They release rStar2-Agent, a 14B math reasoning models trained with agentic RL.

It reaches frontier-level math reasoning in just 510 RL training steps.

Here are my notes:

Quick Overview

rStar2-Agent (Microsoft Research). A 14B math-reasoning model trained with agentic RL that learns to think smarter by using a Python tool environment, not just longer CoT.

It introduces GRPO-RoC, a rollout strategy that filters noisy successful traces, plus infrastructure for massive, low-latency tool execution.

Method

GRPO-RoC oversamples rollouts, then keeps only the cleanest correct ones while preserving diverse failures, reducing tool-call errors and formatting issues during training.

Read 7 tweets

elvis

@omarsar0

Aug 31

Overview of Self-Evolving Agents

There is a huge interest in moving from hand-crafted agentic systems to lifelong, adaptive agentic ecosystems.

What's the progress, and where are things headed?

Let's find out:

This survey defines self-evolving AI agents and argues for a shift from static, hand-crafted systems to lifelong, adaptive agentic ecosystems.

It maps the field’s trajectory, proposes “Three Laws” to keep evolution safe and useful, and organizes techniques across single-agent, multi-agent, and domain-specific settings.

Paradigm shift and guardrails

The paper frames four stages: Model Offline Pretraining → Model Online Adaptation → Multi-Agent Orchestration → Multi-Agent Self-Evolving.

It introduces three guiding laws for evolution: maintain safety, preserve or improve performance, and then autonomously optimize.

Read 10 tweets

elvis

@omarsar0

Aug 28

Memory-R1

Another really cool paper showing how RL can enhance an LLM's agentic and memory capabilities.

Great read for AI devs.

Here are my notes:

Overview

A framework that teaches LLM agents to decide what to remember and how to use it.

Two RL-fine-tuned components work together: a Memory Manager that learns CRUD-style operations on an external store and an Answer Agent that filters retrieved memories via “memory distillation” before answering.

Active memory control with RL

The Memory Manager selects ADD, UPDATE, DELETE, or NOOP after a RAG step and edits entries accordingly; training with PPO or GRPO uses downstream QA correctness as the reward, removing the need for per-edit labels.

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

elvis

Try unrolling a thread yourself!

More from @omarsar0

elvis

elvis

elvis

elvis

elvis

elvis

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!