Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

OpenAI

@OpenAI

Sep 17 • 7 tweets • 3 min read • Read on X

Today we’re releasing research with @apolloaievals.

In controlled tests, we found behaviors consistent with scheming in frontier models—and tested a way to reduce it.

While we believe these behaviors aren’t causing serious harm today, this is a future risk we’re preparing for. openai.com/index/detectin…

Scheming = when an AI behaves one way on the surface while hiding its true goals.

Today’s deployed systems have little opportunity to scheme in ways that could cause serious harm. The most common failures are simple deceptions—like pretending to complete a task without doing it. We’ve studied and mitigated these issues and made meaningful improvements in GPT-5 over earlier models.

But as AIs take on more complex, long-term tasks with real-world impact, the potential for harmful scheming will grow—so our safeguards and testing must grow with it.

Typically, as models become smarter, their problems become easier to address—for example, smarter models hallucinate less and follow instructions more reliably.

However, AI scheming is different.

As we train models to get smarter and follow directions, they may either better internalize human goals or just get better at hiding their existing true goals.

The core of anti-scheming research is to distinguish between these two, which requires understanding the reasoning behind a model's behavior.

In this new research with @apolloaievals, we found behaviors consistent with scheming in controlled tests across frontier models, including OpenAI o3 and o4-mini, Gemini-2.5-pro, and Claude Opus-4.

We can significantly reduce scheming by training models to reason explicitly, using an extension to the Model Spec that prohibits scheming. That method is called deliberative alignment.

With this technique, we can reduce covert actions by 30x for o3.

However, situational awareness complicates results.

Model spec: model-spec.openai.com/2025-04-11.html

Frontier models can recognize when they are being tested, and their tendency to scheme is influenced by this situational awareness.

We demonstrated counterfactually that situational awareness in their chain-of-thought affects scheming rates: the more situationally aware a model is, the less it schemes, and vice versa.

Moreover, both RL training and anti-scheming training increase levels of situational awareness.

Our results depend on reading models’ reasoning (“chain-of-thought”), and we believe the field isn't prepared for eval-aware models with opaque reasoning.

Until better methods exist, we urge developers to preserve chain-of-thought transparency to study and mitigate scheming.

This is significant progress, but we have more work to do.

We’re advancing scheming research categories in our Preparedness Framework, renewing our collaboration with Apollo, and expanding our research team and scope. And because solving scheming will go beyond any single lab, we're also exploring ways to raise awareness and collaborate including piloting cross-lab evals, launching a $500K Kaggle challenge, and urging chain-of-thought transparency industry-wide.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @OpenAI

OpenAI

@OpenAI

Aug 5

We released two open-weight reasoning models—gpt-oss-120b and gpt-oss-20b—under an Apache 2.0 license.

Developed with open-source community feedback, these models deliver meaningful advancements in both reasoning capabilities & safety.

openai.com/index/introduc…

gpt-oss-120b matches OpenAI o4-mini on core benchmarks and exceeds it in narrow domains like competitive math or health-related questions, all while fitting on a single 80GB GPU (or high-end laptop).

gpt-oss-20b fits on devices as small as 16GB, while matching or exceeding OpenAI o3-mini.

These models are trained for agentic workflows—supporting function calling, web search, Python execution, configurable reasoning effort, and full raw chain-of-thought access. github.com/openai/gpt-oss

Read 7 tweets

OpenAI

@OpenAI

Jul 17

ChatGPT can now do work for you using its own computer.

Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths.

ChatGPT agent starts rolling out today to Pro, Plus, and Team users.

Pro users will get access by the end of day, while Plus and Team users will get access over the next few days.

Enterprise and Edu users will get access in the coming weeks. openai.com/index/introduc…

ChatGPT agent uses a full suite of tools, including a visual browser, text browser, a terminal, and direct APIs.

Read 7 tweets

OpenAI

@OpenAI

Jun 4

ChatGPT can now connect to more internal sources & pull in real-time context—keeping existing user-level permissions.

Connectors available in deep research for Plus & Pro users (excl. EEA, CH, UK) and Team, Enterprise & Edu users:

Outlook
Teams
Google Drive
Gmail
Linear
& more

Additional connectors available in ChatGPT for Team, Enterprise, and Edu users:

SharePoint
Dropbox
Box

Workspace admins can also now build custom deep research connectors using Model Context Protocol (MCP) in beta.

MCP lets you connect proprietary systems and other apps so your team can search, reason, and act on that knowledge alongside web results and pre-built connectors.

Available to Team, Enterprise, and Edu admins, and Pro users starting today.

Read 4 tweets

OpenAI

@OpenAI

May 16

We’re launching a research preview of Codex: a cloud-based software engineering agent that can work on many tasks in parallel.

Rolling out to Pro, Enterprise, and Team users in ChatGPT starting today.

chatgpt.com/codex

Codex independently navigates your codebase, implements and tests code changes, and proposes pull requests for you to review.

It’s powered by codex-1, a version of OpenAI o3 optimized for software engineering.

openai.com/index/introduc…

.@calvinfo uses Codex when he's on call to triage issues and prioritize bug fixes, to help him stay focused on product work.

Read 6 tweets

OpenAI

@OpenAI

Apr 28

We're excited to announce we’ve launched several improvements to ChatGPT search, and today we’re starting to roll out a better shopping experience.

Search has become one of our most popular & fastest growing features, with over 1 billion web searches just in the past week 🧵

Shopping

We’re experimenting with making shopping simpler and faster to find, compare, and buy products in ChatGPT.

✅ Improved product results
✅ Visual product details, pricing, and reviews
✅ Direct links to buy

Product results are chosen independently and are not ads.

These shopping improvements are starting to roll out today to Plus, Pro, Free, and logged-out users everywhere ChatGPT is available. It will take a few days to complete the rollout.

Search in WhatsApp

You can now send a WhatsApp message to 1-800-ChatGPT (+1-800-242-8478) to get up-to-date answers and live sports scores.

Accessible everywhere ChatGPT is available.

wa.me/18002428478

Read 5 tweets

OpenAI

@OpenAI

Apr 16

Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date.

For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation.

OpenAI o3 is a powerful model across multiple domains, setting a new standard for coding, math, science, and visual reasoning tasks.

o4-mini is a remarkably smart model for its speed and cost-efficiency. This allows it to support significantly higher usage limits than o3, making it a strong high-volume, high-throughput option for everyone with questions that benefit from reasoning. openai.com/index/introduc…

OpenAI o3 and o4-mini are our first models to integrate uploaded images directly into their chain of thought.

That means they don’t just see an image—they think with it. openai.com/index/thinking…

Read 5 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

OpenAI

Try unrolling a thread yourself!

More from @OpenAI

OpenAI

OpenAI

OpenAI

OpenAI

OpenAI

OpenAI

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!