Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use.
Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.
The new Claude 3.5 Sonnet is the first frontier AI model to offer computer use in public beta.
While groundbreaking, computer use is still experimental—at times error-prone. We're releasing it early for feedback from developers.
We've built an API that allows Claude to perceive and interact with computer interfaces.
This API enables Claude to translate prompts into computer commands. Developers can use it to automate repetitive tasks, conduct testing and QA, and perform open-ended research.
We're trying something fundamentally new.
Instead of making specific tools to help Claude complete individual tasks, we're teaching it general computer skills—allowing it to use a wide range of standard tools and software programs designed for people.
Claude 3.5 Sonnet's current ability to use computers is imperfect. Some actions that people perform effortlessly—scrolling, dragging, zooming—currently present challenges. So we encourage exploration with low-risk tasks.
We expect this to rapidly improve in the coming months.
Even while recording these demos, we encountered some amusing moments. In one, Claude accidentally stopped a long-running screen recording, causing all footage to be lost.
Later, Claude took a break from our coding demo and began to peruse photos of Yellowstone National Park.
Beyond computer use, the new Claude 3.5 Sonnet delivers significant gains in coding—an area where it already led the field.
Sonnet scores higher on SWE-bench Verified than all available models—including reasoning models like OpenAI o1-preview and specialized agentic systems.
Claude 3.5 Haiku is the next generation of our fastest model.
Haiku now outperforms many state-of-the-art models on coding tasks—including the original Claude 3.5 Sonnet and GPT-4o—at the same cost as before.
The new Claude 3.5 Haiku will be released later this month.
We believe these developments will open up new possibilities for how you work with Claude, and we look forward to seeing what you'll create.
We find that models generalize, without explicit training, from easily-discoverable dishonest strategies like sycophancy to more concerning behaviors like premeditated lying—and even direct modification of their reward function.
We designed a curriculum of increasingly complex environments with misspecified reward functions.
Early on, AIs discover dishonest strategies like insincere flattery. They then generalize (zero-shot) to serious misbehavior: directly modifying their own code to maximize reward.
Our previous interpretability work was on small models. Now we've dramatically scaled it up to a model the size of Claude 3 Sonnet.
We find a remarkable array of internal features in Sonnet that represent specific concepts—and can be used to steer model behavior.
The problem: most LLM neurons are uninterpretable, stopping us from mechanistically understanding the models.
In October, we showed that dictionary learning could decompose a small model into "monosemantic" components we call "features"—making the model more interpretable.
We find that Claude 3 Opus generates arguments that don't statistically differ in persuasiveness compared to arguments written by humans.
We also find a scaling trend across model generations: newer models tended to be rated as more persuasive than previous ones.
We focus on arguments regarding less polarized issues, such as views on new technologies, space exploration, and education. We did this because we thought people’s opinions on these topics might be more malleable than their opinions on polarizing issues.