Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Anthropic

@AnthropicAI

Oct 22, 2024 • 9 tweets • 3 min read • Read on X

Scrolly

Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use.

Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.

The new Claude 3.5 Sonnet is the first frontier AI model to offer computer use in public beta.

While groundbreaking, computer use is still experimental—at times error-prone. We're releasing it early for feedback from developers.

We've built an API that allows Claude to perceive and interact with computer interfaces.

This API enables Claude to translate prompts into computer commands. Developers can use it to automate repetitive tasks, conduct testing and QA, and perform open-ended research.

We're trying something fundamentally new.

Instead of making specific tools to help Claude complete individual tasks, we're teaching it general computer skills—allowing it to use a wide range of standard tools and software programs designed for people.

Claude 3.5 Sonnet's current ability to use computers is imperfect. Some actions that people perform effortlessly—scrolling, dragging, zooming—currently present challenges. So we encourage exploration with low-risk tasks.

We expect this to rapidly improve in the coming months.

Even while recording these demos, we encountered some amusing moments. In one, Claude accidentally stopped a long-running screen recording, causing all footage to be lost.

Later, Claude took a break from our coding demo and began to peruse photos of Yellowstone National Park.

Beyond computer use, the new Claude 3.5 Sonnet delivers significant gains in coding—an area where it already led the field.

Sonnet scores higher on SWE-bench Verified than all available models—including reasoning models like OpenAI o1-preview and specialized agentic systems.

Claude 3.5 Haiku is the next generation of our fastest model.

Haiku now outperforms many state-of-the-art models on coding tasks—including the original Claude 3.5 Sonnet and GPT-4o—at the same cost as before.

The new Claude 3.5 Haiku will be released later this month.

We believe these developments will open up new possibilities for how you work with Claude, and we look forward to seeing what you'll create.

Read the updates in full: anthropic.com/news/3-5-model…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @AnthropicAI

Anthropic

@AnthropicAI

May 22

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4.

Claude Opus 4 is our most powerful model yet, and the world’s best coding model.

Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.

Claude Opus 4 and Sonnet 4 are hybrid models offering two modes: near-instant responses and extended thinking for deeper reasoning.

Both models can also alternate between reasoning and tool use—like web search—to improve responses.

Both Claude 4 models are state-of-the-art on SWE-bench Verified, which measures how models solve real software issues.

As the best coding model, Claude Opus 4 can work continuously for hours on complex, long-running tasks—significantly expanding what AI agents can do.

Read 8 tweets

Anthropic

@AnthropicAI

Apr 23

New report: How we detect and counter malicious uses of Claude.

For example, we found Claude was used for a sophisticated political spambot campaign, running 100+ fake social media accounts across multiple platforms.

This particular influence operation used Claude to make tactical engagement decisions: commenting, liking, or sharing based on political goals.

We've been developing new methods to identify and stop this pattern of misuse, and others like it (including fraud and malware).

In this case, we banned all accounts that were linked to the influence operation, and used the case to upgrade our detection systems.

Our goal is to rapidly counter malicious activities without getting in the way of legitimate users.

Read 4 tweets

Anthropic

@AnthropicAI

Apr 15

Today we’re launching Research, alongside a new Google Workspace integration.

Claude now brings together information from your work and the web.

Research represents a new way of working with Claude.

It explores multiple angles of your question, conducting searches and delivering answers in minutes.

The right balance of depth and speed for your daily work.

Claude can also now connect with your Gmail, Google Calendar, and Docs.

It understands your context and can pull information from exactly where you need it.

Read 6 tweets

Anthropic

@AnthropicAI

Apr 8

New Anthropic research: How university students use Claude.

We ran a privacy-preserving analysis of a million education-related conversations with Claude to produce our first Education Report.

Students most commonly used Claude to create and improve educational content (39.3% of conversations) and to provide technical explanations or solutions (33.5%).

Which degrees have the most disproportionate use of Claude?

Perhaps not surprisingly, Computer Science leads the field, with 38.6% of Claude conversations related to the subject, which makes up only 5.4% of US degrees.

Read 8 tweets

Anthropic

@AnthropicAI

Apr 3

New Anthropic research: Do reasoning models accurately verbalize their reasoning?

Our new paper shows they don't.

This casts doubt on whether monitoring chains-of-thought (CoT) will be enough to reliably catch safety issues.

We slipped problem-solving hints to Claude 3.7 Sonnet and DeepSeek R1, then tested whether their Chains-of-Thought would mention using the hint (if the models actually used it).

Read the blog: anthropic.com/research/reaso…

We found Chains-of-Thought largely aren’t “faithful”: the rate of mentioning the hint (when they used it) was on average 25% for Claude 3.7 Sonnet and 39% for DeepSeek R1. $Graph comparing the four models (Claude 3.5 and 3.7 Sonnet, and DeepSeek V3 and R1) on their faithfulness - the fraction of time they mentioned having used the clue.$

Read 8 tweets

Anthropic

@AnthropicAI

Mar 27

Last month we launched our Anthropic Economic Index, to help track the effect of AI on labor markets and the economy.

Today, we’re releasing the second research report from the Index, and sharing several more datasets based on anonymized Claude usage data.

The data for this second report are from after the release of Claude 3.7 Sonnet. For this new model, we find a small rise in the share of usage for coding, as well as educational, science, and healthcare applications.

Read the blog post: anthropic.com/news/anthropic…

We saw little change in the overall balance of “augmentation” versus “automation”, but some changes in the specific interaction modes within those categories.

For instance, there was a small increase in learning interactions, where users ask Claude for explanations.

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Anthropic

Try unrolling a thread yourself!

More from @AnthropicAI

Anthropic

Anthropic

Anthropic

Anthropic

Anthropic

Anthropic

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!