Rowan Cheung Profile picture
Jan 23 16 tweets 6 min read Read on X
I got early access to ChatGPT Operator.

It's OpenAI's new AI agent that autonomously takes action across the web on your behalf.

The 9 most impressive use cases I’ve tried (videos sped up):

1. Ordering dinner ingredients based on a picture and a recipe
2. Planning a weekend trip based on hidden gems off Reddit, my budget and interests

Notice how at 0:06, ChatGPT Operator was blocked from Reddit but then decided to just do a Bing search with "Reddit" at the end

Very impressive decision-making
3. Crypto investment research based on tokens that are actually worth looking into

Notice how ChatGPT Operator got hit with a "Are you human" CAPTCHA, then pinged me to take control to confirm

Wild workaround
4. Booking a one-way flight from Zurich to Vienna using the Booking integration

This one required a bit of back and forth, with ChatGPT Operator pinging me and asking for my flight preference and having me take control of entering payment details
5. Scheduling an appointment with my barber after looking at my Google Calendar schedule/availability

Note that in this demo, ChatGPT Operator pinged me that I needed to sign in to Google to check my calendar

I tried a second time, and my login was saved session-to-session
6. Researching a good birthday gift for my mom based on what she likes

Similar to the Reddit block, ChatGPT Operator couldn't access NYTimes, so it pivoted and found another site.

Really neat.

Also cool to see it compare and find the best price across the web for me, too
7. Booking a one-time house cleaner for my home through the Thumbtack integration based on my budget

ChatGPT Operator came back to me with four highly rated options within my price range
8. Finding the best/cheapest health insurance coverage in Switzerland

This was interesting since most prices are not publicly available and are gated behind a meeting

ChatGPT Operator did what it could, and presented me with a good blog for me to read further
9. Finding a top-rated dog walker in Vancouver BC

This is no easy task, so I wanted to test how well ChatGPT Operator could handle it

To my surprise, I got 3 really solid options at the end
Overall, I was very impressed by the research preview of Operator.

I loved that it can do tasks for me as I do other work, and simply ping me when it needs me to "take over"

I also really enjoyed the saved tasks tab, and adding Custom Instructions for specific websites. Image
But it's important to note that Operator is still a research preview and is improving.

I found that:

-Quite a few sites were blocked after they detected the AI
-There's a limited set of partner integrations
-It's true purpose is to take actions across the web (more below) Image
Operator *operates* within ChatGPT, but it's a completely different tool.

Its output lengths are small, and its true purpose is to take actions across the web (typing, clicking, scrolling).

Meaning it's not like ChatGPT, which can produce essays and write long code
With every new tool, comes a new way of using it optimally.

E.g. with GPT-4, CoT prompting produced the best results, but prompting o1 best is completely different.

The exact same thing is happening here with Operator, and I'm 100% just scratching the surface with these tests.
The future of tech work is here. And personally, I'm incredibly excited about it.

Agents can do the boring work, so I can spend more time doing what I love.

I'll be publicly sharing all the ways I automate my work with agents, so follow me @rowancheung for more.
Lastly, big thanks to @OpenAI for granting me early access. I had a ton of fun early testing Operator.

If you want to support my work, like/retweet the first tweet of this thread to share with friends:
I'll be writing more about my early experiences and how Operator works in tomorrow's newsletter.

If you want it, you can join 900,000 other readers keeping up with everything going on in AI here (it's free): therundown.ai/subscribe

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Rowan Cheung

Rowan Cheung Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @rowancheung

Apr 9
TODAY'S AI NEWS: Amazon just dropped a new voice model that beats OpenAI

Plus, more news from Google, Nvidia, Deep Cogito, Stanford, and more.

Here's everything you need to know:
Amazon launched Nova Sonic speech-to-speech AI for human-like interactions

—Outperforms OpenAI's voice models with ~ 80% less cost
—4.2% word error rate across languages
— 46.7% better accuracy than GPT-4o for noisy environments
—On Amazon Bedrock
Amazon also dropped an upgraded Nova Reel 1.1 video model

—Delivers improved quality, style consistency
—Extends generations to 2 min via automated and manual, shot-by-shot modes
—Also available on Amazon Bedrock
Read 10 tweets
Apr 8
TODAY'S AI NEWS: Google's just started rolling out 'Project Astra' capabilities in Gemini Live

Plus, more news from Google Search, Runway, ElevenLabs, UC Berkeley, Krea AI, and more.

Here's everything you need to know:
Google started rolling out 'Project Astra' in Gemini Live

The move brings real-time visual AI, allowing multilingual conversations about anything seen and heard via a phone's camera or screen sharing

Coming to Pixel 9 and Samsung Galaxy S25 devices!
Google also enhanced Search by adding multimodality to its AI Mode.

The update enables users to ask complex questions about images using Gemini and Google Lens

AI Mode is now rolling out to more Google Labs users in the U.S.
Read 10 tweets
Apr 7
TODAY'S AI NEWS: Meta AI just dropped Llama 4 AI with a 10M token context window

Plus, more news from Midjourney, Microsoft, OpenAI, Google, and Kawasaki

Here's everything you need to know:
Meta announced three MoE-based Llama 4 models: 109B param Scout, 400B param Maverick, and 2T param Behemoth (in training)

Scout features a 10M context window and beats Gemma 3 & Mistral 3

Maverick, with a 1M window, outperforms GPT-4o and Gemini 2.0
Midjourney released V7, the first major update to its image model in almost a year. It includes:

—Improved generation quality
—Better prompt adherence
—A faster and voice-capable Draft Mode to iterate on ideas
—Currently in alpha testing phase
Read 9 tweets
Mar 26
TODAY'S AI NEWS: Google just dropped Gemini 2.5 Pro, its most intelligent AI model to date.

Plus, more news from OpenAI, Figure, ByteDance, Otter, and Perplexity.

Here's everything you need to know:
Google released Gemini 2.5 Pro Experimental, the first model in its Gemini 2.5 family

—#1 on the LMArena
—SOTA capabilities across benchmarks for coding, math, science, and more
—Visual reasoning
—1M token context window (2M coming soon!)
OpenAI added native image generation within GPT-4o and Sora

—A fully integrated system for creating visuals via ChatGPT
—Excels at menus, diagrams, and infographics
—Edits images with text prompts
—Rolling out to Plus, Pro, Team, and Free users
Read 9 tweets
Mar 24
TODAY'S AI NEWS: Researchers just developed an AI that detects certain cancers with 99% accuracy!

Plus, more news from Tencent, Anthropic, Perplexity, Zapier, and more.

Here's everything you need to know:
A new game-changing AI, ECgMLP, identifies endometrial cancer with 99.26% accuracy

It uses microscopic tissue images and
outperforms humans and other automated methods

Also works across colorectal, breast, and oral cancers with 97%+ accuracy!
Image
Tencent released Hunyuan T1, a reasoning AI based on industry's first Transformer-Mamba architecture

—Matches or surpasses DeepSeek R1 and OpenAI’s o1 and GPT 4.5
—2x faster with reduced compute demands
—Priced at $0.14 and $0.55 per million I/O tokens
Image
Read 9 tweets
Mar 20
It's been 4 days of NVIDIA GTC 2025, and the announcements have been incredible.

The 10 most important reveals so far:

1. Blue: A Star Wars-inspired robot powered by a new physics engine with real-time intelligence and movement
2. Newton: An open-source physics engine to simulate robotic movements in the real world — developed jointly by Nvidia, DeepMind, and Disney Research.

(This is the physics engine that powers the Blue robot!)
3. Blackwell Ultra: The next generation of Blackwell with 1.5x computational power — coming in the second half of 2025
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(