Rowan Cheung Profile picture
Dec 6, 2023 10 tweets 4 min read Read on X
Google just revealed Gemini and will directly integrate the AI into Google apps.

The GPT-4 competitor comes in 3 models — Ultra, Pro, and Nano.

Here's a thread of EVERYTHING you need to know: Image
Gemini is multimodal and can recognize images and speak in real-time.

With a score of 90%, Gemini Ultra is the FIRST AI model to outperform human experts on the MMLU benchmark.

This demo is incredible.
Gemini has next-generation capabilities such as sophisticated reasoning, multimodality, and advanced coding.

The model is also advanced in math and coding, as compared to ChatGPT (GPT-4), which cannot perform math.

Check out this demo of them solving physics.
Gemini has an incredible understanding of science.

It can find and extract research across 1000's of research papers.

Because Gemini is multimodal, it can not only understand text but also graphs through images!
Gemini comes in three sizes — Ultra for complex tasks, Pro for scaling across a range of tasks, and Nano for efficient on-device tasks.

-Pro will be in Google products through Bard starting today.
-Ultra will be rolling out early next year.
-Nano will be available on Pixel. Image
Gemini Ultra’s performance beats current state-of-the-art results in 30 of 32 benchmarks used in LLM research & development.
Image
Image
Gemini Pro will be available for free in Bard and across Google apps today.

In six out of eight benchmarks, Gemini Pro outperformed GPT-3.5, making it 'the most powerful free chatbot on the market today'. Image
Gemini Nano now powers on-device generative AI features for Pixel 8 Pro.

New features include:
-Summarize in Recorder
-Smart Reply in Gboard
-Cutting-edge video
-Enhanced photography and image editing
I shared all the info on Gemini in my newsletter this morning.

Click here to join 400k+ readers, and you'll never miss a thing in AI ever again: therundown.ai/subscribe
Thanks to @GoogleDeepMind for an invitation to the early press conference invite, allowing me to share the news live.

I do these rundowns daily, follow me @rowancheung
for more.

If you found this helpful, spare me a like/retweet to support my content 👇

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Rowan Cheung

Rowan Cheung Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @rowancheung

Jan 27
NEWS: DeepSeek just dropped ANOTHER open-source AI model, Janus-Pro-7B.

It's multimodal (can generate images) and beats OpenAI's DALL-E 3 and Stable Diffusion across GenEval and DPG-Bench benchmarks.

This comes on top of all the R1 hype. The 🐋 is cookin' Image
For those wondering my quick take on what's happening right now with R1 and Janus

1. GPU demand will not go down
2. OpenAI is not done for, but Open source and China are showing they're far closer than anticipated
3. There's way too much misinfo being spread by mainstream media right now (almost seems on purpose?)
4. DeepSeek open-sourcing R1 is still a huge gift to developers and overall AI progress

I haven't seen this much confusion and uncertainty on my TL for ages...
Read 5 tweets
Jan 23
I got early access to ChatGPT Operator.

It's OpenAI's new AI agent that autonomously takes action across the web on your behalf.

The 9 most impressive use cases I’ve tried (videos sped up):

1. Ordering dinner ingredients based on a picture and a recipe
2. Planning a weekend trip based on hidden gems off Reddit, my budget and interests

Notice how at 0:06, ChatGPT Operator was blocked from Reddit but then decided to just do a Bing search with "Reddit" at the end

Very impressive decision-making
3. Crypto investment research based on tokens that are actually worth looking into

Notice how ChatGPT Operator got hit with a "Are you human" CAPTCHA, then pinged me to take control to confirm

Wild workaround
Read 16 tweets
Jan 8
That's a wrap for day 2 of the world's largest consumer tech event, CES 2025.

The top 10 tech and gadget reveals from day 2:

1. A stretchable Micro LED display that turns 2D into 3D by Samsung
2. A multitasking household robot that does everything from vacuuming, organization, air purification, monitoring pets, and even delivering you food while you sit on the couch by SwitchBot
3. An immerse location-based entertainment concept that allows players to use flashlights and guns in an LED environment by Sony
Read 12 tweets
Jan 7
It's only been 1 day of CES 2025, and the announcements have already been incredible.

The 10 most impressive reveals of CES 2025 so far:

1. A 360° AI-powered body scanning health mirror that can scan your heart, weight, and metabolic health
2. Roborock's Saros Z70: A robotic vacuum that has a mechanical arm for picking up objects in the way of cleaning the floor
3. Halliday Glasses: Smart glasses with a 3.5-inch internal monochrome display

These glasses are equipped with an AI agent that can listen to conversations, answer questions during meetings, and do live translation
Read 12 tweets
Dec 16, 2024
Google just released Veo 2, a new state-of-the-art AI video model.

In testing, Veo beat OpenAI Sora in BOTH quality and prompt adherence.

The video compilation below is 100% created by AI (more details in thread):
Veo can generate 8 second videos in up to 4K resolution (720p at launch).

The model also features:

— Better understanding of physics for more natural movement, lighting, etc.
— Enhanced clarity and sharpness of outputs
— Reduced hallucinated objects and details
The model also excels at a variety of cinematic styles, with better camera control for more creative storytelling.

An animation example:
Read 8 tweets
Dec 11, 2024
I've been an early tester + had in-person demos for most of Google’s AI projects announced today.

I found several practical use cases that will benefit everyday people.

12 use cases of Project Astra/Deep Research Agents/Project Mariner (beyond the hype):
Project Astra: Google's AI agent that can 'see the world' using your phone camera

Use cases that stood out to me:

> Summarizing a book page in seconds and chatting with it for follow-ups on complex topics (professor-in-your-pocket)
> Identifying a rash: just seasonal hives or something more serious?
> Real-time translation of languages, sign language, and books (worked great for Japanese writing → English summary).
> Locating landscapes in photos and estimating their distance using the Google Maps integration.
> Remembering cookbook recipes and recommending wine pairings based on the recipe and budget.
> Summarizing thousands of Amazon/Airbnb reviews in seconds using mobile screen sharing, with highlights of any negative feedback.
Deep Research Agent: Google’s new research assistant that create's full reports on any topic and links back to the relevant sources.

Use cases that stood out to me:

> Coming up with interview questions based on what people are curious about across the internet.
> Conducting market research on stocks (e.g., "Why did Google stock go up today?").
> Creating a full Christmas gift plan for my mom (based on current trends and her preferences highlighted in the prompt)
> Creating an analysis and report of my health/fitness and how I can improve based on my Whoop data.
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(