Post

Carlos DP

@the_carlosdp

May 13, 2024 • 7 tweets • 3 min read • Read on X

Scrolly

I think people are just not reading the blog post, so I'll help OpenAI out a bit and just post the coolest demos from it here.

TLDR: GPT4o is fully multimodal, as in input *and* output

One of these outputs is audio (not voice, *audio*, which is why it can sing)

The API only exposes audio/video to "select partners" for now, but these are some of the demos they show on the blog post:

Consistent image generation for a narrative.

This is *not* the model calling DALL-E like in ChatGPT today, these images are coming directly from the model

Which is why it can do things like this, where it manipulates an existing image with ease

No IPAdapters, ControlNets etc. needed!

It can take styles from images and do things like mixing styles into a new font

Synthesize an image with text that looks like it was written on paper

It can transcribe meeting notes (nothing new)

But it can also do speaker diarization, infer the speaker identities from the context, and register emotional voice cues

Seriously, check out the actual blog post, this is a huge deal, they severely undersold it in the presentation (and even the presentation was impressive!) openai.com/index/hello-gp…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @the_carlosdp

Carlos DP

@the_carlosdp

Apr 23, 2023

What would happen if GPT-4 took control of the NPCs in one of the most popular online multiplayer games in the world? 🤷‍♂️

Let's find out.

Introducing Whispering Fable: a Rust (the game) server, but with GPT-4

Watch or read 👇

Rust (the *game*, not the language) is one of the most popular, large-scale multiplayer games in the world.

Hundreds of players can play together in multi-week server maps, and do pretty much anything they want.

It's an ideal sandbox, like Minecraft almost, but grown-up

@playrust

Whispering Fable is a @playrust server that will feature a bunch of custom mods, which will be attached to a GPT-4 based "brain".

This GPT-4 autonomous agent(s) will control the NPCs, toward an objective, while real human players play on the same island.

Read 7 tweets

Share this page!

Enter URL or ID to Unroll

Carlos DP

Try unrolling a thread yourself!

More from @the_carlosdp

Carlos DP

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!