I think people are just not reading the blog post, so I'll help OpenAI out a bit and just post the coolest demos from it here.
TLDR: GPT4o is fully multimodal, as in input *and* output
One of these outputs is audio (not voice, *audio*, which is why it can sing)
The API only exposes audio/video to "select partners" for now, but these are some of the demos they show on the blog post:
Consistent image generation for a narrative.
This is *not* the model calling DALL-E like in ChatGPT today, these images are coming directly from the model
Apr 23, 2023 • 7 tweets • 3 min read
What would happen if GPT-4 took control of the NPCs in one of the most popular online multiplayer games in the world? 🤷♂️
Let's find out.
Introducing Whispering Fable: a Rust (the game) server, but with GPT-4
Watch or read 👇
Rust (the *game*, not the language) is one of the most popular, large-scale multiplayer games in the world.
Hundreds of players can play together in multi-week server maps, and do pretty much anything they want.
It's an ideal sandbox, like Minecraft almost, but grown-up