Post

More from @elevenlabsio

ElevenLabs

@elevenlabsio

Jan 9

Today we’re introducing Scribe v2: the most accurate transcription model ever released.

While Scribe v2 Realtime is optimized for ultra low latency and agents use cases, Scribe v2 is built for batch transcription, subtitling, and captioning at scale.

Scribe v2 achieves the lowest word error rate based on industry-standard benchmarks.

Scribe v2 improves on the state of the art stability from Scribe v1. It handles pauses, changes in tone and delivery, together with long silences without any issues, delivering unmatched accuracy across more than 90 languages.

Scribe v2 is now used in ElevenLabs Studio for more accurate subtitles, captions and transcriptions, supporting teams that manage large libraries of audio and video across marketing, media, research, training, and compliance use cases.

Read 9 tweets

ElevenLabs

@elevenlabsio

Aug 19, 2025

Introducing Chat Mode

You can now build text-only conversational agents.

Ideal for:
- Customers that prefer typing to speaking.
- Precise inputs like order IDs or email addresses.
- Solving simple issues, handing off to our voice agents for complex tasks.

Chat Mode is an extension of our Conversational Agent platform, designed to help you reach users in the modality that best fits their context.

ElevenLabs conversational agents are intelligent, real-time AI agents that talk, type, and take action.

Resolve customer issues, automate tasks, and deliver accurate answers - all grounded in your data, tailored to your workflows, and ready to deploy at enterprise scale.

Read 4 tweets

ElevenLabs

@elevenlabsio

Aug 18, 2025

Introducing the Eleven Music API.

This is the first Music API for developers trained on licensed data and cleared for broad commercial use.

You can now integrate the highest quality AI music into your products and workflows.

Since launch, creators have generated over 750k songs with Eleven Music.

The Eleven Music API allows you to:

- Generate high quality tracks from text prompts
- Create vocal or instrumental versions in any genre
- Customize length, structure, and language

Created in collaboration with labels, publishers, and artists, songs created with the Eleven Music API are available for broad commercial use.

It’s designed for building apps across media and entertainment. Whether you’re delivering personalized mediations, producing music for video games, or creating AI generated ads.

Read 9 tweets

ElevenLabs

@elevenlabsio

Jun 5, 2025

Introducing Eleven v3 (alpha) - the most expressive Text to Speech model ever.

Supporting 70+ languages, multi-speaker dialogue, and audio tags such as [excited], [sighs], [laughing], and [whispers].

Now in public alpha and 80% off in June.

This is a research preview. It requires more prompt engineering than previous models - but the generations are breathtaking.

We’ll continue fine-tuning to improve reliability and control.

The new architecture of Eleven v3 deeply understands text - delivering much greater expressiveness.

And now you can guide generations more directly using audio tags:
- Emotions [sad] [angry] [happily]
- Delivery direction [whispers] [shouts]
- Non-verbal reactions [laughs] [clears throat] [sighs]

Read 7 tweets

ElevenLabs

@elevenlabsio

Apr 1, 2025

We pioneered the first ultra-realistic Text to Speech model, and recently launched the world's most accurate Speech to Text model, Scribe.

But we're not stopping there.

Today, we're taking one small step for man, and one giant leap for man's best friend...

with Text to Bark.

Introducing Text to Bark, the world's first AI-powered TTS model for dogs.

Simply type a message, choose your breed, and our models will convert it into fluent barking.

Try it with your own dog at
elevenlabs.io/text-to-bark

Independent benchmarking shows that 95% of dogs couldn't distinguish between ElevenLabs AI-generated barks and real ones, a result that got tails wagging among the international AI community.

Read 5 tweets

ElevenLabs

@elevenlabsio

Dec 18, 2024

Meet Flash. Our newest model that generates speech in 75ms + application & network latency.

You’ve never experienced human-like TTS this fast.

Flash is our recommended model for low-latency, conversational voice agents.

You can use Flash today in our Conversational AI platform

Or build directly via the API using model id “eleven_flash_v2” and “eleven_flash_v2_5”: elevenlabs.io/docs/api-refer…

Flash v2 is English only and Flash v2.5 supports 32 languages

They both cost 1 credit for every 2 characters

Read 5 tweets

Share this page!

Enter URL or ID to Unroll

ElevenLabs

Try unrolling a thread yourself!

More from @elevenlabsio

ElevenLabs

ElevenLabs

ElevenLabs

ElevenLabs

ElevenLabs

ElevenLabs

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!