Hatice Ozen Profile picture
Apr 17 5 tweets 2 min read Read on X
read through @vercel's state of ai survey to see that 22% of builders are now using @groqinc and we're one of the top providers developers switched to in the last 6 months.

we do it all for you. 🫡
1/4 and just why are developers switching? the survey shows that 23% of you cite latency/performance as a top technical challenge, which is exactly what we're solving with our lpu inference engine and offering you access to that powerful hardware via groq api.
2/4 another interesting, but not surprising data point: 86% of teams don't train their models, preferring to focus on implementation and optimization.

smart strategy - tell us the models you'd like to see and let us handle the inference speed while you spend your time building.
3/4 cost management remains a top challenge for 23% of developers. i hear you. (side eyeing openai/anthropic rn... with love). 🤨

if this is also a challenge for you, check out our pricing page to see how you can scale with us without breaking your bank:
groq.com/pricing
5/5 if you want to be part of the teams that switched ( you should, although i may be biased), check out our docs - built by devs for devs.

our api is openai-compatible & we have features ranging from crazy fast reasoning to TTS.

what should we add next?
console.groq.com/docs/overview

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Hatice Ozen

Hatice Ozen Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ozenhati

Apr 5
HUGE PSA: @Meta's Llama 4 Scout (17Bx16E MoE) is now live on @GroqInc for all users via console playground and Groq API.

This conversational beast with native multimodality just dropped today and we're excited to offer Day 0 support so you can build fast.
1/7 What makes Llama 4 Scout special? It's a chonky multimodal model with a 10M context window (yes, TEN MILLION tokens).

Built on an expert mixture architecture (17B activated params, 109B total), it brings incredible image understanding and conversational abilities.
2/7 My vibe check is off the charts - Scout feels remarkably natural and chill. Think almost Claude-level chat quality with multi-image input capability that's insanely accurate and support for 12 languages.

And, of course, it's fast on Groq. Tokens go BRRR. 🏁
Read 8 tweets
Feb 21
Just wrapped up Day 1 of @aiDotEngineer and the talks went from "agents don't work (yet)" to enterprise deployment success stories.

2024 was for experimenting with AI, but 2025 is clearly the year of putting AI agents into production.

🧵 Here are some of my key takeaways:
1/9 @graceisford shared how agents = complex systems with compounding errors, but there's hope if we focus on:
- Data being our best asset/differentiator
- Personal LLM evals
- Tools to mitigate errors
- Intuitive AI UX (the moat that matters)
- Reimagining DevEx (go multimodal) Image
2/9 @HamelHusain & @gregce10 shared how to build an AI strategy that fails. Amongst all the S+ tier memes, what stood out to me is to drop AI jargon.

It's important to step out of our tech bubble to see that AI adoption goes beyond our domain. Keep it simple to drive adoption. Image
Image
Read 10 tweets
Feb 14
PSA: @Alibaba_Qwen's Qwen-2.5-Coder-32B-Instruct is now live on @GroqInc for insanely fast (and smart) code generation.

See below for instructions to add to @cursor_ai.
1/4 Qwen2.5 Coder is state-of-the-art when it comes to coding capabilities for open-source models with impressive performance across several popular code generation benchmarks - even beating GPT-4o and Claude 3.5 Sonnet. Image
2/4 Beyond code generation, Qwen2.5 Coder with Groq speed is a game-changer for debugging workflows. Image Jon Skeet (famous for being top contributor on @StackOverflow) reviewing your code in real-time and helping you build, fix bugs, and ship fast. This is the dream (but real).
Read 5 tweets
Feb 6
PSA: DeepSeek R1 Distill Llama 70B speculative decoding version is now live on @GroqInc for Dev Tier.

We just made fast even faster for instant reasoning. 🏁
1/5 What is speculative decoding? It's a technique that uses a smaller, faster model to predict a sequence of tokens, which are then verified by the main, more powerful model in parallel. The main model evaluates these predictions and determines which tokens to keep or reject.
2/5 Speculative decoding achieves faster inference because the main model can verify multiple tokens in parallel rather than generating them one-by-one. This parallel verification is significantly faster than traditional sequential token generation.
Read 6 tweets
Feb 6
Let's couple our vibe coding with vibe learning with this incredible dive into LLMs that @karpathy just dropped. 🧠

This is what democratizing AI education looks like with knowledge for both beginners and builders. And if you're new to AI development, this thread is for you.
2/7 Karpathy explains how parallelization is possible during LLM training, but output token generation is sequential during LLM inference. Specialized HW (like Groq's LPU) is designed to optimize such computational reqs, particularly sequential token gen, for fast LLM outputs.
3/7 And while training LLMs requires massive GPU clusters ($$$), using LLMs for inference doesn't. 🤝

You can get access to insanely-fast inference for top models via Groq API and start building right now. Seriously. Here are some apps others have built: console.groq.com/docs/showcase-…
Read 7 tweets
Jan 21
Huge ship recap from @GroqInc this past week:

- Flex Tier Beta is live for Llama 3.3 70b/8b with 10x higher rate limits
- Whisper Large v3 is now 67% faster (tokens go BRRR)
- Whisper Large v3 audio file limit is now 100MB (up from 40MB)
- The DevRel team is growing 📈📈📈 Image
2/4 Flex Tier gives on-demand processing with rapid timeout when resources are constrained - perfect for workloads that need fast inference and can handle occasional request failures.

Available with Llama 3.3 70b/8b for paid tier at the same price.

See: console.groq.com/docs/service-t…
3/4 We've also significantly improved Whisper Large v3 performance to make it 67% faster and increased file size limit to 100MB from 40MB.

You asked, we listened (srsly, keep asking me for more features)! TTS up next.

To leverage the 100MB limit, provide your file via URL: Image
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(