AshutoshShrivastava Profile picture
Post about latest AI news, tools, tutorials and memes. Join my newsletter : https://t.co/jZDvYbKLa3
May 21 13 tweets 4 min read
12 mind-blowing announcements from Google I/O Day 1, you don’t want to miss

Flow
Veo 3
Lyria 2
Imagen 4
Android XR
Agent Mode
Gemini Diffusion
Gemini native audio
Jules Code Assistant
Google Search Try-On
Project Astra (live demo)
Video Overview in NotbookLM

More details 👇 Image Flow: a new type of AI filmmaking tool that combines the best of Veo, Imagen and Gemini.

May 20 9 tweets 2 min read
GOOGLE JUST WON AI VIDEO RACE.
it's so over..

Google's new video AI model Veo 3 has native audio generation.
You can now generate videos with sound effects, background noise, and even dialogue with just one prompt

9 Wild example 👇 2
Apr 21 10 tweets 4 min read
🚨 SkyReels just launched! The world’s first open-source video generation platform supporting unlimited duration 🔥

All-in-one creator toolkit:

- Consistent high-quality video (LoRA ready)
- Fast gen, amazing output.
- Amazing facial expressions .

Plus text-to-film agent handles everything: script, character, storyboards, full AV gen, auto-edit . it's Wild!

Step by step tutorial 👇 1/7
Scripting

Just type your idea or upload a story , it converts everything into a professional screenplay. Then automatically designs characters and storyboards based on your narrative.
Apr 7 4 tweets 2 min read
The Vogent AI team nailed this one 🔥🔥 They rearchitected the sesame CSM 1B model for:
- Ultra-realistic streaming voice
- Latency < 200ms
- Easy voice cloning (example Trump)

Listen to the AI Trump tariffs conversation but wait till you hear the next one.
More details 👇
1/4 2/3
How did Vogent team created this low latency and ultra-realistic streaming voice .
- Rebuilt Sesame CSM 1B model from the ground up
- Optimized for real-time, low-latency inference
- More humanlike than anything else out there
- Available now to all Vogent users, no extra charge
- Coming soon as a text-to-speech API.
Apr 2 6 tweets 3 min read
Somebody stop China, they are absolutely K!lling it 🔥

Mureka AI just launched Mureka O1, the first-ever Chain-of-Thought AI music model, and it’s insane.

How will Music industry survive in AI era ??

Here’s how it works, with examples and details 👇
1/6 Image 2/6
Here is small tutorial on how to use it .
- You can create song and instrumental both.
- You have basic and Advanced options based on your need.
- Advanced section allows you to use lyrics along with reference and description
Mar 26 4 tweets 2 min read
Vibe coding is fun until production breaks or a hacker wreaks havoc .
Use CodeRabbit to review your AI generated code and Ships better quality code in half the time.
- Safe and easy to use
- Provides codebase-aware reviews.
- Catches security issues.

Note : CodeRabbit is free for OSS and projects like DiceDB, Strapi, and NuxtJS use it.

Step by Step Guide 👇
1/4 2/4
- Sign up for free trial.
- It works with popular Git platforms like GitHub, GitLab, Bitbucket, and Azure DevOps.
- You can give access to all repos or only selected repo which is really good.
- It will enable CodeRabbit to your repo and it’s ready to use.
- CodeRabbit automatically integrates with 20+ Code quality and Code Security tools, that enables non-devs or jr devs to write better code. CodeRabbit can also be assumed as Review Copilot.
Mar 17 11 tweets 3 min read
China drops another banger on AI video work 🔥

ReCamMaster: Camera-Controlled Generative Rendering from a Single Video

- Creates realistic new perspectives of a scene from one video, so you can "re-film" videos with new camera movements.

12 examples and more details below 👇 Arc Trajectories
Mar 16 6 tweets 3 min read
China is literally on 🔥
Baidu from China has launched ERNIE 4.5 and ERNIE X1 and it’s freaking cheap .

Here is everything you need to know.

ERNIE 4.5

- Native multimodal and Outperforms GPT 4.5 in multiple benchmarks at just 1% of GPT 4.5 price
- OpenAI GPT 4.5 – Input: $75 / 1M tokens, Output: $150 / 1M tokens;
- ERNIE 4.5 – Input: $0.55 / 1M tokens, Output: $2.20 / 1M tokens

ERNIE X1

- A deep thinking reasoning model with multimodal capabilities on par with DeepSeek R1 at only half the price

See it in action and check out the pricing details👇

📹 source : yiyan[.]baidu[.]com

1/6
ERNIE 4.5 is a multimodal which can take Audio files as well. 2/6
ERNIE 4.5 can also analyze document files.
Mar 13 4 tweets 3 min read
ManusAI is powerful, but it has real limitations (like OpenAI operator) when interacting with the web:

- Gets blocked by major websites
- Can’t access logged-in sessions

That’s where web agents like rtrvr works well, running directly in your browser it can bypass blocks and access logged-in sessions.

Few examples 👇

Manus was blocked by Reddit and couldn’t scrape data. while rtrvr, extracted data from your logged-in sessions on X and Reddit, then use it to create posts on both platforms. ManusAI is blocked by Major websites like Zillow and Reddit for bot detection and you can’t actually extract data. Image
Image
Image
Image
Mar 2 6 tweets 2 min read
Grok-3 DeepSearch + Gamma : productivity hack 🔥

- Grok-3: DeepSearch is really powerful use it to pull relevant data and people's opinions from X and the internet on any topic.
- Gamma : Turn this data into sleek, AI-generated slides in seconds.

step-by-step guide 👇 Step 1
Use Grok-3 DeepSearch and provide a detailed prompt to retrieve relevant information on your preferred topic.
Feb 25 11 tweets 4 min read
Somebody please stop China they are on 🔥🔥
Wan from Alibaba Group has just open-sourced Wan 2.1 and it is better than OpenAI Sora
- Text-to-Video
- Image-to-Video
- Video Editing
- Text-to-Image
- Video-to-Audio

10 wild examples and more details below! 👇 2.
A wide cinematic shot from the audience's perspective capturing a vibrant hip-hop crew dominating the stage. Comprising five dancers with urban streetwear, confident expressions, and synchronized movements, they perform under dynamic stage lighting with side-angled beams slicing through smoke effects. The energetic crowd surrounds them, amplifying the powerful atmosphere of collective motion and rhythmic unity.
Feb 25 10 tweets 3 min read
Anthropic Claude 3.7 Sonnet is the Best Coding AI model in the world right now

Less than 12 hours since its release, and people are already going crazy over its coding capabilities.

10 Wild examples to try👇

1. Landing Page : Credit : SullyOmarr
Feb 24 11 tweets 3 min read
Grok-3 voice is mode is the best AI exist right now.. It is the most powerful and the most unhinged AI out there. Voice mode is now live, and it's absolutely insane.

Listen at your own risk.

Here are 10 crazy examples you don’t want to miss 👇
Feb 10 12 tweets 4 min read
China is on 🔥 ByteDance drops another banger AI paper on AI video!

- Goku : Flow-based video generative foundation model.
- Goku+ : Video ads foundation model - 100x lower cost than traditional ads methods.

These are insanely good!

Here are 10 incredible examples and the research paper link 👇 2. Goku+: Turn Product Image To Video Clip
Feb 4 12 tweets 3 min read
China is on 🔥 ByteDance drops another banger AI paper!
OmniHuman-1 can generate realistic human videos at any aspect ratio and body proportion using just a single image and audio. This is the best i have seen so far.

10 incredible examples and the research paper Link👇 2
Feb 1 13 tweets 3 min read
OpenAI new o3-mini is currently the best coding model right now, with an average coding score of 82.74 (o3-mini-high) on LiveBench.

No one even comes close.

A 🧵of best coding examples of o3-mini. Image
Jan 25 14 tweets 4 min read
OpenAI has released a major update for ChatGPT Canvas, Now, anyone can build web apps directly within ChatGPT

Canvas can now render both HTML and React code seamlessly within ChatGPT.

Here’s how people are using it : Top 13 Examples for you to try👇Image
Dec 7, 2024 9 tweets 2 min read
How are people using OpenAI's most advanced o1/ o1-pro model?
Here are a few examples 👇 and sor sure o1-pro is insane.
If you’ve tried o1/ o1-pro, share your use case below! Image
Nov 15, 2024 5 tweets 2 min read
OpenAI updated the ChatGPT macOS app and literally finished Cursor.

ChatGPT for macOS can now work with your coding apps and read content from them. Here’s everything you need to know and how to set it up 👇
How to get started :
Oct 29, 2024 5 tweets 2 min read
Microsoft just killed Cursor today with the launch of GitHub Spark and introduced major updates to GitHub Copilot.
More details in the thread 🧵
📹 credit from Microsoft. You can now use Anthropic’s Claude 3.5 Sonnet
Oct 9, 2024 15 tweets 4 min read
OpenAI o1-preview and o1-mini best coding examples that I’ve come across, and they’re really impressive.
🧵 Image