Latest Twitter Threads by @ai_for_success on Thread Reader App

Apr 2 • 5 tweets • 2 min read

Qwen has just launched Qwen3.6-Plus model and it’s Free on OpenRouter , it is a significant upgrade over the Qwen3.5 series
- 1M context window by default
- significantly improved agentic coding capability
- better multimodal perception and reasoning ability
- Also available via the Alibaba Cloud Model Studio API along side OpenRouter.

Some use case below from blog , Video Source and Credit : Qwen Blog
Qwen is also planning to Open-source smaller-scale variants of the Qwen3.6 series shortly.
1/5

2/5
OpenRouter : openrouter.ai/qwen/qwen3.6-p…

Dec 15, 2025 • 6 tweets • 3 min read

Most sites you actually want to build on don't have APIs.

Mino turns any website into structured data. Send a URL and a goal, get JSON back.

I've been testing this for weeks. It's genuinely powerful - handles logins, dynamic content, multi-step flows. Works on sites that'll never build developer tools.

Easy to use via API. Built-in stealth mode bypasses anti-bot protections.

Production infrastructure - processes millions of operations monthly for pricing intelligence and competitive research.

More use cases below 👇

1/6

2/6
Most browser agents make a model call for every action. Look at screenshot → reason → click → look again. Expensive and slow.

Mino learns workflows once with AI, then executes deterministically.

First run: AI figures out the site
Next runs: Milliseconds, code-level precision

85-95% accuracy, 10-30 seconds per task, pennies per run.

Same infrastructure Google , DoorDash and many more use in production.

Nov 20, 2025 • 7 tweets • 6 min read

Official : Google DeepMind has launched Nano Banana Pro aka Gemini 3 Pro Image , I was one of the early testers, Thank you Google DeepMind team.

Some major improvements :
- 1K 2K and 4K image generation
- More accurate legible text in multiple languages
- Better consistency across concepts characters and styles
- Precise localized editing for any part of an image

I asked it to explain Transformer Architecture , this is incredible stuff.

Some prompts and image examples from my early testing 👇

2/7 World knowledge and reasoning
One of the biggest upgrades is world knowledge and reasoning.
Nano Banana Pro can turn

- text into context rich infographics and diagrams
- real world data via Google Search into visual snapshots like recipes weather sports and more
Great for educational content dashboards explainers and product mockups.

I asked it to online research and create me an image with all details of Gemini 3.0 Pro launch .

Prompt : Use live online search to gather the latest accurate information about the launch of Google DeepMind’s Gemini 3.0 Pro from official Google/DeepMind sources and major tech news sites, then synthesize the confirmed facts (launch date, key features, capabilities, improvements vs previous versions, availability, and main use cases) into a single clean, modern infographic image with a clear title like “Gemini 3.0 Pro – Launch Overview,” short readable text (no long paragraphs), simple icons, and 3–6 sections or panels that visually highlight the main points; keep the design professional and balanced, make all text sharp and legible, and avoid adding any details that are not supported by your search results.

Nov 11, 2025 • 5 tweets • 2 min read

ElevenLabs just launched Scribe v2 Realtime - their next-gen Speech to Text model.

Current STT models force you to choose: fast but inaccurate, accurate but slow, or both but expensive at scale.

Scribe v2 Realtime breaks this tradeoff:

> Ultra-low latency – median latency of 150ms with partial transcriptions
> High accuracy – 93.5% across 30 EU & Asian languages with robust accent handling
> Low cost – optimized for production workloads

Built for live agents, meetings, and conversational AI that needs to work in the real world.

More details 👇

1/5

2/5

Scribe v2 Realtime delivers ultra-low latency STT with 150ms median latency and partial transcriptions in milliseconds.

- Streaming support - send audio chunks, get real-time transcripts
- Voice Activity Detection - automatic segmentation based on silence
- Manual commit control - you decide when to finalize transcript segments
- Multiple audio formats - PCM (8kHz–48kHz) and µ-law encoding
- Speaker diarization - available via manual commit
- Enterprise compliance - SOC 2, PCI, HIPAA, EU data residency ready

May 21, 2025 • 13 tweets • 4 min read

12 mind-blowing announcements from Google I/O Day 1, you don’t want to miss

Flow
Veo 3
Lyria 2
Imagen 4
Android XR
Agent Mode
Gemini Diffusion
Gemini native audio
Jules Code Assistant
Google Search Try-On
Project Astra (live demo)
Video Overview in NotbookLM

More details 👇

Flow: a new type of AI filmmaking tool that combines the best of Veo, Imagen and Gemini.

May 20, 2025 • 9 tweets • 2 min read

GOOGLE JUST WON AI VIDEO RACE.
it's so over..

Google's new video AI model Veo 3 has native audio generation.
You can now generate videos with sound effects, background noise, and even dialogue with just one prompt

9 Wild example 👇

Apr 21, 2025 • 10 tweets • 4 min read

🚨 SkyReels just launched! The world’s first open-source video generation platform supporting unlimited duration 🔥

All-in-one creator toolkit:

- Consistent high-quality video (LoRA ready)
- Fast gen, amazing output.
- Amazing facial expressions .

Plus text-to-film agent handles everything: script, character, storyboards, full AV gen, auto-edit . it's Wild!

Step by step tutorial 👇

1/7
Scripting

Just type your idea or upload a story , it converts everything into a professional screenplay. Then automatically designs characters and storyboards based on your narrative.

Apr 7, 2025 • 4 tweets • 2 min read

The Vogent AI team nailed this one 🔥🔥 They rearchitected the sesame CSM 1B model for:
- Ultra-realistic streaming voice
- Latency < 200ms
- Easy voice cloning (example Trump)

Listen to the AI Trump tariffs conversation but wait till you hear the next one.
More details 👇
1/4

2/3
How did Vogent team created this low latency and ultra-realistic streaming voice .
- Rebuilt Sesame CSM 1B model from the ground up
- Optimized for real-time, low-latency inference
- More humanlike than anything else out there
- Available now to all Vogent users, no extra charge
- Coming soon as a text-to-speech API.

Apr 2, 2025 • 6 tweets • 3 min read

Somebody stop China, they are absolutely K!lling it 🔥

Mureka AI just launched Mureka O1, the first-ever Chain-of-Thought AI music model, and it’s insane.

How will Music industry survive in AI era ??

Here’s how it works, with examples and details 👇
1/6

2/6
Here is small tutorial on how to use it .
- You can create song and instrumental both.
- You have basic and Advanced options based on your need.
- Advanced section allows you to use lyrics along with reference and description

Mar 26, 2025 • 4 tweets • 2 min read

Vibe coding is fun until production breaks or a hacker wreaks havoc .
Use CodeRabbit to review your AI generated code and Ships better quality code in half the time.
- Safe and easy to use
- Provides codebase-aware reviews.
- Catches security issues.

Note : CodeRabbit is free for OSS and projects like DiceDB, Strapi, and NuxtJS use it.

Step by Step Guide 👇
1/4

2/4
- Sign up for free trial.
- It works with popular Git platforms like GitHub, GitLab, Bitbucket, and Azure DevOps.
- You can give access to all repos or only selected repo which is really good.
- It will enable CodeRabbit to your repo and it’s ready to use.
- CodeRabbit automatically integrates with 20+ Code quality and Code Security tools, that enables non-devs or jr devs to write better code. CodeRabbit can also be assumed as Review Copilot.

Mar 17, 2025 • 11 tweets • 3 min read

China drops another banger on AI video work 🔥

ReCamMaster: Camera-Controlled Generative Rendering from a Single Video

- Creates realistic new perspectives of a scene from one video, so you can "re-film" videos with new camera movements.

12 examples and more details below 👇

Arc Trajectories

Mar 16, 2025 • 6 tweets • 3 min read

China is literally on 🔥
Baidu from China has launched ERNIE 4.5 and ERNIE X1 and it’s freaking cheap .

Here is everything you need to know.

ERNIE 4.5

- Native multimodal and Outperforms GPT 4.5 in multiple benchmarks at just 1% of GPT 4.5 price
- OpenAI GPT 4.5 – Input: $75 / 1M tokens, Output: $150 / 1M tokens;
- ERNIE 4.5 – Input: $0.55 / 1M tokens, Output: $2.20 / 1M tokens

ERNIE X1

- A deep thinking reasoning model with multimodal capabilities on par with DeepSeek R1 at only half the price

See it in action and check out the pricing details👇

📹 source : yiyan[.]baidu[.]com

1/6
ERNIE 4.5 is a multimodal which can take Audio files as well.

2/6
ERNIE 4.5 can also analyze document files.

Mar 13, 2025 • 4 tweets • 3 min read

ManusAI is powerful, but it has real limitations (like OpenAI operator) when interacting with the web:

- Gets blocked by major websites
- Can’t access logged-in sessions

That’s where web agents like rtrvr works well, running directly in your browser it can bypass blocks and access logged-in sessions.

Few examples 👇

Manus was blocked by Reddit and couldn’t scrape data. while rtrvr, extracted data from your logged-in sessions on X and Reddit, then use it to create posts on both platforms.

ManusAI is blocked by Major websites like Zillow and Reddit for bot detection and you can’t actually extract data.

Mar 2, 2025 • 6 tweets • 2 min read

Grok-3 DeepSearch + Gamma : productivity hack 🔥

- Grok-3: DeepSearch is really powerful use it to pull relevant data and people's opinions from X and the internet on any topic.
- Gamma : Turn this data into sleek, AI-generated slides in seconds.

step-by-step guide 👇

Step 1
Use Grok-3 DeepSearch and provide a detailed prompt to retrieve relevant information on your preferred topic.

Feb 25, 2025 • 11 tweets • 4 min read

Somebody please stop China they are on 🔥🔥
Wan from Alibaba Group has just open-sourced Wan 2.1 and it is better than OpenAI Sora
- Text-to-Video
- Image-to-Video
- Video Editing
- Text-to-Image
- Video-to-Audio

10 wild examples and more details below! 👇

2.
A wide cinematic shot from the audience's perspective capturing a vibrant hip-hop crew dominating the stage. Comprising five dancers with urban streetwear, confident expressions, and synchronized movements, they perform under dynamic stage lighting with side-angled beams slicing through smoke effects. The energetic crowd surrounds them, amplifying the powerful atmosphere of collective motion and rhythmic unity.

Feb 25, 2025 • 10 tweets • 3 min read

Anthropic Claude 3.7 Sonnet is the Best Coding AI model in the world right now

Less than 12 hours since its release, and people are already going crazy over its coding capabilities.

10 Wild examples to try👇

1. Landing Page : Credit : SullyOmarr

https://x.com/techikansh/status/1894126426049151386

Feb 24, 2025 • 11 tweets • 3 min read

Grok-3 voice is mode is the best AI exist right now.. It is the most powerful and the most unhinged AI out there. Voice mode is now live, and it's absolutely insane.

Listen at your own risk.

Here are 10 crazy examples you don’t want to miss 👇

https://x.com/testingcatalog/status/1893790075768504450

Feb 10, 2025 • 12 tweets • 4 min read

China is on 🔥 ByteDance drops another banger AI paper on AI video!

- Goku : Flow-based video generative foundation model.
- Goku+ : Video ads foundation model - 100x lower cost than traditional ads methods.

These are insanely good!

Here are 10 incredible examples and the research paper link 👇

2. Goku+: Turn Product Image To Video Clip

Feb 4, 2025 • 12 tweets • 3 min read

China is on 🔥 ByteDance drops another banger AI paper!
OmniHuman-1 can generate realistic human videos at any aspect ratio and body proportion using just a single image and audio. This is the best i have seen so far.

10 incredible examples and the research paper Link👇

Feb 1, 2025 • 13 tweets • 3 min read

OpenAI new o3-mini is currently the best coding model right now, with an average coding score of 82.74 (o3-mini-high) on LiveBench.

No one even comes close.

A 🧵of best coding examples of o3-mini.

https://x.com/ytiskw/status/1885409258919125354

Jan 25, 2025 • 14 tweets • 4 min read

OpenAI has released a major update for ChatGPT Canvas, Now, anyone can build web apps directly within ChatGPT

Canvas can now render both HTML and React code seamlessly within ChatGPT.

Here’s how people are using it : Top 13 Examples for you to try👇

https://x.com/omarsar0/status/1882898976602763713?t=F837piPWZtwO0dUdPi54fw&s=19

Share this page!

Enter URL or ID to Unroll