Latest Twitter Threads by @ihteshamit on Thread Reader App

Sep 15 • 8 tweets • 4 min read

This paper just exposed RAG's biggest lie 😳

99% of people think RAG is just "search some docs, stuff them into a prompt." That's Naive RAG. It worked for demos. It doesn't work for production.

The real evolution happened when researchers realized LLMs don't just need more information. They need the right information, at the right time, in the right format.

This led to Advanced RAG with query rewriting and context compression. Better, but still linear.

Now we're in the Modular RAG era. Instead of retrieve-then-generate, we have systems that decide when to retrieve, what to retrieve, and how many times. Self-RAG lets models critique their own outputs and retrieve more context when confidence drops.

But here's what nobody talks about: RAG and fine-tuning aren't competitors. They're complementary. Fine-tuning gives you style. RAG gives you fresh facts.

Most interesting finding: noise sometimes helps. One study found that including irrelevant documents can increase accuracy by 30%. The model learns to filter signal from noise.

The evaluation problem is real though. We're measuring RAG systems with metrics designed for traditional QA. Context relevance and answer faithfulness barely scratch the surface.

Production RAG faces different challenges. Data security, retrieval efficiency, preventing models from leaking document metadata. The engineering problems matter more than research papers.

Multi-modal RAG is coming fast. Text plus images plus code plus audio. The principles transfer, but complexity explodes.

My take: we're still early. Current RAG feels like early search engines. The next breakthrough comes from better integration with long-context models, not replacing them.

One prediction: the distinction between retrieval and generation blurs completely. Future models won't retrieve documents, they'll retrieve and synthesize information in a single forward pass.

1. The three paradigms of RAG evolution: Naive (basic retrieve-read), Advanced (pre/post processing), and Modular (adaptive retrieval).

We're moving from "always retrieve" to "retrieve when needed."

Sep 14 • 8 tweets • 4 min read

I just read this Google research paper that completely broke my brain 😳

So these researchers took regular language models - the same ones everyone says "can't really think" - and tried something dead simple. Instead of asking for quick answers, they just said "hey, show me how you work through this step by step."

That's it. No fancy training. No special algorithms. Just better prompts.

The results? Absolutely insane.

Math problems that stumped these models? Suddenly they're solving them left and right. We're talking 18% accuracy shooting up to 57% on the same exact model. Same brain, different conversation.

But here's where it gets weird. This only worked on the really big models. The smaller ones? They actually got worse. Started rambling nonsense that sounded smart but made zero sense.

Something magical happens around 100 billion parameters though. The model just... starts thinking. Like, actual logical reasoning chains that you can follow. Nobody taught it this. It just emerged.

I've been using ChatGPT and Claude completely wrong this whole time. Instead of wanting instant answers, I should've been asking "walk me through this."

They tested this on everything. Math, common sense questions, logic puzzles. Same pattern everywhere. The models were always capable of this stuff - we just never knew how to ask.

Makes me wonder what else these systems can do that we haven't figured out yet. Like, if reasoning just pops up when you scale things up and ask differently, what happens when someone figures out the right way to prompt for creativity? Or planning? Or solving actually hard problems?

The craziest part is that the models don't even need to be retrained. They already have this ability sitting there, waiting for someone to unlock it with the right conversation.

We've been having the wrong conversations with AI this whole time.

1/ The bigger the model, the better it thinks (small models actually get worse)

Sep 11 • 6 tweets • 3 min read

What the fuck just happened 🤯

UAE just dropped K2-Think world’s fastest open-source AI reasoning model and it's obliterating everything we thought we knew about AI scaling.

32 billion parameters. That's it. And this thing is matching GPT-4 level reasoning while being 20x smaller.

The paper is absolutely wild. They combined six technical tricks that nobody else bothered to put together. Long chain-of-thought training, reinforcement learning with verifiable rewards, and this "Plan-Before-You-Think" approach that actually reduces token usage by 12% while making the model smarter.

The benchmarks are insane. 90.83% on AIME 2024. Most frontier models can't crack 85%. On complex math competitions, it scored 67.99% - beating models with 200B+ parameters.

And the speed. Holy shit, the speed. 2,000 tokens per second on Cerebras hardware. Most reasoning models crawl at 200 tokens/second. That's the difference between waiting 3 minutes or 16 seconds for a complex proof.

Here's the kicker: they used only open-source datasets. No proprietary training data. No closed APIs. They proved you can build frontier reasoning with public resources and actual engineering skill.

This just nuked the "you need massive scale" narrative. Small labs can now deploy reasoning that was OpenAI-exclusive six months ago.

Everyone's talking about the speed records. The real story is they cracked parameter efficiency at the reasoning level.

1/ The benchmark

Sep 10 • 13 tweets • 4 min read

you can now use any llm like chatgpt, claude, or grok to:

→ write your resume
→ personalize cover letters
→ find hidden jobs
→ prep you for interviews
→ optimize your linkedin

here are 10 prompts to automate your entire job search (bookmark this): prompt 1: build your custom resume

"you are a resume strategist. based on my experience and the job below, write a resume that matches keywords, highlights results, and passes ats filters."
→ [paste job description]
→ [paste work history]

Aug 27 • 7 tweets • 3 min read

R.I.P Canva.

This new AI tool makes presentations, docs, landing pages & charts in under 60 seconds no templates, no design stress.

Here’s why 50M+ people already switched:

Meet Gamma - Your all-in-one AI platform for creating:

• Presentations
• Landing pages
• Social media posts
• Documents

All in under 1 minute.

No more manual design. No wasted time. Just type, and it builds.

Check it here
gamma.app/?utm_medium=cr…

Aug 20 • 16 tweets • 5 min read

AI can lie.
AI can flatter.
AI can manipulate.
AI can turn hostile.

but now we can flip these traits off like switches.

this breakthrough from Anthropic is called 'Persona Vectors' and it changes everything.

Here's everything you need to know:

What are persona vectors?

They’re directions inside a model’s brain (activation space) that represent a specific trait like:

• evil
• sycophancy
• hallucination
• optimism
• humor

Once extracted, they let you measure, steer, or suppress traits in any LLM.

Aug 19 • 16 tweets • 5 min read

After reading OpenAI’s internal docs + top research papers…

I finally understand how LLMs actually work and why most prompts suck.

Here are 10 prompting techniques that completely changed my results 👇 You’re going to learn:

• What great prompts look like
• How to structure them for better output
• 10+ expert techniques that boost accuracy, logic & creativity

Whether you're a beginner or pro this will level you up.

Aug 17 • 16 tweets • 4 min read

This one paper might kill the LLM agent hype.

NVIDIA just published a blueprint for agentic AI powered by Small Language Models.

And it makes a scary amount of sense.

Here’s the full breakdown:

Today, most AI agents run every task no matter how simple through massive LLMs like GPT-4 or Claude.

NVIDIA’s researchers say: that’s wasteful, unnecessary, and about to change.

Small Language Models (SLMs) are models that fit on consumer hardware and run with low latency.

They’re fast, cheap, and for most agentic tasks just as effective as their larger counterparts.

Jul 25 • 11 tweets • 3 min read

How to prompt chatgpt reasoning models to get shockingly good outputs: these models can reason.
but only if you let them.

if you treat o3 or 4o-mini like search bars, you’ll get mid results.

if you treat them like thinkers, you’ll get gold.

here’s exactly how to do that (with prompts to test):

Jul 21 • 30 tweets • 6 min read

How to write JSON prompts to get shockingly accurate outputs from any chatbot: first…what is json prompt writing?

it's just putting your prompt inside a structured format.
like this:

{
"task": "summarize this article",
"audience": "college students",
"length": "100 words",
"tone": "curious"
}

not english.
not vibes.
just instructions, like a form

Jul 19 • 8 tweets • 3 min read

RIP ChatGPT Agents

Emergent builds entire apps and games from a single prompt through a team of agents specialising in software engineering.

You don't even touch code

Here are 6 apps that are built with Emergent in few minutes (+ learn how to use it for free): 1. AI Ad Generator with Veo 3 integration

Generates high quality video ads from text prompts

Fully functional, production-ready.

Emergent even handles LLM logic for you.

Jul 2 • 7 tweets • 3 min read

🚨 BREAKING: AI can now talk to humans on the phone

It calls your dentist, cancels your gym, argues with Comcast for you

Pine AI just made customer service optional

Here’s how it works ↓ Pine is your AI phone assistant and it actually talks.

You give it a task. It dials the number. It speaks naturally.

And it finishes the call like a pro:

→ No hold music
→ No scripts
→ No stress

Try it here: cutt.ly/vrQ5TKAh

Jun 23 • 7 tweets • 3 min read

BREAKING: You can now clone any music without copyright risk.

Mureka AI just launched a tool that lets you upload any reference track and generate a custom instrumental in the same style.

Here's how it works (+ wild examples): Meet Mureka AI your personal AI music engine.

Upload an instrumental.
Choose your style.
Get back a demo-quality track made just for you.

No loops. No copyright flags. Just vibe-matched music for content, beats, or BGM.

👉

Here's how I made music: bit.ly/45CgGGt

Jun 14 • 9 tweets • 3 min read

RIP Lovable.
RIP Cursor.

This new AI agent builds production-ready apps from scratch and actually ships.

Here’s how it works (with examples):

@Rocketdotnew is an AI tool that creates websites and apps.

Simply tell Rocket what you need or import a Figma file for code conversion.

Once done, download your code, publish it online, or upload it to GitHub.

rocket.new

Jun 9 • 8 tweets • 3 min read

RIP research departments

You can now use ChatGPT, Gemini, Claude, DeepSeek, or any other LLM to replace a full research team.

Here’s the exact mega prompt I use to make any LLM a world-class researcher for free:

First, what does a great researcher actually do?

Only 3 things:

1. Understand a broad topic deeply
2. Break it into its key components
3. Deliver clear, structured insights

AI can now do all of that faster, cheaper, and at scale.

Jun 8 • 12 tweets • 4 min read

🚨 BREAKING: China just dropped the most insane AI tool of 2025.

Motion Magic AI creates cinema-level motion graphics in literal seconds with one prompt.

Hollywood studios are panicking.

Here's what it can do (+7 wild examples) 👇

Motion Magic AI listens to your request using natural language and creates custom motion graphics for you.

Turn your ideas into cool motion graphics by simply chatting with us.

trydorastudio.com

May 20 • 7 tweets • 3 min read

🚨 Google just launched Flow, an AI-powered filmmaking tool built for the next generation of storytellers.

It's cinematic. It's collaborative. And it runs on Google’s most advanced models: Veo 3, Imagen, and Gemini.

Here’s how it works:

1. Meet Flow built with filmmakers, for filmmakers.

Flow lets you:

→ Create cinematic clips from natural prompts
→ Use consistent characters & assets across scenes
→ Control shots with advanced camera tools
→ Seamlessly edit with continuous motion
→ Learn from real examples via Flow TV

May 18 • 10 tweets • 4 min read

I don’t say this lightly:

These 8 TED talks genuinely changed my life.
Not inspired. Not motivated. Changed!

Bookmark this thread:

1. Sleep is your superpower by Matt Walker

Discover about sleep's effects on learning, memory, immunity, and genetics, plus tips for better rest.

May 11 • 8 tweets • 2 min read

🚨 BREAKING: Microsoft just opened up global access to free AI courses.

Learn real-world AI skills, get certified all for free.

Here’s what inside ↓

The Microsoft AI Skills Fest runs through May 28, 2025.

It offers self-paced training for all levels from curious beginners to professionals.

Courses cover everyday AI use, advanced tools like Microsoft Fabric, and GitHub Copilot.

Join here: register.aiskillsfest.microsoft.com

May 10 • 9 tweets • 2 min read

🚨 Anthropic dropped a free guide on Prompt Engineering and it's insanely useful.

Learn how to write smarter prompts that get better results.

Here’s what’s inside:

Go to their main website to start learning:

Here's what you are going to learn:docs.anthropic.com/en/docs/build-…

May 7 • 6 tweets • 2 min read

🚨 BREAKING: Hugging Face just dropped a free AI agent that uses a computer like a human.

It’s called Open Computer Agent, and it mimics real computer use.

You can run it in your browser no install required.

Here's everything you need to know:

Test it here:

Open Computer Agent runs on a cloud Linux VM with apps like Firefox.

Give it a task like “Find Hugging Face HQ on Google Maps,” and it clicks, types, and navigates like a person would.

But…it’s slow. It struggles with CAPTCHAs. And complex tasks like booking flights trip it up.huggingface.co/spaces/smolage…

Share this page!

Enter URL or ID to Unroll