co-founder of @theprohumanai, running a marketing agency + AI businesses, sharing what i learn here so you can be more productive and successful.
28 subscribers
Sep 15 • 8 tweets • 4 min read
This paper just exposed RAG's biggest lie 😳
99% of people think RAG is just "search some docs, stuff them into a prompt." That's Naive RAG. It worked for demos. It doesn't work for production.
The real evolution happened when researchers realized LLMs don't just need more information. They need the right information, at the right time, in the right format.
This led to Advanced RAG with query rewriting and context compression. Better, but still linear.
Now we're in the Modular RAG era. Instead of retrieve-then-generate, we have systems that decide when to retrieve, what to retrieve, and how many times. Self-RAG lets models critique their own outputs and retrieve more context when confidence drops.
But here's what nobody talks about: RAG and fine-tuning aren't competitors. They're complementary. Fine-tuning gives you style. RAG gives you fresh facts.
Most interesting finding: noise sometimes helps. One study found that including irrelevant documents can increase accuracy by 30%. The model learns to filter signal from noise.
The evaluation problem is real though. We're measuring RAG systems with metrics designed for traditional QA. Context relevance and answer faithfulness barely scratch the surface.
Production RAG faces different challenges. Data security, retrieval efficiency, preventing models from leaking document metadata. The engineering problems matter more than research papers.
Multi-modal RAG is coming fast. Text plus images plus code plus audio. The principles transfer, but complexity explodes.
My take: we're still early. Current RAG feels like early search engines. The next breakthrough comes from better integration with long-context models, not replacing them.
One prediction: the distinction between retrieval and generation blurs completely. Future models won't retrieve documents, they'll retrieve and synthesize information in a single forward pass.1. The three paradigms of RAG evolution: Naive (basic retrieve-read), Advanced (pre/post processing), and Modular (adaptive retrieval).
We're moving from "always retrieve" to "retrieve when needed."
Sep 14 • 8 tweets • 4 min read
I just read this Google research paper that completely broke my brain 😳
So these researchers took regular language models - the same ones everyone says "can't really think" - and tried something dead simple. Instead of asking for quick answers, they just said "hey, show me how you work through this step by step."
That's it. No fancy training. No special algorithms. Just better prompts.
The results? Absolutely insane.
Math problems that stumped these models? Suddenly they're solving them left and right. We're talking 18% accuracy shooting up to 57% on the same exact model. Same brain, different conversation.
But here's where it gets weird. This only worked on the really big models. The smaller ones? They actually got worse. Started rambling nonsense that sounded smart but made zero sense.
Something magical happens around 100 billion parameters though. The model just... starts thinking. Like, actual logical reasoning chains that you can follow. Nobody taught it this. It just emerged.
I've been using ChatGPT and Claude completely wrong this whole time. Instead of wanting instant answers, I should've been asking "walk me through this."
They tested this on everything. Math, common sense questions, logic puzzles. Same pattern everywhere. The models were always capable of this stuff - we just never knew how to ask.
Makes me wonder what else these systems can do that we haven't figured out yet. Like, if reasoning just pops up when you scale things up and ask differently, what happens when someone figures out the right way to prompt for creativity? Or planning? Or solving actually hard problems?
The craziest part is that the models don't even need to be retrained. They already have this ability sitting there, waiting for someone to unlock it with the right conversation.
We've been having the wrong conversations with AI this whole time.1/ The bigger the model, the better it thinks (small models actually get worse)
Sep 11 • 6 tweets • 3 min read
What the fuck just happened 🤯
UAE just dropped K2-Think world’s fastest open-source AI reasoning model and it's obliterating everything we thought we knew about AI scaling.
32 billion parameters. That's it. And this thing is matching GPT-4 level reasoning while being 20x smaller.
The paper is absolutely wild. They combined six technical tricks that nobody else bothered to put together. Long chain-of-thought training, reinforcement learning with verifiable rewards, and this "Plan-Before-You-Think" approach that actually reduces token usage by 12% while making the model smarter.
The benchmarks are insane. 90.83% on AIME 2024. Most frontier models can't crack 85%. On complex math competitions, it scored 67.99% - beating models with 200B+ parameters.
And the speed. Holy shit, the speed. 2,000 tokens per second on Cerebras hardware. Most reasoning models crawl at 200 tokens/second. That's the difference between waiting 3 minutes or 16 seconds for a complex proof.
Here's the kicker: they used only open-source datasets. No proprietary training data. No closed APIs. They proved you can build frontier reasoning with public resources and actual engineering skill.
This just nuked the "you need massive scale" narrative. Small labs can now deploy reasoning that was OpenAI-exclusive six months ago.
Everyone's talking about the speed records. The real story is they cracked parameter efficiency at the reasoning level.1/ The benchmark
Sep 10 • 13 tweets • 4 min read
you can now use any llm like chatgpt, claude, or grok to:
→ write your resume
→ personalize cover letters
→ find hidden jobs
→ prep you for interviews
→ optimize your linkedin
here are 10 prompts to automate your entire job search (bookmark this):
prompt 1: build your custom resume
"you are a resume strategist. based on my experience and the job below, write a resume that matches keywords, highlights results, and passes ats filters."
→ [paste job description]
→ [paste work history]
Aug 27 • 7 tweets • 3 min read
R.I.P Canva.
This new AI tool makes presentations, docs, landing pages & charts in under 60 seconds no templates, no design stress.
Here’s why 50M+ people already switched:
Meet Gamma - Your all-in-one AI platform for creating:
• Presentations
• Landing pages
• Social media posts
• Documents
All in under 1 minute.
No more manual design. No wasted time. Just type, and it builds.
🚨 Google just launched Flow, an AI-powered filmmaking tool built for the next generation of storytellers.
It's cinematic. It's collaborative. And it runs on Google’s most advanced models: Veo 3, Imagen, and Gemini.
Here’s how it works:
1. Meet Flow built with filmmakers, for filmmakers.
Flow lets you:
→ Create cinematic clips from natural prompts
→ Use consistent characters & assets across scenes
→ Control shots with advanced camera tools
→ Seamlessly edit with continuous motion
→ Learn from real examples via Flow TV
May 18 • 10 tweets • 4 min read
I don’t say this lightly:
These 8 TED talks genuinely changed my life.
Not inspired. Not motivated. Changed!
Bookmark this thread: 1. Sleep is your superpower by Matt Walker
Discover about sleep's effects on learning, memory, immunity, and genetics, plus tips for better rest.
May 11 • 8 tweets • 2 min read
🚨 BREAKING: Microsoft just opened up global access to free AI courses.
Learn real-world AI skills, get certified all for free.
Here’s what inside ↓
The Microsoft AI Skills Fest runs through May 28, 2025.
It offers self-paced training for all levels from curious beginners to professionals.
Courses cover everyday AI use, advanced tools like Microsoft Fabric, and GitHub Copilot.