Latest Twitter Threads by @connordavis_ai on Thread Reader App

Nov 10 • 8 tweets • 3 min read

If you’re building AI agents right now, you’re probably doing it wrong.

Most “agents” break after one task because nobody’s teaching the real framework. Here’s how to build one that actually works ↓

First: most "AI agents" are just glorified chatbots.

You don't need 20 papers or a PhD.

You need 4 things:

→ Memory
→ Tools
→ Autonomy
→ A reason to exist

Let’s break it down like you're building a startup MVP:

Oct 2 • 16 tweets • 4 min read

This Stanford study just ended the prompt engineering gold rush. Turns out most viral techniques are placebo.

I verified every claim myself.

Here's the real playbook:

The biggest lie: "Be specific and detailed"

Stanford researchers tested 100,000 prompts across 12 different tasks.

Longer prompts performed WORSE 73% of the time.

The sweet spot? 15-25 tokens for simple tasks, 40-60 for complex reasoning.

Sep 27 • 7 tweets • 3 min read

🚨 Meta just exposed a massive inefficiency in AI reasoning

Current models burn through tokens re-deriving the same basic procedures over and over. Every geometric series problem triggers a full derivation of the formula. Every probability question reconstructs inclusion-exclusion from scratch. It's like having a mathematician with amnesia.

Their solution: "behaviors" - compressed reasoning patterns extracted from the model's own traces. Instead of storing facts like RAG systems, they store procedural knowledge. "behavior_inclusion_exclusion" becomes a reusable cognitive tool rather than something to rediscover each time.

The results crush current approaches. 46% fewer tokens with maintained accuracy on MATH problems. 10% better accuracy on AIME with behavior-guided self-improvement versus standard critique-and-revise.

But here's the kicker: when they fine-tuned models on behavior-conditioned reasoning, smaller models didn't just get faster - they became fundamentally better reasoners. The behaviors act as scaffolding for building sophisticated reasoning capabilities.

This flips everything. Instead of "think longer = think better," we get "remember how to think = think better." No architectural changes needed. Just better utilization of patterns the models already discover.

The current paradigm - scale context length for redundant reasoning - looks wasteful now. We're paying enormous computational costs for models to repeatedly rediscover their own knowledge.

This suggests reasoning breakthroughs won't come from bigger models or longer chains of thought, but from systems that accumulate procedural memory. Models that learn not just what to conclude, but how to think efficiently.

The efficiency gains alone make this commercially critical. But the deeper insight challenges our entire approach to reasoning model development.

The pipeline is surprisingly simple. Model solves problem → reflects on its own solution → extracts reusable behaviors. No architectural changes needed.

Just metacognitive analysis of reasoning traces.

Sep 13 • 8 tweets • 3 min read

Google just solved the language barrier problem that's plagued video calls forever.

Their new Meet translation tech went from "maybe in 5 years" to shipping in 24 months.

Here's how they cracked it and why it changes everything.

The old translation process was a joke. Your voice → transcribed to text → translated → converted back to robotic speech.

10-20 seconds of dead air while everyone stared at their screens. By the time the translation played, the conversation had moved on. Natural flow? Dead.

Sep 12 • 9 tweets • 3 min read

Forget Google Scholar.

Grok 4 just became a research assistant on steroids.

It scans long PDFs, extracts insights, and formats your bibliography in seconds.

Here’s the prompt to copy:

The traditional research process is painfully slow:

• Searching Google Scholar
• Reading 50+ papers
• Extracting key findings manually
• Synthesizing ideas into clear insights

Most of this can now be delegated to AI.

Let me show you how AI can help you:

Sep 8 • 10 tweets • 3 min read

🚨 BREAKING: OpenAI just killed the “hallucinations are a glitch” myth.

New paper shows hallucinations are inevitable with today’s training + eval setups.

Here’s everything you need to know:

Most people think hallucinations are random quirks.

but generation is really just repeated classification:
at every step the model asks “is this token valid?”

if your classifier isn’t perfect → errors accumulate → hallucinations.

Sep 7 • 8 tweets • 3 min read

If you want to build AI agents using n8n, do this:

Copy/paste this prompt into ChatGPT and watch it build your agent from scratch.

Here’s the exact prompt I use:

The system:

1. I open ChatGPT
2. Paste in 1 mega prompt
3. Describe what I want the agent to do
4. GPT returns:

• Architecture
• n8n nodes
• Triggers
• LLM integration
• Error handling
• Code snippets

5. I follow the steps in n8n.

Done.

Sep 5 • 16 tweets • 4 min read

The most important AI paper of 2025 might have just dropped.

NVIDIA lays out a framework for Small Language Model agents that could outcompete LLMs.

Here’s the full breakdown (and why it matters):

Today, most AI agents run every task no matter how simple through massive LLMs like GPT-4 or Claude.

NVIDIA’s researchers say: that’s wasteful, unnecessary, and about to change.

Small Language Models (SLMs) are models that fit on consumer hardware and run with low latency.

They’re fast, cheap, and for most agentic tasks just as effective as their larger counterparts.

Sep 1 • 10 tweets • 3 min read

You don’t need a PhD to understand Retrieval-Augmented Generation (RAG).

It’s how AI stops hallucinating and starts thinking with real data.

And if you’ve ever asked ChatGPT to “use context” you’ve wished for RAG.

Let me break it down in plain English (2 min read): 1. what is RAG?

RAG = Retrieval-Augmented Generation.

it connects a language model (like gpt-4) to your external knowledge.

instead of guessing, it retrieves relevant info before generating answers.

think: search engine + smart response = fewer hallucinations.

it’s how ai stops making stuff up and starts knowing real things.

Aug 24 • 8 tweets • 3 min read

Building AI agents in n8n doesn’t require endless trial & error.

I use 1 mega prompt with ChatGPT/Claude to extract everything I need:

• Architecture
• APIs & triggers
• Logic
• Outputs

Here’s the exact prompt: The system:

1. I open ChatGPT
2. Paste in 1 mega prompt
3. Describe what I want the agent to do
4. GPT returns:

• Architecture
• n8n nodes
• Triggers
• LLM integration
• Error handling
• Code snippets

5. I follow the steps in n8n.

Done.

Aug 23 • 15 tweets • 4 min read

If you’re building AI systems in 2025, there are only two tools worth learning: LangGraph and n8n.

The choice you make here will define how far you can actually scale.

Here’s everything you need to know (and what nobody is telling you):

Let’s get one thing clear:

LangGraph and n8n are not competitors in the usual sense.

They solve different problems.

But if you misunderstand their roles, you’ll cripple your AI stack before it even gets going.

Aug 17 • 13 tweets • 4 min read

You don’t need GPT-5 or Claude 5...

You need better prompts.

MIT just confirmed what AI experts already knew:

Prompting drives 50% of performance.

Here’s how to level up without touching the model:

When people upgrade to more powerful AI, they expect better results.

And yes, newer models do perform better.

But this study found a twist:

Only half the quality jump came from the model.

The rest came from how users adapted their prompts.

Share this page!

Enter URL or ID to Unroll