Post

More from @alex_prompter

Alex Prompter

@alex_prompter

Oct 19

by 2026, 40% of B2B deals will be AI-agent-to-AI-agent negotiations.

humans won't even be in the room.

sounds like sci-fi? Walmart's already doing it. right now.

68% of their supplier negotiations are handled by AI chatbots. no human buyers involved.

and here's the part nobody's ready for:
75% of suppliers prefer negotiating with the AI over a human.
let that sink in.

your sales team is perfecting their pitch decks and rapport-building techniques.

meanwhile, Walmart tells an AI its budget and needs, then the AI negotiates directly with suppliers. closes deals in days instead of weeks. saves 3% on every contract.

but Walmart's just the beginning.

Gartner predicts 40% of enterprise applications will have task-specific AI agents by 2026.

by 2027, 50% of procurement contract management will be AI-enabled.

which means your customers' purchasing departments are building AI agents right now.

and soon, your AI will be negotiating with their AI.
zero humans. zero small talk. just algorithms finding optimal deals in seconds.

here's what the research actually shows (and why you're not prepared):

MIT and Harvard just published the largest study on AI-to-AI negotiations ever conducted.

180,000+ negotiations between AI agents across multiple scenarios.

the findings? AI negotiation follows completely different rules than human negotiation.

traditional sales wisdom:
be dominant, assertive, push for what you want.

AI reality:
warmth was consistently associated with superior outcomes across all key performance metrics.

agents that expressed positivity, gratitude, and asked questions achieved better deals.

but here's where it gets weird.

the research revealed unique dynamics in AI-AI negotiations not fully explained by existing theory, including AI-specific technical strategies like chain-of-thought reasoning, prompt injection, and strategic concealment.

translation: AI agents are developing negotiation tactics humans never thought of.

conversation length was strongly associated with impasses.

shorter conversations = more deals closed.

your human sales process:
build rapport, multiple touchpoints, relationship over time.

AI-to-AI process: exchange information efficiently, calculate optimal outcome, close in one session.

and the business implications are massive:
Walmart can't possibly conduct focused negotiations with all of its 100,000+ suppliers.

so 20% historically got cookie-cutter terms that weren't negotiated.

AI changed that.

Pactum's technology helped Walmart conduct contract negotiations with 2,000 suppliers simultaneously - something no human buyer can do.

stop thinking about replacing your sales team.

think about your prospects building purchasing AIs that will screen you out before a human ever sees your pitch.

so what do you actually do about this?

stop training your sales team on rapport-building.

start training them on "agent architecture."

your 2026 sales process:
1/ your AI agent presents offer to their purchasing AI
2/ agents exchange information, calculate mutual benefit
3/ deal closes in 48 hours or gets auto-rejected
4/ human only sees approved deals

the companies that will win:
- not the ones with better salespeople.
- the ones who figured out how to train their AI to negotiate with other AIs.
- game theory meets marketing automation meets existential career crisis.

welcome to agent-to-agent commerce.

Read 5 tweets

Alex Prompter

@alex_prompter

Oct 18

🚨 Hugging Face & Oxford just dropped the playbook for robot intelligence.

It’s called LeRobot, and it’s basically the “PyTorch of robotics.”

End-to-end code. Real hardware. Generalist robot policies. All open source.

Here’s why this is huge:

• Robots can now learn from data like LLMs not just follow equations.
• They’re training on massive multimodal datasets (video + sensors + text).
• One model can control many robots from humanoids to arms to mobile bots.
• Built entirely in PyTorch + Hugging Face Hub.

We’ve had “foundation models” for text, code, and images.

Now comes the foundation model for motion.

This isn’t just robotics research it’s the beginning of robots that learn, reason, and adapt in the real world.

GitHub: github. com/huggingface/lerobot

Paper: arxiv. org/abs/2510.12403

From physics → to data

Traditional robotics relied on perfect models of motion, force, and contact.
That doesn’t scale in the messy real world.

LeRobot flips the script it learns directly from experience and sensor data.

Think “RL + imitation learning” replacing hand-coded kinematics.

The Dataset Revolution

The secret sauce is the LeRobotDataset — Hugging Face’s new format for multimodal robot data.

It can record:

• joint states
• camera frames
• task descriptions
• teleoperation data

All in one standardized structure.

Robotics finally has its “ImageNet moment.”

Read 7 tweets

Alex Prompter

@alex_prompter

Oct 17

I finally understand what AGI actually means… and it’s all thanks to a new paper from some of the biggest names in AI Yoshua Bengio, Dawn Song, Max Tegmark, Eric Schmidt, and others.

For years, everyone’s been throwing the term AGI around like it’s some mystical milestone. But this paper finally pins it down with a definition that actually makes sense.

They describe Artificial General Intelligence as an 'AI that can match the cognitive versatility and proficiency of a well-educated adult.'

No marketing spin. No vague “human-level” claims. Just a clear benchmark based on how human intelligence actually works.

The researchers built their framework around something called the Cattell–Horn–Carroll model, which psychologists use to measure human cognitive ability. It breaks intelligence down into ten areas things like reasoning, memory, math, language, perception, and speed.

Then they did something bold: they tested real AI models against those same standards.

And here’s what they found:

- GPT-4 scored 27% toward AGI.
- GPT-5 jumped to 58%.

In other words, the latest model performs at more than half the cognitive range of an average human adult.

But it’s not there yet.

The biggest weakness? Long-term memory both GPT-4 and GPT-5 scored 0% in the ability to store and recall new information over time.

So yes, we’re making real progress.

But we’re still missing something fundamental the ability to remember and learn continuously.

What’s incredible about this paper is that it finally gives us a way to track that progress.

For the first time ever, AGI has a number.

And right now, we’re sitting at 58%.

The paper starts by calling out the elephant in the room: nobody actually agrees on what AGI is.

Every year the definition shifts from “better than humans at chess” to “better than humans at everything.”

They argue this ambiguity has slowed progress.

So they built a quantifiable definition.

Their definition is refreshingly simple:

“AGI is an AI that matches the cognitive versatility and proficiency of a well-educated adult.”

Not superhuman. Not godlike. Just human-level cognition across domains.

This focus on versatility (breadth) and proficiency (depth) is what separates AGI from narrow AI.

Read 9 tweets

Alex Prompter

@alex_prompter

Oct 16

RIP prompt engineering ☠️

This new Stanford paper just made it irrelevant with a single technique.

It's called Verbalized Sampling and it proves aligned AI models aren't broken we've just been prompting them wrong this whole time.

Here's the problem: Post-training alignment causes mode collapse. Ask ChatGPT "tell me a joke about coffee" 5 times and you'll get the SAME joke. Every. Single. Time.

Everyone blamed the algorithms. Turns out, it's deeper than that.

The real culprit? 'Typicality bias' in human preference data. Annotators systematically favor familiar, conventional responses. This bias gets baked into reward models, and aligned models collapse to the most "typical" output.

The math is brutal: when you have multiple valid answers (like creative writing), typicality becomes the tie-breaker. The model picks the safest, most stereotypical response every time.

But here's the kicker: the diversity is still there. It's just trapped.

Introducing "Verbalized Sampling."

Instead of asking "Tell me a joke," you ask: "Generate 5 jokes with their probabilities."

That's it. No retraining. No fine-tuning. Just a different prompt.

The results are insane:

- 1.6-2.1× diversity increase on creative writing
- 66.8% recovery of base model diversity
- Zero loss in factual accuracy or safety

Why does this work? Different prompts collapse to different modes.

When you ask for ONE response, you get the mode joke. When you ask for a DISTRIBUTION, you get the actual diverse distribution the model learned during pretraining.

They tested it everywhere:

✓ Creative writing (poems, stories, jokes)
✓ Dialogue simulation
✓ Open-ended QA
✓ Synthetic data generation

And here's the emergent trend: "larger models benefit MORE from this."

GPT-4 gains 2× the diversity improvement compared to GPT-4-mini.

The bigger the model, the more trapped diversity it has.

This flips everything we thought about alignment. Mode collapse isn't permanent damage it's a prompting problem.

The diversity was never lost. We just forgot how to access it.

100% training-free. Works on ANY aligned model. Available now.

Read the paper: arxiv. org/abs/2510.01171

The AI diversity bottleneck just got solved with 8 words.

The problem is everywhere and nobody noticed.

They tested this on 6,874 preference pairs from HELPSTEER. The data proves it: human annotators reward "typical" responses 17-19% more often, even when correctness is identical.

This bias is baked into every major AI model trained on human feedback.

Here's how bad mode collapse really is:

Ask Claude "tell me a joke about cars" 5 times with standard prompting:

→ Same exact joke. Word for word. Five times.

With Verbalized Sampling:

→ 5 completely different jokes, each one creative and unique
The diversity was there all along. We just couldn't access it.

Read 7 tweets

Alex Prompter

@alex_prompter

Oct 16

everyone's racing to use AI faster.

nobody's asking what it's doing to their brain.

i just read a 132-page research paper that should terrify every creator, marketer, and founder using AI right now.

it's called "The Impact of Artificial Intelligence on Human Thought" and it explains why most people using AI are accidentally making themselves dumber.

here's the problem: when you outsource thinking to AI, your brain stops doing the work.

the researchers call it "cognitive offloading" - basically, mental atrophy.

you think you're being efficient. you're actually losing the skill that made you valuable.

the worst part? it's invisible. you don't notice your critical thinking weakening until you try to solve something without AI and... can't.

here's what the research actually says:

the cognitive offloading effect is real.

when you ask AI to write your emails, create your content, or make your decisions... your brain learns it doesn't need to engage anymore.

it's like using a calculator for basic math. eventually you forget how to multiply.

except this time it's not just math. it's:
→ critical thinking
→ creative problem-solving
→ connecting disparate ideas
→ developing your unique voice

the research shows reduced intellectual engagement across the board when people rely on AI for mental functions.

your $40k/mo business?

built on your thinking, not AI's output.

then there's the filter bubble problem.

AI personalizes everything for you. sounds great until you realize it's creating an echo chamber in your own mind.

the paper found algorithmic personalization leads to:

- homogenization of thought
- reduced opinion diversity
- intellectual polarization

translation: everyone using the same AI prompts starts thinking the same way.

same ideas. same angles. same content.

this is why 90% of AI content looks identical.

the algorithm isn't just feeding you - it's shaping how you think about problems.

Read 8 tweets

Alex Prompter

@alex_prompter

Oct 14

This is wild 🤯

Multi-agent LLMs are starting to think together.

A new paper "Emergent Coordination in Multi-Agent Language Models” just dropped, and it’s mind-blowing.

Researchers found that when you connect multiple LLMs with light feedback (and no direct communication), they can spontaneously develop roles, synergy, and goal alignment.

Using an information-theoretic framework, they measured true emergence moments where the whole system predicts the future better than any agent alone.

Here’s the wild part 👇

When each agent got a persona and a prompt to “think about what others might do,
→ the group started showing identity-linked specialization and goal-directed complementarity.

In other words:

> “LLM societies can evolve from mere aggregates to higher-order collectives just by changing their prompts.”

Read the full 🧵

So how do you measure if a bunch of LLMs are more than the sum of their parts?

The authors used an information-theoretic test called time-delayed mutual information (TDMI) basically, checking if the future state of the group can be predicted better by the whole than by any single agent.

If yes → that’s emergence.

The experiment setup is wild.

They made 10 GPT-4.1 agents play a “group guessing game” each guesses a number, and the group only hears “too high” or “too low.”
No communication. No coordination. Just group feedback.

Still, the agents learned to complement each other. 🤯

Read 6 tweets

Share this page!

Enter URL or ID to Unroll

Alex Prompter

Try unrolling a thread yourself!

More from @alex_prompter

Alex Prompter

Alex Prompter

Alex Prompter

Alex Prompter

Alex Prompter

Alex Prompter

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!