Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

ℏεsam

@Hesamation

Apr 5, 2025 • 11 tweets • 4 min read • Read on X

Scrolly

the best researchers from Meta, Yale, Stanford, Google DeepMind, and Microsoft laid out all we know about Agents in a 264-page paper [book],

here are some of their key findings:

they build a mapping of different agent components, such as perception, memory, and world modelling, to different regions of the human brain and compare them:

- brain is much more energy-efficient
- no genuine experience in agents
- brain learns continuously, agent is static

an agent is broken down to:
- Perception: the agent's input mechanism. can be improved with multi-modality, feedback mechanisms (e.g., human corrections), etc.
- Cognition: learning, reasoning, planning, memory. LLMs are key in this part.
- Action: agent's output and tool use.

agentic memory is represented as:
- Sensory memory or short-term holding of inputs which is not emphasized much in agents.
- Short-term memory which is the LLM context window
- Long-term memory which is the external storage such as RAG or knowledge graphs.

the memory in agents can be improved and researched in terms of:
- increasing the amount of stored information
- how to retrieve the most relevant info
- combining context-window memory with external memory
- deciding what to forget or update in memory

the agent must simulate or predict the future states of the environment for planning and decision-making.

ai world models are much simpler than the humans' with their causal reasoning (cause-and-effect) or physical intuition.

LLM world models are mostly implicit and embedded

EMOTIONS are a deep aspect of humans, helping them with social interactions, decision-making, or learning.

agents must understand emotions to better interact with us.

but rather than encoding the feeling of emotions, they have a surface-level modelling of emotions.

Perception is the process by which an agent receives and interprets raw data from its surroundings.

human perception is complex, while AI's perception is mostly limited to textual and vision data, though research is finding ways to incorporate more (e.g. audio)

the paper goes on to explore multi-agent systems and the approach of key players such as MetaGPT, @CamelAIOrg , @huggingface, or ChatDEV.

It also touches on online active learning, design of multi-agent systems, and different agent collaboration paradigms.

I only covered the Part I of the paper. It has 4 comprehensive parts which cover almost all crucial things to know about agents.

Read Paper: huggingface.co/papers/2504.01…

CORRECTION: the paper is not affiliated with META but @MetaGPT_

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @Hesamation

ℏεsam

@Hesamation

Sep 12, 2025

Anthropic just dropped a full masterclass on building tools for your agents, here's the gist:
> evaluate your tools religiously
> limit the number of tools
> namespace your tools
> return meaningful context from tools
> prompt-engineer your tool descriptions
what each means:

1. evaluate your tools
use agents to create a test set of real-world tasks. then evaluate your tool on this benchmark. refine your tool description and args. create a hold-out test set and evaluate on that too. measure your tool performance and make sure it works.

2. don't overflow the agent with tools
more tools don't lead to better outcomes. they fill the precious context very fast. build few super-optimized tools for high-impact workflows.

Read 7 tweets

ℏεsam

@Hesamation

May 23, 2025

large language model explained through 4 simple notes:

1. a little history and traditional methods.

2. vector embeddings and RNNs.

3. attention and the encoder-decoder architecture.

Read 5 tweets

ℏεsam

@Hesamation

Jan 29, 2025

🧵SFT memorizes and RL generalizes,
based on OpenAI o1 and DeepSeek R1 we know that RL helps the models with reasoning, but this paper (dropped today) explores:
> how does SFT or RL affect the model’s generalization to different rules?
> Is SFT necessary for RL training?

In short, the paper argues that supervised fine-tuning (SFT) helps the model memorize and align with certain outputs, while reinforcement learning (RL) helps the model generalize and learn out-of-distribution (OOD) tasks.

the experiment is done in both textual and visual environments:
the textual task includes presenting the model with numbers and prompting it to produce an equation that equals a target number. the 'J', 'Q', and 'K' cards are given different values as a variation of rules.

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

ℏεsam

Try unrolling a thread yourself!

More from @Hesamation

ℏεsam

ℏεsam

ℏεsam

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!