Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

elvis

@omarsar0

Aug 3 • 15 tweets • 6 min read • Read on X

Scrolly

The Agentic Web is upon us!

If you want to learn about the Agentic Web, look no further.

This new report is a banger!

It presents a detailed framework to understand and build the agentic web.

Here is everything you need to know:

Agentic Web

This paper introduces the concept of the Agentic Web, a transformative vision of the internet where autonomous AI agents, powered by LLMs, act on behalf of users to plan, coordinate, and execute tasks.

It proposes a structured framework for understanding this shift, situating it as a successor to the PC and Mobile Web eras.

It's defined by a triplet of core dimensions (intelligence, interaction, and economics) and involves fundamental architectural and commercial transitions.

From static browsing to agentic delegation

The Agentic Web transitions from human-led navigation (PC era) and feed-based content discovery (Mobile era) to agent-driven action execution.

Here, users delegate intents like “plan a trip” or “summarize recent research,” and agents autonomously orchestrate multi-step workflows across services and platforms.

Three dimensions of the Agentic Web

Intelligence: Agents must support contextual understanding, planning, tool use, and self-monitoring across modalities.

Interaction: Agents communicate via semantic protocols (e.g., MCP, A2A), enabling persistent, asynchronous coordination with tools and other agents.

Economics: Autonomous agents form new machine-native economies, shifting focus from human attention to agent invocation and task completion.

A Cross-Era Comparison

They compare the PC, Mobile, and Agentic Web eras across dimensions like user behavior, technology, commercial models, and attention focus, framing the Agentic Web as a shift to action-driven, agent-mediated interaction and economics.

Web Systems Evolution

The architectural evolution of the Web highlights a shift from static content and manual interaction (PC era), to algorithm-curated feeds (Mobile era), and now to agentic automation where AI agents handle tasks via goal-driven orchestration.

This marks a transition from human operators to agents as outcome-driven executors.

Algorithmic Transitions for the Agentic Web

Traditional paradigms like keyword search, recommender systems, and single-agent MDPs are replaced by agentic retrieval, goal-driven planning, and multi-agent orchestration.

This includes systems like ReAct, WebAgent, and AutoGen, which blend LLM reasoning with external tool invocation, memory, and planning modules.

Protocols and Infrastructure

To enable agent-agent and agent-tool communication, the paper details protocols like MCP and A2A (Agent-to-Agent), along with system components such as semantic registries, task routers, and billing ledgers.

These redefine APIs as semantically rich, discoverable services.

Interaction Process

The example shows how high-level user intents are processed via three core components: the User Client, the Intelligent Agent, and Backend Services.

Applications and use cases

From transactional automation (e.g., booking, purchasing), to deep research and inter-agent collaboration, the Agentic Web supports persistent agent-driven workflows.

Implementations of autonomous web agents include ChatGPT Agent, Anthropic Computer Use, Google Project Mariner, and Genspark Super Agent.

Agentic Browsers

The authors list early AI-augmented browsers (Agent-as-Interface) applications, like Opera Neon, Perplexity Comet, and Microsoft NLWeb, highlighting how agents augment browsing via orchestration, summarization, and conversational UIs.

Taxonomy of Agentic Web Challenges

The authors present a taxonomy of open challenges in building the Agentic Web, spanning foundational cognition, learning, coordination, alignment, security, and socio-economic impact.

Risks and governance

The shift to autonomous agents introduces new safety threats, such as goal drift, context poisoning, and coordinated market manipulation.

The paper proposes multi-layered defenses including red teaming (human and automated), agentic guardrails, and secure protocols, while highlighting gaps in evaluation (e.g., lack of robust benchmarks for agent safety).

Paper: arxiv.org/abs/2507.21206
GitHub:
github.com/SafeRL-Lab/age…

--
Want to take the next steps?

Learn everything you need to know about building with AI Agents in my academy: dair-ai.thinkific.com

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @omarsar0

elvis

@omarsar0

Aug 2

Hierarchical Reasoning Model

This is one of the most interesting ideas on reasoning I've read in the past couple of months.

It uses a recurrent architecture for impressive hierarchical reasoning.

Here are my notes:

The paper proposes a novel, brain-inspired architecture that replaces CoT prompting with a recurrent model designed for deep, latent computation.

It moves away from token-level reasoning by using two coupled modules: a slow, high-level planner and a fast, low-level executor.

The two recurrent networks operate at different timescales to collaboratively solve tasks

Leads to greater reasoning depth and efficiency with only 27M parameters and no pretraining!

Read 9 tweets

elvis

@omarsar0

Jul 30

Graph-R1

New RAG framework just dropped!

Combines agents, GraphRAG, and RL.

Here are my notes:

Introduces a novel RAG framework that moves beyond traditional one-shot or chunk-based retrieval by integrating graph-structured knowledge, agentic multi-turn interaction, and RL.

Graph-R1 is an agent that reasons over a knowledge hypergraph environment by iteratively issuing queries and retrieving subgraphs using a multi-step “think-retrieve-rethink-generate” loop.

Unlike prior GraphRAG systems that perform fixed retrieval, Graph-R1 dynamically explores the graph based on evolving agent state.

Read 7 tweets

elvis

@omarsar0

Jul 28

GLM-4.5 looks like a big deal!

> MoE Architecture
> Hybrid reasoning models
> 355B total (32B active)
> GQA + partial RoPE
> Multi-Token Prediction
> Muon Optimizer + QK-Norm
> 22T-token training corpus
> Slime RL Infrastructure
> Native tool use

Here's all you need to know:

Model Architecture & Pre-Training

GLM-4.5 is 355B total parameters (32B active); deeper model with narrower width; optimized for reasoning via more layers and 96 attention heads.

GLM-4.5-Air is 106B (12B active).

22T-token training corpus that combines 15T general data with 7T code/reasoning-focused data.

Grouped-Query Attention + partial RoPE to enhance long-context efficiency and accuracy in reasoning tasks.

Mid-training looks like a key part of this model

"Unlike the earlier pre-training stage on large-scale universal documents, these stages leverage medium-sized domain-specific datasets, including instruction data."

Read 14 tweets

elvis

@omarsar0

Jul 27

Claude Code is more than a coding agent.

It's more like a super smart orchestrator agent.

Watch this evaluator loop agent I just built using sub agents and / commands.

This is one of the fastest ways to build custom agentic workflows.

Claude Code is no joke!

I'm impressed to see how easy it is to control how the sub agents communicate with each other (i.e., chain, loop, hierarchical, critic, etc.).

Claude Code is good out of the box, but customization gives you a clear advantage.

Custom sub agents + / commands solve that.

It's worth spending the time optimizing instructions, tool use, agent definitions, and more.

Claude Code, on its own, somehow likes to use a lot of tokens and perform unnecessary tasks/tool calls.

You can max out credits or hit rate limits really fast if you are not careful.

Read 6 tweets

elvis

@omarsar0

Jul 19

Context Rot

Great title for a report, but even better insights about how increasing input tokens impact the performance of top LLMs.

Banger report from Chroma.

Here are my takeaways (relevant for AI devs):

Context Rot

The research evaluates how state-of-the-art LLMs perform as input context length increases, challenging the common assumption that longer contexts are uniformly handled.

Testing 18 top models (including GPT-4.1, Claude 4, Gemini 2.5, Qwen3), the authors show that model reliability degrades non-uniformly even on simple tasks as input grows, what they term "context rot."

Simple tasks reveal degradation

Even basic benchmarks like semantic variants of Needle-in-a-Haystack, repeated word copying, or long QA logs (LongMemEval) expose accuracy drops as context length increases.

The decline is more dramatic for semantically ambiguous inputs or outputs that scale with length.

Read 8 tweets

elvis

@omarsar0

Jul 18

A Survey of Context Engineering

160+ pages covering the most important research around context engineering for LLMs.

This is a must-read!

Here are my notes:

The paper provides a taxonomy of context engineering in LLMs categorized into foundational components, system implementations, evaluation methodologies, and future directions.

The context engineering evolution timeline from 2020 to 2025 involves foundational RAG systems to complex multi-agent architectures.

Read 12 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

elvis

Try unrolling a thread yourself!

More from @omarsar0

elvis

elvis

elvis

elvis

elvis

elvis

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!