Post

Accelxr 👾

@accelxr

Jul 2, 2024 • 1 tweets • 14 min read • Read on X

Onchain AI Agents: Architecture, Examples, and Projects to Follow

The primary purpose of current generative models is content creation and information filtering. However, recent research + discussion around AI Agents - autonomous actors that use external tools to complete user defined goals - implies there could be a substantive unlock for AI if provided with economic rails similar to the trajectory of the Internet in the 1990s.

To do this, agents will need agency over assets they can control since the traditional financial system is not setup for them.

Enter crypto: a payment and ownership layer with fast settlement that is uniquely digital, effectively built for AI agents.

Below, I'll give you an introduction to the concept of agents and agentic architectures, examples in the research of how agents have been shown to have emergent properties above and beyond traditional LLMs, and projects that are building solutions or products around crypto-based agents.

---

🔍 What is an Agent 🔍

🔍 AI agents are LLM-powered entities able to plan and take actions to execute goals over multiple iterations.

🔍 Agentic architectures are either comprised of a single agent or multiple agents working together to solve a problem.

🔍 Typically, each agent is given a personality and access to a variety of tools that will help them accomplish their job either independently or as part of a team.

🔍 Agentic architectures differ from how we typically interact with LLMs today:

→ Zero Shot Prompting is how most people interact with these models: you enter a prompt, and the LLM generates a response based on its pre-existing knowledge.

→ Within an agentic architecture, you instead initialize a goal, the LLM breaks it into subtasks, and it then recursively prompts itself (or other models) to complete each subtask autonomously until the objective is reached.

---

🏗️ Single vs. Multi Agent Architectures 🏗️

Single Agent Architecture:
One language model performs all the reasoning, planning, and tool execution on its own. There is no feedback mechanism from other agents, but there may be options for a human to provide feedback to the agent.

Multi Agent Architecture:
These architectures involve two or more agents, where each agent can utilize the same language model or a set of different language models. The agents may have access to the same tools or different tools. Each agent typically has their own persona.

→ Vertical Architecture: one agent acts as a leader that other agents report to. This can help with organizing the group's outputs.
→ Horizontal Architecture: One large group discussion about the task where every agent can see other messages and volunteer to complete tasks or call tools.

🏗️ Agent Architecture: Profile
Agents have profiles or personalities which define a role into the prompt to influence the LLM behaviors and skillset. This is largely determined by the specific application.

Likely many of you already use this as a prompting technique today: "You are an expert nutritionist. Provide me with a meal plan...". Interestingly, providing the LLM a role improves its outputs vs. baseline.

Profiles can be crafted by the following methods:

→ Handcrafting: Manually specified profiles by a human creator; most flexible but also time consuming.

→ LLM-generation: profile generated by an LLM with a ruleset around composition and attributes + (optionally) few-shot examples.

→ Dataset Alignment: profiles are generated from a real-world dataset of people.

🏗️ Agent Architecture: Memory
The agent's memory stores information perceived from the environment and leverages it to create new plans or actions. Memory allows an agent to self-evolve and behave based on its experiences.

→ Unified Memory: reminiscent of short-term memory that is realized through in-context learning / passing via continuous prompting. All relevant memories are passed through to the agent in every prompt. Mainly limited by context window size.

→ Hybrid: short + long term memory. Short term memory is a temporary buffer for current state. Reflections or useful long term information are stored permanently in a database. There are a few ways to do this but a common method is a vector database (the memory is encoded as an embedding and stored; recall comes from similarity search)

→ Formats: natural language, databases (e.g. SQL with finetuning to understand SQL queries), structured lists, embeddings

🏗️ Agent Architecture: Planning
The deconstruction of a complex task into simpler subtasks to solve individually.

Planning without Feedback:
In this method, the agents do not receive feedback that influence future behaviors after taking actions. An example is Chain of Thought (CoT) where the LLM is encouraged to express its thought process when providing an answer.

→ Single path reasoning (e.g. zero-shot Chain of Thought)
→ Multi-path reasoning (e.g. Self-consistent CoT where multiple CoT threads are generated with highest frequency answers used)
→ External Planner (e.g. planning domain definition language)

Planning with Feedback:
Iteratively refining subtasks based on external feedback

→ Environmental feedback (e.g. game task completion signals)
→ Human feedback (e.g. solicit feedback from user)
→ Model feedback (e.g. solicit another LLM for feedback - crowdsourced)

🏗️ Agent Architecture: Action
Action is responsible for translating the agent’s decisions into specific outcomes.

The action goal comes in multiple possible forms, such as:
→ task completion (e.g. make a iron pickaxe in Minecraft)
→ communication (e.g. share information to another agent or human)
→ environment exploration (e.g. searching its own action space and learning its own abilities).

Production of an action often comes from memory recollection or plan following, and the action space is comprised of internal knowledge, APIs, databases/knowledge bases, and using external models to itself.

🏗️ Agent Architecture: Capability Acquisition

For an agent to properly execute an action within the action space, it must have task-specific capabilities. There are primarily two major ways to do this:

→ Via Fine-tuning: Trains the agent on human-annotated, LLM generated, or real-world datasets of example actions.

→ Without Fine-tuning: can use innate abilities of an LLM via more sophisticated prompt engineering and/or mechanism engineering (i.e. incorporating external feedback or experience accumulation while performing trial-and-error).

---

🛠️ Examples in the Literature 🛠️

🛠️ Generative Agents: Interactive Simulacra of Human Behavior: Instantiated generative agents in a virtual sandbox environment, showing multi-agent systems have emergent social behaviors. Froma single user-specified prompt of an upcoming Valentine's Day party, the agents autonomously spread invitations over the next two days, make new acquaintances, ask each other out on dates, and coordinate to show up to the party together at the right time. Can try this yourself using @a16z AI Town implementation.

🛠️ Describe Explain Plan Select (DEPS): the first zero-shot multi-task agent that can accomplish 70+ Minecraft tasks.

🛠️ Voyager: the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention. Continually refines it skill execution code based on feedback from trail and error.

🛠️ CALYPSO: An agent designed for the game Dungeons & Dragons, which can assist Dungeon Masters in the creation and narration of stories. Its short-term memory is built upon scene descriptions, monster information, and previous summaries.

🛠️ Ghost in the Minecraft (GITM): Generally capable agent in Minecraft with a 67.5% success rate on obtaining diamonds and 100% completion rate of all items in-game.

🛠️ SayPlan: LLM-based large-scale task planning for robotics using 3d scene graph representations showing capabilities of long-horizon task plans from abstract and natural language instruction for a robot to execute.

🛠️ HuggingGPT: Uses ChatGPT for task planning upon user prompt, selects models according to their descriptions on Hugging Face, and executes all subtasks with impressive results in language, vision, speech, and other challenging tasks.

🛠️ MetaGPT: takes an input and outputs user stories / competitive analysis / requirements / data structures / APIs / documents, etc. Internally, has multiple agents that make up the various functions of a software company.

🛠️ ChemCrow: an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design using 18 expert-designed tools. Autonomously planned and executed synthesis of an insect repellant, three organocatalysts, and guided the discovery of a novel chromophore.

🛠️ BabyAGI: General infrastructure that uses OpenAI and vector databases such as Chroma or Weaviate to create, prioritize, and execute tasks.

🛠️ AutoGPT: Another example of generalized infrastructure for spinning up LLM agents.

---

✨ Examples in Crypto (note: not all are LLM-based + some might be more loosely based on agentic concepts) ✨

✨ @frenrug from @ritualnet: Based on the GPT-4 Turkish Carpet Salesman game {}. Frenrug is an agent that anyone can try to convince to buy their @friendtech key. Each user message is passed to multiple LLMs run by different Infernet nodes. These nodes respond onchain with an LLM-produced vote on whether the agent should purchase the proposed key. When enough nodes respond, aggregation of the votes occurs and a supervised classifier model determines the action and relays a validity proof onchain, allowing for verification of the off-chain execution of the multinomial classifier.

✨ Prediction market agents on @gnosischain using @autonolas: AI mechs are essentially a smart contract wrapper for an AI service that anyone can call with a payment and a question. A service monitors the requests, performs the task and return the answer back onchain. This AI mech infra has been expanded to prediction markets via Omen with the underlying idea that agents will actively monitor and bet on predictions from news analytics that ultimately result in aggregated predictions closer to real-world odds. Agents scan for markets on Omen, autonomously pay a 'mech" for a prediction on the topic, and use the market to make a trade.

✨ @ianDAOs GPT<> @safe demo: The GPT autonomously manages USDC in its own Safe multisig wallet on @Base using @syndicateio Transaction Cloud API. You can talk to it and give it suggestions on how to best use its capital, and it might allocate based on your recommendations.

✨ Gaming Agents: Multiple ideas here, but in short AI agents in virtual environments as companions (think Skyrim AI NPC) and competitors (e.g. a griefing gang of pudgy penguins). Agents could do things like automatically execute farming strategies on your behalf, provide goods and services (think: shop owners, traveling merchants, sophisticated generative quest givers), or act as semi-playable characters as in @ParallelColony and @aiarena_ .

✨ @safe Guardian Angels: Uses a group of AI agents to monitor wallets and provide defense against potential threats to protect user funds and improve wallet security. Features include the automatic revoking of contract permissions and the withdrawing of funds in case of anomalies or hacks.

✨ @bottoproject: While a more loosely defined example of an onchain agent, Botto showcases the concept of an autonomous onchain artist, producing works that are voted on by tokenholders and auctioned off on SuperRare. One could imagine various extensions of this that employ multi-modal agentic architectures.

---

✨ Some Notable Agent-specific Projects (note: not all are LLM-based + some might be more loosely based on agentic concepts) ✨

✨ @AIWayfinder - Decentralized knowledge graph of Protocols, Contracts, Contract Standards, Assets, Functions, API Functions, Routines + paths (i.e. a virtual roadmap of blockchain ecosystems that wayfinder agents can navigate). Users will be rewarded for identifying viable paths that are used by agents. In addition, you can mint shells (i.e. agents) that include persona setup and skill activation that can subsequently plugin to the wayfinder knowledge graph.

✨ @ritualnet - As was seen with the frenrug example above, Ritual infernet nodes can be used to setup multiagent architectures. Nodes listen for onchain or offchain requests and deliver outputs with optional proofs.

✨ @MorpheusAIs - A peer-to-peer network of personal general purpose AIs that can execute Smart Contracts on behalf of a user. This can be used for web3 wallet and tx intent management, data parsing through chatbot interfaces, recommender models for dapps & contracts, and extensions of agent actions via long-term memory of connected apps & user data.

✨ @dainprotocol - Exploring multiple use cases for agent deployments on Solana. Recently demoed deployment of a crypto trading bot that can ingest onchain and offchain information to execute on behalf of a user (e.g. selling BODEN if Biden loses).

✨ @NapthaAI - An agent orchestration protocol with an on-chain task marketplace for contracting agents, operator nodes that orchestrate tasks, an LLM Workflow Orchestration Engine that supports async messaging across different nodes, and a Proof-of-Workflow system for verifying execution.

✨ @myshell_ai - AI character platform similar to where creators can monetize agent profiles and tools. Multimodal infrastructure with some interesting examples agents including translation, education, companionship, coding, and more. Contains both simple no-code agent creation and more advanced developer modes for assembling AI widgets.

✨ @aiarena_ - A competitive PvP fighting game where gamers purchase, train, and battle AI-enabled NFTs. Players train their agent NFTs via imitation learning, where the AI learns to play the game across different maps and scenarios by learning the associated probabilities of the player's actions. Once trained, players can send their agents out to fight in ranked battles for token rewards. Not LLM-based, but still an interesting example of agentic gaming possibilities.

✨ @virtuals_io - A protocol for building and deploying multimodal agents to gaming and other online spaces. The three primary archetypes of virtuals today include mirrors of IP characters, function-specific agents, and personal doubles. Contributors contribute data and models to Virtual, with validators acting as gatekeepers. There is an economic layer of incentives for development and monetization.

✨ @BrianknowsAI - Offers a UI for users to interact with an agent that can perform txs, research crypto-specific information, and deploy smart contracts by prompt. Currently supports over 10 actions across 100+ integrations. A recent example was having an agent stake ETH in Lido on behalf of the user using natural language.

✨ @autonolas - Offers lightweight local and cloud-based agents, consensus operated decentralized agents, and specialized agent economies. Prominent examples include DeFi and prediction-based agents, an AI-powered governance delegate, and agent-to-agent tool marketplaces. Offers both a protocol for coordinating and incentivizing agent operations + the OLAS stack, an open source framework for developers to build co-ownable agents.

✨ @CreatorBid - Offers users agents with social media personas connected to live APIs for X and Farcaster. Brands can spin up knowledge based agents to execute brand-aligned content on social platforms.

✨ @polywrap_io - Offers a variety of agent-based products, such as Indexer (a social media agent for Farcaster), AutoTx (a planning and tx execution agent built with Morpheus and @flock_io), (a predictive agent with Gnosis and Autonolas), and (an agent for grant resource allocation).

✨ Verification - Since economic flows will be directed by agents, verification of outputs will be important (more on this in a future post). Verification methodologies include opML / opp/ai from @OraProtocol, zkML from teams like @ModulusLabs + @gizatechxyz + EZKL, game theoretical solutions, and hardware-based solutions like TEEs.

✨

---

💡 Some ideas for onchain agents 💡

💡 Ownable, tradeable, token-gated agents fulfilling various types of functions, from companionship to financial applications
💡 Agents that can identify, learn, and participate on your behalf in game economies; also autonomous agents as players in collaborative, competitive, or fully simulated environments.
💡 Agents that can simulate real human behavior for protocol farming opportunities
💡 Multi-agent managed smart wallets that can act as autonomous asset managers
💡 AI managed DAO governance (e.g. token delegation, proposal creation or management, process refinement, etc.)
💡 Use of web3 storage or databases for a composable system of vector embeddings for shared and permanent memory state
💡 Locally run agents that participate in a global consensus network for user defined tasks
💡 Knowledge graphs for existing and new protocol interactions and APIs
💡 Autonomous keeper networks, multisig security, smart contract security & feature enhancements
💡 Truly autonomous investment DAOs (e.g. a collector DAO that uses art historian, investment analyst, data analyst, and degen agent personas)
💡 Tokenomics and contract security simulations & testing
💡 General intent management, especially in the cases of crypto UX such as bridging or DeFi
💡 Art or experimental projects

---

📈 Onboarding the next billion users 📈

As @jessewldn recently put it, autonomous agents are an evolution, rather than a revolution, in how blockchains are used: we already have protocol task bots, sniper bots, MEV searchers, botkits, etc. Agents are just an extension of this.

Many sectors of crypto are built in a way conducive to agent execution, such as fully onchain games and DeFi.

Assuming costs of LLMs trend downwards relative to performance on tasks + accessibility of creating and deploying agents increases, it’s hard to imagine a world where AI agents don’t dominate onchain interactions and become crypto's next billion users.

---

📚 Further Reading 📚

📚 AI Agents That Can Bank Themselves Using Blockchains []
📚 The new AI agent economy will run on Smart Accounts []
📚 A Survey on Large Language Model based Autonomous Agents (I used this for identifying the taxonomy of agentic architectures above, highly recommend) []
📚 ReAct: Synergizing Reasoning and Acting in Language Models []
📚 Generative agents: Interactive simulacra of human behavior []
📚 Reflexion: Language Agents with Verbal Reinforcement Learning
[]
📚 Toolformer: Language Models Can Teach Themselves to Use Tools []
📚 Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents []
📚 Voyager: An Open-Ended Embodied Agent with Large Language Models []
📚 LLM Agents Papers GitHub Repo []aiadventure.spiel.com/carpet
character.ai
predictionprophet.ai
fundpublicgoods.ai
mirror.xyz/0x16de9a0d10EF…
safe.mirror.xyz/V965PykKzlE1PC…
arxiv.org/pdf/2308.11432
arxiv.org/abs/2210.03629
arxiv.org/abs/2304.03442
arxiv.org/abs/2303.11366
arxiv.org/abs/2302.04761
arxiv.org/pdf/2302.01560
arxiv.org/abs/2305.16291
github.com/zjunlp/LLMAgen…

• • •

Missing some Tweet in this thread? You can try to force a refresh

Share this page!

Enter URL or ID to Unroll

Accelxr 👾

Try unrolling a thread yourself!

More from @accelxr

Accelxr 👾

Accelxr 👾

Accelxr 👾

Accelxr 👾

Accelxr 👾

Accelxr 👾

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!