"We demonstrate through ablation that the components of our agent architecture—observation, planning, and reflection—each contribute critically to the believability of agent behavior"
One of the novel components was a "architecture that makes it possible for generative agents to remember, retrieve, reflect, interact with other agents" - this is what we tried to recreate
So now that we have this retriever, how is it used in memory?
There are two key methods: `add_memory` and `summarize_related_memories`
When an agent makes an observation, it stores the memory:
1. a LLM scores the memory’s importance (1 for mundane, 10 for poignant) 2. Observation and importance are stored within the retrieval system
When an agent responds to an observation:
1. Generates query(s) for retriever, which fetches documents based on salience, recency, and importance. 2. Summarizes the retrieved information 3. Updates the last_accessed_time for the used documents.
So let’s now see this in action! We can simulate what happens by feeding observations to the agent and seeing how the summary of the agent is updated over time
Here we do a simple update of only a few observations
We can take this to the extreme even more and update with ~20 or so observations (a full day’s worth)
We can then “interview” the agent before and after the day - notice the change in the agent’s responses!
Finally, we can create a simulation of two agents talking to each other.
This is a far cry from the ~20 or so agents the paper simulated, but it's still interesting to see the conversation + interview them before and after
• • •
Missing some Tweet in this thread? You can try to
force a refresh
One thing we've seen is that while default agents make it easy to prototype, a lot of teams want to customize some component of them in order to improve the accuracy of THEIR application
In order enable this, we exposed all the core components
The basic idea: you store multiple embedding vectors per document. How do you generate these embeddings?
👨👦Smaller chunks (this is ParentDocumentRetriever)
🌞Summary of document
❓Hypothetical questions
🖐️Manually specified text snippets
Quick 🧵
Language models are getting larger and larger context windows
This is great, because you can pass bigger chunks in!
But if you have larger chunks, then a single embedding per chunk can start to fall flat, as there can be multiple distinct topics in that longer passage
One solution is to start creating not one but MULTIPLE embeddings per document
This was the basic realization with our ParentDocumentRetriever ~2 weeks ago, but it's really much more general than that