Our new approach to self improving auto agents, introduced in the next mod of @babyagi_.
Dig in shall we? π
How it works:
With FOXY (Final Output eXamination from "Yesterday"), we do a final reflection on the output of each run, and use this to guide future runs, pulling most relevant reflection using a similarity search, paired with a decay mechanism to prioritize recent reflections.
Here's a simplified version of what this looks like.
In this example, we run the same objective twice. The second task list is improved based on notes from the final reflection of the first run.
We pull the most relevant past example(s) for this initial reflection, combined with a decay function to prioritize recent reflections.
Here's the code ChatGPT wrote for me, it seems to work - though I'm sure it can be improved.
Here's an actual example, using the same objective as above.
As you can see, in the first run, it creates a simple two step task list.
Note the "initial reflection" which is pretty basic, and refers mostly to which skills to use.
Here you see it successfully reading the file and making suggestions. Great.
Here's the "final reflection" from this first run.
Notice it mentions more specific suggestions, prioritizing suggestions, and providing explanations.
When we run the same objective again, you see the initial reflection looks very different (guided by the previous "final reflection")...
Resulting in a more comprehensive task list!
Early results show promise, though w/ room for improvement.
Many design choices to make, such as adding human feedback, number of examples to use, depth of reflection, speed of time decay, etc.
Still need to clean up code, will be included in the next @babyagi_ mod!
β’ β’ β’
Missing some Tweet in this thread? You can try to
force a refresh
startup funding discussions, captured on @x, analyzed with @openai, enriched with @exaailabs, daily newsletter using @resend. built with @replit agent
daily newsletter & more π
if you just want the automated daily newsletter, subscribe here:
(sample image attached)
random thoughts on the tool and building it below vcpedia.com
this wasnβt even something i planned on building but the idea came to me a week ago, and I put other builds* on hold to throw this up (itβs when I work fastest)