Jerry Liu Profile picture
Apr 30 7 tweets 3 min read Twitter logo Read on Twitter
An implication of designing any LLM app over your data is you’re adding “state” (data) to a “stateless” module (LLM).

Stateful apps are hard, and require good storage abstractions.

We’ve thought hard about this with @gpt_index 🧵 Image
Hacking an initial retrieval-augmented LLM app is super easy: take some documents, chunk it up, put it in a vector db.

But thinking about prod data reqs makes this more challenging.

It’s one thing to build a demo over 5 docs. What about over GB’s of data over different sources?
Some questions:

How do we store source Documents? Once we split it, how do we store text chunks?

How do we store metadata? Including indices on your data?

How do we store vectors with vector db’s?
The new release of @gpt_index (0.6.0) takes a stab at addressing this:
- We define an underlying KV store abstraction
- We can store Nodes (raw data chunks) and indices in KV store
- In parallel, we maintain vector store abstractions

Full blog: medium.com/@jerryjliu98/l…
There is now a vector ecosystem of vector db providers. Many vector db’s (e.g. @pinecone, @trychroma, @weaviate_io), allow storage of both vectors and docs.

For now they’re sep from our docstore; we have a TODO to explore similarities.
A key concept is to decouple the raw data from the indexes that we define at the top-level.

An Index in @gpt_index is just a lightweight view over your data, each solving a diff retrieval use case.

You can/should define multiple indices over your data.
Interested in contributing? We’d LOVE to have your help in building way more document store abstractions: S3, GCS, HDFS, and more.

+ more vector integrations as well.

gpt-index.readthedocs.io/en/latest/how_…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jerry Liu

Jerry Liu Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @jerryjliu0

Apr 29
LlamaIndex now makes it super easy for you to define custom retrieval for LLM’s 💡

Hybrid search is a popular extension of semantic search; let’s walk through an example of how you can define your *own* (simplified) hybrid search with @gpt_index 👇

github.com/jerryjliu/llam…
At its core, hybrid search is a mix of keyword lookup and semantic search.

We show how you can define a custom retriever that can take the intersection of retrieved nodes from the two techniques above.
First, we split a document into nodes, and add the nodes to the docstore.

We then define two indexes over your data: our keyword lookup index, and our vector index.

Note: defining multiple indexes does not duplicate the data. ImageImage
Read 5 tweets
Apr 29
It’s official…LlamaIndex 0.6.0.alpha1 is out. And it’s basically a completely new product 🔥

We fundamentally rewrote two main areas:
- 🔍Query interface
- 🗃️Storage abstractions

Full blog post: medium.com/@jerryjliu98/l…

Way too much for one tweet thread but we’ll try! 🧵
[1] There are BIG changes in the following core areas:
- 🔍Decoupling state from compute: separate index (state) from retriever/query (compute)
- 🧱Progressive Disclosure of Complexity: high-level API -> low-level API
- 🫙Principled Storage Abstractions
[2] Decoupling state from compute:
- An index manages state: abstracts away underlying storage, exposes view over processed data
- A Retriever fetches Nodes from an index
- A QueryEngine can synthesize a response from Nodes Image
Read 9 tweets
Apr 26
Evaporate (@simran_s_arora et al.) is an awesome paper on structured data extraction 🙌

Key insight: “function extraction”; synthesize an “extract” fn using LLM, apply it across data at scale!

arxiv.org/abs/2304.09433

We added an initial module in @gpt_index! 🛠️👇 Image
The paper proposes two strategies for structured extraction:
- ➡️Evaporate-Direct: LLM directly extracts values from docs (similar to @gpt_index SQL support)
- 🤖Evaporate-Code: LLM synthesizes fn, applies it to docs at scale
We implement a super basic version of Evaporate-Code in @gpt_index, with following steps from paper:

1. Schema Identification: extract attributes from docs
2. Function Synthesis: Given attributes, synthesize functions
3. Run functions across docs to get structured data
Read 6 tweets
Apr 24
💬 Conversational Agent Simulations 🤖

Over this weekend I hacked on getting AI agents to talk in different settings:
🥂 First date
🥼 Doctor checkup
🧑‍💻 Software eng interview

Used some core @gpt_index data structs. Check it out on Llama Lab 🧪! 👇

github.com/run-llama/llam…
Each actor is represented by a simple “ConvoAgent” class, containing short-term and long-term memory.

Long-term memory uses our vector index. Short-term memory is just a deque.

Each agent can 1) store incoming messages 🗃️, and 2) generate messages 🗣️
@gpt_index makes this really easy to do with the following:
- Abstractions for storing/querying long-term memory
- Synthesize short-term and long-term memory *without* worrying about context limits.
- Easily customizable prompts to present different settings
Read 6 tweets
Apr 17
Super excited to feature TWO exciting AGI projects using @gpt_index 🔥⚡️

🤖 llama_agi: Automatically execute tasks towards a goal!
⚙️ auto_llama: An internet agent to fulfill tasks.

Link: github.com/run-llama/llam…

@gpt_index makes AGI projects straightforward to build. 🧵
At a high-level, @gpt_index is a great tool for AGI dev 🛠️
📕 Index a knowledge corpus
✅ Index a set of tasks
🧠 Use it as a memory module
📝 Synthesize response from data.
Here’s an example w/ llama_agi. Suppose you want to set a goal: “What steps can I take to live longer?”

llama_agi can repeatedly reason/store tasks until it’s complete!

Sample tasks: develop a plan for medical advice, research effects of diet, research effects of stress, etc. ImageImage
Read 5 tweets
Apr 16
In LLM retrieval-augmented generation, we not only want to look at quality of responses, but also of the source contexts.

Which sources contain the answer to the query? 🤔

Introducing LLM Source Context Evaluation in @gpt_index (s/o @ravithejads, @kar2905)🔬 Image
The idea is super simple:

Given a list of retrieved sources, feed query + each source to the LLM to see if the answer is contained within that source.

If yes, that source is relevant; if no, that source is irrelevant.

Can be used to get a “precision” metric of retrieval! 🛠️
As a simple example, imagine doing semantic search “Who is the mayor of New York City?” with a top-k value.

Some sources are relevant; some are irrelevant.

This may tell you if your retrieval model is casting too wide of a net 🥅; may be inefficient ImageImage
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(