Latest Twitter Threads by @clusteredbytes on Thread Reader App

Feb 4, 2024 • 4 tweets • 2 min read

Introducing LlamaBot 🔥🚀

An open-source Discord bot that listens to your conversations, remembers them and answers your questions across a discord server, created using @llama_index (inspired by @seldo 's LlamaBot for Slack)

Stack used: LlamaIndex, Gemini Pro, @qdrant_engine

@llama_index @seldo @qdrant_engine Features:

- We can ask LlamaBot questions about what's going on across the server

- We can tell LlamaBot to start/stop listening to conversations.

- We can check current listening status, or ask the bot to forget everything from the server.

Jan 25, 2024 • 10 tweets • 4 min read

The "Dense X Retriever" paper shows that it significantly outperforms the traditional chunk-based retriever

@LoganMarkewich created an awesome LlamaPack that lets you get started with this proposition-based retriever in no time using @llama_index 🔥

Let's see how it works 👇🧵

@LoganMarkewich @llama_index Passage and sentence based retrieval has their limitations.

Though passages contain more context, they often include extraneous details, which distracts the LLM during response synthesis

Jan 10, 2024 • 4 tweets • 2 min read

Previously I've talked about the amazing Ingestion Pipeline from @llama_index.

Here's how to use Redis (@Redisinc) as the docstore, vectorstore and cache for the pipeline.

LlamaIndex abstractions make it really easy to just use Redis for the entire pipeline 🔥👇

@llama_index @Redisinc We need to pass the following arguments to the ingestion pipeline:

cache: we pass a RedisCache instance as the argument of IngestionCache

docstore: An instance of RedisDocumentStore

vectorstore: An instance of RedisVectorStore

Nov 23, 2023 • 11 tweets • 4 min read

Multi-Modal AI is rapidly taking over 🔥🚀

It’s truly amazing how fast @llama_index incorporated a robust pipeline for multi-modal RAG capabilities.

Here’s a beginners-friendly guide to get started with multi-modal RAG using LlamaIndex 👇🧵

@llama_index First let’s start with some simple stuff.

We just want to ask questions about our images.

OpenAIMultiModal is a wrapper around OpenAI’s latest vision model that lets us do exactly that.

Oct 27, 2023 • 13 tweets • 4 min read

Previously we've seen how to improve retrieval by funetuning an embedding model.

@llama_index also supports finetuning an adapter on top of existing models, which lets us improve retrieval without updating our existing embeddings. 🚀

Let's see how it works 👇🧵

@llama_index For adapters, we pull apart every single layer of the transformer and add randomly initialized new weights.

Then, instead of finetuning all the weights, we freeze the weights of the pre-trained model, only finetune the newly added weights.

We apply similar technique here 👇

Oct 19, 2023 • 11 tweets • 4 min read

Extract tables from documents using @llama_index UnstructuredElementParser and then use RecursiveRetriever to enable hybrid tabular/semantic queries and also comparisons over multiple docs.

Let's see how to use this advanced RAG technique 🧵👇

@llama_index First we load the documents.

Then we create the new UnstructuredElementNodeParser from LLamaIndex.

Oct 9, 2023 • 10 tweets • 4 min read

Finetuning the embedding model can allow for more meaningful embedding representations, leading to better retrieval performance.

@llama_index has abstraction for finetuning sentence transformers embedding models that makes this process quite seamless.

Let's see how it works 👇

@llama_index Finetuning means updating the model weights themselves over a set of data corpus to make the model work better for specific use-cases.

E.g. for embedding ArXiv papers, we want the embeddings to align semantically with the concepts and not filler words like “This paper is…”.

Oct 2, 2023 • 7 tweets • 3 min read

Multi Document Agent architecture (v0) in @llama_index, a step beyond naive top-k RAG.

It allows answering broader set of questions over multiple documents, which weren't possible with basic RAG.

Let's break down the agent architecture and see how it works 👇🧵

Architecture:

- For each document, a VectorIndex is created for semantic search, and a SummaryIndex is created for summarization

- Then we create QueryEngine for both these Indices

- Next the QueryEngines are converted to QueryTools

Sep 29, 2023 • 10 tweets • 4 min read

We've seen that smaller chunks are good for capturing semantic meaning and larger ones are good for providing better context.

@llama_index AutoMergingRetriever takes it one step further by keeping the chunks in a tree structure and dynamically choosing the chunk length. 🧵👇

The first step here is parsing via the HierarchicalNodeParser.

It stores the node in a tree structure, where deeper nodes are smaller chunks and shallow nodes are larger chunks.

We can specify how many layers of nodes we want and the splitter size for each layer.

Aug 26, 2023 • 13 tweets • 4 min read

Previously we've seen @LangChainAI ParentDocumentRetriever that creates smaller chunks from a document and links them back to the initial documents during retrieval.

MultiVectorRetriever is a more customizable version of that. Let's see how to use it 🧵👇

@LangChainAI ParentDocumentRetriever automatically creates the small chunks and links their parent document id.

If we want to create some additional vectors for each documents, other than smaller chunks, we can do that and then retrieve those using MultiVectorRetriever.

https://twitter.com/1355239433432403968/status/1691143792831639556

Aug 14, 2023 • 14 tweets • 4 min read

While splitting the raw text for Retrieval Augmented Generation (RAG), what should be the ideal length of each chunk? What’s the sweet spot?

Strike a balance between small vs large chunks using @LangChainAI ParentDocumentRetriever

Let's see how to use it 👇🧵

The issue:

- smaller chunks reflect more accurate semantic meaning after creating embedding

- but they sometimes might lose the bigger picture and might sound out of context, making it difficult for the LLM to properly answer user's query with limited context per chunk.

Share this page!

Enter URL or ID to Unroll