🧵🚀 Following my last thread on "in-context learning", now it's time to explain how we can digest our custom data so that LLM’s 🤖 can use it. Spoiler alert- @LangChainAI 🦜 🔗 and a vector store like @pinecone 🌲 will do all the work for us.
1/12 This is a laser focused thread 🧵 for devs and software engineers. Even if you have zero AI knowledge (like I did just 6 months ago)- I will be simplifying key data concepts for any gen ai application💡
2/12 Let's talk custom data digestion for LLMs 🤖
First off: Embedding models. These condense complex data into meaningful vectors, capturing relationships and semantic meaning. Think of it as a black box for text ➡ vector conversion. (vector = list of floats)
3/11 There are many embedding models, e.g @googlecloud ☁️ VertexAI Embeddings. Each has its pros and cons, considering cost, storage latency, and other factors. The main idea: transform data into vectors representing semantic meaning.
4/12 It is important to note that semantically related vectors will be “located” close to each in the vector space.
How does this work? Well…. a lot of math that I don’t really care about as a dev lol 😂 just use black box and let google vertex do its magic for me! 🎩
5/12 Now, if we take all the HTML documentation of a Python package (like @LangChainAI ) and embed it, we end up with a bunch of vectors, each representing a different doc page. If we used @googlecloud ☁️ VertexAI embeddings, the vectors' size = dimension would be 768.
6/12 So what do we do with these vectors? On their own, there isn't much we can do. They're just lists of numbers. But that's where a cloud-based ☁️ managed vectorstore like @pinecone 🌲 comes in, which saves us from storing hundreds of GBs of vectors on our own machines!
7/12 In @pinecone 🌲, we can create an index with the same dimension as our embeddings model (768 in this example). Then, we just need to iterate over the vectors we got back from the embedding model and upsert them into the vector store. Simple! 👌
8/12 Vectorstores like @pinecone🌲 offer semantic search functionality to find vectors close to our query vector. These semantically related vectors hold the info our LLM needs to answer.
9/12 How does this semantic search work? Complex algorithms, heavy math calculations which again I don’t really want to know as lazy dev, thank you @pinecone engineers for abstracting this for me! 🙏
10/12 Enters now @LangChainAI 🦜 🔗. This open source framework automates this entire process, doing the heavy lifting for us. With just one line of code, you can invoke a function that handles data embedding, and inserts all created vectors into @pinecone . Easy as pie! 🥧
11/12 @LangChainAI 🦜 🔗 has so much more to offer, making our lives easier when developing production grade LLM powered applications. IMO, it's the go-to open-source framework for developing such apps. Check out their docs here: python.langchain.com/en/latest/
1/17🧵Demystifying LLM memory🧠 mega thread featuring @LangChainAI 🦜🔗
In this thread I will cover the most popular real-world approaches for integrating memory to our GenAI applications 🤖
2/17 THE GIST:
Memory is basically using in context learning. Its just passing extra context of our conversation/relevant parts of it to the LLM in addition to our query. We augment our prompt with history giving the LLM ad-hoc memory-like abilities such as coreference resolution
Coreference resolution:
When someone says "@hwchase17 just tweeted. He wrote about @LangChainAI ," we effortlessly understand that "he" refers to @hwchase17 based on our coreference resolution skills. It's a cognitive process that enables effective communication & understanding
0/12 📢🧵Unpopular Opinion thread - Vectorstores are here to stay! 🔐🚀
I've noticed a lot of tweets lately discussing how #LLM s with larger context windows will make vector-databases obsolete. However, I respectfully disagree. Here's why:
1/12 @LangChainAI 🦜🔗 @pinecone 🌲 @weaviate_io @elastic @Redisinc @milvusio let me know what you think😎 I think you will like this.
2/12: Too much context hurts performance. As the context window expands, #LLM s can "forget" information from the beginning of the prompt. With contexts larger than ~50k tokens, this becomes a challenge.
1/14🧵Real world CHUNKING best practices thread:
🔍 A common question I get is: "How should I chunk my data and what's the best chunk size?" Here's my opinion based on my experience with @LangChainAI 🦜🔗and building production grade GenAI applications.
2/14 Chunking is the process of splitting long pieces of text into smaller, hopefully semantically meaningful chunks. It's essential when dealing with large text inputs, as LLMs often have limitations on the amount of tokens that can be processed at once. (4k,8k,16k,100k)
3/14 Eventually, we store all chunks in a vectorstore like @pinecone🌲 and perform similarity search on them then using the results as context to the LLM.
1/13 🧵💡 Ever wondered how to handle token limitations of LLMs? Here's one strategy of the "map-reduce" technique implemented in @LangChainAI 🦜🔗
Let's deep dive! @hwchase17 's your PR is under review again😎
2/13 MapReduce is not new. Famously introduced by @Google , it's a programming model that allows for the processing and generation of large data sets with a parallel, distributed algorithm.
3/13 In essence, it divides work into small parts that can be done simultaneously (the “mapping”) and then merge the intermediate results back to a one final result (“reducing”).
1/8 🚀 Let's go step by step on "Chat with your Repo" assistant powered by @LangChainAI🦜🔗 and @pinecone🌲all running smoothly on @googlecloud☁️ Run- this was demoed at yesterday's HUGE @googlecloud@pinecone event in Tel Aviv 🇮🇱
2/8 Step 1? Vectorize your repository files. With using @googlecloud VertexAI embeddings and a couple of lines of @LangChainAI you simply ingest these vectors into @pinecone vectorstore.
3/8 Now, we use @googlecloud VertexAI embeddings along with context retrieved from @pinecone to augment the user's original prompt to @googlecloud PaLM 2 LLM. This enables is also called in context learning. With @LangChainAI again is just a couple of lines of code
1/6🌐💡Singularity is here? Just read this blog from @LangChainAI 🦜🔗 featuring @itstimconnors on multi-agent simulation. IMO its amazing to witness how a few "hacks" such as a memory system + some prompt engineering can stimulate human-like behavior 🤖
2/6 inspired by @Stanford 's "Generative Agents" paper-
Every agent in a GPTeam simulation has its unique personality, memories, and directives, creating human-like behavior👥
3/6 📚💬 "The appearance of an agentic human-like entity is an illusion. Created by a memory system and a fe of distinct Language Model prompts."- from GPTeam blog. This ad-hoc human behaviour is mind blowing🤯🤯🤯