Harrison Chase Profile picture
@LangChainAI, previously @robusthq @kensho MLOps ∪ Generative AI ∪ sports analytics
6 subscribers
Oct 8, 2024 9 tweets 2 min read
🚀We're launching "long-term memory" support in LangGraph

At its core, long-term memory is "just" a persistent document store that lets you *put*, *get*, and *search* for memories you've saved

Why so simple?

🧵 Image 🧠The idea of memory is tantalizing, but also really vague

What does it even mean for an application to have memory?

Much like agents, there's a lot of hype and interest in this area, without a clear definition of what is actually means
Oct 14, 2023 7 tweets 4 min read
⛓️Chain of Verification

A great new paper from Meta on a prompting technique to reduce hallucinations

🦜🔗Sourajit Roy Chowdhury implemented this in @LangChainAI **along with some improvements**

📃And he wrote a blog on it

🧵Lets dive in (this is why I love the LC community!)


Image
Image
Image
Image
Most important link: the GitHub repo

This is a well documented, well implemented repo - that takes a lot of time

Big 👏 and ⭐️ to Sourajit for not only implementing this paper, but implementing in such a comprehensive and helpful way

github.com/ritun16/chain-…
Sep 21, 2023 13 tweets 6 min read
🤖Agents from scratch

We've rewritten all our 8 agent types using LangChain Expression LangChain and prompts from the Hub

This makes them more modular, understandable, and therefor more customizable

This customizability is crucial for teams looking to go to production

Long 🧵
Image
Image
If you want to jump right into it, we've updated the "Getting Started" page for agents to go over all the individual components

We then show how to create agents from these individual components

Is a great resource to build up a solid base understanding

python.langchain.com/docs/modules/a…
Aug 25, 2023 10 tweets 3 min read
🌲Multi Vector Retriever

The basic idea: you store multiple embedding vectors per document. How do you generate these embeddings?

👨‍👦Smaller chunks (this is ParentDocumentRetriever)
🌞Summary of document
❓Hypothetical questions
🖐️Manually specified text snippets

Quick 🧵 Image Language models are getting larger and larger context windows

This is great, because you can pass bigger chunks in!

But if you have larger chunks, then a single embedding per chunk can start to fall flat, as there can be multiple distinct topics in that longer passage
Aug 15, 2023 7 tweets 3 min read
🚢Benchmarking Question/Answering Over CSV Data

Deep dive on improving an application that does question answering over CSV data:

📜3000 word blog post
🎥30min video
🛌Open sourced eval data
🎬Open sourced code for gathering feedback
🤖Open sourced final agent code

🧵 Image Blog:

YouTube: https://t.co/JxUrrvzBdi

Code & data used: https://t.co/LnQeRsHrNT

Now for a quick thread:blog.langchain.dev/benchmarking-q…

github.com/langchain-ai/l…
Aug 3, 2023 7 tweets 3 min read
💬Conversational Retrieval Agents

The most popular chain in @LangChainAI is the ConversationalRetrievalChain, which allows you chat with your data

Using an agent instead can allow for great flexibility, and its a narrow and well defined enough agent that its fairly reliable

🧵 Image I'll dive into details in this thread, but quick links:

Blog:

Python Docs: https://t.co/v1wLHIuBki

JS Docs: https://t.co/N0hQ90MFyg https://t.co/1eAdJBUnXCblog.langchain.dev/conversational…
python.langchain.com/docs/use_cases…
js.langchain.com/docs/use_cases…
Image
Aug 1, 2023 15 tweets 5 min read
A 🧵on examples of using our new LangChain Expression Language to rewrite some of our most popular chains

Benefits: it's very clear what's going on under the hood, and (most importantly) how to modify them

👇 Image Before jumping in:

(1) We'll be doing a webinar on this tmrw, so come join then for a more in depth walkthrough + Q/A:

(2) There's lots of more chains to rewrite, so if you have good examples (or asks) just comment and I'll add!crowdcast.io/c/ckw1tydg29er
Jul 6, 2023 5 tweets 2 min read
💬ConversationalRetrievalChain Upgrades

One of our more popular chains is the ConversationalRetrievalChain, which allows you to create a retrieval augmented generation chatbot

We've introduced some small but impactful quality of life changes:

🧵 📃Improved Reference Docs

We beefed up our reference documentation to include better docstrings and a more end-to-end example

There's a lot of toggles to play with, hopefully this helps make it more clear what all the parameters are

Docs: api.python.langchain.com/en/latest/chai…
Jul 5, 2023 8 tweets 5 min read
📄Documents x LLMs📄

Combining documents with LLMs is a key part of retrieval and chaining

We've improved our @LangChainAI reference documentation across the 5 major CombineDocumentsChains and helper functions to help with clarity and understanding of how these work

🧵





📄 `format_document`

Want to control which metadata keys show up in the prompt?

This helper function is rarely exposed, but is key to combining documents with LLMs

It takes a Document and formats it into a string using a PromptTemplate

Docs: https://t.co/Xrl5HtvFlvapi.python.langchain.com/en/latest/sche…
Jun 19, 2023 9 tweets 3 min read
⭐️Using `functions` to structure output⭐️

We're starting to add more chains that rely on functions to structure output

Here's a quick overview of how we're doing that, which chains we've added so far, how to contribute, and additional resources

🧵 Although we first incorporated `functions` into agents, an almost more important ability of `functions` is to structure output from ChatGPT

This is extremely useful when you want to use the output of ChatGPT in a particular way
Jun 16, 2023 4 tweets 4 min read
The new @OpenAI functions are good for other things besides agents

Another killer use case is extracting structured information from unstructured docs

We've adding support for extraction AND tagging in @LangChainAI - thanks to @fpingham for code and @jxnlco for review

🧵 ✂️Extraction

Specify a schema - either a dictionary or a Pydantic model - and then extract entities from a piece of text with the same schema

This will return a list of objects with that schema

Docs: python.langchain.com/en/latest/modu… ImageImage
Jun 5, 2023 4 tweets 2 min read
⭐️Composable Prompts⭐️

Wouldn't it be nice if there was a way to compose prompts together, reusing pieces across prompts?

In the newest Python and JS release there now is with `Pipeline Prompt`!

Links 👇 ImageImage The way this works is you define a `PipelinePrompt` with two components:

- FinalPrompt: the final prompt template to be formatted
- PipelinePrompts: a sequence of tuples of (name, PromptTemplate)

The `name` argument is how the formatted prompt will be passed to future prompts
May 31, 2023 5 tweets 3 min read
✂️15+ Code Specific Text Splitters✂️

Just used one of @LangChainAI 's 100+ Document Loaders?

Next step: split data into embeddable chunks.

We now have support for splitting 15+ different coding languages in the optimal way

🧵 ImageImage A underrated part of the preprocessing pipeline, proper splitting of text allows for maintaining semantically meaningful chunks

This is crucial when doing retrieval augmented generation in order to ensure the proper context is inserted into the prompt
May 30, 2023 4 tweets 2 min read
How to speed up "chat-your-data" applications while retaining final answer accuracy?

🫙Use a cheaper/faster model (gpt-3.5) to create the condensed question
💬Use a better but more expensive model (gpt-4) for final response

Thanks to @cristobal_dev for highlighting!

🧵 Image Most "chat-your-data" applications involve three steps:

1⃣Condense the chat history into a standalone question
2⃣Retrieve relevant docs based on the standalone question
3⃣Generate a final answer based on the retrieved documents

This involves two total calls to the LLM!
May 24, 2023 4 tweets 3 min read
Want to be able to easily access open source models behind an API?

@MosaicML has got you covered

Excited to announce an integration of their inference API with @LangChainAI, making it easy to experiment with MPT-7B, Dolly, and embedding models

Quick 🧵 ImageImage My biggest hurdle to using open source models is that I just want to access them behind an API to start

At some point I may want to fine tune them, but to start I don't want to worry about hosting - I just want to get started building!

@MosaicML makes it easy to do so
May 11, 2023 7 tweets 3 min read
Let's use LangChain to analyze LangChain! And show off the power of our self query retriever in the process

"what did they say about prompt injection in the agents in production webinar?"

Here's why vanilla semantic search would mess up the response to this query:

🧵 First, let's assume the library of all documents is all LangChain YouTube videos.

We can load these with the LangChain YouTube DocumentLoader Image
May 10, 2023 6 tweets 3 min read
🗺️ Plan-and-Execute Agents 🗺️

Inspired by BabyAGI and the recent Plan-and-Solve paper, we're introducing a new type of @LangChainAI agent

We think these are better for more complex tasks, at the cost of more calls to the LLM

Blog: blog.langchain.dev/plan-and-execu…

🧵 Up until now, agents in LangChain have followed the algorithm of:

- take user input
- think about action to take
- take action and observe response
- repeat until done

This is great for simple tasks, but for more complex tasks we've noticed the agent losing focus
May 4, 2023 5 tweets 2 min read
🔀Router Chains🔀

A simple (yet much requested) abstraction that started with a @ShreyaR pr months ago and is finally in @LangChainAI!

- Router Chain does classification to choose sub chain to use
- Call the selected chain with that input

Lots of potential use cases!

🧵 ImageImage Why add this?

We view this as a basic building block for constructing chains

Just as you could want to run a sequence of chains sequentially, there's also the basic building block of forking and routing to the correct chain

What are some of the use cases?
May 3, 2023 5 tweets 2 min read
🔧Structured Tools🔧

Agents are all about the tools you give it

Tools in @LangChainAI used to just take a single string input. In our new release, tools can now take multiple inputs. We also introduce a new agent type for these tools

Blog: blog.langchain.dev/structured-too…

🧵 When we started @LangChainAI, LLMs were simple enough that we made tools just a single string input (they couldn't handle anything more complex)

Now models are good enough that they can handle more complex tool interfaces. Accordingly, we're updating the tool interface
Apr 27, 2023 9 tweets 3 min read
Retrieval for QA systems is hard

Vector search is good for capturing semantically similar texts, but often queries specify desired attributes like time, authorship, or other "metadata" fields, which vector search is not great at

Enter... ⭐️SelfQueryRetriever⭐️ ImageImageImage The basic idea of SelfQueryRetriever is simple: given a user query, use an LLM to extract:

1. The `query` string to use for vector search
2. A metadata filter to pass in as well

Most vector databases support metadata filters, so this doesn't require any new databases or indexes
Apr 21, 2023 9 tweets 4 min read
⭐️Contextual Compression⭐️

We introduce multiple new methods in @LangChainAI to compress retrieved documents w.r.t. the query before passing to an LLM for generation

Inspired by @willpienaar at the "LLMs in production" conference

Blog: blog.langchain.dev/improving-docu…

🧵More details: ⛩️Introduction

The key step in "chat-your-data" or "document question-answering" applications is a retrieval step, which fetches relevant documents and inserts them into a prompt to pass to the LLM

See diagram below Image