Tweet

How to get URL link on Twitter App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Harrison Chase

@hwchase17

Jul 5 • 8 tweets • 5 min read Twitter logo

Read on Twitter

Scrolly

📄Documents x LLMs📄

Combining documents with LLMs is a key part of retrieval and chaining

We've improved our @LangChainAI reference documentation across the 5 major CombineDocumentsChains and helper functions to help with clarity and understanding of how these work

🧵

📄 `format_document`

Want to control which metadata keys show up in the prompt?

This helper function is rarely exposed, but is key to combining documents with LLMs

It takes a Document and formats it into a string using a PromptTemplate

Docs: https://t.co/Xrl5HtvFlvapi.python.langchain.com/en/latest/sche…

🧸Stuff Documents Chain

The most basic CombineDocumentsChain, this takes N documents, formats them into a string using a PromptTemplate and `format_document`, and then combines them into a single prompt and passes them to an LLM

Docs: https://t.co/NCvUNEbAVYapi.python.langchain.com/en/latest/chai…

🟥 ReduceDocumentsChain

But what if you have too many documents to fit into a single a prompt? That's where ReduceDocumentsChain comes into play

It recursively combines documents together

Docs: https://t.co/0VdYZm7WhXapi.python.langchain.com/en/latest/chai…

🗺️ Map Reduce Chain

This builds on top of the ReduceDocumentsChain

It takes an LLMChain and a ReduceDocumentsChain. It first applies the LLMChain to each document, and then passes all the results to the ReduceDocumentsChain

Docs: https://t.co/5s3IZ3XPbOapi.python.langchain.com/en/latest/chai…

👨‍🚒Refine Documents Chain

This chain uses the first document to get an initial response

It then loops over the remaining docs, making a call to the language model to combining the response with the next document

Docs: https://t.co/iONLv4ooo2api.python.langchain.com/en/latest/chai…

🌭Map Rerank

Finally, the Map Rerank Chain calls an LLM on each document, asking not only for an answer but also a score

It then sorts the responses by the score and returns the highest one

Docs: https://t.co/mdaswUaYw8api.python.langchain.com/en/latest/chai…

These chains form the backbone of several of the most popular use cases (question answering, summarization, etc)

We hope the updated reference documentation helps with understanding how these chains work and how to use them

Please let us know what other documentation we can add

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @hwchase17

Harrison Chase

@hwchase17

Jul 6

💬ConversationalRetrievalChain Upgrades

One of our more popular chains is the ConversationalRetrievalChain, which allows you to create a retrieval augmented generation chatbot

We've introduced some small but impactful quality of life changes:

🧵

📃Improved Reference Docs

We beefed up our reference documentation to include better docstrings and a more end-to-end example

There's a lot of toggles to play with, hopefully this helps make it more clear what all the parameters are

Docs: api.python.langchain.com/en/latest/chai…

❓Rephrase Question Flag

The conversational retrieval chain first condenses the chat history and the new message into a standalone question to use for retrieval

This flag controls whether that new question is also used for generation as well

Docs: api.python.langchain.com/en/latest/chai…

Read 5 tweets

Harrison Chase

@hwchase17

Jun 19

⭐️Using `functions` to structure output⭐️

We're starting to add more chains that rely on functions to structure output

Here's a quick overview of how we're doing that, which chains we've added so far, how to contribute, and additional resources

🧵

Although we first incorporated `functions` into agents, an almost more important ability of `functions` is to structure output from ChatGPT

This is extremely useful when you want to use the output of ChatGPT in a particular way

You can do this by not only passing in `functions` parameter, but also passing in the `function_call` parameter

The `function_call` parameter forces it to respond using a particular function - allowing you to guarantee the output in a specific format

Read 9 tweets

Harrison Chase

@hwchase17

Jun 16

@OpenAI

The new @OpenAI functions are good for other things besides agents

Another killer use case is extracting structured information from unstructured docs

We've adding support for extraction AND tagging in @LangChainAI - thanks to @fpingham for code and @jxnlco for review

🧵

✂️Extraction

Specify a schema - either a dictionary or a Pydantic model - and then extract entities from a piece of text with the same schema

This will return a list of objects with that schema

Docs: python.langchain.com/en/latest/modu…

⚡️Tagging

Specify a schema and tag a document with those attributes

As opposed to Extraction, this extracts only one instance of that schema so its more useful for classification of attributes pertaining to the text as a whole

Docs: python.langchain.com/en/latest/modu…

Read 4 tweets

Harrison Chase

@hwchase17

Jun 5

⭐️Composable Prompts⭐️

Wouldn't it be nice if there was a way to compose prompts together, reusing pieces across prompts?

In the newest Python and JS release there now is with `Pipeline Prompt`!

Links 👇

The way this works is you define a `PipelinePrompt` with two components:

- FinalPrompt: the final prompt template to be formatted
- PipelinePrompts: a sequence of tuples of (name, PromptTemplate)

The `name` argument is how the formatted prompt will be passed to future prompts

When `.format` is called, the PipelinePrompts are first formatted in order, and are then used in future formatting steps with their respective `name` arguments

Finally, the FinalPrompt.format is called using any previously formatted values as neccesary

Read 4 tweets

Harrison Chase

@hwchase17

May 31

@LangChainAI

✂️15+ Code Specific Text Splitters✂️

Just used one of @LangChainAI 's 100+ Document Loaders?

Next step: split data into embeddable chunks.

We now have support for splitting 15+ different coding languages in the optimal way

🧵

A underrated part of the preprocessing pipeline, proper splitting of text allows for maintaining semantically meaningful chunks

This is crucial when doing retrieval augmented generation in order to ensure the proper context is inserted into the prompt

One way to do this is split on semantically meaningful characters.

Theoretically, these characters are used to denote the start and end of sequences that make the most sense to be together.

For example "\n\n" is generically good to split on as that signals a new paragraph

Read 5 tweets

Harrison Chase

@hwchase17

May 30

@cristobal_dev

How to speed up "chat-your-data" applications while retaining final answer accuracy?

🫙Use a cheaper/faster model (gpt-3.5) to create the condensed question
💬Use a better but more expensive model (gpt-4) for final response

Thanks to @cristobal_dev for highlighting!

🧵

Most "chat-your-data" applications involve three steps:

1⃣Condense the chat history into a standalone question
2⃣Retrieve relevant docs based on the standalone question
3⃣Generate a final answer based on the retrieved documents

This involves two total calls to the LLM!

@LangChainAI

But these calls are not created equal

Condensing the chat history is a relatively easy (and less important) step, while generating the final answer can be trickier and more important to get right

With @LangChainAI you can easily use a different LLM for each step

Read 4 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter Twitter Thread URL to Unroll

Harrison Chase

Try unrolling a thread yourself!

More from @hwchase17

Harrison Chase

Harrison Chase

Harrison Chase

Harrison Chase

Harrison Chase

Harrison Chase

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!