Tweet

How to get URL link on Twitter App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Lance Martin

@RLanceMartin

Aug 23 • 9 tweets • 4 min read Twitter logo

Read on Twitter

GPT-3.5 and LLaMA2 fine-tuning guides 🪄

Considering LLM fine-tuning? Here's two new CoLab guides for fine-tuning GPT-3.5 & LLaMA2 on your data using LangSmith for dataset management and eval. We also share our lessons learned in a blog post here:

blog.langchain.dev/using-langsmit…

... 1/ When to fine-tune? Fine-tuning is not advised for teaching an LLM new knowledge (see references from @OpenAI and others in our blog post). It's best for tasks (e.g., extraction) focused on "form, not facts":
anyscale.com/blog/fine-tuni…

https://twitter.com/RLanceMartin/status/1691880034058064365?s=20

... 2/ With this in mind, we fine-tuned LLaMA-7b-chat & GPT-3.5-turbo for knowledge graph triple extraction (see details in blog post and CoLab). Notebooks here:
LLaMA CoLab:
GPT-3.5-turbo CoLab:

colab.research.google.com/drive/1tpywvzw…
colab.research.google.com/drive/1YCyDHPS…

https://twitter.com/RLanceMartin/status/1691880034058064365?s=20

... 3/ We used LangSmith for managing / cleaning the train & test set and for eval, using a GPT4 grader. All code is shared in CoLabs. Results comparing few-shot GPT4, GPT3.5 vs fine-tuning are shown below, with grades from 0 (worst) to 100% (best).

https://twitter.com/karpathy/status/1655994367033884672

... 4/ Lesson 1: always consider approach like few-shot prompting or RAG before fine-tuning. Few-shot prompting of GPT4 scored better than any fine-tuning (w/ a small 1.5k instruction dataset / 7b base-model).

tidepool.so/2023/08/17/why…

https://twitter.com/karpathy/status/1655994367033884672

... 5/ Lesson 2: but, we find that fine-tuning a small (7b) base model can outperform a larger generalist (GPT-3.5) w/ few shot prompting, a result also shown recently by @anyscalecompute and others.
anyscale.com/blog/fine-tuni…

... 6/ Lesson 3: dataset collection, cleaning is often the most challenging part. We iterated through several public datasets. LangSmith automatically logs project generations w/ a queryable interface to select for and fix poor quality examples for fine-tuning.

... 7/ Lesson 4: eval is challenging. We used LangSmith to run eval and inspect generations. We found base models w/o fine-tuning were verbose / chatty, and in one case hallucinated Homer Simpson as the subject (vs fine-tuned LLMs extracted triples much closer to label format):

Overall, fine-tuning is a powerful tool but should be considered vs prompt eng / RAG. LangSmith can help w/ fine-tuning pain points (data capture / cleaning / eval) and works well w/ fine-tuning recipes (e.g., via @huggingface / @maximelabonne, @OpenAI).
mlabonne.github.io/blog/posts/Fin…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @RLanceMartin

Lance Martin

@RLanceMartin

Aug 12

Code understanding 🖥️🧠

LLMs excel at code analysis / completion (e.g., Co-Pilot, Code Interpreter, etc). Part 6 of our initiative to improve @LangChainAI docs covers code analysis, building on contributions of @cristobal_dev + others:
https://t.co/2DsxdjbYeypython.langchain.com/docs/use_cases…

https://twitter.com/karpathy/status/1608895189078380544?s=20

1/ Copilot and related tools (e.g., @codeiumdev) have dramatically accelerated dev productivity and shown that LLMs excel at code understanding / completion

https://twitter.com/karpathy/status/1608895189078380544?s=20

https://twitter.com/cristobal_dev/status/1675745319659659270?s=20

2/ But, RAG for QA/chat on codebases is challenging b/c text splitters may break up elements (e.g., fxns, classes) and fail to preserve context about which element each code chunk comes from.

https://twitter.com/cristobal_dev/status/1675745319659659270?s=20

Read 6 tweets

Lance Martin

@RLanceMartin

Aug 8

Text-to-SQL 📒

LLMs unlock a natural language interface with structured data. Part 4 of our initiative to improve @LangChainAI docs shows how to use LLMs to write / execute SQL queries w/ chains and agents. Thanks @manuelsoria_ for work on the docs:
https://t.co/CyOqp5I3TMpython.langchain.com/docs/use_cases…

1/ Text-to-SQL is an excellent LLM use-case: many ppl can describe what they want in natural language, but have difficultly mapping that to a specific SQL queries. LLMs can bridge this gap, e.g., see:
https://t.co/b0NMkHPe9xarxiv.org/pdf/2204.00498…

2/ create_sql_query_chain( ) maps from natural language to a SQL query: pass the question and the database into the chain, and get SQL out. Run the query on the database easily:

Read 8 tweets

Lance Martin

@RLanceMartin

Aug 5

Extraction 📚➡️🗒️

Getting structured LLM output is hard! Part 3 of our initiative to improve @LangChainAI docs covers this w/ functions and parsers (see @GoogleColab ntbk). Thanks to @fpingham for improving the docs on this:

https://t.co/bMjFmCSZM3python.langchain.com/docs/use_cases…

https://twitter.com/goodside/status/1657396491676164096?s=20

1/ Getting LLMs to produce structured (e.g., JSON) output is challenge, often requiring tedious prompt eng:

https://twitter.com/goodside/status/1657396491676164096?s=20

2/ Functions (e.g., using OpenAI models) have been a great way to tackle this problem, as shown by the work of @jxnlco and others. LLM calls a function and returns output that follows a specified schema.
wandb.ai/jxnlco/functio…

Read 10 tweets

Lance Martin

@RLanceMartin

Aug 3

LLM Use Case: Summarization 📚🧠

We've kicked off a community driven effort to improve @LangChainAI docs, starting w/ popular use cases. Here is the new use case doc on Summarization w/ @GoogleColab notebook for easy testing ...
https://t.co/e6QYl8pEsHpython.langchain.com/docs/use_cases…

https://twitter.com/AnthropicAI/status/1656700154190389248?s=20

1/ Context window stuffing: adding full documents into LLM context window for summarization is easiest approach and increasingly feasible as LLMs (e.g., @AnthropicAI Claude w/ 100k token window) get larger context windows (e.g., fits hundreds of pages).
https://t.co/aClREUqtPd

https://twitter.com/AnthropicAI/status/1656700154190389248?s=20

https://twitter.com/GregKamradt/status/1654171339438030868?s=20

2/ Embed-cluster-sample: @GregKamradt demod a cool approach w/ @LangChainAI to chunk, embed, cluster, and sample representative chunks that are passed to the LLM context window. A nice approach to save cost by reducing tokens sent to the LLM.

https://twitter.com/GregKamradt/status/1654171339438030868?s=20

Read 6 tweets

Lance Martin

@RLanceMartin

Aug 2

Recent updates @LangChainAI data ecosystem 🦜⛓️: 3 new loaders, 2 new storage options, new loader / retriever for web research ...

... great addition from @RubenBarraganP that connects files in @Dropbox to the LangChain ecosystem:

... similarly, @Huawei unstructured data storage can be connected:
https://t.co/Ir3HLgtgAgpython.langchain.com/docs/integrati…
python.langchain.com/docs/integrati…

... there's a new loader for etherscan transactions. Folks like @punk9059 may have a pulse on applications w/in the larger crypto community. Always interesting to learn about:
python.langchain.com/docs/integrati…

Read 5 tweets

Lance Martin

@RLanceMartin

Jul 26

Web research is a great LLM use case. @hwchase17 and I are releasing a new retriever to automate web research that is simple, configurable (can run in private-mode w/ llamav2, GPT4all, etc), & observable (use LangSmith to see what it's doing). Blog:
https://t.co/LU0PWDmrBEblog.langchain.dev/automating-web…

Projects like @assaf_elovic gpt-researcher are great example of research agents; we started with an agent, but landed on a simple retriever that executes LLM-generated search queries in parallel, indexes the loaded pages, and retrieves relevant chunks. LangSmith trace:

The retriever is compatible w/ private workflows. Here's a trace running on my laptop (~50 tok/sec) w/ Llama-v2 and @nomic_ai GPT4all embeddings + @trychroma: LLM will generate search queries and also be used for the final answer generation. See docs: https://t.co/I5V51LVdOFpython.langchain.com/docs/modules/d…

Read 5 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter Twitter Thread URL to Unroll

Lance Martin

Try unrolling a thread yourself!

More from @RLanceMartin

Lance Martin

Lance Martin

Lance Martin

Lance Martin

Lance Martin

Lance Martin

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!