LlamaIndex 🦙 Profile picture
May 27 1 tweets 1 min read Read on X
LiteParse v2.0 is out now, and it is blazing fast + runs everywhere!

We rewrote everything from scratch in Rust, and now:
- up to 100x faster parsing
- install natively in Rust, JS/TS, and Python
- a custom WASM package enables browser and edge runtime usage

pip install liteparse
npm i @llamaindex/liteparse
npm i @llamaindex/liteparse-wasm
cargo install liteparse

Blog: llamaindex.ai/blog/liteparse…
Repo: github.com/run-llama/lite…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with LlamaIndex 🦙

LlamaIndex 🦙 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @llama_index

Oct 26, 2023
🚨 Completely Revamped Docs 🚨

We’ve completely re-orged our docs to better mirror the user journey from building prototype to production LLM/RAG apps with LlamaIndex

200+ guides to build/optimize your app.

Full credits @seldo, see thread below! 🧵

docs.llamaindex.ai



Image
Image
Image
Image
Section 1: Use Cases

The key use cases for building LLM apps over your data consist of question-answering, conversational chat, workflow automation with agents, and structured data extraction.

Learn about these use cases at a high-level before diving into materials. Image
Section 2: Building an LLM Application

Learn *all* the steps towards building an initial LLM app. This includes the LLM modules, to data loading/indexing/storage.

This also includes putting it together and setting up observability/evals. Image
Read 5 tweets
Sep 25, 2023
We’re excited to release full native support for THREE @huggingface embedding models (s/o @LoganMarkewich):
🧱 Base @huggingface embeddings wrapper
🧑‍🏫 Instructor embeddings
⚡️ Optimum embeddings (ONNX format)

Full thread below 🧵.

Checkout the guide: gpt-index.readthedocs.io/en/latest/exam…


Image
Image
Image
[1] Base @huggingface embeddings 🧱

This is a generic wrapper around any HF model for embeddings. You can set either pooling="cls" or pooling="mean”.

Check out the embeddings leaderboard for recs on embedding models to use! huggingface.co/spaces/mteb/le…
Image
[2] Instructor embeddings 🧑‍🏫

Instructor embeddings are unified models that have undergone instruction tuning on a ton of tasks (classification, retrieval, etc.). Therefore they can be adapted simply via task instruction, no fine-tuning!

instructor-embedding.github.io
Image
Read 4 tweets
Aug 28, 2023
We recently added 3 finetuning projects 🔥
✅ Finetuning embeddings
✅ @OpenAI finetuning gpt-3.5-turbo to distill GPT-4
✅ Finetuning Llama 2 for text-to-SQL

We now have a brand-new guide ✨showing how to include all these components when building RAG:

gpt-index.readthedocs.io/en/latest/end_…
Image
Finetuning embeddings: github.com/run-llama/fine…
Read 4 tweets
Aug 26, 2023
We now have the most comprehensive cookbook on building LLMs with Knowledge Graphs (credits @wey_gu).
✅ Key query techniques: text2cypher, graph RAG
✅ Automated KG construction
✅ vector db RAG vs. KG RAG

Check out the full 1.5 hour tutorial:
Image
The full Colab notebook is here:

There was so much content beyond the live webinar that we recorded a part 2 🔥

We stitched it together in the video.colab.research.google.com/drive/1tLjOg2Z…
To reiterate, there’s a ton of content in here - it basically qualifies as a mini-course 🧑‍🏫

First, we learn the concepts through helpful visual explanations and links.

Learn both about KGs and the traditional RAG stack.
Read 5 tweets
Aug 10, 2023
Introducing “One-click Observability” 🔭

With one line of code, you can now seamlessly integrate @llama_index with rich observability/eval tools offered by our partners (@weights_biases, @arizeai, @truera_ai).

Easily debug/eval your LLM app for prod 💪 https://t.co/tia41IgsT6gpt-index.readthedocs.io/en/latest/end_…
[1] @weights_biases Prompts lets users log/trace/inspect the LlamaIndex execution flow during index construction/querying.

You automatically get traces, and can also choose to version/load indices.

https://t.co/iGDkmxybzggpt-index.readthedocs.io/en/latest/end_…

Image
Image
[2] OpenInference (@arize_ai) is a standard for capturing/storing AI model inferences.

It allows you to experiment/visualize LLM apps using observability tools like @arize_phoenix.

Check out the notebook here! https://t.co/aT9PAP3jGhgpt-index.readthedocs.io/en/latest/exam…
Image
Read 5 tweets
Aug 8, 2023
Tip for better RAG systems💡: don’t just store raw text chunks, augment them with structured data.
✅Enables metadata filtering
✅Helps bias embeddings

Here’s a guide on how to use the @huggingface span-marker to extract entities for this exact purpose📕: https://t.co/Gwwoeu3i9Hgpt-index.readthedocs.io/en/latest/exam…
Image
In this example, we parse the 2023 IPPC Climate Report.

After text parsing to break the document into chunks, we use the span-marker extractor to extract relevant entities. Image
These entities can be used as metadata filters (in a vector db) or to help enhance the context embeddings.

In this guide, we do the latter. Adding/embedding the right metadata directly improves the generated answer (left), vs. without (right)
Image
Image
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(