Nirant Profile picture
💡 Tweets about LLMs, learning about Search/Ranking | 🛠️⚡️ FastEmbed: https://t.co/Ag7vgIPywR
Nov 5, 2023 8 tweets 2 min read
6 Quick tips on doing RAG better:

1. Retrieval and ranking matter quite a lot:

1a) Chunking: Including section title in your chunks improves that, so does keywords from the documents

1b) Different token-efficient separators in your chunks e.g. ### is a single token in GPT 1c) Latency permitting — use a ReRanker — Cohere, Sentence Transformers and BGE have decent ones

1d) If you can, finetune the embedding to your domain — takes about 20 minutes on a modern laptop or Colab notebook, improves recall by upto 30-50%
Jun 28, 2023 5 tweets 2 min read
Why you should never use pgvector (e.g. @supabase Vector Store) for production:

😮 pgvector is 20x slower than a decent vector DB (e.g. @qdrant_engine)
🤯 And it's a full 18% worse in finding relevant docs for you

And this can happen at as little as 10K documents when chunked! As a postgres fan, I am sad to see that pgvector not only starts at less than half the QPS at even 100K vectors — it dips really quickly beyond that.