Vector databases may be the next "big thing"
• Vector databases explained
• What is Unstructured data?
• When to use vector databases
• Embeddings.
• Use-cases
All you need to know in under 10 tweets.
Save This ↓
Why do we even need a database?
→ We have data that we want to store.
Relational databases (like Postgres) or No-SQL databases (AWS DynamoDB) can store structured data but there is one inherent problem.
→ Unstructured data is hard to store in relational databases.
What is Unstructured data?
→ Things like: Images, Audio, Documents, PDFs etc.
Image you want to find what's the best book recommendation if I you "Catcher in the Rye." This is impossible with a relational database.
→ This is where embeddings & vector databases come in.
Here's a Caveman explanation:
→ Vector databases allow us to use to search across unstructured data (images, video, audio) by their content
What are some use-cases for having a vector database?
• Recommendation ( Netflix movie recommendation)
• Find similar images ("Find similar images with dogs in it")
• Find related documents ("Find other documents that talk about love")
P.S - ✨ I'm dropping a FREE step-by-step mini-course guide to start coding with A.I
(Free for now, but not free forever)
Check it out: StartCodingWithAI.com
Let's go over what an embedding is.
→ We are generating the numerical representation of a piece of unstructured data.
Take a look at the graphic.
→ We generated the vector embedding from raw data.
→ We use the vector-database to help us find the data that are similar or related.
Imagine you have a billion records in the database. It will take a while to find & return the most relevant result.
This is where Index comes in.
→ Index is a data structure that speeds up the search process & allows for similarity search (Think of it as an appendix in a book)
Wrapping it all up: 🔥
1) Generate embeddings with a ML model (like OpenAI embeddings)
2) Pass embeddings into a Vector database
3) Vector database stores, indexes and allows you to search for similar/relevant data.
That's a wrap. 🌯
Lmk what's the biggest problem you guys are having and I'll cover it.
P.S: This langchain series is coming to an end, gonna do LLM deep-dive Series starting next week.
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.