peter! 🥷 Profile picture
Member of Technical Staff, ex swe @amazon

May 25, 2023, 11 tweets

Vector databases may be the next "big thing"

• Vector databases explained
• What is Unstructured data?
• When to use vector databases
• Embeddings.
• Use-cases

All you need to know in under 10 tweets.

Save This ↓

Why do we even need a database?

→ We have data that we want to store.

Relational databases (like Postgres) or No-SQL databases (AWS DynamoDB) can store structured data but there is one inherent problem.

→ Unstructured data is hard to store in relational databases.

What is Unstructured data?

→ Things like: Images, Audio, Documents, PDFs etc.

Image you want to find what's the best book recommendation if I you "Catcher in the Rye." This is impossible with a relational database.

→ This is where embeddings & vector databases come in.

Here's a Caveman explanation:

→ Vector databases allow us to use to search across unstructured data (images, video, audio) by their content

What are some use-cases for having a vector database?

• Recommendation ( Netflix movie recommendation)
• Find similar images ("Find similar images with dogs in it")
• Find related documents ("Find other documents that talk about love")

P.S - ✨ I'm dropping a FREE step-by-step mini-course guide to start coding with A.I

(Free for now, but not free forever)

Check it out: StartCodingWithAI.com

Let's go over what an embedding is.

→ We are generating the numerical representation of a piece of unstructured data.

Take a look at the graphic.

→ We generated the vector embedding from raw data.
→ We use the vector-database to help us find the data that are similar or related.

Imagine you have a billion records in the database. It will take a while to find & return the most relevant result.

This is where Index comes in.

→ Index is a data structure that speeds up the search process & allows for similarity search (Think of it as an appendix in a book)

Wrapping it all up: 🔥

1) Generate embeddings with a ML model (like OpenAI embeddings)

2) Pass embeddings into a Vector database

3) Vector database stores, indexes and allows you to search for similar/relevant data.

That's a wrap. 🌯

Lmk what's the biggest problem you guys are having and I'll cover it.

P.S: This langchain series is coming to an end, gonna do LLM deep-dive Series starting next week.

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling