Simplifying LLMs, MLOps, Python & Machine Learning for you! • AI Engineering @LightningAI • Lead DataScientist • BITS Pilani • 3 Patents
Feb 21 • 4 tweets • 3 min read
Vector Embeddings, clearly explained:
Embeddings are one of the most powerful ideas in the world of Machine Learning.
They are the building blocks of the powerful LLMs that we see today!
So, let's understand what embeddings are and how they came into the picture! 🚀
❗️Computers, particularly neural networks, are good with numbers & when it comes to LLMs, they are trained on large corpus of words!
So, Instead of teaching them with the dictionary definition of each word, we allow them to learn a mathematical representation, which we call embeddings (a series of numbers)
Now these numbers i.e. embeddings are more powerful than you think, they capture:
1️⃣ Meaning: Words come with a myriad of meanings, contexts, and nuances. Embeddings pack all this complex information into a dense vector, compressing vast amounts of information into a small space.
2️⃣ Context: Words are understood by their company. "Apple" in the context of "orchard" is different from "Apple" in the context of "MacBook". Embeddings capture these contextual nuances.
3️⃣ Semantic Relationship: Embeddings can capture relationships between words, words which have similar meaning or say are closer to each other will be closer in the embedding space as well!
Let’s understand these points with a very famous analogy (you can now refer the image below as you read)!
King : Queen :: Man : Woman
The word king might be close to the man because they're both male figures. Similarly, the queen might be close to the woman.
But here's the magical part, let’s say we are in the embedding space.
🔹And, if we know the direction and distance from the queen to the king (Gender vector in the image below), we can use the same direction and distance starting from the Woman to find where the Man should be in this embedding space.
🔸Similarly if we know the direction & distance from Man to King (Royalty vector in the image below), we can use the same direction and distance starting from the woman to find where the queen should be in this embedding space.
🔹When we perform these vector arithmetic operations on properly trained embeddings, we get results that capture these relationships.
This makes the embeddings truly magical & powerful!✨
Today we can convert a large corpus of text, documents, images & audio into embeddings.
We have sophisticated databases to store and index embeddings known as vector data bases.
With these databases, we can efficiently perform various operations on the embeddings and build powerful systems on top of them.
If you're interested in LLMs & AI Engineering, understanding the concept of embeddings and being able to work with them is crucial!
In the next tweet I've shared two links:
1. A FREE AI Engineering Newsletter that I write.
2. @LightningAI's LLM Learning Lab, FREE world class learning material on training, fine-tuning & deploying LLMs at scale.
That’s a wrap!
I love breaking down complex ideas in AI and Machine Learning to their fundamentals.
If you enjoyed reading this, follow me → @akshay_pachaar, so you don't miss my updates.
Feb 19 • 5 tweets • 2 min read
Data Classes in Python, clearly explained:
Data classes in Python are amazing!
- Quick initialisation
- Easy comparison
- Concise representation
- And more ...
Here's an example of how you can create a ChatGPT-like interface using your own knowledge base.
Let's understand each component one-by-one...👇
Retrieval Augmented Generation (RAG) is a powerful tool for enhancing the performance of Large Language Models (LLMs) by incorporating external knowledge into the generation process.
Let's explore the key components of RAG...👇
Jan 18 • 9 tweets • 3 min read
I started my career in Data Science back in 2016 ⏳
Here are 7 tips for those starting out today:
1️⃣ Become a strong programmer 🔥
It helps you to bridge the gap between theory and building stuff that matters!
If you're new to programming, Harvard's CS50: Intro to AI with Python is a great starting point!