Matrices are the basic building blocks of learning algorithms.
Multiplying the data vectors with a matrix is equivalent to transforming the feature space. We think about this as a "black box", but there is a lot to discover.
For one, how they change the volume of objects.
This is described by the determinant of the matrix, which is given by
• how the transformation scales the volume,
• and how it changes the orientation of basis vectors.
The determinant is given by the formula below. I am a mathematician, and even I find this intimidating.
However, the determinant can be explained in terms of simple geometric concepts.
This new chapter takes this route, making determinants easy to understand. From motivation to applications, I am taking you through all the details.
In the early access, I publish chapters as I write them.
Moreover, you get a personal hotline to me, where
• I help you out if you are stuck,
• and you can share your feedback with me.
This way, I can build the best learning resource for you.
🤔 Should you learn mathematics for machine learning?
Let's do a thought experiment! Imagine moving to a new country without speaking the language and knowing the way of life. However, you have a smartphone and a reliable internet connection.
How do you start exploring?
1/8
With Google Maps and a credit card, you can do many awesome things there: explore the city, eat in nice restaurants, have a good time.
You can do the groceries every day without speaking a word: just put the stuff in your basket and swipe your card at the cashier.
2/8
After a few months, you'll start to pick up some language as well—simple things, like saying greetings or introducing yourself. You are off to a good start!
There are built-in solutions for common tasks that just work. Food ordering services, public transportation, etc.
3/8
Data similarity has such a simple visual interpretation that it will light all the bulbs in your head.
The mathematical magic tells you that similarity is given by the inner product. Have you thought about why?
This is how elementary geometry explains it all.
↓ A thread. ↓
Let's start in the beginning!
In machine learning, data is represented by vectors. So, instead of observations and features, we talk about tuples of (real) numbers.
Vectors have two special functions defined on them: their norms and inner products. Norms simply describe their magnitude, while inner products describe
.
.
.
well, a 𝐥𝐨𝐭 of things.
If I toss a fair coin ten times and it all comes up heads, what is the chance that the 11th toss will also be heads? Many think that it'll be highly unlikely. However, this is incorrect.
Here is why!
↓ A thread. ↓
In probability theory and statistics, we often study events in the context of other events.
This is captured by conditional probabilities, answering a simple question: "what is the probability of A if we know that B has occurred?".
Without any additional information, the probability that eleven coin tosses result in eleven heads in a row is extremely small.
However, notice that it was not our case. The original question was to find the probability of the 11th toss, given the result of the previous ten.
The early access of my Mathematics of Machine Learning book is launching today!
One chapter per week, we go from basics to the internals of neural networks. We are starting with vector spaces, the scene where machine learning happens.
Here is why they are so important!
🧵 👇🏽
As you probably know, data is represented by vectors.
Data points are just tuples of measurements. In their raw form, they are hardly useful for us. They are just blips in space.
Without operations and transformations, it is difficult to predict class labels or do anything else.
Vector spaces provide a mathematical structure where operations naturally arise.
Instead of a blip, just imagine an arrow pointing to the data point from a fixed origin.