Just released a new chapter in the early access of my Mathematics of Machine Learning book!
It is about computing determinants in practice. Sadly, this is often missing from linear algebra courses, so I decided to fill this gap.
↓ Here's the gist. ↓
The determinant of a matrix is essentially the product of
• the orientation of its column vectors (which is either 1 or -1),
• and the area of the parallelepiped determined by them.
For 2x2 matrices, this is illustrated below.
Here is the thing.
In mathematics, we generally use two formulas to compute this quantity.
First, we have a sum that runs through all permutations of the columns.
This formula is hard to understand, let alone to implement.
The other one is not so good either.
It is a recursive formula, so implementing it is not that hard, but its performance is horrible.
Its complexity is O(n!), which is unfeasible in practice.
We can quickly implement this in Python.
However, it takes almost 30 seconds to calculate the determinant of a 10 x 10 matrix.
This is not going to cut it.
With a little trick, we can simplify this problem a lot.
If the determinant is not zero, we can factor any A into the product of a lower and an upper triangular matrix. This is called the LU decomposition.
As a bonus, the diagonal of L is constant 1.
The LU decomposition takes O(n³) steps to compute, and the determinant of A can be easily read out from it: determinants of triangular matrices equal to the product of the diagonal elements.
So, instead of O(n!), we can calculate determinants at O(n³) time.
The difference is stunning. With the recursive formula, a 10 x 10 determinant took 30 seconds. Using LU decomposition, we can do a 10000 x 10000 one in that time.
A bit of linear algebra can take us very far.
Having a deep understanding of mathematics will make you a better engineer. This is what I want to help you with.
If you are interested in the details and the beauties of linear algebra, check out the early access for my book!
The Law of Large Numbers is one of the most frequently misunderstood concepts of probability and statistics.
Just because you lost ten blackjack games in a row, it doesn’t mean that you’ll be more likely to be lucky next time.
What is the law of large numbers, then? Read on:
The strength of probability theory lies in its ability to translate complex random phenomena into coin tosses, dice rolls, and other simple experiments.
So, let’s stick with coin tossing.
What will the average number of heads be if we toss a coin, say, a thousand times?
To mathematically formalize this question, we’ll need random variables.
Tossing a fair coin is described by the Bernoulli distribution, so let X₁, X₂, … be such independent and identically distributed random variables.