For every topic in computer science, there is an XKCD comic that summarizes it perfectly. My all-time favorite one is the following.

Jokes aside, linear algebra plays a crucial part in machine learning. Here is why!

(image credit: @xkcd, original: xkcd.com/1838/) Image
In essence, a machine learning model works by doing the following two things.

1. Find an alternative representation of the data.
2. Make decisions based on this representation.

Linear algebra plays a role in the representations.
Regardless of the features, data points are represented by vectors.

Finding more descriptive representations is the same as finding functions f(x), mapping between vector spaces.

The simplest ones are the linear transformations given by matrices. Image
Why do we love linear transformations? There are two reasons.

• They are easy to work with and fast to compute.
• Combined with simple nonlinear functions, they can create expressive models.

What is their effect on the data? We'll see this next.
Linearity means that the order of addition, scalar multiplication, and function application can be changed.

So, a linear transformation is determined by the images of the basis vectors. Image
We can visualize this for linear transformations on the two-dimensional plane.

As you can see, the images of the basis vectors form a parallelogram. (Whose sides can fall onto a single line.) Image
From yet another perspective, this is the same as distorting the grid determined by the basis vectors. Image
How does this help to find good representations of the data?

Think about PCA, which finds features with no redundancy. This is done by a simple linear transformation. (If you are not familiar with how PCA works, here is a thread I posted earlier.)

So, linear transformations give rise to new features. How descriptive can these be?

For instance, in classification tasks, we want each high-level feature to represent the probability of belonging to a given class. Are linear transformations enough to express this?

Almost.
Any true underlying relationship between data and class label can be approximated by composing linear transformations with certain nonlinear functions (such as the Sigmoid or ReLU).

This is formally expressed by the Universal Approximation Theorem.

This is why machine learning is just a pile of linear algebra, stirred until it looks right. (Not just accordingly to XKCD.)

In summary, linear transformations are
• simple to work with,
• fast to compute,
• and can be used to build powerful models.
If you enjoy my explanations of math in machine learning, I regularly post them on my blog as well. Here is the written version of this thread!

tivadardanka.com/blog/linear-al…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tivadar Danka

Tivadar Danka Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @TivadarDanka

13 Aug
How to build a good understanding of math for machine learning?

I get this question a lot, so I decided to make a complete roadmap for you. In essence, three fields make this up: calculus, linear algebra, and probability theory.

Let's take a quick look at them!

🧵 👇 Image
1. Linear algebra.

In machine learning, data is represented by vectors. Essentially, training a learning algorithm is finding more descriptive representations of data through a series of transformations.

Linear algebra is the study of vector spaces and their transformations. Image
Simply speaking, a neural network is just a function mapping the data to a high-level representation.

Linear transformations are the fundamental building blocks of these. Developing a good understanding of them will go a long way, as they are everywhere in machine learning.
Read 15 tweets
12 Aug
How you play determines who you are.

You might be surprised, but I gained a lot from playing games. Board games, video games, all of them. Playing is a free-time activity, but it can teach a lot about life and work.

This thread is about the most important lessons I learned.
1. Taking responsibility for your mistakes.

Mistakes are the best way to learn, but you can do so by taking responsibility instead of looking for excuses. Stop blaming bad luck, lag, teammates, or anything else.

Be your own critic and identify where you can improve.

2/8
2. Actively focus on improvement.

Contrary to popular belief, "just doing it" is not an effective way to learn. Identifying flaws in your game, setting progressive goals, and keeping yourself accountable relentlessly supercharges the process. Play (work) with purpose.
Read 8 tweets
21 Jun
What you see below is a cube in four dimensions.

Because humans can't see in more than 3D, it is challenging to make sense of it for the first time. However, there is a simple yet beautiful pattern behind.

This is how the magic is done!
What is a cube in one dimension?

It is simply two vertices connected with a line of unit length.
To move beyond and construct a cube in two dimensions, also known as a square, we simply copy a one-dimensional cube and connect each original vertex with its copy.

(These new edges are colored blue.)
Read 8 tweets
31 May
Last week, I lost my mother due to COVID.

Ever since then, I can't really focus on work. I keep thinking about life, death, and the meaning of it all.

I never post personal tweets/threads like this, but I have to write about this.
I am a man of science.

I don't believe in god, heaven, or any kind of afterlife.

I believe that if you die, you cease to exist. You fall into the void.
Everyone is fighting against the third law of thermodynamics.

Entropy increases. Knowledge decays. Your work dissipates.
Read 9 tweets
18 May
Reading research papers is a skill in itself.

I learned it the hard way. After reading hundreds of articles, I figured out the methods of learning and extracting information the simplest way.

Here is how.

🧵 👇🏽
Regardless of fields, most well-written papers have a similar structure:

What is the problem?
🠓
What are the previous works?
🠓
What did previous works miss?
🠓
What is the main result?
🠓
Why does it work?
🠓
How it compares to others?
🠓
What are its limitations?
However, research papers are not meant to be read linearly.

There are several levels of understanding:

knowing
1. how to use the result,
2. when to use it,
3. why and how does it work,
4. and how to improve it.

Depending on your goal, the reading paths might differ.
Read 9 tweets
12 May
What you see below is a 2D representation of the MNIST dataset.

It was produced by t-SNE, a completely unsupervised algorithm. The labels were unknown to it, yet it almost perfectly separates the classes. The result is amazing.

This is how the magic is done!

🧵 👇🏽
Even though real-life datasets can have several thousand features, often the data itself lies on a lower-dimensional manifold.

Dimensionality reduction aims to find these manifolds to simplify data processing down the line.
So, we have data points 𝑥ᵢ in a high-dimensional space, looking for lower dimensional representations 𝑦ᵢ.

We want the 𝑦ᵢ-s to preserve as many properties of the original as possible.

For instance, if 𝑥ᵢ is close to 𝑥ⱼ, we want 𝑦ᵢ to be close to 𝑦ⱼ as well.
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(