Tweet

Tivadar Danka

10 May, 12 tweets, 4 min read

Creative abuse of rules can lead to game-changing discoveries.

In high school, you learned that -1 has no square roots. Yet, by ignoring this, you'll soon discover something that changed mathematics forever: complex numbers.

Follow along, and you'll see how!

🧵 👇🏽

Let's start with a very simple equation:

𝑥² + 1 = 0

Can we solve this? Not at first glance, since the left side of the equation is always larger than one. This is equivalent to solving

𝑥² = -1,

which is (apparently) not possible.

But let's disregard this and imagine a number whose square is -1.

Let's appropriately name it the 𝑖𝑚𝑎𝑔𝑖𝑛𝑎𝑟𝑦 𝑛𝑢𝑚𝑏𝑒𝑟 and denote it with 𝑖.

So, 𝑖² = -1.

Now that we have this strange entity, what can we do?

Can we add or multiply 𝑖 with real numbers?

Sure, why not! We can compose a new number by taking the linear combination of 𝑖 with real numbers. So, we can form new ones by

𝑧 = 𝑎 + 𝑏𝑖,

where 𝑎 and 𝑏 are real numbers. Let's call 𝑎 the real part, 𝑏 the imaginary part.

We can perform addition and multiplication with these composite numbers by following elemental algebraic rules.

Let's name our creation the 𝑐𝑜𝑚𝑝𝑙𝑒𝑥 𝑛𝑢𝑚𝑏𝑒𝑟𝑠.

In the literal sense, there is nothing imaginary about complex numbers. The definition is, although quite mysterious, perfectly valid.

Also, note that the set of real numbers is a subset of complex numbers.

To make reasoning about complex numbers easier, we can represent them as vectors in the Cartesian plane.

Every 𝑧 = 𝑎 + 𝑏𝑖 can be represented as the (𝑎, 𝑏) vector.

The x and y axes are called real and imaginary axes.

Why are complex numbers important?

Now, not only 𝑥² + 1 = 0 has solutions, but every nontrivial polynomial equation with complex coefficients has at least one solution in the set of complex numbers.

(Polynomial equations are the ones like below.)

Believe it or not, this seemingly innocent fact makes complex numbers an ideal mathematical structure for many purposes.

For instance, complex matrices always have eigenvalues, which are not true for their real counterparts. This plays a significant role in linear algebra.

Without complex numbers, the Fourier transform wouldn't exist either.

We would be unable to adequately study certain mechanical systems. Even simple ones like pendulums.

As I mentioned in the beginning, creative abuse of rules can lead to game-changing discoveries.

Complex numbers arose when someone dared to challenge long-standing views and went outside the box.

Complex numbers changed science forever. The rest is history.

https://twitter.com/TivadarDanka/status/1391743582445375495

If you enjoyed this explanation, consider following me and hitting a like/retweet on the first tweet of the thread!

I regularly post simple explanations of seemingly complicated concepts in machine learning, make sure you don't miss out on the next one!

https://twitter.com/TivadarDanka/status/1391743582445375495

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @TivadarDanka

Tivadar Danka

@TivadarDanka

12 May

What you see below is a 2D representation of the MNIST dataset.

It was produced by t-SNE, a completely unsupervised algorithm. The labels were unknown to it, yet it almost perfectly separates the classes. The result is amazing.

This is how the magic is done!

🧵 👇🏽

Even though real-life datasets can have several thousand features, often the data itself lies on a lower-dimensional manifold.

Dimensionality reduction aims to find these manifolds to simplify data processing down the line.

So, we have data points 𝑥ᵢ in a high-dimensional space, looking for lower dimensional representations 𝑦ᵢ.

We want the 𝑦ᵢ-s to preserve as many properties of the original as possible.

For instance, if 𝑥ᵢ is close to 𝑥ⱼ, we want 𝑦ᵢ to be close to 𝑦ⱼ as well.

Read 15 tweets

Tivadar Danka

@TivadarDanka

11 May

There is a mathematical formula so beautiful that it is almost unbelievable.

Euler's identity combines the famous numbers 𝑒, 𝑖, π, 0, and 1 in a single constellation. At first sight, most people doubt that it is true. Surprisingly, it is.

This is why.

🧵 👇🏽

Let's talk about the famous exponential function 𝑒ˣ first.

Have you ever thought about how is this calculated in practice? After all, raising an irrational number to any power is not trivial.

It turns out that the function can be written as an infinite sum!

In fact, this can be done with many other functions.

For those that are differentiable infinitely many times, there is a recipe to find the infinite sum form. This form is called the Taylor expansion.

It does not always yield the original function, but it works for 𝑒ˣ.

Read 9 tweets

Tivadar Danka

@TivadarDanka

7 May

One of the biggest misconceptions regarding education is that its main purpose is to give knowledge you can immediately use.

It is not.

The best thing education can give you is the mental agility to obtain knowledge at the speed of light.

Let's unpack this idea a bit!

1/7

Consider a course where you build a custom neural network framework with NumPy.

This is hardly usable in practice: working with a custom library is insane.

However, if you know how they are built, you only need to learn the interface to master an actual framework!

2/7

By understanding how the framework is built and how the underlying algorithms work, you'll be able to do much more: experiment with custom optimizers, implement your own layers, etc.

3/7

Read 7 tweets

Tivadar Danka

@TivadarDanka

5 May

@GoogleAI

An exciting result came out from @GoogleAI recently, which raises several questions about how deep network architectures should be.

Here is their announcement, including a very interesting post. I would like to unpack this a bit.

https://twitter.com/GoogleAI/status/1389680797985161218

Suppose that you have a trained network and a set of samples 𝑋. You take this data and run it through the network, storing all intermediate results.

The output of the 𝑖-th layer is denoted by 𝑋ᵢ. These encode the intermediate internal representations of the data.

In general, the further you go, the higher level these representations become.

For a convolutional network, filters in earlier layers detect edges, while later activations represent objects.

Check the fantastic article below for more details!

distill.pub/2017/feature-v…

Read 8 tweets

Tivadar Danka

@TivadarDanka

28 Apr

Principal Component Analysis is one of the most fundamental techniques in data science.

Despite its simplicity, it has several equivalent forms that you might not have seen.

In this thread, we'll explore what PCA is really doing!

🧵 👇🏽

PCA is most commonly introduced as an algorithm that iteratively finds vectors in the feature space that are

• orthogonal to the previously identified vectors,
• and maximizes the variance of the data projected onto it.

These vectors are called the principal components.

The idea behind this is we want features that convey as much information as possible.

Low variance means that the feature is more concentrated, so it is easier to predict its value in principle.

Features with low enough variances can even be omitted.

Read 10 tweets

Tivadar Danka

@TivadarDanka

27 Apr

Have you ever wondered why include the logarithm in the definition of log-likelihood?

The answer is simple: logarithm makes differentiation of products easier.

Let's see why!

🧵 👇🏽

Although the derivative of a sum is the sum of derivatives, a similar property cannot be stated about the product of functions.

The derivative of a product is slightly more complicated: it is a sum of products.

The formula gets even more complicated when we have more functions in the product.

When potentially hundreds of terms are present, like in the likelihood function, computing this is not feasible.

Read 6 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Tivadar Danka

Try unrolling a thread yourself!

More from @TivadarDanka

Tivadar Danka

Tivadar Danka

Tivadar Danka

Tivadar Danka

Tivadar Danka

Tivadar Danka

Did Thread Reader help you today?

Like this author's thread?