This stone tablet from 1800-1600 BC shows that ancient Babylonians were able to approximate the square root of two with 99.9999% accuracy.
How did they do it?
First, let’s decipher the tablet itself. It is called YBC 7289 (short for the 7289th item in the Yale Babylonian Collection), and it depicts a square, its diagonal, and numbers written around them.
Here is a stylized version.
As the Pythagorean theorem implies, the diagonal’s length for a unit square is √2. Let’s focus on the symbols there!
These are numbers, written in Babylonian cuneiform numerals. They read as 1, 24, 51, and 10.
Since the Babylonians used the base 60 numeral system (also known as sexagesimal), the number 1.24 51 10 reads as 1.41421296296 in decimal.
This matches √2 up to the sixth digit, meaning a 99.9999% accuracy!
The computational accuracy is stunning. To appreciate this, pick up a pen and try to reproduce this without a calculator. It’s not that easy!
Here is how the ancient Babylonians did it.
We start by picking a number x₀ between 1 and √2. I know, this feels random, but let’s just roll with it for now. One such example is 1.2, which is going to be our first approximation.
Because of this, 2/x₀ is larger than √2.
Thus, the interval [x₀, 2/x₀] envelopes √2.
From this, it follows that the mid-point of the interval [x₀, 2/x₀] is a better approximation to √2. As you can see in the figure below, this is significantly better!
Let's define x₁ by this.
Continuing on this thread, we can define an approximating sequence by taking the midpoints of such intervals.
Here are the first few terms of the sequence. Even the third member is a surprisingly good approximation.
If we put these numbers on a scatterplot, we practically need a microscope to tell the difference from √2 after a few steps.
Were the Babylonians just lucky, or did they hit the nail right on the head?
The most important concept in probability and statistics: the expected value
For instance, all the popular loss functions in machine learning, like cross-entropy, are expected values. However, its definition is far from intuitive.
Here is what's behind the scenes:
It's better to start with an example.
So, let's play a simple game! The rules: I’ll toss a coin, and if it comes up heads, you win $1. However, if it is tails, you lose $2.
Should you even play this game with me? We’ll find out.
After n rounds, your earnings can be calculated by the number of heads times $1 minus the number of tails times $2.
If we divide total earnings by n, we obtain your average earnings per round.
The main reason math is considered difficult: proofs.
Reading and writing proofs are hard, but you cannot get away without them. The best way to learn is to do.
So, let's deconstruct the proof of the most famous mathematical result: the Pythagorean theorem.
Here it is in its full glory.
Theorem. (The Pythagorean theorem.) Let ABC be a right triangle, let a and b be the lengths of its two legs, and let c be the length of its hypotenuse.
Then a² + b² = c².
Now, the proof. Mathematical proofs often feel like pulling a rabbit out of a hat. I’ll go a bit overboard and start by pulling out two rabbits.
The first rabbit. Take a look at the following picture.
The depicted square’s side is a + b long, so its area is (a + b)².
One of my favorite formulas is the closed-form of the geometric series.
I am amazed by its ubiquity: whether we are solving basic problems or pushing the boundaries of science, the geometric series often makes an appearance.
Here is how to derive it from first principles:
Let’s start with the basics: like any other series, the geometric series is the limit of its partial sums.
Our task is to find that limit.
There is an issue: the number of terms depends on N.