Tivadar Danka Profile picture
Jul 14 28 tweets 7 min read Read on X
"Probability is the logic of science."

There is a deep truth behind this conventional wisdom: probability is the mathematical extension of logic, augmenting our reasoning toolkit with the concept of uncertainty.

In-depth exploration of probabilistic thinking incoming. Image
Our journey ahead has three stops:

1. an introduction to mathematical logic,
2. a touch of elementary set theory,
3. and finally, understanding probabilistic thinking.

First things first: mathematical logic.
In logic, we work with propositions.

A proposition is a statement that is either true or false, like
• "it's raining outside",
• or "the sidewalk is wet".

These are often abbreviated as variables, such as A = "it's raining outside".
We can formulate complex propositions from smaller building blocks with logical connectives.

Consider the proposition "if it is raining outside, then the sidewalk is wet". This is the combination of two propositions, connected by the implication connective.
There are four essential connectives:

• NOT (¬), also known as negation,
• AND (∧), also known as conjunction,
• OR (∨), also known as disjunction,
• THEN (→), also known as implication.
Connectives are defined by the truth values of the resulting propositions. For instance, if A is true, then NOT A is false; if A is false, then NOT A is true.

Denoting true by 1 and false by 0, we can describe connectives with truth tables. Here is the one for negation (¬). Image
AND (∧) and OR (∨) connect two propositions. A ∧ B is true if both A and B are true, and A ∨ B is true if either one is. Image
The implication connective THEN (→) formalizes the deduction of a conclusion B from a premise A.

By definition, A → B is true if B is true or both A and B are false.

An example: if "it's raining outside", THEN "the sidewalk is wet". Image
Science is just the collection of complex propositions like "if X is a closed system, THEN the entropy of X cannot decrease". (As the 2nd law of thermodynamics states.)

The entire body of scientific knowledge is made of A → B propositions.
In practice, our thinking process is the following.

"I know that A → B is true and A is true. Therefore, B must be true as well."

This is called modus ponens, the cornerstone of scientific reasoning. Image
(If you don't understand modus ponens, take a look at the truth table of the → connective, a few tweets above.

The case when A → B is true and A is true is described by the very first row, which can only happen if B is true as well.)
Logical connectives can be translated to the language of sets. Union (∪) and intersection (∩), two fundamental operations, are particularly relevant for us.

Notice how similar the symbols for AND (∧) and intersection (∩) are? This is not an accident. Image
By definition, any element 𝑥 is the element of A ∩ B if and only if (𝑥 is an element of A) AND (𝑥 is an element of B).

Similarly, union corresponds to the OR connective. Image
What's most important for us is that the implication connective THEN (→) corresponds to the "subset of" relation, denoted by the ⊆ symbol. Image
Now that we understand how to formulate scientific truths as "premise → conclusion" statements and see how this translates to sets, we are finally ready to talk about probability.

What is the biggest flaw of mathematical logic?
We rarely have all the information to decide if a proposition is true or false.

Consider the following: "it'll rain tomorrow". During the rainy season, all we can say is that rain is more likely, but tomorrow can be sunny as well.
Probability theory generalizes classical logic by measuring truth on a scale between 0 and 1, where 0 is false and 1 is true.

If the probability of rain tomorrow is 0.9, it means that rain is significantly more likely, but not absolutely certain.
Instead of propositions, probability operates on events. In turn, events are represented by sets.

For example, if I roll a dice, the event "the result is less than five" is represented by the set A = {1, 2, 3, 4}.

In fact, P(A) = 4/6. (P denotes the probability of an event.)
As discussed earlier, the logical connectives AND and OR correspond to basic set operations: AND is intersection, OR is union.

This translates to probabilities as well. Image
How can probability be used to generalize the logical implication?

A "probabilistic A → B" should represent the likelihood of B, given that A is observed.

This is formalized by conditional probability. Image
At the deepest level, the conditional probability P(B | A) is the mathematical formulation of our belief in the hypothesis B, given empirical evidence A.

A high P(B | A) makes B more likely to happen, given that A is observed. Image
On the other hand, a low P(B | A) makes B less likely to happen when A occurs as well.

This is why probability is called the logic of science. Image
To give you a concrete example, let's go back to the one mentioned earlier: the rain and the wet sidewalk. For simplicity, denote the events by

A = "the sidewalk is wet",
B = "it's raining outside".
The sidewalk can be wet for many reasons, say the neighbor just watered the lawn. Yet, the primary cause of a wet sidewalk is rain, so P(B | A) is close to 1.

If somebody comes in and tells you that the sidewalk is wet, it is safe to infer rain.
Probabilistic inference like the above is the foundation of machine learning.

For instance, the output of (most) classification models is the distribution of class probabilities, given an observation. Image
To wrap up, here is how Maxwell — the famous physicist — thinks about probability.

"The actual science of logic is conversant at present only with things either certain, impossible, or entirely doubtful, none of which (fortunately) we have to reason on."
"Therefore the true logic for this world is the calculus of Probabilities, which takes account of the magnitude of the probability which is, or ought to be, in a reasonable man's mind." — James Clerk Maxwell

By now, you can fully understand what Maxwell meant.
If you liked this thread, you will love The Palindrome, my weekly newsletter on Mathematics and Machine Learning.

Join 19,000+ curious readers here: thepalindrome.org

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tivadar Danka

Tivadar Danka Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @TivadarDanka

Jul 13
Conditional probability is the single most important concept in statistics.

Why? Because without accounting for prior information, predictive models are useless.

Here is what conditional probability is, and why it is essential. Image
Conditional probability allows us to update our models by incorporating new observations.

By definition, P(B | A) describes the probability of an event B, given that A has occurred. Image
Here is an example. Suppose that among 100 emails, 30 are spam.

Based only on this information, if we inspect a random email, our best guess is a 30% chance of it being a spam.

This is not good enough. Image
Read 10 tweets
Jul 11
Most people think math is just numbers.

But after 20 years with it, I see it more like a mirror.

Here are 10 surprising lessons math taught me about life, work, and thinking clearly: Image
1. Breaking the rules is often the best course of action.

We have set theory because Bertrand Russell broke the notion that “sets are just collections of things.”
2. You have to understand the rules to successfully break them.

Miles Davis said, “Once is a mistake, twice is jazz.”

Mistakes are easy to make. Jazz is hard.
Read 12 tweets
Jul 8
This will surprise you: sine and cosine are orthogonal to each other.

What does orthogonality even mean for functions? In this thread, we'll use the superpower of abstraction to go far beyond our intuition.

We'll also revolutionize science on the way. Image
Our journey ahead has three milestones. We'll

1. generalize the concept of a vector,
2. show what angles really are,
3. and see what functions have to do with all this.

Here we go!
Let's start with vectors. On the plane, vectors are simply arrows.

The concept of angle is intuitive as well. According to Wikipedia, an angle “is the figure formed by two rays”.

How can we define this for functions? Image
Read 18 tweets
Jul 7
In machine learning, we use the dot product every day.

However, its definition is far from revealing. For instance, what does it have to do with similarity?

There is a beautiful geometric explanation behind. Image
By definition, the dot product (or inner product) of two vectors is defined by the sum of coordinate products. Image
To peek behind the curtain, there are three key properties that we have to understand.

First, the dot product is linear in both variables.

This property is called bilinearity. Image
Read 15 tweets
Jul 5
If I had to learn Math for Machine Learning from scratch, this is the roadmap I would follow: Image
1. Linear Algebra

These are non-negotiables:

• Vectors
• Matrices
• Equations
• Factorizations
• Matrices and graphs
• Linear transformations
• Eigenvalues and eigenvectors

Now you've learned how to represent and transform data. Image
2. Calculus

Don't skip any of these:

• Series
• Functions
• Sequences
• Integration
• Optimization
• Differentiation
• Limits and continuity

Now you understand the math behind algorithms like gradient descent and get a better feeling of what optimization is. Image
Read 6 tweets
Jul 3
Behold one of the mightiest tools in mathematics: the camel principle.

I am dead serious. Deep down, this tiny rule is the cog in many methods. Ones that you use every day.

Here is what it is, how it works, and why it is essential. Image
First, the story.

The old Arab passes away, leaving half of his fortune to his eldest son, third to his middle son, and ninth to his smallest.

Upon opening the stable, they realize that the old man had 17 camels. Image
This is a problem, as they cannot split 17 camels into 1/2, 1/3, and 1/9 without cutting some in half.

So, they turn to the wise neighbor for advice. Image
Read 18 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(