If you are looking for more, this course from Harvard University is an excellent introduction to probability as a language and a set of tools for understanding statistics, science, risk, and randomness.
A lot in machine learning is pretty dry and boring, but understanding how autoencoders work feels different.
This is a thread about autoencoders, things they can do, and a pretty cool example.
↓ 1/10
Autoencoders are lossy data compression algorithms built using neural networks.
A network encodes (compresses) the original input into an intermediate representation, and another network reverses the process to get the same input back.
↓ 2/10
The encoding process "generalizes" the input data.
I like to think of it as the Summarizer in Chief of the network: its entire job is to represent the entire dataset as compactly as possible, so the decoder can do a decent job at reproducing the original data back.