I've been a vocal opponent of the "neural networks are brain simulations" analogy, not because it's *wrong* but because I believe it's harmful for beginners.

I want to propose an alternative analogy for approaching deep learning from a dev background.

Think about detecting a face in an image.

How would you even start to write a program for that?

You know it's gonna have something to do with finding a "nose" and two "eyes", but how can you go from an array of pixels to something that looks like an eye, in whatever position?
Now, suppose you have access to thousands of faces and non-faces.

How does that changes the problem?

Instead of thinking in the problem domain (finding faces) you can now take a leap upwards in abstraction, and think in the meta-problem domain (finding face finders).
How on Earth do you that?

Well, the same way you find anything in computer science.

You have a collection of (potentially infinite) objects, you iterate through them in some smart order, and compare them to some object of reference.
We need two things:

- A way to describe the collection of all potential "face finders", i.e. all possible algorithms that go from pixels to boolean.
- A way to efficiently search in this collection.

Here's where neural networks enter the movie.
We don't know which is the exact program that detects faces, but *if that program exists*, it's gonna have a bunch of IF/ELSEs related to a bunch of pixels.

So we are gonna assume there is some magic program that takes pixels, does a bunch of math with them, and outputs a bool.
Now, we make a meta-program, kind of a template, as complicated as we can, that can be instantiated as many many different specific programs, one of which is hopefully our face detector.

This program is a huge method full of statements like:

"if pixel x_i * w_j > 0 then..."
Now you can see that, if this meta-program is general enough, there should be a way to select suitable values for all w_j that result in this program becoming a face detector.

We just have to search among all ways of assigning values to w_j for the best program.
And the best program is of course the one that produces the smallest error on our example set of faces and non-faces.

Now, instead of using random or exhaustive search, if we play a little bit with the math, we can search much more efficiently (but that's for another day).
Now, instead of actual code, we have a neural network, which is a way to represent these types of programs in a computational structure that makes it much easier to manipulate, store, and analyze.
So, to summarize, think of a neural network as a template program, that according to some specific set of values for its weights, it's gonna be (almost) equivalent to some specific program.

And SGD is just a super optimized search procedure for that specific type of objects.
There are a lot of details I left out, like non-linear activation functions, bias weights, different topologies for connecting these so-called neurons, but none of that matters yet...

This analogy of ANNs as program templates is, I believe, much more helpful for beginners.

• • •

Missing some Tweet in this thread? You can try to force a refresh

Keep Current with Alejandro Piad Morffis

Alejandro Piad Morffis Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!


Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @AlejandroPiad

21 Sep
Hey, today is #MindblowingMonday 🤯!

A day to share with you amazing things from every corner of Computer Science.

Today I want to talk about Generative Adversarial Networks 👇
🍬 But let's begin with some eye candy.

Take a look at this mind-blowing 2-minute video and, if you like it, then read on, I'll tell you a couple of things about it...

Generative Adversarial Networks (GAN) have taken by surprise the machine learning world with their uncanny ability to generate hyper-realistic examples of human faces, cars, landscapes, and a lot of other stuff, as you just saw.

Want to know how they work? 👇
Read 12 tweets
17 Sep
Hey, guess what, today is #TheoryThursday 🧐!

A silly excuse I just invented to share with you random bits of theory from some dark corner of Computer Science and make it as beginner-friendly as possible 👇
Today I want to talk about *Algorithmic Complexity *.

To get started, take a look at the following code. How long do you think it will take to run it?

Let's make that question more precise. How long do you think it will take to run it in the worst-case scenario?
We can see that the code will run slower if:

👉 your computer is older;
👉 the array is longer; or
👉 x happens to be further to back, or not present at all.

Can we turn these insights into an actual formula? We will have to get rid of ambiguous stuff like "old computers".
Read 21 tweets
11 Sep
I made this experiment yesterday:

The thread exploded with hundreds of insights and parallel discussions! Thanks to all who participated.

I'm gonna try to summarize the most interesting takes (from my POV) and thread in my own thoughts. Brace! 👇
First, as some suggested, this does happen in a couple languages:



And even Python has optional semicolons:
However, these are not so much "errors" fixed by the compiler, but actual features.

They are accounted for in the grammar, and only in specific constructions where it is mostly unambiguous to do so.

(🤔Whether this a good idea or not is a matter for another discussion).
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!