Let's talk about NEURAL NETWORKS. ๐Ÿง 

Most of you are probably familiar with them. ๐Ÿง‘โ€๐Ÿ’ป

But not many know HOW they actually work andโ€”even more importantlyโ€”WHY they work. ๐Ÿ”ง

So, let's take a journey to understand what makes Neural Networks so effective... ๐Ÿ“Š

#ML #AI #NN

๐Ÿงต๐Ÿ‘‡
Let's start with a simple question...

โ“ What problem are NNs trying to solve? ๐Ÿค”

Generally speaking, NNs are trained on examples, to produce predictions based on some input values.

The example data (input + desired output) draws a curve that the NN is trying to fit.
In a nutshell, NNs areโ€”like most ML toolsโ€”a fancy way to fit the curves inherently generated by the examples used to train it.

The more inputs it needs, and the more outputs it produces, the higher the dimension of the curve.

The simplest curve we can fit is ...a line! ๐Ÿ˜…
Fitting a line is known by Mathematicians & Statisticians as LINEAR REGRESSION. ๐Ÿ“ˆ

The equation of a line is:
๐ฒ = ๐ฑ๐ฐ + ๐›

where:
๐Ÿ”ธ ๐ฐ: Slope
๐Ÿ”ธ ๐›: Y-intercept

Fitting a line means finding the ๐ฐ & ๐› of the line that best fits the input data! ๐Ÿ”
To find which line better fits the training data, we need to define what "better" means first.

There are many ways to measure "linear fitness", and they all take into account how close each point is to the line.

The RMSE (Root Mean Square Error) is very popular metric.
On top of the "traditional" algebraic form (๐ฒ = ๐ฑ๐ฐ + ๐›), let's introduce a more "visual" way to represent equations.

๐Ÿ’ก NETWORKS allow us to better see the relationships between each part.

It will be important later, trust me! ๐Ÿ˜Ž
LINEAR regression, however, only work well with LINEAR data.

โ“ What if out data is "binary" instead? ๐Ÿค”

This is common with many decision-making problems.

For instance:
๐Ÿ”ธ ๐ฑ: the room temperature ๐ŸŒก๏ธ
๐Ÿ”ธ ๐ฒ: either 0 or 1, to turn the fan ON/OFF โ„๏ธ
If we try to naively use LINEAR REGRESSION to fit binary data, we will likely get a line that passes through both sets of points.

The example below shows the "best" fitting line, according to RMSE.

It's a bad fit. โŒ

โ“ Can we "fix" linear interpolation? ๐Ÿค”
In *this* special case, we can! ๐Ÿ˜Ž

Let's find a DIFFERENT line. Not the one that BEST FITS the data, but the one that BEST SEPARATES the data.

So that:

๐Ÿ”น When ๐ฒ โ‰ค 0, we return 0 (turn fan OFF ๐Ÿ”ด)
๐Ÿ”น When ๐ฒ > 0, we return 1 (turn fan ON ๐Ÿ”ต)
To do that, we need to update our MODEL:
๐ฒ = ๐ฌ(๐ฑ๐ฐ + ๐›)

where ๐ฌ() is a the HEAVISIDE STEP function.

That will be the ACTIVATION FUNCTION of our network.
Other commonly used AFs are:
๐Ÿ”น Sigmoid
๐Ÿ”น Tanh
๐Ÿ”น Rectified Linear Unit (ReLU)
๐Ÿ”น Leaky ReLU
Ultimately, the ACTIVATION FUNCTION is where the magic happens, because it adds NON-LINEARITY to our MODEL. โœจ

This gives us the power to fit virtually any type of data! ๐Ÿ”ฎ

This is (more or less!) what a PERCEPTRON is: the grandfather of modern Neural Networks. ๐Ÿง“
Now that we have PERCEPTRONs, let's see how we can use them as the building blocks of more complex networks.

For instance, let's imagine a more complex training data ("ternary" data? ๐Ÿค”).

A perceptron can only fit 2/3 of the data.
So, why not using ...THREE of them? ๐Ÿ˜Ž
The FIRST perceptron fits the first 2/3 of the data:
1โƒฃ ๐Ÿ”ด๐Ÿ“ˆ๐Ÿ”ต โšซ๏ธ

The SECOND perceptron fits the last 2/3 of the data:
2โƒฃ โšซ๏ธ ๐Ÿ”ต๐Ÿ“‰๐Ÿ”ด

What's left to do now is to use a THIRD perceptron to merge the first two:
3โƒฃ ๐ฒ=(๐Ÿ“ˆ+๐Ÿ“‰)/2 - 0.5
This is a better view of the resulting network, with each colour indicating a different perceptron.

Pretty neat, right? ๐Ÿ˜Ž

Training a network like requires finding the 7 PARAMETERS so that out model fits the training data best.

Modern NNs can have MILLIONS of parameters. ๐Ÿคฏ
If we translate that network back into its equation, you can immediately see how messy that looks.

You probably would have never come up with this yourself. But when you think in terms of curve fitting, that becomes much easier to understand.
At this point, you might wonder...

โ“ What has all of this to do with the reason WHY Neural Networks are so effective? ๐Ÿค”

Because we have just built an AND gate! ๐Ÿ˜Ž

Likewise, we can also build OR and NOT gates, de-facto proving that NNs are TURING COMPLETE! ๐Ÿ–ฅ๏ธ
This proves that they can perform ANY computation that a more "traditional" computer can. ๐Ÿ–ฅ๏ธ

To continue our analogy with CURVE FITTING, it means that Neural Networks have the potential to fit ANY curve in ANY number of dimensions, with as much precision as you want. ๐Ÿคฏ
Any arbitrary 2D curve can be potentially recreated by a NN, in just three steps:

1โƒฃ Slice the original shape in thin sections ๐Ÿ”ช
2โƒฃ Fit each section with a perceptron (AND) ๐Ÿ“ˆ
3โƒฃ Use a perceptron to merge all sections (OR) ๐Ÿ“Š
You can see here that very same principle applied to the design of a Neural Network.

This NN now has 33 parameters to fit, meaning that our search problem is now taking place in a 33-dimensional space. ๐Ÿ”

That is nothing compared to the many millions some NNs nowadays have.
This is, in a nutshell, what Machine Learning is really about.

Making decisions...
...by learning from examples...
...by fitting a curve...
...by finding some numbers...
...that minimise the error of our model over a set of examples.
โœจ ๐’•๐’‰๐’‚๐’๐’Œ ๐’š๐’๐’– ๐’‡๐’๐’“ ๐’„๐’๐’Ž๐’Š๐’๐’ˆ ๐’•๐’ ๐’Ž๐’š ๐’•๐’†๐’… ๐’•๐’‚๐’๐’Œ โœจ

I tweet about Machine Learning, Artificial Intelligence, #GameDev & Shader Coding. ๐Ÿง”๐Ÿป

If you are interested in any of these topics, follow me & have a look at my @Patreon! ๐Ÿ˜Ž

patreon.com/AlanZucconi

โ€ข โ€ข โ€ข

Missing some Tweet in this thread? You can try to force a refresh
ใ€€

Keep Current with Alan Zucconi

Alan Zucconi Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @AlanZucconi

10 Feb
This is a list of TEN (plus one) games that are "accidentally" TURING COMPLETE. ๐Ÿ–ฅ๏ธ

In a nutshell, when a game is Turing Complete you can use it to build a WORKING COMPUTER. ๐Ÿคฏ

If you are a parent: please don't underestimate games as a creative medium.

Let's start with...

๐Ÿงต๐Ÿ‘‡ ImageImageImage
1โƒฃ Minecraft โ›๏ธ

Unsurprisingly, @Minecraft is indeed Turing Complete. ๐Ÿ–ฅ๏ธ

This is possible thanks to the so-called REDSTONE CIRCUITS, which allow to build simple logic gates. ๐ŸŸฅ

And if you are really, REALLY committed ...you can even build a computer!

2โƒฃ Portal 2 ๐ŸŸ ๐Ÿ”ต

Yes. Everybody's favourite game is Turing Complete! ๐Ÿฅฐ

@anklejbiter used Portal 2 level editor to create a 3-Digit Binary Calculator. De-facto proving it can be used for arbitrary computation.

steamcommunity.com/sharedfiles/fiโ€ฆ
Read 14 tweets
2 Feb
After using @unity3d for almost 10 years, I want to share the TEN small features that are really helping me develop games faster.

Starting with my favourite... ๐Ÿ˜Ž

1โƒฃ Inspector Maths

You can write ACTUAL mathematical expressions in the inspector! ๐Ÿคฏ

Check the other ones!

๐Ÿงต๐Ÿ‘‡
2โƒฃ Animation Curves

Unity supports a super easy way to create smooth paths through ANIMATION CURVES. Perfect for blending & animating properties.

Creation:
public AnimationCurve Curve;

Usage:
float x = Curve.Evaluate(t);

#unitytips

docs.unity3d.com/ScriptReferencโ€ฆ
3โƒฃ Gradients

The equivalent of AnimationCurve for colours is Gradient. You can use it to create and sample smooth gradients.

Creation:
public Gradient Gradient;

Usage:
Color c = Gradient.Evaluate(t);

#unitytips

docs.unity3d.com/ScriptReferencโ€ฆ
Read 10 tweets
25 Jan
These are FIVE TRICKS that my colleagues and I have learnt from one academic year or REMOTE TEACHING in HIGHER EDUCATION. ๐Ÿ“š๐ŸŽ“

Every class is different, and every lecture has their own teaching style. But feel free to share this thread if you think it might help someone!

๐Ÿงต๐Ÿ‘‡
1โƒฃ ๐—ฆ๐—ฒ๐˜๐˜‚๐—ฝ ๐Ÿ–ฅ๏ธ

If you are using PowerPoint with a second monitor, you can actually resize the "Presenter View" window:

1โƒฃ๐Ÿ–ฅ๏ธ
๐Ÿ”ดShared screen with slides

2โƒฃ๐Ÿ–ฅ๏ธ
๐Ÿ”ตPresenter view with notes
๐ŸŸขMS Teams chat

So you can share your screen, see your notes & read questions!
2โƒฃ ๐—–๐—น๐—ฎ๐—ฝ๐—ฝ๐—ถ๐—ป๐—ด ๐—ผ๐—ป ๐— ๐—ฆ ๐—ง๐—ฒ๐—ฎ๐—บ๐˜€ ๐Ÿ‘

Every time a student present their work in front of the class, I invite everyone to clap. ๐Ÿ‘

A remote alternative is to encourage students to raise and lower their hands on MS Teams quickly.
Read 7 tweets
23 Jan
BEHAVIOUR TREES are the cornerstone of ARTIFICIAL INTELLIGENCE for #gamedev.

But they can be difficult to "grow". ๐ŸŒฑ

I have selected FIVE rather approachable papers that use EVOLUTION to grow behaviour trees AUTOMATICALLY, so you don't have to. ๐ŸŒฒ

Let's start with...๐Ÿ“

๐Ÿงต๐Ÿ‘‡ ImageImage
1โƒฃ "Evolving Behaviour Trees for the Commercial Game DEFCON" (2010)

๐Ÿ”นChong-U Lim
๐Ÿ”นRobin Baumgarten
๐Ÿ”นSimon Colton

researchgate.net/publication/22โ€ฆ ImageImage
2โƒฃ "Learning of Behavior Trees for Autonomous Agents" (2015)

๐Ÿ”นMichele Colledanchise
๐Ÿ”นRamviyas Parasuraman
๐Ÿ”นPetter ร–gren

arxiv.org/abs/1504.05811 ImageImage
Read 8 tweets
22 Jan
Fifty years have passed since CONWAY'S GAME OF LIFE firstly appeared on a column called "Mathematical Games" on @sciam.

While most Programmers & Computer Science enthusiasts are familiar with it, not many know that the game is actually TURING COMPLETE.

Let's see why. โ  โ ต

๐Ÿงต๐Ÿ‘‡
The quickest way to prove that a system is TURING COMPLETE is to show that it allows for the constructions of LOGIC GATES. ๐Ÿ–ฅ๏ธ

So, let's see how the ๐—”๐—ก๐——, ๐—ข๐—ฅ and ๐—ก๐—ข๐—ง gates can actually be constructed in Conway's Game of Life...
Firstly, we need to find a way to encode binary signals.

One very popular choice is to use a stream of GLIDERS. The so-called GOSPER GLIDER GUN can generated a new glider every 30 generations. ๐Ÿ”ซ

Hence, receiving a glider every 30 generations counts as a "1".
Read 11 tweets
19 Jan
This is a story about the importance of being the first one to cover a topic.

๐Ÿ“™ "Artificial Intelligence for Games" by @idmillington & @funge, was published in 2009.

And it featured a diagram which has now possibly become the most popular Behaviour Tree seen in Colleges.

๐Ÿงต๐Ÿ‘‡ Image
I first realised I had seen this Behaviour Tree before when I noticed the "Barge door" node in someone else's slides.

And from that moment onwards, I have been finding variations of that very Behaviour Tree in virtually every game AI presentation.

Jeremy Gow, Goldsmiths ๐Ÿ‘‡ Image
So, I went on a journey to find out how many other presentations I could discover, which features a variation of that original Behaviour Tree...

Simon Colton & Alison Pease, Imperial College London ๐Ÿ‘‡ Image
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!