right?
A: Right ^_^
A: Imagine you’re a real estate agent. You have {bedroom, bathroom, square footage, sold price}. You have a formula for house price but want the best parameters.
To start with, we can set {X, Y, Z} to anything sane - even if that is a terrible approximation at first. We run the equation over many samples to work out how well our equation and parameters work.
A: Indeed it is - the scary thing is the principle scales up. The same general tactics work for images, text, you name it..! Instead of three parameters though, I’m doing this over MILLIONS or BILLIONS of parameters.
Backpropagation still works!
A: ...
Wait, dad, you know about compilers? 🤔
(context: father is a lawyer, mother ran web dev teams in past)
Sadly you’ll need to forget what you know about compilers as it’s not super relevant.
<parents chuckle>
Dad: Luckily that won't be hard - I don’t know that much about them anyway.
🙃
A: I set up the overall equation, the neural network’s structure, and how it measures error against the {input, target} the neural network receives. Then I subtly adjust the training of the neural network over time.
A: Nope! 🤣
A: Not really. There are millions of parameters and calculations going on, none of which are described, so it’s hard to pull apart.
Q: … but it somehow works?
A: Backpropagation just soldiers on and decreases the error without much guidance.
A: The models generally need to be really complex when training - but then they can be squeezed down to work on your phone with little accuracy loss.
A: Oddly, we usually can’t take the phone sized network and train it to the same accuracy as the large one. Weird right?
A: Nope! It’s like the Wright brothers + Haskell era of flight.
Something’s obviously working but mostly we’re just bolting bigger engines and wings onto our plane and seeing if it works ^_^