How important is having access to a GPU when we deploy a machine learning model?
Let's talk about this in this short thread.
↓ 1/7
We all know that having a GPU is a must to train a deep learning model, especially when we have a lot of data.
But that's during training. How about when using the model?
↓ 2/7
The right answer depends on the problem we are trying to solve, but more often than not, a GPU is not necessary during inference time.
And depending on its cost, a GPU might even be counterproductive: the value it gives us may pale compared to its cost.
↓ 3/7
Let's imagine a deep learning model for computer vision.
When we are training it, we are scheduling operations on the GPU for many images at a time. We benefit from having to run all of these operations at the same time.
But when we deploy this model, things change.
↓ 4/7
First, assuming that the model is used to predict one image, we won't need to run too many operations to process every request.
Second, depending on how popular the model is, we may only get sporadic requests to do any processing.
↓ 5/7
But we usually have to pay the cost of a GPU regardless of how much we use it. And a GPU is much more expensive than a CPU!
If we can't keep a GPU busy, then it becomes a liability.
Think about this when deciding the infrastructure you need.
↓ 6/7
Of course, you may be able to batch requests for your use case, or you may own the GPU and won't need to pay any extra for it.
Thinking about this is important, but you know what's even more important? Following me @svpino for more threads like this one! 😎
7/7
• • •
Missing some Tweet in this thread? You can try to
force a refresh
If you are looking for more, this course from Harvard University is an excellent introduction to probability as a language and a set of tools for understanding statistics, science, risk, and randomness.
A lot in machine learning is pretty dry and boring, but understanding how autoencoders work feels different.
This is a thread about autoencoders, things they can do, and a pretty cool example.
↓ 1/10
Autoencoders are lossy data compression algorithms built using neural networks.
A network encodes (compresses) the original input into an intermediate representation, and another network reverses the process to get the same input back.
↓ 2/10
The encoding process "generalizes" the input data.
I like to think of it as the Summarizer in Chief of the network: its entire job is to represent the entire dataset as compactly as possible, so the decoder can do a decent job at reproducing the original data back.