Santiago Profile picture
4 Mar, 4 tweets, 1 min read
It depends on what a "large model" means to you.

If we are talking about "ridiculously large," like in ImageNet-type-large, then you probably won't be able to do it for free.

But then, why would you want to train such a massive model?
If we get it down a couple of notches, we get into the realm of "a few days of training" as long as you have a GPU.

You could train this locally, assuming you have a computer with the right hardware. But that computer won't be cheap either.
If we are talking about hours of training, then Google Colab might be all you need, and you can access it for free.

It won't be the best experience, but again, hard to beat free.
Finally, as a student, and assuming you are learning this stuff, you don't need to obsess on training large models.

Focus on learning the ropes, and grow into a position where a company will worry about paying the bills.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

6 Mar
16 questions that I really like to ask during interviews for machine learning candidates.

These will help you practice, but more importantly, they will help you think and find ways to improve!

Let's do this! ☕️👇
1. How do you handle an imbalanced dataset and avoid the majority class completely overtaking the other?

2. How do you deal with out-of-distribution samples in a classification problem?

3. How would you design a system that minimizes Type II errors?
4. Sometimes, the validation loss of your model is consistently lower than your training loss. Why could this be happening, and how can you fix it?

5. Explain what you would expect to see if we use a learning rate that's too large to train a neural network.
Read 9 tweets
4 Mar
Today, a tiny percentage of companies are currently using AI/ML at scale.

Do you imagine the possibilities as this continues to grow?

Who wants to be in the center of it all?
And paying a buttload of money for people that are the right fit.

Read 6 tweets
4 Mar
When designing your neural network, you first want to focus on your training loss.

Overfit the heck of your data and get that loss as low as you can!

Only after that should you start regularizing and focusing on your validation loss.

☕️🧵👇
Always try to overfit first.

Getting here is a good thing: you know your model is working as it should!

If you can't get your model to overfit, there's probably something wrong with your configuration.
How do you overfit? Pick a model that's large enough for the data.

Large enough means it has enough parameters (layers, filters, nodes) to memorize your data.

You can also so try to overfit a portion of your dataset. Fewer samples will be easier to overfit.
Read 11 tweets
3 Mar
The more you grow, the more you realize that the language you use doesn't matter at all.

JavaScript, Python, or whatever you use represents exactly $0 of your take-home pay every month.

The value you produce using these languages is the remaining 100%.
I’ve never had a conversation with a client that cared about a specific language, other than those wanting to build on top of an existing codebase.
Every line of code is a liability.

Corollary: The best code is the one nobody wrote.
Read 4 tweets
3 Mar
The two questions related to neural networks that I hear most often:

▫️ How many layers should I use?
▫️ How many neurons per layer should I use?

There are some rules of thumb that I'll share with you after you get your ☕️ ready.

🧵👇
First, let's get this out of the way:

A neural network with a single hidden layer can model any function regardless of how complex it is (assuming it has enough neurons.)

Check the "Universal Approximation Theorem" if you don't believe me.

So, if we can do it all with a single layer, why bother adding more layers?

Well, it turns out that a neural network with a single layer will overfit really quick.

The more neurons you add to it, the better it will become at memorizing stuff.

That is bad news.

Read 12 tweets
2 Mar
Let's talk about how you can build your first machine learning solution.

(And let's make sure we piss off half the industry in the process.)

Grab that ☕️, and let's go! 🧵
Contrary to popular belief, your first attempt at deploying machine learning should not use TensorFlow, PyTorch, Scikit-Learn, or any other fancy machine learning framework or library.

Your first solution should be a bunch of if-then-else conditions.
Regular, ol' conditions make for a great MVP solution to a machine learning wannabe system.

Pair those conditions with a human, and you have your first system in production!

Conditions handle what they can. Humans handle the rest.
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!