Santiago Profile picture
28 Sep, 14 tweets, 3 min read
"You can't use an algorithm unless you understand how it works."

That's what many people say. But I don't believe it.

This is how you can build expertise: ↓
We all learn new things in different ways.

Personally, I'm a huge proponent of learning on-demand:

• Start with a problem
• Try to solve it
• Incorporate new knowledge as you go
Almost every time:

I start using new techniques with a very superficial understanding of how they work.

Sometimes, I only know they *do* work but have no idea how.
As I progress and find problems, I'm forced to understand more about what I'm doing.

At a high level, this is what I do:

Find a solution that works → Learn how it works
Here is a specific example:

I trained my first model to classify Dogs and Cats before I understood how Convolutional Neural Networks worked.

• I solved the exercise
• Found a problem
• Started digging for a solution
• Learned how to do it
• Fixed the problem
• Repeat
This is important:

• Using something doesn't make you an expert on it.

Copying code to solve a specific problem doesn't automatically qualify you to push that to production.

You have to earn that right.
This is where people get stuck:

Many assume that those who start with a hands-on approach are skipping ahead. They are dangerous, and their solutions are mediocre.

This is a poor generalization.
Knowing the theory doesn't qualify you to drive a car. Knowing the physical aspects of driving doesn't qualify you either.

You need a combination of both.

The order that you follow to learn doesn't matter.
We all have to gain the privilege of sharing our work with the world.

You don't get extra points for learning theory first. You don't get extra points for learning theory as you go.

• We all need to show our value.
• We all need to build trust with our teams.
What's the takeaway?

• Find the path that works for you and focus on building expertise.

Over the years, the approach I explained here has proven to be exactly what I need.

Find yours.
Every week, I post 2 or 3 threads like this, breaking down machine learning concepts and giving you ideas on applying them in real-life situations.

You can find more of these at @svpino.

If you find this helpful, stay tuned: a lot more is coming.
That's an excellent opportunity to dive deeper and get a better understanding of what you are doing.

Like peeling an onion, you start layer by layer until you find the problem.

People assume that "writing code" is the only necessary requirement to build software.

It is not.

You need to gain your team's trust before they let u push code to production.

You gain trust by showing good work, not by "understanding everything."

There's a subtle—but fundamental—difference in the way we are using the idea of "understanding an algorithm" here.

My argument is that you can use algorithms that you understand their expected result without having to understand how they arrive at it.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

29 Sep
Here is a fantastic example of dimensionality reduction.

Look at the attached images. They both show the number zero (huge pixels, but convince yourself they are zeros.)

The one on your left requires 64 dimensions. The one on your right only needs 5 dimensions! ImageImage
We are cutting 92% of the dimensions but still keeping the essence of the data.

Dimensionality reduction is a key technique you should study.

This example uses singular value decomposition.

A couple more:

• Principal component analysis
• Independent component analysis
In case you are curious, here is the process to go from the first image (the one with 64 dimensions) to the second image:

1. Take the image
2. Apply singular value decomposition
3. Use top 5 resultant dimensions

I used this example from a course that I'm going through.
Read 8 tweets
27 Sep
If you are a teaching a machine learning class, or thinking of creating a course, please, dedicate some time to have your students deal with data.

Most courses mention "data is important, you know?" and right away go into the 1,001 different ways to build a model.

A better way:

Have your students practice a skill they will face the very first day they go out there.

Data is messy. Incomplete. Noisy. Dissorganized. Misslabeled.

Have them deal with this for a while. Don't worry about the modeling part.
A good exercise:

1. Give your students a dataset.
2. Give them a model.
3. Ask them to improve its performance.
4. They can't touch the model code.

They should focus exclusively on improving the data to get a better performance.

This should be great practice.
Read 4 tweets
24 Sep
I've been trying to identify the most effective trait for those building a career in software development.

If I were to give you one single recommendation, what would that be?

I think I figured it out. ↓
Here is a problem I see every day:

Most people start their careers solving the same boring exercises.

This is good in certain ways, but it also limits your experience to what everyone else is doing.

The key to getting out of this trap?

*Curiosity*
If there's a single trait that has helped me make continuous progress over the last two decades in building software, it has been a relentless curiosity.

And contrary to what many believe, you can learn to be curious.

This is what I do.
Read 9 tweets
23 Sep
I've heard multiple times that you don't need to do any feature engineering or selection whenever you are using neural networks.

This is not true.

Yes, neural networks can extract patterns and ignore unnecessary features from the dataset, but this is usually not enough.

First, neural networks can't compete with our expertise understanding the data.

As long as we know the problem and the dataset, we can come up with features that it would be really hard for a network to reproduce.

This is a highly creating process. Really hard to reproduce.
A couple of notes regarding the ability of a network to do automatic feature selection:

Yes, networks can "ignore" features that have no bearing on the prediction.

But these features can still introduce noise that degrade the performance of the model.
Read 5 tweets
21 Sep
One issue I see with people applying for a job:

They struggle to highlight their experience in an effective way.

If you are trying to get a job as a Data Scientist or Machine Learning Engineer, here is something you can do.

The first step is to stop thinking of "experience" exclusively as a synonym for employment history.

Experience is about all of the work you have done. It doesn't matter whether someone else paid for it.

If you know how to get things done, you should highlight it.
The second step is doing some inventory.

I'm sure you can find examples and exercises you've solved over the past few months.

They don't have to be end-to-end applications. They just need to showcase your knowledge and ability to make things work.

Collect them all.
Read 13 tweets
20 Sep
When designing a machine learning model, remember the "stretch pants" approach:

Don't waste time looking for pants that perfectly match your size. Instead, use large stretch pants that will shrink down to the right size.

What does this mean for your model?
The "stretch pants" approach in machine learning:

Pick a model with more capacity than you need. Then, use regularization techniques to avoid overfitting.

You gotta thank Vincent Vanhoucke, a scientist at Google, for this analogy.
One example:

Imagine designing a neural network, and you configure a hidden layer that's too small (not many neurons.)

The network may not preserve all the valuable information from the data: you don't have enough power to do it!

Realizing this is difficult.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(