Santiago Profile picture
22 Nov, 13 tweets, 2 min read
10 questions that spark conversations, make you think, and give you a solid foundation of practical Machine Learning.

(Some) interviews are broken.

They focus on trivia and expect candidates to recall concepts that aren't even relevant for the job.

This is garbage.

Instead, focus on problems that scientists and engineers face every day while doing their jobs: πŸ‘‡
Acme Inc. is building a model to classify images in several different categories.

Unfortunately, they don't have a lot of images for some of the classes.

How would you handle such an imbalanced dataset?

(1 of 10)
Acme Inc. gives you access to the data and code used to train their model.

They have been using it for some time with mixed results. They suspect the model might be overfitting.

How do you find out whether this is the case and fix it?

(2 of 10)
Acme Inc. has a deep learning image classification model that performs very well with most images.

Unfortunately, this is not good enough when lives are at stake.

How do you determine the uncertainty of predictions from the model to reduce the mistakes?

(3 of 10)
Acme Inc. wants you to build a couple of models using a dataset they built over the last few years.

But it turns out that capturing a lot of sensor data at scale is a complicated task.

How do you handle missing or corrupted data in this dataset?

(4 of 10)
Acme Inc. is deploying a model to predict whether their equipment is about to break.

They'd like to minimize mistakes, especially false negatives, because their cost is prohibitive.

How would you design a system that minimizes Type II errors?

(5 of 10)
Over the last few years, Acme Inc. was able to collect a lot of data.

Unfortunately, labeling the data is very costly, and they would like you to help.

How would you label as much data as possible, minimizing the cost of doing so?

(6 of 10)
Acme Inc. is building a pipeline that goes all the way from training to deployment.

There's a single missing piece to complete the process:

How would you automatically determine whether the new model is better than the one in production?

(7 of 10)
Acme Inc. has accumulated some data that they want to use in a classification problem.

Before giving you a job, they would like to know which algorithm you'd use.

How would you approach the process of selecting a suitable algorithm?

(8 of 10)
The latest version of Acme Inc.'s model is showing a whooping 99% accuracy detecting fraudulent transactions.

In production, however, during a manual audit, they found out that the model is not catching anything relevant.

How would you tackle this problem?

(9 of 10)

(10 of 10)
This is the second time that Acme Inc.'s algorithms are significantly off predicting the election results.

They would like to try something new. They would like to predict the outcoming election using Twitter data.

How would you design this solution?

(11 of 10)

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!


Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

21 Nov
A plan to get a job as a Machine Learning Engineer.

Put in the work, level up, and get ready to demonstrate that you can deliver value.

You'll have to answer technical questions. Study up.

(If you aren't prepared, you won't pass the first round of interviews.)

(1 of 10)
Focus on showing, not telling.

What can you do today that will serve you as an asset when justifying your experience?

Creating a strong portfolio showing what you are capable of is the most important step you can take.

(2 of 10)
Read 12 tweets
19 Nov
Everything I know about great Software Developers.

1. Great Software Developers are humble.

They never put themselves above anyone else. They are willing to leverage existing solutions and listen to others.

(1 of 15)
2. Great Software Developers are self-motivated to learn.

They never stop improving and never get complacent. They understand the importance of growing their skills.

(2 of 15)
Read 24 tweets
24 Oct
33 applications of Machine Learning, 3 different categories.

(And there are so many more it's not even funny!)

It doesn't matter what you enjoy in life. There's something here for you!

▫️ Natural Language Processing Applications

1. Speech recognition
2. Answering questions
3. Translation
4. Generating content
5. Summarizing documents
6. Sentiment analysis
7. Virtual assistants
8. Classifying text
9. Autocorrection
10. Urgency detection
11. Text extraction

▫️ Computer Vision Applications

1. Face recognition
2. Image captioning
3. Image coloring
4. Object detection
5. Image classification
6. Pose estimation
7. Image transformation
8. Image analysis
9. Automatic drone inspections
10. Defect detection
11. Image restoration

Read 4 tweets
22 Oct
A quick, non-technical explanation of Dropout.

(As easy as I could make it.)

Remember those two kids from school that sat together and copied from each other during exams?

They aced every test but were hardly brilliant, remember?

Eventually, the teacher had to set them apart. That was the only way to force them to learn.

The same happens with neural networks.

Sometimes, a few hidden units create associations that, over time, provide most of the predictive power, forcing the network to ignore the rest.

This is called co-adaptation, and it prevents networks from generalizing appropriately.

Read 7 tweets
21 Oct
I always get Normalization and Standardization mixed up.

But they are different.

Notes about them and why do we care.

Feature scaling is key for a lot of Machine Learning algorithms to work well.

We always want all of our data on the same scale.

Imagine we are working with a dataset of workers.

"Age" will range between 16 and 90.
"Salary" will range between 15,000 and 150,000.

Huge disparity!

Salary will dominate any comparisons because of its magnitude.

We can fix that by scaling both features.
Read 7 tweets
20 Oct
I'm a full-on AI proponent.

But I really don't like the idea of facial recognition software.

This is why.

▫️It violates our right to privacy

Do you really want thousands of photos with your face stored in hundreds of databases all over the place?

Photos that will be automatically tagged with your personal information.

And you won't have any control over this.

▫️Lack of regulations makes this scary.

Who will be able to use this? Do we have to give consent? Can we trust this? How is this information going to be used? With what purposes?

Are we gonna get tracked every time, everywhere?

Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!