Why does every beginner data scientist fall for the "deep learning trap"?

True story 🧵

#rstats #datascience #deeplearning
When I was first learning data science this cost me at least 6-months. Seriously...

I was building a model for predicting which quotes would become orders.
I had just finished using a linear regression (didn't know about logistic yet) to make a predictive model.

Yeah I know - I was a noobie using regression instead of classification. So what?!
Well eventually I found out about logistic regression and I actually built my first usable model. Win!

Here's what happened next.
I tried to improve my model with Deep Learning. HAHAHAHAH. Big mistake.

Oh god this is funny.
So I began researching deep learning BECAUSE I saw someone tweet about TensorFlow and Keras on twitter.

And then I saw another person talk about Torch on LinkedIn.

And I heard them saying, "this is so crazy, I can predict images."
What were they doing?

Predicting cats and dogs.

But like a noobie, I said, "I gotta learn this."
You know, if they can predict CATS AND DOGS, then this deep learning stuff has to be good, right?!

WRONG.
I spent 3-months researching tensorflow.

Then I spent another month learning Keras because tensorflow was a flipping nightmare.

Then I spent another month trying to build a classification model.
So anyways, after 6-months I finally got SOMETHING that worked.

And it was WORSE than my linear regression model.
I'm not even talking about my logistic regression model, the one that was actually getting decent results.

I'm talking LINEAR REGRESSION that has no business being used for CLASSIFICATION.
So yeah, deep learning was worse than my crappy linear regression classification model. 😡
But the good news is that Logistic Regression model made my company $15,000,000.

So at the end of the day I still got promoted. And a hefty 100% increase in salary.
My point is when you're learning, there's a 1000 things you CAN learn.

But what SHOULD you learn?

I'd like to help.
I put together a FREE 40-minute webinar that consolidates the 10 things that helped me the most in my journey.

This webinar if watched through til the end CAN save you years of learning data science.

learn.business-science.io/free-rtrack-ma…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Matt Dancho (Business Science)

Matt Dancho (Business Science) Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mdancho84

Sep 27
When I first learned R, I struggled making data visualizations with ggplot2.

Here are the 3 things that helped me.

#rstats #R #datascience #datavisualization
Data visualizations are absolutely the most important thing to learn because of story telling...

... the ability to help your business take action.

AND the most powerful R library for static data visualization is ggplot2.
But ggplot2 has a STEEP learning curve.

3 things that helped me...
Read 14 tweets
Sep 26
I hate to say it but...

#Shiny is giving tableau a run for it's money.

Here's why...

#rstats
Tableau is a great tool. For descriptive analysis...

...but it's terrible at predictive analysis.

Enter Shiny.
Shiny's big con is that it takes forever to build an app.

You still need to know HTML & CSS to make it look good.

UNTIL NOW.
Read 10 tweets
Sep 26
Embarrassed by your #R code?

Here are 4 mistakes beginner R coders make AND how to avoid them.

#rstats #datascience
The reality is you aren't going to become a master R programmer over night.

But I see beginners making the same mistakes time and time again.

And they are easy to correct.

Here are the 4 most common mistakes and how to easily correct them.
1. Not using comments

This is a huge no-no.

Why?

Because comments help others understand your code INCLUDING future you.
Read 13 tweets
Sep 17
Shiny is a powerful tool that data scientists can use for web apps & production.

But most data scientists struggle.

Here are 7 resources on shiny that helped me.

#rstats #shiny #excel #python
1. The Shiny website

The 1st place to go to learn shiny.

shiny.rstudio.com
2. Flexdashboard website

Flexdashboard combines Rmarkdown & Shiny to make quick apps.

pkgs.rstudio.com/flexdashboard/
Read 10 tweets
Sep 14
TODAY. I'm excited to share 2 years of research + 6 software packages that went into Time Series Analysis...

And it's not what you think... 🧵

#rstats #datascience #timeseries #python #excel Image
I won't be talking about ARIMA.

Or, focusing on stationarity.
And, I most certainly will NOT be talking about:

1. Prophet

2. Exponential smoothing

3. Holt winters

4. Time series decomposition

5. OR any other "common techniques"
Read 5 tweets
Sep 14
When it comes to Time Series, colleges and universities have it all wrong.

A time series thread 🧵

#rstats #excel #python #timeseries
Universities are stuck in the past, teaching ARIMA.

But the cold reality is that ARIMA is NOT winning time series competitions & ARIMA is NOT helping companies solve BIG forecasting problems.
To be frank, ARIMA is too slow.

When you use ARIMA, you fall into a trap. You think, hey, this is what they're teaching me...

It must be good, right?
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(