Modeling in R is extremely powerful for business analytics...

But many beginners get stuck.

Here's my simple 3-step process to make a linear regression model in R. 🧵

#datascience #business #R #rstats Image
To give some background, these simple 5 lines of code create a basic business solution...

... I'm modeling my ...

Target = bicycle product prices (regression task)
As a function of my predictors:

Predictors = product categories (mountain or road bikes) and bicycle frame material (aluminum or carbon fiber).
Here's how I make a linear regression model and make predictions using the Tidymodels system in R...
👉Step 1: Make a model specification

Here I'm using linear_reg() to create a regression model.

I use set_engine() to specify "lm" for linear model.
👉Step 2: Fit the model.

I'm using fit() to specify which columns in the data are the target (my product prices) and the predictors (product category and frame material)

Make sure to fit the model on a test data set (called in-sample).
👉Step 3: Make Predictions.

This step is super easy.

Take the fitted model from Step 2, and use the predict() function.

PRO TIP: Make sure that new_data is the unseen data (out-of-sample)
Step 4: Keep Learning

If you want to keep learning, I have a free 40-minute webinar that will help you learn...

... The 10 skills that helped me become a data scientist for business.
You will learn which tools I use for:

- Data Wrangling & Visualization
- Machine Learning
- Time Series
- Web Applications

Plus 6 more...
Here's a link to my free 40-Minute Masterclass on How to Become A 6-Figure Data Scientist With R.

Enjoy!

learn.business-science.io/free-rtrack-ma… Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Matt Dancho (Business Science)

Matt Dancho (Business Science) Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mdancho84

Oct 11
Network analysis is an amazing tool for business analysis.

But there are a few challenges to be prepared for.

#datascience #rstats #businessanalysis Image
Network analysis has the potential to identify the most influential customers for a business...

But there are a few challenges that the Data Scientist needs to be prepared for.
One that I often struggle with is determining the right threshold for showing network connections.

Too low and it becomes difficult to find the most important clusters.

Too high and there aren't enough connections to tell anything.
Read 7 tweets
Oct 9
90% of data scientists are overlooking this skill for business analysis.

Yet, it's a gold mine.

Here's why...

#datascience #rstats #business #excel Image
Whether you realize it or not, your business runs off of customers.

And how they work is based on principles of social psychology.
If you understand which are the most influential customers, then you know how to market to them...

...And knowing their triggers is like adding fuel to a fire. 🔥
Read 8 tweets
Sep 30
The more I dive into Bayesian, the more... my mind is blown.

Here's why. 🧵

#rstats #python #datascience Image
First, Bayesian is like normal regression. Except way better!

It literally solves issues in-sample by sampling. Lots of times!
Second, confidence intervals are realistic.

Unlike normal regression, Bayesian regression accounts for changing variance.
Read 5 tweets
Sep 30
No computer science degree?

Here’s how to get a job in data science. 🧵

#rstats #python
Learn these:

1. A programming language: R or Python

I chose #R because it was intuitive coming from a business background (excel)
2. Statistics & Math

Learn the basics of frequency, distribution, & within group analysis.
Read 10 tweets
Sep 29
My biggest mistakes were never in my insights.

My mistakes were in overconfidence. 🧵

#rstats #datascience #python Image
In business, I've made great regression models that have predicted how much sales we were going to make.

In fact, this helped me increase revenue from $3M to $15,000,000 per year at one of the companies I worked at.
BUT my models were NOT perfect.

In fact, I'd argue that the BIGGEST flops were due to overconfidence.

Believing my model was better than it actually was.

Things that hurt me:
Read 11 tweets
Sep 27
Why does every beginner data scientist fall for the "deep learning trap"?

True story 🧵

#rstats #datascience #deeplearning
When I was first learning data science this cost me at least 6-months. Seriously...

I was building a model for predicting which quotes would become orders.
I had just finished using a linear regression (didn't know about logistic yet) to make a predictive model.

Yeah I know - I was a noobie using regression instead of classification. So what?!
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(