πŸ”₯ Matt Dancho (Business Science) πŸ”₯ Profile picture
On a mission to grow your business data science skills and accelerate your career | Get my free 5-day business for data scientists course πŸ‘‡
Maleph Profile picture David Otoosakyi Profile picture Learning in Public - Coding - DataSci Profile picture Dr. Isaiah McCommons Profile picture 5 subscribed
Nov 30, 2023 β€’ 6 tweets β€’ 2 min read
90% of data scientists overlook how to design A/B Testing experiments.

4 tips for better experiments: 🧡

#DataScience #ABTesting Image Tip 1: Include a pre-test

Pretest data is unaffected data before the actual A/B test or Time-based Experiment.

Pre-test is a secret used by Booking(dot)com in their CUPED A/B Test method for reducing variance (and improving decision-making from A/B Test results).
Nov 27, 2023 β€’ 8 tweets β€’ 2 min read
Both Bayesian and Frequentist approaches to A/B testing have strengths (and weaknesses).

Here's a quick selection guide with 4 Pros/Cons. 🧡

#Bayesian #Frequentist #MachineLearning #ABTesting Image πŸ’‘ 4 Reasons for the #Frequentist Approach for A/B testing

1. Fixed Sample Size: Requires pre-determination of sample size. Ideal when sample size cannot change once the test begins.
Nov 20, 2023 β€’ 8 tweets β€’ 2 min read
Stop using frequentist approaches for A/B Testing.

Use Bayesian instead.

Bayesian has 5 key advantages: 🧡

#DataScience #Bayesian #Rstats #Python Image 1. Intuitive Interpretation:

Bayesian methods provide results in terms of probabilities.

Bayesian probabilities are more intuitive to understand AND more accurate compared to t-test or linear regression p-values.
Nov 13, 2023 β€’ 9 tweets β€’ 2 min read
12 mistakes that Data Scientists (and even statisticians) make:🧡

#DataScience #Statistics #DataAnalysis #CommonMistakes #CriticalThinking #DataIntegrity Image Even seasoned data professionals can fall into traps.

Here are 12 common mistakes and misconceptions in statistics and data science:

1. Correlation vs Causation πŸ”„

Mistaking correlation for causation is a classic error! Remember, correlation does not imply causation. 🚫
Nov 12, 2023 β€’ 9 tweets β€’ 3 min read
Can ChatGPT be used for Time Series?

A thread with #R code.

#rstats I've been using ChatGPT a lot more. But one question I had is whether or not it could be used for Time Series.

In this thread, we'll:

1. Show that chatgpt can write time series code in R
2. Provide code examples
3. Show the app that ChatGPT built for me πŸ‘‡ Image
Oct 21, 2023 β€’ 5 tweets β€’ 2 min read
90% of data scientists struggle with Price Elasticity and Optimization.

Why?

Outliers.

This is how to save your company. (And your career) 🧡

#datascience #stats #BusinessAnalytics Image Demand is not static. It's constantly changing. And this costs Data Scientists companies and their careers.

But there's a new technique that can help.

Quantile GAMs.
Oct 13, 2023 β€’ 5 tweets β€’ 2 min read
90% of data scientists struggle with price optimization.

Demand patterns are complex.

I have good news. 🧡

#datascience #PriceElasticity #Optimization #Python #Rstats Image Demand patterns are complex.

The competitive landscape is ever-changing.

And prices are elastic, which is costing these companies (and their careers).
Sep 28, 2023 β€’ 9 tweets β€’ 2 min read
12 mistakes that Data Scientists (and even statisticians) make:🧡

#DataScience #Statistics #DataAnalysis #CommonMistakes #CriticalThinking #DataIntegrityEven Image Even seasoned data professionals can fall into traps.

Here are 12 common mistakes and misconceptions in statistics and data science:

1. Correlation vs Causation πŸ”„

Mistaking correlation for causation is a classic error! Remember, correlation does not imply causation. 🚫
Sep 26, 2023 β€’ 4 tweets β€’ 2 min read
My new timetk for python package just got an upgrade!

Time series plotting.

Here's the details. 🧡

#datascience #timeseries #rstats #python Image If you're familiar with Time Series in #R then you've probably seen me use my Timetk in R package (2,000,000+ downloads).

Time series plotting is one of its best features.

Link: business-science.github.io/timetk/article…
Image
Sep 24, 2023 β€’ 6 tweets β€’ 2 min read
Introducing our new time series package in Python.

And how to learn Time Series Analysis in R and Python for free.

This is how. 🧡

#DataScience #Rstats #Python Image over the past 5 years, I've developed dozens of open-source time series packages in R.

Timetk for RΒ is my most popular time series package with over 2,000,000 downloads.

And time series R fans love anomalize, modeltime, and the 5+ extensions my students and I have developed.
Sep 7, 2023 β€’ 6 tweets β€’ 2 min read
When Clustering in Python for Customer Segmentation...

PCA is a secret weapon in the struggle to scale beyond 50,0000 customers.

Here's why. 🧡

#datascience #analytics #python Image A common business case is to apply K-means to 100,000 or even 1,000,000 customers.

K-means takes several hours because as databases have increased in size, algorithms like K-Means and DBSCAN slow down.

I've spent the past 4 weeks researching ways to speed it up.
Sep 3, 2023 β€’ 6 tweets β€’ 2 min read
Python's Scikit Learn has 12 clustering algorithms.

With all these options, why do 90% of data scientists still struggle with clustering and customer segmentation?

#datascience #analytics #python #rstats Image πŸ‘‰ Too many algorithms

More algorithms makes it challenging to select the right approach.

Fortunately, all of these options are not created equal.

I prefer K-means, HDBCAN, but I like them for different reasons.

Outliers and scalability can influence which I use and when.
Aug 23, 2023 β€’ 12 tweets β€’ 3 min read
Clustering is a superpower.

Learn it and you're an unstoppable force.

Use these 5 powerful tricks to master clustering in 10 minutes:

#datascience #python #rstats Image 90% of data scientists struggle with clustering.

As databases increase in size, methods like K-means or DBSCAN can fail or become slow.

But I have 5 power-tricks to help you cluster like a PRO:
Aug 18, 2023 β€’ 12 tweets β€’ 4 min read
I put together these 5 time series tutorial to help you learn more than a college degree will teach you.

Learn in 7 minutes what took me 7 years.

#datascience #rstats Image 1. (Start Here) Time Series Forecasting Foundations

Use this to immediately 10X your companies forecasting capability.

This is the foundation of forecasting...

...with my Modeltime Forecasting framework.

Article: https://t.co/zFXL8yYJoqbuff.ly/3RJmVio
Image
Jul 21, 2023 β€’ 8 tweets β€’ 3 min read
ChatGPT was a struggle for me at first.

And, it actually took me months to confidently crush projects with it.

So I put together 5 tips that I wish someone would have told me.

Here you go.

#datascience #chatgpt Image 1. Start small:

A big mistake is trying to do something 'monumental'

Instead, do small things that add up to big things. Image
Jul 17, 2023 β€’ 11 tweets β€’ 2 min read
It's 2023 and traditional data science is broken.

There's a new type of Data Professional forming.

Here are the 2023 #datascience #trends I'm seeing... 🧡

#datascientist #career Image In the past, companies were loading up on bulky data science teams with:

3 Data Scientists ($129,680/yr) πŸ‘¨β€πŸ”¬πŸ‘©β€πŸ”¬πŸ‘¨β€πŸ”¬
2 Data Engineers ($129,244/yr) πŸ‘¨β€πŸ”§πŸ‘©β€πŸ”§
2 Machine Learning Engineers ($138,244/yr) πŸ‘¨β€πŸ³πŸ‘©β€πŸ³
2 Data Analysts ($83,924/yr) πŸ‘¨β€πŸ’ΌπŸ‘©β€πŸ’Ό
1 Project Manager ($140,500/yr) πŸ•΅οΈ
Jul 4, 2023 β€’ 12 tweets β€’ 5 min read
Did you know that R has a killer forecasting ecosystem?

Here's everything you need to know in under 2 minutes. 🧡

#rstats #datascience #timeseries Image R's forecasting ecosystem is ABSOLUTELY a thing of beauty.

You get:

1. Time Series Analysis
2. Forecasting
3. Ensembles
4. Resampling & Backtesting
5. AutoML
6. Deep Learning

(full disclosure - I created all the R packages I'm about to cover)
Jun 1, 2023 β€’ 8 tweets β€’ 3 min read
I've been experimenting with #chatgpt for #datascience for 16 weeks.

And I now have a process I'm happy with.

Here are the details. 🧡

#datascience #rstats Image Using ChatGPT for data science has been a MASSIVE learning curve.

I began using it for complex workflows.

And I FAILED miserably.
May 31, 2023 β€’ 9 tweets β€’ 3 min read
It took me 5-years to feel confident in data science.

True story.🧡

#datascience #rstats Image This is coming from a person that has created two R packages that combine for 1.5 Million downloads.

Has trained elite data scientists at Apple, Walmart, Google.

And has built a career teaching students how to become data scientists.

Why did it take so long?
May 31, 2023 β€’ 19 tweets β€’ 7 min read
There are over 2,000 AI tools that have hit the market over the last 365 days.

So I condensed them into the best.

Here are the TOP 15 AI TOOLS for Data Scientists. 🧡

#datascience #rstats #python #career #ai Image It's hard not to get excited about #AI. The potential is insane. It's also scary.

And the worst thing you can do for your career is ignore AI.

I mean, there are literally 2,000 new tools that have hit the market in 365 days. So where do you start?

I want to help.
May 31, 2023 β€’ 12 tweets β€’ 3 min read
3 battle-tested skills that every data scientist should have.

(and how to apply them to a job interview)🧡

#datascience #skills #rstats #python Image People don’t realize this but I was a data science consultant and corporate trainer...

That was long before I was a β€œteacher” and a β€œ6-figure data science mentor”.

That’s where I learned these skills through battle-testing.

And my clients were my test subjects. πŸ§ͺπŸ§‘β€πŸ”¬