**How to get URL link on X (Twitter) App**

- On the Twitter thread, click on or icon on the bottom

- Click again on or Share Via icon

- Click on Copy Link to Tweet

- Paste it above and click "Unroll Thread"!
- More info at Twitter Help

7 subscribed

Aug 7 • 7 tweets • 3 min read

Aug 6 • 6 tweets • 2 min read

Use ChatGPT instead.

This is how (Step 2 is the best). 🧵 R Shiny Web Apps take me days to build when I make them from scratch.

ChatGPT has been my secret weapon.

I use a special technique called Prompt Stacking, which is a simple idea.

Here's how Prompt Stacking works:

Aug 5 • 11 tweets • 2 min read

In 3 minutes I'll compress what I learned in 10 years of using correlation to solve business problems.

Let's go! 1. Correlation: Correlation is a statistical measure that describes the extent to which two variables change together. It can indicate whether and how strongly pairs of variables are related.

Jul 31 • 10 tweets • 3 min read

In 3 minutes, I'll share one secret that took me 3 years to figure out.

When I did, it cut my training time 10X. Let's dive in: 1. XGBoost:

XGBoost (eXtreme Gradient Boosting) is a popular machine learning algorithm, especially for structured (tabular) data. It's claim to fame is winning tons of Kaggle Competitions. But more importantly, it's fast, accurate, and easy to use. But it's also easy to screw it up.

Jul 28 • 11 tweets • 4 min read

(Number 3 is the best) 90% of data science programs don't teach you time series.

But time series forecasting has been one of the most important skills in my career, helping my last company save $15,000,000 per year.

Here are my top 8 resources:

Jul 25 • 10 tweets • 2 min read

And it's the Number 1 technique that companies benefit from in improving customer revenue.

So here's 6 of the most common stat methods used in A/B testing.

Jul 23 • 10 tweets • 3 min read

In 3 minutes, I'll demolish your confusion.

Let's dive in. 🧵 1. Type 1 Error (False Positive): This occurs when the pregnancy test tells Tom, the man, that he is pregnant. Obviously, Tom cannot be pregnant, so this result is a false alarm. In statistical terms, it's detecting an effect (in this case, pregnancy) when it actually doesn't exist.

Jul 20 • 12 tweets • 2 min read

But it took me 1 year to master ARIMA.

In 1 minute, I'll teach you what took me 1 year.

Let's go. 🧵 1. ARIMA and SARIMA are both statistical models used for forecasting time series data, where the goal is to predict future points in the series.

Jul 20 • 8 tweets • 2 min read

This is why. 🧵

#python Polars is a fast and efficient DataFrame library designed for data analysis and manipulation in Rust and Python.

It is built to provide high-performance data processing capabilities, often outperforming traditional libraries like pandas, especially with large datasets.

Jul 18 • 8 tweets • 2 min read

But it took me 2 years to understand its importance.

In 2 minutes, I'll share my best findings over the last 2 years exploring Bayesian Statistics.

Let's go. 🧵 1. Background:

"An Essay towards solving a Problem in the Doctrine of Chances," was published in 1763, two years after Bayes' death.

In this essay, Bayes addressed the problem of inverse probability, which is the basis of what is now known as Bayesian probability.

Jul 5 • 9 tweets • 3 min read

But for years, I had no clue what I was doing. In 3 minutes, I’ll share 3 months of research (business case included).

Let’s go: 🧵 1. XGBoost, which stands for Extreme Gradient Boosting, is an advanced implementation of the gradient boosting machine (GBM) algorithm. It was developed to optimize both computational speed and model performance.

Jul 3 • 5 tweets • 2 min read

Here's how I want to help you do it too: 🧵

#python #rstats Sales forecasting is a struggle for many data scientists.

Most college DS programs don't teach it (or gloss over it).

But it's one of the most important skills that companies need.

There are 2 incredible tools out there that I want to share with you:

Jun 29 • 9 tweets • 2 min read

But it took me 2 years to understand its importance.

In 2 minutes, I'll share my best findings over the last 2 years exploring Bayesian Statistics.

Let's go. 1. Background: "An Essay towards solving a Problem in the Doctrine of Chances," was published in 1763, two years after Bayes' death. In this essay, Bayes addressed the problem of inverse probability, which is the basis of what is now known as Bayesian probability.

May 31 • 11 tweets • 3 min read

But it took me 2 years to figure out mistakes that were killing my regression models.

In 2 minutes, I'll share how I fixed 2 years of mistakes (and made 50% more accurate models than my peers).

Let's go: 🧵 1. R-squared (R2):

Is a statistical measure used in regression models that provides a measure of how well the observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model.

May 26 • 9 tweets • 2 min read

But it took me 2 years to understand its importance.

In 2 minutes, I'll share my best findings over the last 2 years exploring Bayesian Statistics. Let's go. 1. Background:

"An Essay towards solving a Problem in the Doctrine of Chances," was published in 1763, two years after Bayes' death.

In this essay, Bayes addressed the problem of inverse probability, which is the basis of what is now known as Bayesian probability.

May 25 • 11 tweets • 4 min read

In 5 minutes, I'll share 5 years of experience using 8 common forecasting algorithms. 🧵 1. ARIMA (AutoRegressive Integrated Moving Average):

Uses a Linear Regression as the base model.

Captures autoregressive and moving average terms, along with integrating the differencing of raw observations (to make the time series stationary).

May 18 • 11 tweets • 3 min read

In 5 minutes, I'll share 5 years of experimentation with dozens of Cross Validation techniques.

Let's dive in. 🧵 1. Cross Validation Goals:

Cross-validation is a statistical method used to estimate the accuracy of machine learning models

It's also used to measure the stability of models when combined with hyperparameter tuning of machine learning models.

May 10 • 10 tweets • 3 min read

In 2 minutes, I'll share my best findings over the last 2 years exploring Bayesian Modeling.

Let's go. 🧵 1. Why Bayesian Data Analysis?

Bayesian modeling is a powerful tool in statistics and data science, especially where traditional approaches fall short.

It avoids arbitrary assumptions and provides distributions of possible values instead of just point estimates.