How my life is changing as a direct result of attending the #RStudioConf 🧵

#rstats
Just 3 days ago, I had the pleasure of watching the #rstudioconf2022 kick off.

I've been attending since 2018 and watching even longer than that.

And, I was just a normal spectator in the audience until this happened.
@topepos and @juliasilge's keynote showed all of the open source work their team has been working on to build the best machine learning ecosystem in R called #tidymodels.

And then they brought this slide up.
Max and Julia then proceeded to talk about how the community members have been working on expanding the ecosystem.

- Text Recipes for Text
- Censored for Survival Modeling
- Stacks for Ensembles

And then they announced me and my work on Modeltime for Time Series!!!
I had no clue this was going to happen.

Just a spectator in the back.

My friends to both sides went nuts. Hugs, high-fives, and all.

My students in my slack channel went even more nuts.
Throughout the rest of the week, I was on cloud-9.

My students that were at the conf introduced themselves.

Much of our discussions centered around Max & Julia's keynote and the exposure that modeltime got.
And all of this wouldn't be possible without the support of this company. Rstudio / posit.

So, I'm honored to be part of something bigger than just a programming language.

And if you'd like to learn more about what I do, I'll share a few links.
The first is my modeltime package for #timeseries.

This has been a 2-year+ passion project for building the premier time series forecasting system.

It now has multiple extensions including ensembles, resampling, deep learning, and more.

business-science.github.io/modeltime/
The second is my company @bizScienc.

For the past 4-years I've dedicated myself to teaching students how to apply data science to business.

I have 3000+ students worldwide.

Here are some of my tribe that I met at #rstudioconf2022.
The third is my 40-minute webinar.

I put a free presentation together to help you on your journey to become a data scientist.

A few things I talk about:

Modeltime for Time Series.
Tidymodels & H2O for Machine Learning
Shiny for Web Apps
and 7 more!

learn.business-science.io/free-rtrack-ma…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with 🔥 Matt Dancho (Business Science) 🔥

🔥 Matt Dancho (Business Science) 🔥 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mdancho84

Mar 5
A Python Library for Time Series using Hidden Markov Models.

Let me introduce you to hmmlearn. Image
1. Hidden Markov Models

A Hidden Markov Model (HMM) is a statistical model that describes a sequence of observable events where the underlying process generating those events is not directly visible, meaning there are "hidden states" that influence the observed data, but you can only see the results of those states, not the states themselvesImage
2. HMM for Time Series with hmmlearn

hmmlearn implements the Hidden Markov Models (HMMs).

We can use HMM for time series. Example: Using HMM to understand earthquakes.

Tutorial: hmmlearn.readthedocs.io/en/latest/auto…Image
Read 9 tweets
Mar 1
Python has crazy forecasting libraries.

Let me introduce you to Kats, by Meta (Facebook) Image
Kats is a toolkit to analyze time series data and a lightweight, easy-to-use, and generalizable framework to perform time series analysis. It covers:

- Forecasting
- Detection
- Feature Extraction
- Simulation
1. Forecasting

Kats integrates popular models like ARIMA, Prophet, Holt Winters, and VAR.

Kats also has functions for ensemble models and multivariate forecasts.

See example: github.com/facebookresear…Image
Read 9 tweets
Mar 1
The price of the Python AI/ML Stack I've been using for 12 months:

Langchain $0
Langgraph $0
Scikit Learn $0
H2O $0
Torch $0
Pandas $0
Numpy $0
Plotly $0
Statsmodels $0
Ollama $0
OpenAI (<$1.00 per month)

Becoming a Generative AI Data Scientist cost me $12: 🧵 Image
1. Environment:

- VSCode
- Conda
- Jupyter VSCode Integration

Start here: code.visualstudio.com/docs/datascien…Image
2. Data Analysis and Visualization:

- Pandas
- Plotly
- Numpy
- Statsmodels
- Scipy

Start here: pandas.pydata.org/docs/getting_s…Image
Read 10 tweets
Feb 25
AI is about to kill Tableau and PowerBI.

Every dashboard can now be created in seconds with these Free Agents: Image
Agents can now create these dashboards:

1. Content Performance
2. Email Performance
3. Google Analytics
4. Historical Sales Trends
4. Churn and Subscription Renewal Image
AI Agents can:

- Write SQL
- Generate Data Visualizations
- Draft Reports
- Query CRM
- Recommend Actions Image
Read 5 tweets
Feb 23
6 statistical methods that can be used for A/B Testing (and when to use them). 🧵 Image
A/B Testing is a staple of data science and data analyst interviews.

And it's the Number 1 technique that companies benefit from in improving customer revenue.

So here's a 6 of the most common stat methods used in A/B testing.

Let's dive in.
1. Z-Test (Standard Score Test):

Ideal for large sample sizes (typically over 30) and when the population variance is known.

Compares the mean of two groups to see if they are different from each other.

Often used in conversion rate optimization, click-through rates.
Read 11 tweets
Feb 22
It took me 5 years to master all 24 of these machine learning concepts.

In the next 24 days, I'll teach them to you one by one (with examples of how I've used them). Here's what's coming:

1. Linear Regression
2. Clustering
3. Decision Tree
4. Neural Networks
5. Reinforcement Learning
6. Logistic Regression
7. Naive BayesImage
8. Supervised Learning
9. Support Vector Machine
10. Probability
11. Random Forest
12. Variance
13. Evaluation Metrics
14. Bagging
15. Data Wrangling
16. Dimensionality Reduction
17. K-nearest Neighbors Algorithm
18. Programming
19. Regularization
20. Statistics
21. Binomial Distribution
22. Bootstrap Sampling
23. Exploratory Data Analysis
24. Data Collection
25. There's a new problem that has surfaced that is changing data science-- Companies NOW want AI.

AI is the single biggest force of our decade.

Yet 99% of data scientists are ignoring it.

That's a huge advantage to you. I'd like to help.
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(