BIG NEWS: #ChatGPT breaks #Python vs #R Barriers in Data Science!

Data science teams everywhere rejoice.

A mind-blowing thread (with a FULL chatgpt prompt walkthrough). 🧵

#datascience #rstats
It's NOT R VS Python ANYMORE!

This is 1 example of how ChatGPT can speed up data science & GET R & PYTHON people working together.

(it blew my mind)
This example combines #R, #Python, and #Docker.

I created this example in under 10 minutes from start to finish.
I’m an R guy.

And I prefer doing my business research & analysis in R.

It's awesome. It has:

1. Tidyverse - data wrangling + visualization
2. Tidymodels - Machine Learning
3. Shiny - Apps
But the rest of my team prefers Python.

And they don't like R... it's just weird to them.

So I wanted to see if I could show them how we could work together...
Let’s start with a prompt.

I asked chatgpt to find a data set that I used for this example. Image
...ChatGPT found it... Image
... And gave me this code to read the data... Image
I prefer the tidyverse, so I asked Chatgpt to update the code. Image
That looks better. Image
With the data in hand, it’s time for some Data Science.

I asked this simple question. Image
ChatGPT's response was impressive. Image
But, even though I’m an R guy, my team uses Python for Deployment…

In the past, that’s a huge problem.

(resulting in days of translations from R to Python with Google and StackOverflow)
But now, that’s 1 minute of effort with chatGPT.

Can I show you?
I asked chatgpt to convert the R script to python... Image
And in 10 seconds chatgpt made this python code with pandas and scikit learn. Image
ChatGPT did in 10 seconds something that would have taken me 2 hours.

But let’s continue.

The reason we had to convert to Python is for “deployment”

Deployment is just a fancy word for allowing others to access my model so they can use it on-demand.
So I asked chatGPT this: Image
And ChatGPT made me a Python API using FastAPI. Image
But this code is useless…

… Without a docker environment.

So I asked chatGPT to make one: Image
And chatGPT delivered my Docker Environment's Dockerfile: Image
So in under 10 minutes, I had ChatGPT:

1. Make my research script in R.

2. Create my production script in Python for my Team

3. And create the API + Docker File to deploy it.
But when I showed my Python team, instead of excited...

...They were worried.

And I said, "Listen. There's nothing to be afraid of."

"ChatGPT is a productivity enhancer."

They didn't believe me.
My Conclusion:

You have a choice. You can rule AI.

Or, you can let AI rule you.

What do you think the better choice is?
If you want help, I'd like you to join me on a free #ChatGPT for #DataScientists Workshop on April 26th. And I will help you Rule AI.

What's the next step?

👉Register Here: us02web.zoom.us/webinar/regist… Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with 🔥 Matt Dancho (Business Science) 🔥

🔥 Matt Dancho (Business Science) 🔥 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mdancho84

May 31
R-squared is one of the most commonly used metrics to measure performance.

But it took me 2 years to figure out mistakes that were killing my regression models.

In 2 minutes, I'll share how I fixed 2 years of mistakes (and made 50% more accurate models than my peers).

Let's go: 🧵Image
1. R-squared (R2):

Is a statistical measure used in regression models that provides a measure of how well the observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model.
2. Range (0 to 1 typically):

R2 typically ranges from 0 to 1.

A higher R2 value indicates a better fit between the prediction and the actual data.

For example, an R2 value of 0.70 suggests that 70% of the variance in the dependent variable is predictable from the independent variable(s).
Read 11 tweets
May 26
Bayes' Theorem is a fundamental concept in data science.

But it took me 2 years to understand its importance.

In 2 minutes, I'll share my best findings over the last 2 years exploring Bayesian Statistics. Let's go. Image
1. Background:

"An Essay towards solving a Problem in the Doctrine of Chances," was published in 1763, two years after Bayes' death.

In this essay, Bayes addressed the problem of inverse probability, which is the basis of what is now known as Bayesian probability.
2. Bayes' Theorem:

Provides a mathematical formula to update the probability for a hypothesis as more evidence or information becomes available

It describes how to revise existing predictions or theories in light of new evidence, a process known as Bayesian inference.
Read 9 tweets
May 25
Time series forecasting can be confusing to navigate so many algorithms.

In 5 minutes, I'll share 5 years of experience using 8 common forecasting algorithms. 🧵 Image
1. ARIMA (AutoRegressive Integrated Moving Average):

Uses a Linear Regression as the base model.

Captures autoregressive and moving average terms, along with integrating the differencing of raw observations (to make the time series stationary). Image
2. Prophet:

A forecasting tool developed by Facebook

Designed for data with daily observations and seasonal patterns, employing an additive model with yearly, weekly, and daily components

Local forecasting (requires loops). Image
Read 11 tweets
May 18
When I was first learning data science, one of the things that tripped me up the most was Cross Validation.

In 5 minutes, I'll share 5 years of experimentation with dozens of Cross Validation techniques.

Let's dive in. 🧵 Image
1. Cross Validation Goals:

Cross-validation is a statistical method used to estimate the accuracy of machine learning models

It's also used to measure the stability of models when combined with hyperparameter tuning of machine learning models.
2. Principles & Terminology:

The main principle behind cross-validation is partitioning a sample of data into complementary subsets, performing the analysis on one subset, and validating the analysis on the other subset (called the assessment set).
Read 11 tweets
May 10
Bayesian data analysis is a fundamental concept in data science. But it took me 2 years to understand its importance.

In 2 minutes, I'll share my best findings over the last 2 years exploring Bayesian Modeling.

Let's go. 🧵 Image
1. Why Bayesian Data Analysis?

Bayesian modeling is a powerful tool in statistics and data science, especially where traditional approaches fall short.

It avoids arbitrary assumptions and provides distributions of possible values instead of just point estimates.
2. Bayes Theorem:

Bayesian modeling is based on Bayes’ theorem.

Bayes' Theorem provides a mathematical formula to update the probability for a hypothesis as more evidence or information becomes available.

It describes how to revise existing predictions or theories in light of new evidence, a process known as Bayesian inference.
Read 10 tweets
May 8
Why data scientists should stop ignoring AI.

A thread🧵 Image
I get it. Yet another "hypecycle".

In 2016 it was Deep Learning.

Now it's Generative AI. Right?

Wrong. This is why.
1. GenerativeAI is a 10X complement to Data Science

In the past, deep learning had limited uses in Business Intelligence, Data Analytics, and in particular within Data Science for Business contexts like working with Tabular data.

Generative AI is the opposite. Instead of trying to improve on Machine Learning, generative AI adds a superpower of automation.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(