BIG NEWS: #ChatGPT breaks #Python vs #R Barriers in Data Science!

Data science teams everywhere rejoice.

A mind-blowing thread (with a FULL chatgpt prompt walkthrough). 🧵

#datascience #rstats
It's NOT R VS Python ANYMORE!

This is 1 example of how ChatGPT can speed up data science & GET R & PYTHON people working together.

(it blew my mind)
This example combines #R, #Python, and #Docker.

I created this example in under 10 minutes from start to finish.
I’m an R guy.

And I prefer doing my business research & analysis in R.

It's awesome. It has:

1. Tidyverse - data wrangling + visualization
2. Tidymodels - Machine Learning
3. Shiny - Apps
But the rest of my team prefers Python.

And they don't like R... it's just weird to them.

So I wanted to see if I could show them how we could work together...
Let’s start with a prompt.

I asked chatgpt to find a data set that I used for this example. Image
...ChatGPT found it... Image
... And gave me this code to read the data... Image
I prefer the tidyverse, so I asked Chatgpt to update the code. Image
That looks better. Image
With the data in hand, it’s time for some Data Science.

I asked this simple question. Image
ChatGPT's response was impressive. Image
But, even though I’m an R guy, my team uses Python for Deployment…

In the past, that’s a huge problem.

(resulting in days of translations from R to Python with Google and StackOverflow)
But now, that’s 1 minute of effort with chatGPT.

Can I show you?
I asked chatgpt to convert the R script to python... Image
And in 10 seconds chatgpt made this python code with pandas and scikit learn. Image
ChatGPT did in 10 seconds something that would have taken me 2 hours.

But let’s continue.

The reason we had to convert to Python is for “deployment”

Deployment is just a fancy word for allowing others to access my model so they can use it on-demand.
So I asked chatGPT this: Image
And ChatGPT made me a Python API using FastAPI. Image
But this code is useless…

… Without a docker environment.

So I asked chatGPT to make one: Image
And chatGPT delivered my Docker Environment's Dockerfile: Image
So in under 10 minutes, I had ChatGPT:

1. Make my research script in R.

2. Create my production script in Python for my Team

3. And create the API + Docker File to deploy it.
But when I showed my Python team, instead of excited...

...They were worried.

And I said, "Listen. There's nothing to be afraid of."

"ChatGPT is a productivity enhancer."

They didn't believe me.
My Conclusion:

You have a choice. You can rule AI.

Or, you can let AI rule you.

What do you think the better choice is?
If you want help, I'd like you to join me on a free #ChatGPT for #DataScientists Workshop on April 26th. And I will help you Rule AI.

What's the next step?

👉Register Here: us02web.zoom.us/webinar/regist… Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with 🔥 Matt Dancho (Business Science) 🔥

🔥 Matt Dancho (Business Science) 🔥 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mdancho84

Jul 18
The concept that helped me go from bad models to good models: Bias and Variance.

In 4 minutes, I'll share 4 years of experience in managing bias and variance in my machine learning models. Let's go. 🧵 Image
1. Generalization:

Bias and variance control your models ability to generalize on new, unseen data, not just the data it was trained on. The goal in machine learning is to build models that generalize well. To do so, I manage bias and variance.
2. Low vs High Bias:

Models with low bias are usually complex and can capture the underlying patterns in data very well. They are flexible enough to fit the training data closely. Models with high bias are overly simple and cannot capture the complexity in the data. They often underfit the training data, meaning they perform poorly even on the data they were trained on.
Read 13 tweets
Jul 17
K-means is an essential algorithm for Data Science.

But it's confusing for beginners.

Let me demolish your confusion: Image
1. K-Means

K-means is a popular unsupervised machine learning algorithm used for clustering. It's a core algorithm used for customer segmentation, inventory categorization, market segmentation, and even anomaly detection. Image
2. Unsupervised:

K-means is an unsupervised algorithm used on data with no labels or predefined outcomes. The goal is not to predict a target output, but to explore the structure of the data by identifying patterns, clusters, or relationships within the dataset.
Read 13 tweets
Jul 16
Tableau is about to die.

Introducing PandasAI, a free alternative for fast Business Intelligence.

Let dive in: Image
1. PandasAI

PandaAI transforms your natural language questions into actionable insights — fast, smartly, and effortlessly.
2. Powerful dashboards in seconds

The problem with Tableau? Analysts have to build them from scratch.

PandasAI solves this problem making it lightning fast to create dashboards from multiple sources. Image
Read 8 tweets
Jul 14
85% of data scientists do customer segmentation the WRONG WAY.

AI Agents fix this—here's how I made an AI that clusters customers & recommends marketing actions (and you can too). 🧵 Image
Traditional K-Means finds clusters, but that's just the start.

The real challenge?

Interpreting clusters for business value. Image
AI Agents summarize clusters, spot hidden patterns, and suggest personalized strategies. Image
Read 7 tweets
Jul 14
The 3 types of machine learning (that every data scientist should know).

In 3 minutes I'll eviscerate your confusion. Let's go: 🧵 Image
1. The 3 Fundamental Types of Machine Learning:

- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning.

Let's break them down:
2. Supervised Learning:

Supervised Learning maps a set of inputs (features) to an output (target). There are 2 types: Classification and Regression.
Read 11 tweets
Jul 13
Correlation is the skill that has singlehandedly benefitted me the most in my career.

In 3 minutes I'll demolish your confusion (and share strengths and weaknesses you might be missing).

Let's go: Image
1. Correlation:

Correlation is a statistical measure that describes the extent to which two variables change together. It can indicate whether and how strongly pairs of variables are related. Image
2. Types of correlation:

Several types of correlation are used in statistics to measure the strength and direction of the relationship between variables. The three most common types are Pearson, Spearman Rank, and Kendall's Tau. We'll focus on Pearson since that is what I use 95% of the time.Image
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(