BIG NEWS: #ChatGPT breaks #Python vs #R Barriers in Data Science!

Data science teams everywhere rejoice.

A mind-blowing thread (with a FULL chatgpt prompt walkthrough). 🧵

#datascience #rstats
It's NOT R VS Python ANYMORE!

This is 1 example of how ChatGPT can speed up data science & GET R & PYTHON people working together.

(it blew my mind)
This example combines #R, #Python, and #Docker.

I created this example in under 10 minutes from start to finish.
I’m an R guy.

And I prefer doing my business research & analysis in R.

It's awesome. It has:

1. Tidyverse - data wrangling + visualization
2. Tidymodels - Machine Learning
3. Shiny - Apps
But the rest of my team prefers Python.

And they don't like R... it's just weird to them.

So I wanted to see if I could show them how we could work together...
Let’s start with a prompt.

I asked chatgpt to find a data set that I used for this example. Image
...ChatGPT found it... Image
... And gave me this code to read the data... Image
I prefer the tidyverse, so I asked Chatgpt to update the code. Image
That looks better. Image
With the data in hand, it’s time for some Data Science.

I asked this simple question. Image
ChatGPT's response was impressive. Image
But, even though I’m an R guy, my team uses Python for Deployment…

In the past, that’s a huge problem.

(resulting in days of translations from R to Python with Google and StackOverflow)
But now, that’s 1 minute of effort with chatGPT.

Can I show you?
I asked chatgpt to convert the R script to python... Image
And in 10 seconds chatgpt made this python code with pandas and scikit learn. Image
ChatGPT did in 10 seconds something that would have taken me 2 hours.

But let’s continue.

The reason we had to convert to Python is for “deployment”

Deployment is just a fancy word for allowing others to access my model so they can use it on-demand.
So I asked chatGPT this: Image
And ChatGPT made me a Python API using FastAPI. Image
But this code is useless…

… Without a docker environment.

So I asked chatGPT to make one: Image
And chatGPT delivered my Docker Environment's Dockerfile: Image
So in under 10 minutes, I had ChatGPT:

1. Make my research script in R.

2. Create my production script in Python for my Team

3. And create the API + Docker File to deploy it.
But when I showed my Python team, instead of excited...

...They were worried.

And I said, "Listen. There's nothing to be afraid of."

"ChatGPT is a productivity enhancer."

They didn't believe me.
My Conclusion:

You have a choice. You can rule AI.

Or, you can let AI rule you.

What do you think the better choice is?
If you want help, I'd like you to join me on a free #ChatGPT for #DataScientists Workshop on April 26th. And I will help you Rule AI.

What's the next step?

👉Register Here: us02web.zoom.us/webinar/regist… Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with 🔥 Matt Dancho (Business Science) 🔥

🔥 Matt Dancho (Business Science) 🔥 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mdancho84

Feb 18
Outliers have led me to 100s of business insights. But first I had to find them.

In 3 minutes let me kill your confusion. Let's dive into outliers: Image
1. Outliers

Outliers or anomalies in a dataset are data points that differ significantly from other observations. They are often important insights signifying key events.
2. Methods: There are 1000s of outlier detection methods. The ones I use can be broken into 4 categories:

1. Statistical
2. Clustering
3. Time Series
4. Machine Learning
Read 10 tweets
Feb 17
Tableau and PowerBI are getting killed by free AI tools.

Case in Point: Microsoft's AI Data Formulator.

100% free in Python. Let's dive in: Image
1. Data Formulator: Create Rich Visualizations with AI

Data Formulator is an AI-powered tool for data analysts to iteratively create rich visualizations.

Data Formulator is an application from Microsoft Research that uses large language models to transform data, expediting the practice of data visualization.
2. A Novel Approach to Business Intelligence

Unlike most chat-based AI tools where users need to describe everything in natural language, Data Formulator combines user interface interactions (UI) and natural language (NL) inputs for easier interaction. This blended approach makes it easier for users to describe their chart designs while delegating data transformation to AI
Read 9 tweets
Feb 16
Google just dropped a new Generative AI Python library for SQL Databases.

Introducing Google GenAI Toolbox.

This is what you need to know: Image
1. Meet the Google GenAI Toolbox

An open-source server designed to simplify building Gen AI tools for your databases. It streamlines development, letting you integrate powerful data tools with just a few lines of code.
2. The Toolbox handles the heavy lifting

Managing connection pooling, authentication, and more—so you can focus on creating innovative Gen AI applications without reinventing the wheel.
Read 8 tweets
Feb 16
Move over Tableau and PowerBI.

There's a new Python library that automates Business Intelligence with AI using Text2SQL.

Let me introduce you to WrenAI: Image
1. Meet WrenAI

WrenAI is the future of Generative Business Intelligence (GenBI). It transforms complex data into intuitive insights through a conversational, no-code interface.
2. Text2SQL Engine

With its advanced Text-to-SQL engine, WrenAI lets you ask questions in plain language and instantly translates them into actionable queries, democratizing data access for everyone.
Read 8 tweets
Feb 16
Forecasting time series is what made me stand out as a data scientist.

But it took me 1 year to master ARIMA.

In 1 minute, I'll evaporate your confusion. Let's go. Image
1. Autoregressive Forecast Models

ARIMA and SARIMA are both statistical models used for forecasting time series data, where the goal is to predict future points in the series. The implement a concept called Autoregression.
2. ARIMA Decomposed:

AR-I-MA stands for Autoregressive (AR), Integrated (I), Moving Average (MA).

Let's break it down:
Read 13 tweets
Feb 15
The most overlooked skill by data scientists?

Time Series Analysis.

In 3 minutes, I'll demolish your confusion. Let's go: 🧵 Image
1. Time Series Analysis:

Time series analysis is a statistical technique that deals with time-ordered data points. It's commonly used to analyze and interpret trends, patterns, and relationships within data that is recorded over time (e.g. with timestamps).
2. Uses:

Understanding and applying time series analysis concepts is critical for forecasting, detecting anomalies, and drawing insights on data that varies over time.
Read 11 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(