πŸ”₯ Matt Dancho (Business Science) πŸ”₯ Profile picture
May 20, 2023 β€’ 7 tweets β€’ 5 min read β€’ Read on X
As a data scientist, productivity is a 10X super power.

Here's a short list of AI tools to help data scientists with: 🧡

#ai #datascience #career #skills #tools Image
1. Writing code

AI pair programming is a huge benefit.

Tools like #chatgpt & github #copilot can help debug complex code and replace Googling + Stack Overflowing for common scripting.

Key skill: ChatGPT prompting (more on this in my free ChatGPT for Data Scientists) Image
2. Code Quality & Documentation

Great products have great documentation. AI can help produce documentation, comment code, and replace time-consuming manual documentation with automated AI docs.

Key Skill: Using @mintlify to build your docs: mintlify.com Image
3. Presentations

Great data scientists are storytellers. Use persuasion to your advantage.

Key Skill: Generating images with AI using @midjourney_ai . midjourney.com Image
I'm road-testing all of these.

And I've been quietly researching #ChatGPT for Data Scientists (My NUMBER 1 TOOL) for the past 4 months.

I have good news - I'm ready to reveal my chatgpt research!
If you want to understand how ChatGPT can make you a better data scientist (and mistakes to avoid)...

I'll be sharing my research in a Free WORKSHOP: ChatGPT for Data Scientists (Wednesday, June 7th)!
What's Your Next Step?

Join me and 1,000 data scientists as we crush AI in my LIVE ChatGPT for Data Scientists Workshop.

Seats are limited (1,000 max).

πŸ‘‰Register Here: us02web.zoom.us/webinar/regist… Image

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with πŸ”₯ Matt Dancho (Business Science) πŸ”₯

πŸ”₯ Matt Dancho (Business Science) πŸ”₯ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mdancho84

May 18
When I was first learning data science, one of the things that tripped me up the most was Cross Validation.

In 5 minutes, I'll share 5 years of experimentation with dozens of Cross Validation techniques.

Let's dive in. 🧡 Image
1. Cross Validation Goals:

Cross-validation is a statistical method used to estimate the accuracy of machine learning models

It's also used to measure the stability of models when combined with hyperparameter tuning of machine learning models.
2. Principles & Terminology:

The main principle behind cross-validation is partitioning a sample of data into complementary subsets, performing the analysis on one subset, and validating the analysis on the other subset (called the assessment set).
Read 11 tweets
May 10
Bayesian data analysis is a fundamental concept in data science. But it took me 2 years to understand its importance.

In 2 minutes, I'll share my best findings over the last 2 years exploring Bayesian Modeling.

Let's go. 🧡 Image
1. Why Bayesian Data Analysis?

Bayesian modeling is a powerful tool in statistics and data science, especially where traditional approaches fall short.

It avoids arbitrary assumptions and provides distributions of possible values instead of just point estimates.
2. Bayes Theorem:

Bayesian modeling is based on Bayes’ theorem.

Bayes' Theorem provides a mathematical formula to update the probability for a hypothesis as more evidence or information becomes available.

It describes how to revise existing predictions or theories in light of new evidence, a process known as Bayesian inference.
Read 10 tweets
May 8
Why data scientists should stop ignoring AI.

A thread🧡 Image
I get it. Yet another "hypecycle".

In 2016 it was Deep Learning.

Now it's Generative AI. Right?

Wrong. This is why.
1. GenerativeAI is a 10X complement to Data Science

In the past, deep learning had limited uses in Business Intelligence, Data Analytics, and in particular within Data Science for Business contexts like working with Tabular data.

Generative AI is the opposite. Instead of trying to improve on Machine Learning, generative AI adds a superpower of automation.
Read 8 tweets
May 6
The concept that helped me go from bad models to good models: Bias and Variance. In 4 minutes, I'll share 4 years of experience in managing bias and variance in my machine learning models.

Let's go. 🧡 Image
1. Generalization:

Bias and variance control your models ability to generalize on new, unseen data, not just the data it was trained on. The goal in machine learning is to build models that generalize well. To do so, I manage bias and variance.
2. Low vs High Bias:

Models with low bias are usually complex and can capture the underlying patterns in data very well.

Models with high bias are overly simple and cannot capture the complexity in the data. They often underfit the training data.
Read 11 tweets
May 4
Principal Component Analysis (PCA) is the gold standard in dimensionality reduction with uses in business. In 5 minutes, I'll teach you what took me 5 weeks. Let's go! 🧡 Image
1. What is PCA?:

PCA is a statistical technique used in data analysis, mainly for dimensionality reduction.

It's beneficial when dealing with large datasets with many variables, and it helps simplify the data's complexity while retaining as much variability as possible.
2. How PCA Works:

PCA has 5 steps:

1. Standardization
2. Covariance Matrix Computation
3. Eigen Vector Calculation
4. Choosing Principal Components
5. Transforming the data.

Let's break them down.
Read 11 tweets
Nov 30, 2023
90% of data scientists overlook how to design A/B Testing experiments.

4 tips for better experiments: 🧡

#DataScience #ABTesting Image
Tip 1: Include a pre-test

Pretest data is unaffected data before the actual A/B test or Time-based Experiment.

Pre-test is a secret used by Booking(dot)com in their CUPED A/B Test method for reducing variance (and improving decision-making from A/B Test results).
Tip 2: Factor in time to effect

For online conversions, sales effects can take time. Your experiment should factor this impact.

A different technique, called Causal Impact can be more important especially if the conversion is a longer sale-cycle / process.
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(