Rob Profile picture
Rob
Python 🐍 // Data Science 💾 // 3x Kaggle Grandmaster // Live coding is fun 🎙️// Follow on twitch: https://t.co/Nu95NrGTHS & Youtube: https://t.co/WfD4vK0I5M

Aug 5, 2022, 10 tweets

Learning data science is fun, so then why do we always use the same boring datasets? It's common to see projects using the iris, cars, or titanic data. Stand out! Check these 9 datasets on I created on #kaggle perfect for a unique portfolio project. #datascience #datasets🧵👇

1. MrBeast Youtube Stats

Includes metadata for every MrBeast Youtube video including: title, description, view, comment counts, likes AND thumbnails. Updated daily so you can track this viral sensation’s video trends over time.

🔗 kaggle.com/datasets/robik…

2. Workplace Injury Data

Dataset of over 200k OSHA reportable injuries spanning 5 years. Do some investigative data science to see which industries produce the most injuries and which companies keep their employees safe.

🔗 kaggle.com/datasets/robik…

3. Roller Coaster Metadataset

This dataset contains metrics for ~1,000 different roller coasters from around the world. Includes tons of metrics like top speed, number of flips, year built, and even lat/lon locations so you can plot them on a map!

🔗 kaggle.com/datasets/robik…

4. TextOCR Dataset

Want to beef up your computer vision skills? This is the perfect dataset with over ~1M high quality word annotations on images. Train a custom model capable of OCR text extraction.

🔗 kaggle.com/datasets/robik…

5. Eye State Classification Dataset

This is the perfect dataset to learn binary classification. See if you can create an algorithm that uses EEG measurements to tell if the subject’s eyes are open or closed.

🔗 kaggle.com/datasets/robik…

6. PGA Tour Golf Data

This dataset contains results from all major PGA events going back to 2015. Once you give it a try you won’t be able to help yourself yelling “FORE”

🔗 kaggle.com/datasets/robik…

7. Monthly measurements of Zillow’s home values for each US state going back to 2000. This small dataset is perfect for any beginner interested in working with time series data.
🔗 kaggle.com/datasets/robik…

8. Annotated Car Driving Footage

Multi-Object tracking is one of the most cutting edge fields in computer vision. This dataset provides video footage of cars driving through cities and labels are provided for every car, pedestrian and stop light.
🔗 kaggle.com/datasets/robik…

9. Historic Global Exchange Rates

This dataset is updated daily with exchange rates from around the world. Do some data exploration to see if you can find unique trends related to world events.

🔗 kaggle.com/datasets/robik…

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling