Tweet

@nevrekaraishwa2

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @DataScienceHarp

Harpreet Sahota 🥑

@DataScienceHarp

Jan 3

🤯 Say goodbye to lifeless textbooks and hello to an exciting way to learn statistics! 💪

I have a masters degree in statistics, but these 11 books taught me more about how statistics in the real world than any course I've taken.

Have you read any of them?👇🏽🧵

#66DaysOfData

📚🧠Who says learning statistics has to be boring?!

🤓 The Manga Guide to Statistics makes it fun and easy to learn all the basic concepts, with entertaining examples and applications.

Get it here:nostarch.com/mg_statistics.…

Learn to calculate regression equations and perform hypothesis tests with The Manga Guide to Regression Analysis.

You also learn: simple, multiple, and logistic regression to predict iced tea orders and bakery revenues, and calculate confidence intervals and odds ratios.

Read 13 tweets

Harpreet Sahota 🥑

@DataScienceHarp

Jan 2

You don't need a bootcamp to get started in machine learning.

All you need are the right resources, discipline, and time.

Here are 6 of my favourite FREE resources to get you started.

1/ edX's Machine Learning with Python: A Practical Introduction

This course will give you all the tools you need to get started with supervised and unsupervised learning.

Time commitment: 4 hours a week and you're done in 2 weeks.

edx.org/course/machine…

2/ Cognitive class' Machine Learning with Python

You'll learn about real-life examples of Machine learning and how it affects society in ways you may not have guessed!

Time commitment: 7 hours a week and you'll be done in 2 weeks

cognitiveclass.ai/courses/machin…

Read 8 tweets

Harpreet Sahota 🥑

@DataScienceHarp

Jan 2

The curse of dimensionality is a major roadblock for machine learning practitioners.

But most don't fully understand it.

Don't be left in the dark - join me in this thread as I clarify and demystify this concept 👇🏽🧵

The Curse of Dimensionality (let's just call it "The Curse") refers to problems that occur when you try to use statistical methods in high-dimensional space.

As the number of features (dimensionality) increases, the data becomes relatively more sparse, and often exponentially more samples are needed to make statistically significant predictions.

Read 7 tweets

Harpreet Sahota 🥑

@DataScienceHarp

Dec 30, 2022

Feature selection is a crucial part of building a good machine learning model.

But most data scientists don't think before they select features.

The fact is: feature selection in machine learning is not always necessary.

Here are 5 situation when you don't need it 👇🏽🧵

1. You have a small dataset that doesn't have many features.

If the data you're using is small and doesn't have many features, you don't need to do feature selection.

2. The features are already carefully selected

If the features you're using have already been carefully chosen and are important for the task you are trying to do, you don't need to do feature selection.

Read 7 tweets

Harpreet Sahota 🥑

@DataScienceHarp

Dec 29, 2022

Machine learning and Python go hand in hand.

Ready to take the first step towards a rewarding career in machine learning?

These 4 resources will help you learn Python and get started 👇🏽🧵

#100DaysOfCode #66DaysOfData #DeepLearning

1/ Python Principles

I've never seen anything like this course.

This is a text based course with an interactive coding environment that will teach you all the basics of Python.

There's lots of challenges and exercises, too.

This should take 2 weeks.

pythonprinciples.com/lessons/

2/ CognitiveClass' Python for Data Science

Spend 1 hour a day and you'll be done in a week.

cognitiveclass.ai/courses/python…

Read 7 tweets

Harpreet Sahota 🥑

@DataScienceHarp

Dec 29, 2022

The number one cause of machine learning model failure is data set drift.

Yet most data scientists and machine learning practitioners don't know why their data sets are drifting.

Here are 6 of the most common reasons for data set drift in machine learning 👇🏽🧵

What is dataset drift? It's when the statistical properties of a dataset change over time, which can negatively impact the performance of a machine learning model.

1. Changes in the data distribution:

The distribution of the data used to train the model may change over time, leading to dataset drift. This could be due to changes in the underlying process that generates the data, or due to changes in the data collection process itself.

Read 9 tweets

Share this page!

Harpreet Sahota 🥑

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @DataScienceHarp

Harpreet Sahota 🥑

Harpreet Sahota 🥑

Harpreet Sahota 🥑

Harpreet Sahota 🥑

Harpreet Sahota 🥑

Harpreet Sahota 🥑

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!