Daliana Liu Profile picture
Data Scientist | Host of 'The Data Scientist Show' | now @predibase, ex-Amazon | Nerdy joke connoisseur 📍SF
May 11, 2022 6 tweets 2 min read
How to do outlier detection in 2022:

Outlier detection is a common use case in tech. Here are 3 techniques you should know
(ranked by capabilities ): 1. Z-score ★ ★ ★
Z-score is measured in terms of standard deviations from the mean. For one-dimension data, when Z-score >3 are likely outliers. It means that they are at least 3 standard deviations away from mean.

It's easy to use but only for 1-dimensional data.
May 11, 2022 6 tweets 1 min read
Why data scientists are leaving their jobs?

Because they found that they are doing "data science engineering" instead of "data science research": • Most companies have A/B testing tools, and data scientists design metrics and automate reports (BI).

• For ML, it's easy to load a model from sklearn, and the work is in feature engineering and putting models in production.

• When there's no data, you do data engineering.
Apr 12, 2022 4 tweets 1 min read
Don't let the lack of math skills stops you from getting into data science.

The blog below listed some essential skills. I like that it points out a lot of calculations can be done using Python/R libraries.

Here is how I would prioritize: • I would spend more time on statistics, linear regression, and probabilities.
Because they are most commonly used concepts and the foundation to solve most problems.