Tweet

How to get URL link on Twitter App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Pau Labarta Bajo

@paulabartabajo_

May 4 • 8 tweets • 4 min read Twitter logo

Read on Twitter

Wanna become a freelance data scientist? 😎

5 tips to help you become one ↓

#Tip 1: Start small

Clients don´t look for an all-in-one data scientist, but someone who can solve their SPECIFIC problems.

Identify the things you are already an expert in, e.g.

→ Dashboarding with Tableau, or
→ ML for computer vision, or
→ Scrapping

Apply only for these.

#Tip 2: Build a Minimum Viable Portfolio

Clients want to see real work you have done in the past. They want to see solid proof you can deliver.

Build a small public/private portfolio that focuses on your strengths (from #Tip 1 above).

#Tip 3: Fish in several ponds

There are lots of freelance platforms nowadays, so don't put all the eggs in the same basket.

The 2 platforms I would start with:

→ Toptal: toptal.com/Bxdpg6/worlds-…
→ Brainstrust: app.usebraintrust.com/r/pau1/

#Tip 4: Write proposals like a pro

Go straight to the point. Focus on the problem from the first paragraph, without preambles and presentations that can only make her yawn.

Decrease cognitive load by using bullet points, and close the proposal with a call-to-action.

#Tip 5: Pricing

Hourly pricing is the best option if you correctly set your hourly rate.

Data science hourly rates fluctuate between 40 USD/hour and 150 USD/hour.

Never go below 40, you will be leaving money on the table.

Wanna get more freelance career advice?

Subscribe to my newsletter and get for FREE my eBook

"How to become a freelance data scientist"

which has specific advice to help you get started on the freelance path.

↓↓↓
freelance-data-science.carrd.co

@paulabartabajo_

Wanna learn to build production-grade ML products?
→ 𝗙𝗼𝗹𝗹𝗼𝘄 𝗺𝗲 @paulabartabajo_

Wanna help?
→ Like/Retweet the first tweet below to spread the wisdom ↓↓↓

https://twitter.com/1408789941040058369/status/1654032937904373760

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @paulabartabajo_

Pau Labarta Bajo

@paulabartabajo_

May 3

How to turn an ML notebook into a real-world ML app?

(using only Python and MLOps)

🧵↓↓↓

The starting point is this one Jupyter notebook where you:

1 - Loaded data from a CSV file
2 - Engineered features and targets
3 - Trained and validated an ML model.
4 - Generated predictions on the test set.

Let's turn this notebook into a batch-prediction service ↓

A batch-prediction service ingests raw data and outputs model predictions on a schedule (e.g. every 1 hour).

You can build one using this 3-pipeline architecture
- Feature pipeline 📘
- Training pipeline 📙
- Batch inference pipeline 📒

Let's go step by step...

Read 12 tweets

Pau Labarta Bajo

@paulabartabajo_

May 1

Job postings for entry-level data scientists are nonsense 🙃

Don't try to fit all their requirements.

This is what you need to do instead ↓↓↓

Do not try to tick all the boxes in these long job postings.

Because you will go crazy.

And because it is a lie you need to rock at Python, SQL, ETL design, data visualization, Deep Learning, and Methapyisics to land an entry-level job in data science.

If so, why are companies asking all these things?

Well, because most of them do not have a clue about data science, so they Copy+Paste the job descriptions they see in top tech companies.

Fear of missing out (FOMO) pushes normal companies to ask for things they do not even need

Read 11 tweets

Pau Labarta Bajo

@paulabartabajo_

Apr 18

Here are 2 steps that every real-world ML problem has...

... that you won't learn in Kaggle ↓↓↓

➡️ From business problem to ML problem

Every Kaggle competition starts with a clearly defined target metric you need to optimize for.

But, in real-world ML, there is no target metric waiting for you.

It is your job to translate a business problem into an ML problem, by finding the right proxy metric.

This proxy metric is a quantitative and abstract metric, that positively correlates with the actual business metric you want to impact, e.g. accuracy, precision...

Read 6 tweets

Pau Labarta Bajo

@paulabartabajo_

Apr 18

All ML systems can be decomposed into 3 pipelines (aka programs):

→ Feature pipeline
→ Training pipeline
→ Inference pipeline

And this is how they work ↓

The feature pipeline takes raw data, from

- a data warehouse
- an external API, or
- a website, through scrapping

and generate features, aka the inputs for your ML model, and stores them in a Feature Store so that the other 2 pipelines can later use these features.

The training pipeline takes the features from the store and outputs a trained ML model.

These are (in general) the best models for each domain:

-Tabular data → XGBoost
- Computer Vision → Fine-tune a Convolutional Neural Net
- NLP → Fine-tune a Transformer net.

Read 6 tweets

Pau Labarta Bajo

@paulabartabajo_

Apr 17

Building one professional real-world ML project is the best way to stand out from the crowd, and land an ML job.

And here is how you can do it, 𝘀𝘁𝗲𝗽-𝗯𝘆-𝘀𝘁𝗲𝗽 👩‍💻👨🏽‍💻↓

Step 1. Find a real-world problem you are interested in

Working on projects is harder than completing online courses.
But hey, no pain no gain.

It is VERY important you work on a problem you are interested in.
Otherwise, you will quit.

Step 2. Find a data source

Preferably a live API. If not possible, pick a static dataset from Kaggle.

Here is a superb repo with a list of public APIs you can use
github.com/public-apis/pu…

Read 9 tweets

Pau Labarta Bajo

@paulabartabajo_

Mar 27

Wanna build a trading bot using ML?

Stop thinking about complete ML models.
This is what you need to build instead.

↓↓↓

An ML-based bot is as good as the input features it used to make predictions.

If the data is "too old", the model predictions will not be useful, no matter how accurate your model is.

But, how do you serve fast data to your model?

Welcome **real-time feature pipelines** 🤗

A real-time feature pipeline is a program that

→ ingests real-time raw data (e.g. from an external web socket, or a message bus like Kafka),

→ processes raw data into features (aka inputs for your ML model), and

→ sends these features to a Feature Store or back to the bus.

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter Twitter Thread URL to Unroll

Pau Labarta Bajo

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @paulabartabajo_

Pau Labarta Bajo

Pau Labarta Bajo

Pau Labarta Bajo

Pau Labarta Bajo

Pau Labarta Bajo

Pau Labarta Bajo

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!