GitHub actions are *free* computing that makes your life easier.
Here are 3 use cases for ML projects โ
โก๏ธ Continuous Integration and Deployment (CI/CD)
Machine Learning is software engineering. As such, it is crucial you automate:
โ code updates (aka integration), and
โ code releases to your production environment (aka deployment)
โก๏ธ Batch feature pipelines
This is a program that runs on a chron-like schedule, that fetches raw data from a data source (e.g. a data warehouse), computes ML features, and saves them to a storage service (e.g. a feature store).
Feature pipelines are present in every ML system.
โก๏ธ Inference pipelines
Batch scoring is one of the most popular ways to generate fresh predictions from an ML model.
They fetch recent features, and a model artifact, generate predictions, and save them in a storage layer.
To build a Machine Learning product you need to spend money on 3 types of services:
โ Computing, like CPUs and GPUs so you can train and deploy your models.
โ Orchestration, to kick off the 3 pipelines of your system
โ Storage, to save features, models, and experiment runs
And the thing is, not all these services cost you the same.
โ Orchestration and storage are not expensive ๐ธ
โ Computing, on the other hand, can get very expensive ๐ธ๐ธ๐ธ๐ธ๐ธ
The most effective thing you can do to land an ML job is to
- pick a problem you care about
- build an ML solution, and
- release it to the public.
Here is an example to inspire you ๐คโ
The most effective way to learn and showcase your ML skills is to build a ๐ฐ๐ผ๐บ๐ฝ๐น๐ฒ๐๐ฒ ๐ ๐ ๐ฝ๐ฟ๐ผ๐ท๐ฒ๐ฐ๐ and publish
โ the source code on GitHub, and
โ a public working app