neptune.ai Profile picture
We tweet about #MLOps best practices & other cool stuff Read our blog at https://t.co/nOTpkA6xq6 Experiment tracker & model registry for production ML teams
Dec 14, 2022 8 tweets 3 min read
Models aren’t intelligent enough to adjust to a changing world unless they’re constantly retrained & updated

You need to monitor them, detect data drift & update the data

To detect data drift, do distribution tests by measuring distribution changes using these distance metrics: > Basic statistical metrics you could use to test drift between historical and current features are:
- mean/average value,
- standard deviation,
- minimum and maximum values comparison,
- and also correlation.
Dec 14, 2022 5 tweets 2 min read
Greensteam subscribed to the idea of doing #MLOps at a reasonable scale.

Seeing the quickly growing number of customers (= ML experiments), they decided to build their MLOps stack from 0 and solve all core problems around it.

Here are some of the issues → solutions: - 1000s of Jupyter notebooks → git
- Managing dependencies and reproducibility → @Docker
- Dealing with unit tests (in some parts of the model code) that don’t test → running smoke tests
Dec 13, 2022 4 tweets 2 min read
@LukawskiKacper is joining us next week on #MLOps Live to share his experience and advise on implementing vector search – AMA. Image Kacper has almost 15 years of experience in data engineering, ML and software design. As the founder of @AiEmbassy, he has been also actively taking part in AI discussions, especially on similarity learning, vector search, and solving social issues by applying ML methods.
Dec 12, 2022 8 tweets 3 min read
3 steps to be more productive doing #ML at a reasonable scale.

1/ Identify all critical problems your team is dealing with
2/ Look for the best solution available
3/ Apply + evaluate

Example with @instadeep ↓ Step 1/ Challenges faced by the BioAI team while building DeepChain (platform for protein design):
Dec 7, 2022 4 tweets 3 min read
Great @pytorchlightnin + Hydra (clean and scalable) template to kickstart any deep learning project by @ukashxukash (and some other contributors).

Main ideas behind it: -Predefined structure: clean & scalable so that work can easily be extended
-Rapid Experimentation: thanks to hydra command line superpowrs
-Little Boilerplate: thanks to automating pipelines with config instantiation
-Main Configs: specify default training configuration
Dec 6, 2022 6 tweets 4 min read
#MLOps standard industry best practices” don’t apply to most #ML teams’ reality.

Why?

Those who write and share best practices are doing ML at a hyper scale.

Those who read and re-share them are doing ML at a reasonable scale. Companies like Google, Netflix, Uber, and Airbnb are doing an awesome job for the community by sharing their blogs, white papers, and open-sourcing their tools.

But whatever they do, it is shaped (and biased) by THEIR MLOps problems.

Most companies don’t have their problems.