“#MLOps standard industry best practices” don’t apply to most #ML teams’ reality.
Why?
Those who write and share best practices are doing ML at a hyper scale.
Those who read and re-share them are doing ML at a reasonable scale.
Companies like Google, Netflix, Uber, and Airbnb are doing an awesome job for the community by sharing their blogs, white papers, and open-sourcing their tools.
But whatever they do, it is shaped (and biased) by THEIR MLOps problems.
Most companies don’t have their problems.
They would love to have their problems, but they don’t.
They operate on a smaller scale & have different (& other) challenges.
And they are the biggest part of the ML industry.
They want to know what’s the best way to do MLOps at their scale, with their resources & limitations
And since there is not much out there specifically for RS companies, they read (& share) whatever is out there.
And the vicious cycle continues.
The moral of the story is this.
Think about
-your problem
-your use case
-your limitations
-your needs
Solve YOUR MLOps problems
Don’t do something just because it is a “standard industry practice”.
Great @pytorchlightnin + Hydra (clean and scalable) template to kickstart any deep learning project by @ukashxukash (and some other contributors).
Main ideas behind it:
-Predefined structure: clean & scalable so that work can easily be extended
-Rapid Experimentation: thanks to hydra command line superpowrs
-Little Boilerplate: thanks to automating pipelines with config instantiation
-Main Configs: specify default training configuration
-Experiment Configs: override chosen hyperparameters
-Workflow: comes down to 4 simple steps
-Experiment Tracking: @TensorBoard, @weights_biases, neptune.ai, @Cometml, @MLflow, @CSVLogger
-Logs: all logs are stored in a dynamically generated folder structure
-& more