This week, a new entrant to the experiment tracking / reproducibility space: keepsake.ai, by @replicateai
Keepsake calls itself "Version control for machine learning"
Like other experiment tracking tools, keepsake aims to be super easy to integrate with how you train models, and to require few code changes to get started
What's different?
Keepsake is open source, and all of the metadata and artifacts from your experiments are stored as tarballs and json files in your AWS or GCP account.
That means there's no cloud service to sign up for, and no additional servers or infrastructure to manage.
Keepsake aims to make experiments reproducible, not just track them, so it includes a CLI and visualization library that can:
* Check out code and weights from a previous experiment
* Compare experiments
* Sort and filter runs
* Visualize training runs in a notebook
Who is this for?
If you're already a heavy user of @weights_biases, @MLflow, @DVCorg, or another experiment management / reproducibility platform, keepsake probably isn't full featured enough to convince you to switch yet.
But if you haven't set one of these up for your project because you don't want to use a SaaS platform and don't want the complexity of @MLFlow, then give it a shot.
If you value simple tools with nice UX that do a single job well, you'll probably enjoy this library.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
TVM describes themselves as a an "end to end machine learning compiler framework for cpus, gpus, and accelerators".
Let's talk about what that means and why it might be useful for you:
Under the hood, deep learning frameworks are built on "kernel operator" libraries like cuDNN. These are the primitives that help run your model fast on a GPU or other accelerator.
The problem is: these aren’t open source (not extensible) and only work on specific platforms.
So what happens if you the platform you want to deploy to isn't supported or performance is bad?
That's where TVM comes in. It has importers for all the major frameworks and tutorials to compile optimized versions of your models for common CPUs and GPUs.
Let's talk about setting up our Python/CUDA environment!
Our goals:
- Easily specify exact Python and CUDA versions
- Humans should not be responsible for finding mutually-compatible package versions
- Production and dev requirements should be separate
1/N
Here's a good way to achieve these goals:
- Use `conda` to install Python/CUDA as specified in `environment.yml`
- Use `pip-tools` to lock in mutually compatbile versions from `requirements/prod.in` and `requirements/dev.in`