Shreya Shankar Profile picture
May 4, 2022 10 tweets 2 min read Read on X
I probably should have written this years ago, but here are some MLOps principles I think every ML platform (codebase, data management platform) should have: 1/n
Beginner: use pre-commit hooks. ML code is so, so ugly. Start with the basics — black, isort — then add pydocstyle, mypy, check-ast, eof-fixer, etc. Honestly I put these in my research codebases too, lol. 2/n
Beginner: always train models using *committed* code, even in development. This allows you to attach a git hash to every model. Don’t make ad hoc changes in Jupyter & train a model. Someday someone will want to know what code generated that model… 3/n
Beginner: use a monorepo. Besides known software benefits (simplified build & deps), a monorepo reduces provenance & logging overhead (critical for ML). I’ve seen separate codebases for model training, serving, data cleaning, etc & it’s a mess to figure out what’s going. on 4/n
Beginner: version your training & validation data! Don’t overwrite train.pq or train.csv, because later on, you may want to look at the data a specific model was trained on. 5/n
Beginner: put SLAs on data quality. ML pipelines often break bc of some data-related bug. There are preliminary tools to automate data quality checks but we can't solely rely on them. Have an on-call rotation to manually sanity-check the data (eg look at histograms of cols) 6/n
Intermediate: put *some* effort into ML monitoring. Plenty of ppl are like, “oh we have delayed labels so we don’t monitor accuracy.” Make an on-call rotation for this: manually label a handful of predictions daily, and create a job to update the metric. Some info > no info 7/n
Intermediate: retrain models on a cadence (eg monthly) rather than when a KL divergence for an arbitrary feature arbitrarily drops. A cadence is less cognitive overhead. Do some data science to identify a cadence and make sure a human validates the new model every rotation 8/n
Advanced: shadow a less-complicated model in production so you can easily serve those predictions with one click if the main model goes down / is broken. ML bugs can take a while to diagnose so it’s good to have a reliable backup 9/n
Advanced: put ML-related tests in CI. You can do almost anything in Github Actions. Create test commands that: overfit your training pipeline to a tiny batch of data, verify data shapes, check integrity of features, etc. Whatever your product needs. 10/n

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Shreya Shankar

Shreya Shankar Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @sh_reya

Sep 24
LLMs have made exciting progress on hard tasks! But they still struggle to analyze complex, unstructured documents (including today's Gemini 1.5 Pro 002).

We (UC Berkeley) built 📜DocETL, an open-source, low-code system for LLM-powered data processing: data-people-group.github.io/blogs/2024/09/…Image
2/ Let's illustrate DocETL with an example task: analyzing presidential debates over the last 40 years to see what topics candidates discussed, & how the viewpoints of Democrats and Republicans evolved. The combined debate transcripts span ~740k words, exceeding context limits of most LLMs.
3/ But even for Gemini 1.5 Pro (2M token context limit), when given the entire dataset at once, it only reports on the evolution of 5 themes across all the debates! And, the reports get progressively worse as the output goes on. docetl.com/#demo-gemini-o…
Read 9 tweets
Oct 17, 2023
recently been studying prompt engineering through a human-centered (developer-centered) lens. here are some fun tips i’ve learned that don’t involve acronyms or complex words
if you don’t exactly specify the structure you want the response to take on, down to the headers or parentheses or valid attributes, the response structure may vary between LLM calls / it is not amenable to production
play around with the simplest prompt you can think of & run it a bunch of times on different inputs to build intuition for how LLMs “behave” for your task. then start adding instructions to your prompt in the form of rules, e.g., “do not do X”
Read 9 tweets
Sep 12, 2023
thinking about how, in the last year, > 5 ML engineers have told me, unprompted, that they want to do less ML & more software engineering. not because it’s more lucrative to build ML platforms & devtools, but because models can be too unpredictable & make for a stressful job
imo the biggest disconnect between ML-related research & production is that researchers aren’t aware of the human-centric efforts required to sustain ML performance. It feels great to prototype a good model, but on-calls battling unexpected failures chip away at this success
imagine that your career & promos are not about demonstrating good performance for a fixed dataset, but about how quickly on average you are able to respond to every issue some stakeholder has with some prediction. it is just not a sustainable career IMO
Read 8 tweets
Mar 29, 2023
Been working on LLMs in production lately. Here is an initial thoughtdump on LLMOps trends I’ve observed, compared/contrasted with their MLOps counterparts (no, this thread was not written by chat gpt)
1) Experimentation is tangibly more expensive (and slower) in LLMOps. These APIs are not cheap, nor is it really feasible to experiment w/ smaller/cheaper models and expect behaviors to stay consistent when calling bigger models
1.5) we know from MLOps research that high experimentation velocity is crucial for putting and keeping pipelines in prod. A fast way is to collect a few examples, load up a notebook, try out a heck of a lot of different prompts—calling for prompt versioning & management systems
Read 15 tweets
Dec 23, 2022
IMO the chatgpt discourse exposed just about how many people believe writing and communication is only about adhering to some sentence/paragraph structure
I’ve been nervous for some time now, not because I think AI is going to automate away writing-heavy jobs, but because the act of writing has been increasingly commoditized to where I’m not sure whether people know how to tell good writing from bad writing. Useful from useless.
In my field, sometimes it feels like blog posts (that regurgitate useless commentary or make baseless forecasts about the future) are more celebrated/impactful than tooling and thought. Often such articles are written in the vein of PR or branding
Read 5 tweets
Dec 7, 2022
I want to talk about my data validation for ML journey, and where I’m at now. I have been thinking about this for 6 ish years. It starts with me as an intern at FB. The task was to classify FB profiles with some type (e.g., politician, celebrity). I collected training data,
Split it into train/val/test, iterated on the feature set a bit, and eventually got a good test accuracy. Then I “productionized” it, i.e., put it in a dataswarm pipeline (precursor to Airflow afaik). Then I went back to school before the pipeline ran more than once.
Midway through my intro DB course I realized that all the pipeline was doing was generating new training data and model versions every week. No new labels. So the pipeline made no sense. But whatever, I got into ML research and probably would never do ML in industry again.
Read 22 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(