In good software practices, you version code. Use Git. Track changes. Code in master is ground truth.

In ML, code alone isn't ground truth. I can run the same SQL query today and tomorrow and get different results. How do you replicate this good software practice for ML? (1/7)
Versioning the data is key, but you also need to version the model and artifacts. If an ML API returns different results when called the same way twice, there can be many sources to blame. Different data, different scaler, different model, etc. (2/7)
“Versioning” is not enough. How do you diff your versions? For code, you can visually inspect the diff on Github. But the size of data and artifacts >> size of a company’s codebase. You can't visually and easily inspect everything. (3/7)
Diffing versions isn’t as simple as computing a diff in bytes. Of course the bytes of data or model can change in the next iteration. For each piece of data or artifact, you need to articulate what a diff means. Ex: your database of mammal pics now contains dog pics. (4/7)
My biggest criticism of MLOps tools: they can set up Postgres tables for you to log things to and inspect, but they don’t tell you what to diff or how to compute a diff. For data, maybe it’s a high Jensen-Shannon divergence. For a model, maybe it’s a change in accuracy. (5/7)
It takes companies many iterations of data science projects to figure out what they need to track, and how to track it over time. Ingested data. Cleaning properties. Features. Models. Outputs. Deviations from baselines. Hardware specs. The list is seemingly infinite. (6/7)
So a big part of “good ML practices” is to communicate that you need to version more than the code, and work with your collaborators to align on what needs to be versioned. Any stakeholder should be able to inspect the diffs. (7/7)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Shreya Shankar

Shreya Shankar Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @sh_reya

23 Sep
every morning i wake up with more and more conviction that applied machine learning is turning into enterprise saas. i’m not sure if this is what we want (1/9)
why do i say saas? every ML company is becoming a dashboard and API company, regardless of whether the customer asked for a dashboard or not. there’s this unspoken need to “have a product” that isn’t a serialized list of model weights & mechanisms to trust model outputs (2/9)
why is saas not perfectly analogous? “correctness” at the global scale is not binary for ML, but it is for software. i get the need to package ML into something that sells, but i’m not sure why it needs to replicate the trajectory of enterprise saas (3/9)
Read 9 tweets
20 Sep
Some things about machine learning products just baffle me. For example, I'm curious why computer vision APIs release "confidence scores" with generated labels. What's the business value? Does this business value outweigh potential security concerns? (1/4)
For context, here's what Cloud Vision and Azure Vision return for some image I pulled from Google Images. Notice the "confidence scores" (a.k.a. probabilities) assigned to each label. (2/4) ImageImage
Wouldn't publishing these confidence scores make it easier for an adversary to "steal" the model (ex: fine-tune a model to min. KL div between softmaxed model outputs and API-assigned scores)? Or even attack the model because you could approximate what its parameters do? (3/4)
Read 4 tweets
13 Sep
I have been thinking about @cHHillee's article about the state of ML frameworks in @gradientpub for almost a year now, as I've transitioned out of research to industry. It is a great read. Here's a thread of agreements & other perspectives:

thegradient.pub/state-of-ml-fr…
I do all my ML experimentation *on small datasets* in PyTorch. Totally agreed with these reasons to love PyTorch. I switched completely to PyTorch in May 2020 for my research. I disagree that TF needs to be more afraid of the future, though. Image
In industry, I don't work with toy datasets. I work with terabytes of data that come from Spark ETL processes. I dump my data to TFRecords and read it in TFData pipelines. If I'm already in TF, I don't care enough to write my neural nets in PyTorch.
Read 11 tweets
4 Sep
Beginning a thread on the ML engineer starter pack (please contribute):

- ”example spark config” stackoverflow post
- sklearn documentation
- hatred for Airflow DAGs
- awareness of k8s and containers but no idea how to actually use them
- “the illustrated transformer” blog post
- silent numpy broadcasting errors
- cursing US-West-2 for not having any instances available
- reviewing data scientists’ code & wishing it was cleaner
- reviewing software engineers’ code & wishing your code could be half as good as theirs
- battered copy of Martin Kleppman’s “Designing Data-Intensive Applications”
- weekly emails from ML tooling startups trying to sell their products
- spending 10x time cleaning data as training models on the data
Read 4 tweets
18 Aug
I'm not sure if the machine learning engineer role is very well-defined. IMO, a good MLE does "full-stack" work -- owning ML end-to-end, from model development to integration in production pipelines.

I interview for both MLE and data science roles. Here's what I look for:
"Strongly suggested" languages:

MLE: Spark or Hadoop (or some ETL experience), Python
DS: Python, Pandas (or some datafame manipulation experience)

R can substitute for Python, but in tech it's hard to get used to full Python workflows & collaborate without some Python exp.
Technical concepts:

MLE (more programming and design-heavy): OOP -- do you know what a class is? Can you write good abstractions? Can you design basic DB schemas? How do you debug? Can a DS easily understand your Python code? Do you know basic models and how to write baselines?
Read 11 tweets
3 Aug
Here are some lowlights from my reading of TikTok security reports[1] (for the record, you can learn to avoid these in an intro security or privacy class):

[1] penetrum.com/research
The Android developer API states that `ACCESS_FINE_LOCATION` is a "dangerous" permission. Why is this necessary? It's also known that they use a 3rd party, AppsFlyer, to help with monitoring & tracking (but a lot of apps use AppsFlyer, so no need to zone in on TikTok). Image
How fun, they're also tracking your literal IMEI number of your phone. Penetrum says, "Essentially, it creates an extremely realistic and graphic fingerprint of your phone which can be used to determine everything you have installed." Image
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!