At #SparkSummit 2017, we launched ModelDB and kickstarted the experiment management space. Excited to talk see the impact #VertaModelMonitoring will have on #MLOps! (2/n)
ML runs whole businesses today ranging from robo- advisors to fraud detection to Anti-Money Laundering to RPA. (3/n)
AI-ML doesn't always work as expected! FICO scores stopped working in the pandemic and threw a wrench into lending models (4/n)
We see this in the field everyday --whether lost revenue, whether misuse of models or incorrect pricing decisions leading to losses. (5/n)
So what's the goal of #ModelMonitoring? It's fairly simple -- Ensuring model results are consistently of high quality! (6/n)
What's hard about this? (1) Knowing when a model fails, (2) Finding the root cause, and (3) Remediating it. (7/n)
Why is it hard to know if a model has failed? First, ground-truth is very hard to come by, second, ground-truth is often time-shifted, and third, your model is usually only one part of the decision process. (8/n)
Next challenge is jungles. Pipeline jungles. A model has a ton of dependencies, is it the model or the dependencies that are causing the failure? (9/n)
Finally, monitoring is only useful if you can close the loop and take action. (10/n)
So do you need a monitoring tool or can you write ad-hoc jobs? Works for one model, but what about 20 models and 30 datasets? (11/n)
Why are we excited to solve #ModelMonitoring? Because it's a hard problem! (12/n)
Introducing #VertaModelMonitoring: flexible, customizable, and closing the loop (13/n)
How does it work? Ingest stats, ground-truth, and analyze, analyze, analyze (14/n)
Batch and live are no different, just some aggregation magic (and lots of perf optimizations!) (15/n)
My demo setup with Apache Spark ML -- pre-processing and then GBDT! (16/n)
Monitoring is easy: (1) define how you want to profile your data (use built-in profilers or define your own) and compute stats. This could be any statistic (17/n)
(2) Visualize and explore my stats!(18/n)
(3) Define alerts, ideally on every step of your pipeline! (19/n)
But alert fatigue is real. And so root cause analysis is crucial. With Spark ML pipelines (or any pipeline), we can trace back dependencies to find the underlying issue with the model pipeline. (20/n)
Alright, let's live-tweet again! Super excited to live-tweet my colleague, @VertaAI's Cory Johannsen's talk about why AAP is Not the same as ML Monitoring! #MLOpsdatabricks.com/session_na21/w…
What is APM (New Relic, DataDog, Elastic etc. etc.)? Tracking metrics about your system, aggregating, dashboarding, alerting (1/n)
APM lets you track health, perf, availability etc. the Golden signals (2/n)