I was an eng leader on Facebook’s NewsFeed and my team was responsible for the feed ranking platform.
Every few days an engineer would get paged that a metric e.g., “likes” or “comments” is down.
It usually translated to a Machine Learning model performance issue. /thread
2/ The typical workflow to diagnose the alert by the engineer was to first check our internal monitoring system Unidash to see if the alert was indeed true and then dive into Scuba to diagnose it further.
3/ Scuba is a real-time analytics system that would store all the prediction logs and makes them available for slicing and dicing. It only supported filter and group by queries and was very fast.
4/ Engineers would load up the Scuba dashboard for a given time window and start slicing data on a variety of attributes.
For example - Are likes down for all types of news feed stories? Are they down only within a particular country or region?
5/ If I were on-call and I got an alert that likes dropped by a stat-sig amount, the first thing I would do is to go into Scuba.
I will zoom into likes last day and compare it with last week, and add filters like country, etc to find out which slice has the biggest deviation.
6/ Most Machine Learning model performance issues occurred due to data pipeline issues.
For example, a developer introduced a bug in logging and that is sending bad feature data to the model or a piece of the data pipeline is broken because of a system error.
7/ Another set of issues were due to ML models that were not updated for a while and user behavior has changed.
This usually resulted in the On-call opening a ticket for the model owner to retrain the model.
8/ Facebook had continuous retraining of some models and these models had challenges with reproducibility as they would get updated every few hours.
9/ Another big use-case of Scuba was to do challenger champion testing.
Engineers would run lots of A/B tests and use Scuba metrics to figure out which model is performing the best before they bring that model dashboard for a launch review.
10/ Finally all of this was enabled by a fantastic set of explainability tools that help us debug models both during experimentation and production timeframe.
Some of these tools were integrated into internal and external versions of the Facebook app. about.fb.com/news/2019/03/w…
11/ Monitoring, analysis, and explainability of Models are a must-have for teams that want to operationalize ML at scale and in a trustworthy manner.
With the last week's launch of Google Cloud’s Explainable AI, the conversation around #ExplainableAI has accelerated.
But it begs the questions - Should Google be explaining their own AI algorithms? Who should be doing the explaining? /thread
2/ What do businesses need in order to trust the predictions?
a) They need explanations so they understand what’s going on behind the scenes.
b) They need to know for a fact that these explanations are accurate and trustworthy and come from a reliable source.
3/ Shouldn't there be a separation between church and state?
If Google is building models and is also explaining it for customers -- without third party involvement -- would it align with the incentives for customers to completely trust their AI models?
It is amazing to see so many applications of game theory in modern software applications such as search ranking, internet ad auctions, recommendations, etc. An emerging application is in applying Shapley values to explain complex AI models. #ExplainableAI
Shapley value was named after its inventor Lloyd S. Shapley. It was devised as a method to distribute the value of a cooperative game among the players of the game proportional to their contribution to the game's outcome.
Suppose, 10 people came together to start a company that produces some revenue. How would you distribute the revenue of the company among the 10 people as a payoff so that the payoffs are fair and appropriate to their contributions?