4/ A problem with modeling asset-valued entities, is that they depreciate based on human usage. Let’s take 2identical houses in the same zip code, each is 2,100 square feet and has the same number of bedrooms and bathrooms. How do we know one was better maintained than the other?
5/ Additionally there are global market conditions such as interest rates, GDP, unemployment, and supply and demand in the market that could affect home prices.
Essentially, there are always things that impact a home price that can’t be easily measured and included in a model.
6/ Another thing to worry about is the non-stationarity nature of data. House prices change over time happen due to house depreciation as well as market conditions.
So, teams look at cross-sectional data on house-specific variables such as square footage, year built, or zip.
7/ This is where the Model Risk comes in, because implicit in the machine learning process of dataset construction, model training, and model evaluation is the assumption that the future will be the same as the past.
Data scientists call this Data Drift.
8/ In effect, ML algorithms search through the past for patterns that might generalize to the future. But the future is subject to constant change, and production models can deteriorate in accuracy over time due to data drift.
9/ One or more of these types of data drift could have caused Zillow’s models to deteriorate in production.
Therefore Model performance monitoring becomes critical to catch issues emerging due to the non-stationary nature of the data proactively.
10/ At a bare minimum, teams need to monitor the following things.
a) Performance metrics like MSE, MAPE, etc.
b) Shifts in feature distributions over time.
c) Quality of data that is coming in in the production pipelines.
d) Finally alerts that can fire when things change.
11/ Another thing that could have happened in Zillow is If there was a human misinterpretation or misuse of AI.
Explainability helps provide more information and intuition about how a model operates, and reduce the uncertainty that it will be misused.
12/ XAI enables us to ask questions like:
a) Given a model trained mostly on older homes ranging from 100K-300K, can I trust a model’s prediction that a newly built house costs $400,000?
b) Given 5 houses in a neighborhood, what distinguishes them and their prices?
c) ...
13/ Model risk occurs primarily for two reasons: (1) a model may have fundamental errors and produce inaccurate outputs for the intended business uses; (2) a model may be used incorrectly or inappropriately or there may be a misunderstanding about its limitations and assumptions.
14/ We really don’t know about Zillow’s model risk process, it is possible they look into these things.
One thing is clear: It is imperative for organizations to establish a strong risk culture around how their AI is developed, deployed, and operated.
I was an engineer on Facebook's News Feed and this is NOT how recommender systems work.
While users can set some explicit preferences, implicit user activity on the app is the bulk of the signal that gets fed into the AI systems which control & rank the feed. /thread
So, if you're engaging with a certain type of content of a certain set of friends - the stories from those sources get ranked higher than others. This is true with Facebook or YouTube or any other recommender system. /2
These systems take activity events from a user’s activity history as input and retrieve hundreds of potential candidate stories/videos to show to the users in their feeds. /3
I was an eng leader on Facebook’s NewsFeed and my team was responsible for the feed ranking platform.
Every few days an engineer would get paged that a metric e.g., “likes” or “comments” is down.
It usually translated to a Machine Learning model performance issue. /thread
2/ The typical workflow to diagnose the alert by the engineer was to first check our internal monitoring system Unidash to see if the alert was indeed true and then dive into Scuba to diagnose it further.
3/ Scuba is a real-time analytics system that would store all the prediction logs and makes them available for slicing and dicing. It only supported filter and group by queries and was very fast.
With the last week's launch of Google Cloud’s Explainable AI, the conversation around #ExplainableAI has accelerated.
But it begs the questions - Should Google be explaining their own AI algorithms? Who should be doing the explaining? /thread
2/ What do businesses need in order to trust the predictions?
a) They need explanations so they understand what’s going on behind the scenes.
b) They need to know for a fact that these explanations are accurate and trustworthy and come from a reliable source.
3/ Shouldn't there be a separation between church and state?
If Google is building models and is also explaining it for customers -- without third party involvement -- would it align with the incentives for customers to completely trust their AI models?
It is amazing to see so many applications of game theory in modern software applications such as search ranking, internet ad auctions, recommendations, etc. An emerging application is in applying Shapley values to explain complex AI models. #ExplainableAI
Shapley value was named after its inventor Lloyd S. Shapley. It was devised as a method to distribute the value of a cooperative game among the players of the game proportional to their contribution to the game's outcome.
Suppose, 10 people came together to start a company that produces some revenue. How would you distribute the revenue of the company among the 10 people as a payoff so that the payoffs are fair and appropriate to their contributions?