Susan Athey Profile picture
Economist.

Sep 14, 2020, 13 tweets

@otis_reid Matrix factorization: Panel data can be thought of as a matrix. A necessary condition for being able to do prediction is that there is some structure --something about the row and column of an entry is informative about the outcome. 1/n

@otis_reid One way to describe the amount of structure is the quality of approximation you can get with a low rank matrix. An NxT matrix of rank k can be written as product of two latent factor matrices with k factors: [Nxk] X [kxT]. 2/n

@otis_reid Fixed effect models impose low rank structure with very strong functional form (outcome is sum of unit, time effects). Not usually the best way to approx a given matrix with a limited number of parameters. Matrix factorization finds good approximation in data-driven way. 3/n

@otis_reid If outcome=smoking, matrix is states X yrs, latent unit characteristics =share of pop in each demographic (smoking highly correlated w/ age, ethnicity). Smoking in state/year is dot product of “share of state pop in each demographic” and “smoking rate for demo in this year.” 4/n

@otis_reid If outcome=purchase, matrix is people X products, latent unit characteristics=preferences for product attributes, latent product characteristics=product attributes. Latent product attribute could represent “organic.” 5/n

@otis_reid In smoking ex., we figure out how state outcomes move together over time. Without directly observing demographics, we can infer the factors that lead to co-movements, and if I see some states at a point in time, can infer outcomes for others at that time. 6/n

@otis_reid In shopping ex., I learn from correlation structure in purchase behavior. One person purchases organic tomatoes and lettuce, another purchases organic lettuce and cucumber. I predict the first person more likely to buy organic cucumber. 7/n

@otis_reid Even if matrix sparse (mostly 0’s), so fixed effects hard to estimate, can still find good low-rank approximation if structure is present in the data. Chains of people buying overlapping products informative. 8/n

@otis_reid I have some lecture notes for a master’s class (not as polished as I’d like, and stealing liberally from others) here that may help build intuition for different ways to look at a matrix: drive.google.com/drive/u/0/fold… 9/n

@otis_reid You can see Guido teach this at the AEA website here aeaweb.org/conference/con… , and slides are here: drive.google.com/drive/u/0/fold… 10/n

@otis_reid Slides build intuition about regression in panels. Do you regress final pd outcomes on prior pd outcomes; observation is unit? Regress target unit outcomes on other units in the same period; obs. is time pd (synth control)? Matrix compl. works if N>T or T>N, good in middle. 11/n

@otis_reid I also have applications to shopping and discrete choice, in these papers: arxiv.org/abs/1906.02635 arxiv.org/abs/1711.03560 and also see slides here: drive.google.com/drive/u/0/fold… 12/n

@otis_reid The shopping papers show how modern matrix factorization can be combined with structural models, and indeed there is a long history in marketing/IO and also in time series econ of using latent factor models, just typically fewer factors. 13/n

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling