14 Sep, 13 tweets, 6 min read
@otis_reid Matrix factorization: Panel data can be thought of as a matrix. A necessary condition for being able to do prediction is that there is some structure --something about the row and column of an entry is informative about the outcome. 1/n
@otis_reid One way to describe the amount of structure is the quality of approximation you can get with a low rank matrix. An NxT matrix of rank k can be written as product of two latent factor matrices with k factors: [Nxk] X [kxT]. 2/n
@otis_reid Fixed effect models impose low rank structure with very strong functional form (outcome is sum of unit, time effects). Not usually the best way to approx a given matrix with a limited number of parameters. Matrix factorization finds good approximation in data-driven way. 3/n
@otis_reid If outcome=smoking, matrix is states X yrs, latent unit characteristics =share of pop in each demographic (smoking highly correlated w/ age, ethnicity). Smoking in state/year is dot product of “share of state pop in each demographic” and “smoking rate for demo in this year.” 4/n
@otis_reid If outcome=purchase, matrix is people X products, latent unit characteristics=preferences for product attributes, latent product characteristics=product attributes. Latent product attribute could represent “organic.” 5/n
@otis_reid In smoking ex., we figure out how state outcomes move together over time. Without directly observing demographics, we can infer the factors that lead to co-movements, and if I see some states at a point in time, can infer outcomes for others at that time. 6/n
@otis_reid In shopping ex., I learn from correlation structure in purchase behavior. One person purchases organic tomatoes and lettuce, another purchases organic lettuce and cucumber. I predict the first person more likely to buy organic cucumber. 7/n
@otis_reid Even if matrix sparse (mostly 0’s), so fixed effects hard to estimate, can still find good low-rank approximation if structure is present in the data. Chains of people buying overlapping products informative. 8/n
@otis_reid I have some lecture notes for a master’s class (not as polished as I’d like, and stealing liberally from others) here that may help build intuition for different ways to look at a matrix: drive.google.com/drive/u/0/fold… 9/n
@otis_reid You can see Guido teach this at the AEA website here aeaweb.org/conference/con… , and slides are here: drive.google.com/drive/u/0/fold… 10/n
@otis_reid Slides build intuition about regression in panels. Do you regress final pd outcomes on prior pd outcomes; observation is unit? Regress target unit outcomes on other units in the same period; obs. is time pd (synth control)? Matrix compl. works if N>T or T>N, good in middle. 11/n
@otis_reid I also have applications to shopping and discrete choice, in these papers: arxiv.org/abs/1906.02635 arxiv.org/abs/1711.03560 and also see slides here: drive.google.com/drive/u/0/fold… 12/n
@otis_reid The shopping papers show how modern matrix factorization can be combined with structural models, and indeed there is a long history in marketing/IO and also in time series econ of using latent factor models, just typically fewer factors. 13/n

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

# More from @Susan_Athey

7 Mar
Starting an open-ended conversation for faculty who are taking their courses online in a hurry. What are tips and tricks? What challenges are you facing? #econtwitter
I recently had to give some lectures remotely. I used powerpoint, taking screen shots of pdfs and pasting onto ppt where needed. Ppt has very, very easy to use feature where you record a video w/ webcam, separate videos for each slide, and they autoplay in present mode.
The file size was huge but ppt has option to compress them. I still had to break up ppt's into smaller chunks. I liked recording separately slide by slide and being able to rearrange and edit modularly, could change slide after finishing video, etc.
4 Jan 19
Great opportunity to speak at #ASSA2019 American Economics Association/American Finance Association joint luncheon. Here are my slides: goo.gl/tWcLoM
Machine learning brings great opportunities for improvement in empirical work and for improving firm efficiency. Managers and regulators face similar challenges, given black box algorithms built by engineers who may lack context. Need analytic framework to analyze, guide usage.
Causal inference framework helps with some of the challenges, both in terms of describing problems and suggesting solutions.