Counter-intuitively, if your recommender is trained on data collected from past interactions with recs, retraining new models regularly doesn’t resolve the issue of performance decaying over time and can actually make it worse. Here’s why...
The usual concern is that the performance of a model will decline as the distribution of data encountered during serving deviates from the distribution of the training data.
This is an accurate concern about a real phenomenon, and is a good reason to retrain on a regular basis.
However, if the training data is collected from interactions with past recommendations, then which data you collected was determined by which recommendations were made, which was determined by the last model served in production. See the issue?
Any bias introduced by the previous model changes which data you train the next model on—and the whole purpose of a recommender model is to be intentionally biased toward relevant items.
That doesn’t sound so bad at first, but as more and more subsequent models are trained on increasingly biased data, it can lead the recommender to collapse in on itself and recommend an increasingly narrow set of mostly popular items to a broader and broader slice of the users.
A similar effect applies to evaluation: when evaluating a new model on data collected from interactions with past recs, you don’t have data about interactions with items that weren’t recommended, so models that are similar to the previous model will tend to look the best.
Re-training regularly is a good idea, but without correcting for the bias current models introduce in future training and evaluation data, it can accelerate the destructive feedback loop.
So how do you correct for this effect? Well, you need some data about interactions that the current model wouldn’t normally collect, so instead of a recommendation policy that greedily exploits the best possible recs (according to the model), you introduce some exploration.
That lets you sample at least a little bit of data about interactions you wouldn’t normally see, which makes counterfactual/off-policy training and evaluation possible.
In essence, you collect a much broader but still very biased dataset, and then use statistics to reweight the resulting logged data for training and evaluation to look like it was collected with a more uniform recommendation policy (instead of a greedy policy.)
In order to accomplish this, introducing some exploration into the items being served to users (whether from recs or an existing non-algorithmic source) is one of the first things to do, even though it’s often at the end of a recommendations pipeline.
It doesn’t have to be anything fancy, even an epsilon-greedy policy, where some small percentage of the time an item is chosen at random (perhaps from a pool of potentially relevant candidate items), gives you broader data about interactions you wouldn’t have seen otherwise.
I’ll close with some papers and pointers to more info.
This excellent RecSys paper demonstrates what happens when you don’t correct for the biases introduced by past models, algorithms, and recs:
I learned recently that a decent fraction of cable modems across many brands contain a chipset with a lag-spike design flaw significant enough to warrant lawsuits: theregister.com/2017/04/11/int…
If you’re using whatever the cable company gave you (like I was), and your internet connection seems generally good but occasionally lags out completely, check the mode number of your modem against that list. I swapped mine out yesterday and saw an immediate improvement.
The typical explanation of skip-connections in neural nets depicts them as a detour around parts of the network and suggests that they allow higher level information to flow backward to the lower levels during back-propagation of the gradients. I find this somewhat backward.
The “output” of a skip connection seems to usually be combined with the output of some other blocks of the network with an element wide operation like a sum, so...
To me, it makes a heck of a lot more sense to consider the skip connection to be the default path in the forward pass, onto which something extra is being added or combined.
My take is that we haven’t had the right model architectures. Here’s why I think that...
Going way back to the Netflix prize, multiplicative interactions have been a key component of successful modeling strategies. Matrix factorization did well on the Netflix data and became a classic approach to making recommendations.
Many further iterations on the key concept of factorizing matrices into low-rank approximations with vector embeddings per user/item/attribute have also been successful.
This has been exactly my experience in the workplace as an autistic ML engineer. I’m still doing machine learning, building recommender systems, attending conferences, contributing to open source projects, publishing my work.
I’m just not getting paid, because I don’t fit in.
When I mention that I’m not currently employed, people sometimes reach out to see if I’d be interested in a position with their company—which I appreciate but often don’t know how to respond to.
I might be, but the truth is that I have absolutely been through the wringer with work these past five years or so, and I have some lingering workplace-related trauma that makes me very hesitant to pursue the next thing.
It’s actually not that difficult to understand why people whose foundational worldview is “I feel safe and comfortable when I’m part of a homogeneous in-group that defines the terms of public life” would feel like they’re under siege by the (reasonable) demands from out-groups.
When people say that they feel like their freedoms are being taken away when freedoms are granted to others, this is what they’re referring to: the freedom to live in a social bubble where they don’t have to grapple with differences they find unsettling or threatening.
Here’s the thing: we all feel more comfortable in homogeneous groups that dictate the terms of interaction. Some of us grow beyond it to become capable of more engaging outside our bubbles, and some of us don’t. This mindset is in all of us to some extent and is never going away.
This story and picture of paint-covered magazines really doesn't add up, so I did some digging and found some interesting stuff. 1/many
First of all, the paint goes entirely unremarked on and makes no sense. However, reporting from various sources indicates that federal agents claim that they are being hit with balloons full of paint. 🤔
Oh really, you don't say? Well, here's a Reuters photo from yesterday of a DHS police officer covered in red paint.