r e c s y s d o t s o c i a l Profile picture
Leave Twitter. Staying is being complicit.
Aug 20, 2022 6 tweets 1 min read
If the conceptual basis of your modeling approach can’t be explained to a precocious 8 year old using crayons, the back of a paper diner menu, and whatever bottles and packets happen to be on the table, I’d like to humbly suggest that you may not understand it very well either. If it only makes sense when you use a lot of advanced mathematics and dense notation, maybe it…doesn’t?
Aug 20, 2022 4 tweets 1 min read
Many ANN search tools (e.g. FAISS, ScaNN) allow you to provide multiple points as part of the same query.

Puzzled why more retrieval models don’t take advantage of this. Give me 100 neighbors of ten points, not 1000 neighbors of one point!

(Then score and order them.) Your ranker can’t rank items that aren’t in the retrieved candidate set, so if you care about ending up with a diverse set of items in the final ranked list, you have to start all the way upstream and make sure to retrieve them.
Oct 1, 2021 10 tweets 2 min read
Recommender systems lessons I wish someone had told me years ago:

1. Any rock you lift up and peek under—models, datasets, loss functions, evaluation methods—will have a bunch of creepy crawly biases underneath. (If you haven’t spotted the popularity bias yet, keep looking.) 2. All candidate retrieval (heuristic, social, nearest-neighbor based, or otherwise) can be expressed as a graph query. If you need many different types and find a myriad of different systems weighing you down, consider whether you can unify them by reframing as a graph.
Apr 7, 2021 15 tweets 4 min read
Counter-intuitively, if your recommender is trained on data collected from past interactions with recs, retraining new models regularly doesn’t resolve the issue of performance decaying over time and can actually make it worse. Here’s why... The usual concern is that the performance of a model will decline as the distribution of data encountered during serving deviates from the distribution of the training data.

This is an accurate concern about a real phenomenon, and is a good reason to retrain on a regular basis.
Apr 7, 2021 4 tweets 1 min read
I learned recently that a decent fraction of cable modems across many brands contain a chipset with a lag-spike design flaw significant enough to warrant lawsuits: theregister.com/2017/04/11/int… Here’s a list: lookgadgets.com/articles/intel…
Apr 5, 2021 8 tweets 2 min read
The typical explanation of skip-connections in neural nets depicts them as a detour around parts of the network and suggests that they allow higher level information to flow backward to the lower levels during back-propagation of the gradients. I find this somewhat backward. The “output” of a skip connection seems to usually be combined with the output of some other blocks of the network with an element wide operation like a sum, so...
Oct 9, 2020 17 tweets 3 min read
“Why Are Deep Learning Models Not Consistently Winning Recommender Systems Competitions Yet?“

dl.acm.org/doi/abs/10.114…

My take is that we haven’t had the right model architectures. Here’s why I think that... Going way back to the Netflix prize, multiplicative interactions have been a key component of successful modeling strategies. Matrix factorization did well on the Netflix data and became a classic approach to making recommendations.
Oct 6, 2020 11 tweets 3 min read
This has been exactly my experience in the workplace as an autistic ML engineer. I’m still doing machine learning, building recommender systems, attending conferences, contributing to open source projects, publishing my work.

I’m just not getting paid, because I don’t fit in. ImageImageImageImage When I mention that I’m not currently employed, people sometimes reach out to see if I’d be interested in a position with their company—which I appreciate but often don’t know how to respond to.
Aug 9, 2020 4 tweets 1 min read
It’s actually not that difficult to understand why people whose foundational worldview is “I feel safe and comfortable when I’m part of a homogeneous in-group that defines the terms of public life” would feel like they’re under siege by the (reasonable) demands from out-groups. When people say that they feel like their freedoms are being taken away when freedoms are granted to others, this is what they’re referring to: the freedom to live in a social bubble where they don’t have to grapple with differences they find unsettling or threatening.
Jul 27, 2020 54 tweets 19 min read
This story and picture of paint-covered magazines really doesn't add up, so I did some digging and found some interesting stuff. 1/many First of all, the paint goes entirely unremarked on and makes no sense. However, reporting from various sources indicates that federal agents claim that they are being hit with balloons full of paint. 🤔
Mar 3, 2020 23 tweets 4 min read
I’m not a PM, but I have worked on the home screen for a major streaming service and wrestled with some of these trade-offs as algorithmic recommendations and ranking were introduced. I learned a few things along the way...🧵 Before the home screen is algorithmically ranked, it likely represents the product of many years of messy stakeholder interaction and negotiation, during which they all vied to have their thing at the top.
Sep 6, 2019 14 tweets 4 min read
Building recommenders without machine learning: Where the privacy risks of recommender systems live:
Jun 12, 2019 14 tweets 3 min read
I used to think that making explainable recommendations required an interpretable model. I was wrong, and in order to understand why, you need to understand a few things about the structure of industrial recommender systems. 🧵 The structure I used to picture involved using interaction data and a model to generate vectors for users and items (matrix factorization, word embeddings, etc), and then making recs by finding items similar to each user vector with approximate nearest neighbor search.
Jun 8, 2019 25 tweets 4 min read
This is a really interesting story, worth a read and some contemplation by anyone working on recommender systems. As is often the case with media coverage of technology though, I suspect it may contain some claims that are either a stretch or not entirely accurate. 🧵 In particular, this part about how YouTube recommendations were retooled to expand your interests in a project called Reinforce: