Want to create a Graph Neural Network based recommendation engine? You first have to get intimate with its performance metrics: recall@k and precision@k. Here are a few helpful illustrations I made that help visualize what you are measuring:
First, the definition of precision and recall.
- Precision: "What ratio of the top k elements that we recommended are relevant?"
- Recall: "What ratio of the existing pool of relevant items did we get right in the top k elements?"
In mathematical terms:
Let's take an example to illustrate this, with simple tabular data first. We have a list of item ratings that we take as the baseline (targets) and we have the recommendation that our model spat out.
- Scores are between 0.0 - 10.0.
- We define a relevant match if it is above 4.0 (arbitrary, depends on the case). In our case items 1,2,3 and 5 were relevant.
The first diagram shows the items sorted by the actual scores; the second by the predicted scores given by our model.
So how well did our model do?
* For k=3 we have *
- Precision: 2 relevant recommended items and 3 recommended items => 0.66..
- Recall: We have 2 relevant recommended items from a pool of 4 => 0.50
* For k=1 and k=4 we have *
Now that we have established it for tabular data, let's look at how this looks for #graphs. Below is a simple graph showing user purchases. Green lines are existing purchases, red lines are user seen items that they didn't purchase.
Here are the edges that our model thinks should be connected to this user (our prediction):
We got c, e and d right! All edges have a prediction (confidence value) attached to it. Similarly to our tabular data we can list these in a table and sort them by score.
Let's calculate k=3.
Finally, k=1, and k=4
If you want to dive deeper into these metrics, why they can't be used to calculate loss and generally Graph Neural Network based recommendations (link prediction) then I heavily recommend Jure Leskovec's (@jure) course at Stanford: CS224W web.stanford.edu/class/cs224w/. It's fantastic.
• • •
Missing some Tweet in this thread? You can try to
force a refresh