Canva has more than 20 billion user-uploaded images, with 50 million uploaded daily.
Dealing with a large variety of media creates difficulties when it comes to moderating and minimizing unnecessary duplicate content
To solve this problem, Canva uses perceptual hashing with an internally built reverse image search system
Perceptual hashing, also known as visual hashing or image fingerprinting, is a technique used to create a compact digital representation of an image or video frame based on its visual content.
The goal of perceptual hashing is to generate a hash value that is unique to the image and remains relatively unchanged even if the image undergoes minor transformations such as compression, color correction, and brightness
To perform a reverse image search using perceptual hashing, we can compare two hash values by calculating the Hamming distance between them.
Hamming distance is a metric used to measure, compare and evaluate the similarity or dissimilarity between two binary strings
Benefits for Canva
1. Reduced Storage Costs 2. Near-unique images served on lookup 3. Content Moderation: Takedown of known illegal images within seconds