The reason why I was trying to solve this problem is that I have a directory with a huge number of PDF files (books).
I was almost sure that there are some duplicates among these files (maybe even with different names, that's why the hashing).
I started by implementing a simple and straightforward script in Node.js.