Latest Twitter Threads by @kerolloz on Thread Reader App

Feb 10, 2022 • 8 tweets • 2 min read

Today I made a simple script that traverses a directory to find similar files according to their MD5 hash value.

github.com/kerolloz/same-…

The reason why I was trying to solve this problem is that I have a directory with a huge number of PDF files (books). I was almost sure that there are some duplicates among these files (maybe even with different names, that's why the hashing).

I started by implementing a simple and straightforward script in Node.js.

Share this page!

Enter URL or ID to Unroll