Dr. Liz Fischer Profile picture
Medievalist DH researcher | Consultant @WeAreAVP working with GLAM on AI/data R&D Writing book on #NetworkAnalysis in #BookHistory (they/she)
Jun 23, 2022 11 tweets 4 min read
This is the part of my dissertation I've been working on for the last couple of months! It's a tool to help split PDF-bound documents (so far, mostly scans of printed books) into "units of interest." I want to share bc I'm pretty dang proud of it 🥰 Image Ok, so pretend you have a library catalog with entries (this is literally just one of my case studies, but hey). You scan a whole printed catalog of books, and you want to study that as a corpus, but first you have to chop it up. You need each book's info as its own "document"