, 6 tweets, 2 min read Read on Twitter
Update on Part 2 of my article series discussing #NLP/data science analysis of the Mueller Report:

• new "clean" source of OCRed text found, and will be used for all further articles in the series
• I found an excellent "coreference" tool that I will demonstrate and train

"coreference": when a noun (e.g. full name) is used once in a paragraph, and that noun is referred to later using another form (e.g. last name, or pronoun). This screenshot from a web-based demo shows how three "coreferences" have been de-referenced to their root.

2/6 Screenshot showing the phrase
"Manafort" and "He" are shown with arrows pointing to "Paul Manafort", indicating that those words were de-referenced back to the full name." src="/images/1px.png" data-src="https://pbs.twimg.com/media/D5I9_PiWsAAtSaH.jpg">
I have trained this particular tool using the list of references provided at the end of the report, and it works well.

• de-referencing relative dates, and how I did it
• showing only relevant tweets from the correlated timeline using simple text classification tricks

I'll be aiming to finish Part 2 by Sunday. This isn't my day job at (looking to change that!); I only have off-hours to spare. The next part should be an easier read for non-tech folks, and have a nice pay-off at the end about social networks and the text you write online.

The next stage of the timeline is going to be much nicer! If you'd like to show your support for this effort, I always appreciate likes/retweets/comments/questions. If you're able, the gift of Venmo is a great way to keep me caffeinated while I analyze the report. 😊

5/6 Venmo QR code for scanning with the Venmo app. My username on Venmo is spdustin (s-p-d-u-s-t-i-n)
Part 2 will also be the first to be split into two versions: one, intended for everyone, and written to teach you about the tech behind the analysis; and two, the code itself, separate from the article.

Thanks for your support!

Part 1 is here:


Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to ➖Dustin Miller➖
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!