P. S. Dodds, J. R. Minot, M. V. Arnold, T. Alshaabi, J. L. Adams, A. J. Reagan, and C. M. Danforth
Some questions to ask yourself and others:
What happened in the world over the last two weeks?
What about this time last year? Two years ago?
And what order did the major events happen in?
For Trump’s presidency, how easily could individuals recall and sort these example stories?:
- North Korea
- Charlottesville
- kneeling in the National Football League
- Confederate statues
- family separation
- Stormy Daniels
- Space Force
- the possible purchase of Greenland
With storywrangler, we’re hoping to enable or enhance the computational study of any large-scale temporal phenomena where people matter including:
culture,
politics,
economics,
linguistics,
public health,
conflict,
climate change,
and
data journalism.
J. R. Minot, M. V. Arnold, T. Alshaabi, C. M. Danforth, P. S. Dodds
We explore the dynamics of how Twitter users have responded to tweets made by Obama and Trump from their main accounts, @BarackObama and @realDonaldTrump.
For each tweet, we track three main characteristics as they evolve over time:
- Number of Favorites
- Number of Retweets
- Number of Replies (hard to measure—see our paper)
1. We curate and share usage time series of 1,000 1-grams that have mattered in March of 2020 (words, emojis, hashtags, etc.) for 24 languages.
We hope other researchers can use these time series to connect with other data streams.
2. We show that after a peak in January 2020 in response to the news from Wuhan of a novel contagious disase, the world’s collective attention dropped through much of February before resurging.
Now, we stretch out words naturally when we speak.
But stretched words (sometimes called elongated words) are fairly rare in book and other text corpora, and they aren’t represented well in dictionaries (if at all).
So we thought, let’s science this.
Stretchfulness in written text arrived in an abundant, accessible source with Twitter (along with the possible end of civilization but that issue is beyond the scope of our current project).
Dataset: 10% of all (140 character) tweets from September 2008 to the end of 2016.
We crafted* a series of regex-based tweet-sifters for capturing words that are naturally stretched in the wilds of Twitter.
We ended up with a skosh over 5000 “kernels” for stretchable words: