My Authors
Read all threads
New NCOVID-19 paper thread:

“How the world's collective attention is being paid to a pandemic:
COVID-19 related 1-gram time series for 24 languages on Twitter”

Main site:
compstorylab.org/covid19ngrams/
We make two main contributions:

1. We curate and share usage time series of 1,000 1-grams that have mattered in March of 2020 (words, emojis, hashtags, etc.) for 24 languages.

We hope other researchers can use these time series to connect with other data streams.
2. We show that after a peak in January 2020 in response to the news from Wuhan of a novel contagious disase, the world’s collective attention dropped through much of February before resurging.
To find meaningful 1-grams for now, we compared Zipf distributions for March 2020 relative to March 2019.

We used our new allotaxonometric equipment on our also new language-re-identified Twitter database of 1-grams.
We looked at 24 of the most common languages on Twitter.

(We had to leave out Japanese, Thai, and Chinese because we are not able to parse them well with our current tools.) Image
Some languages correspond strongly with countries, others are more spread out (e.g., English, Spanish, Arabic).
Here’s an example of how Italian differed, using our allotaxonometric tools, and focusing only on simple words.

The right side is filled with pandemic-related terms that were hardly or not at all being used a year ago. Image
We made lists like this one for each of the first 21 days of March, comparing to the year before, and then found which 1-grams really stood out overall. Image
Here are the top 20 1-grams for our 24 languages, in groups of 6, with languages ordered by overall rank on Twitter.

Languages 1–6: Image
Not all 1-grams are about the pandemic—the world always has many stories running together.

Languages 7–12: Image
But the pandemic dominates the narrative space in a way we’ve never seen before.

Languages 13–18: Image
The last 6 of 24: Image
We share all of these time series (2019/09/01–) for the top 1000 1-grams on Gitlab here:

gitlab.com/compstorylab/c…
We explored the daily rank time series for the word ‘virus’, translated as needed.

1. A late January peak appears across most of the time series, leading to a decline in collective attention.

2. The world slowly relaxed.

3. And the virus spread.

The top 12 languages: Image
The next 12 languages: Image
Of the top 24, Catalan stood out as not having a response to the January news of a novel contagious disease in China. Image
Catalonia is right now one of the hardest hit areas:

theguardian.com/world/2020/mar… Image
For a different view, we have bar chart races here:

compstorylab.org/covid19ngrams/…

An excerpt:
Immediately next: We’re working on providing a similar data set for 2-grams.

Yes, we will have time series for “toilet paper” in English.

Until then, we believe/hope there’s much more that can be one with the time series we’ve put together.
Per the paper, some topics to track:

Hand washing, testing, serology, vaccines, masks and protection equipment, social/physical distancing, community support versus loneliness/isolation, closures of schools and universities, economic problems, job loss, and food concerns.
Ending the thread here.

We will attempt to update the data daily, and the paper as need be.

Be safe, help others, and think systems not selves.
appending arXiv link here:

arxiv.org/abs/2003.12614
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with ComputationlStoryLab

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!