Tyler Dukes Profile picture
Jul 28, 2021 15 tweets 6 min read Read on X
CDC now says residents of counties with "high" or "substantial" community spread should mask up indoors – even if they're vaccinated.

For NC, that's all but 21 counties.
CDC publishes a US map showing "level of community transmission," but no way to download the data (that I can see at least).

covid.cdc.gov/covid-data-tra…

But what if we REALLY WANT the data? Follow along for a few tips on prying it loose.
With some exceptions, interactives like these pipe in structured data from another source.

Maybe it's a CSV (comma-separated value).

Or a JSON (JavaScript Object Notation).

However it's structured, we can work with it – if we know where to find it.
Luckily, most modern browsers (I'm using Chrome) can help us track it down.

Right click on the page and click "Inspect" to pull up a panel that allows you to look under the hood.

What we're looking for is the "Network" tab, which doesn't look very interesting right now.
If we refresh, the Network tab shows us all kinds of external files pulled in to render the page, from images to basic styling.

But let's narrow the field.
Near the top of the Network tab, you'll see a row of options allowing you to filter. What we want is the "Fetch/XHR" filter.

That's going to be all manner of structured data loaded into the page, some of it not terribly interesting.

But if we refresh again...
We can see for example, that the page is loading a JSON called "colors."

Click on a row to see a preview, and – more importantly – the URL of the data itself.
Not everything looks like a "static" file per se. Sometimes you'll find a "call" to an Application Programming Interface (or API) that's akin to a request for data by passing certain choices (or parameters) to a specific URL.

Kind of like ordering from a menu.
You can get a clue about what the CDC page is "ordering up" by looking at the parameter, in this case what comes after "id=".

I'm particularly interested in the menu request for "integrated_county_latest_external_data". Sounds delicious.
If we click through and take a closer look, we can preview and expand the items to see there is a LOT of data in here for every US county.

And it just so happens to contain exactly the variable we want: "community_transmission_level"
In the "Headers" tab, the Request URL leads us directly to the data source: covid.cdc.gov/covid-data-tra…

It might look like gibberish, but it's actually sweet, sweet structured data! And if you have a browser extension like JSONView, reading is a little easier chrome.google.com/webstore/detai…
From there, you can hit CMD+S or Ctrl+S to save your JSON file locally. There are a few free converters out there to turn it into CSV, which you can open in a spreadsheet program (or Google Sheets).

I tried this one from @konklone, and it worked great! konklone.io/json/
Or if you're an R fan, you can pipe the JSON directly from the URL using the jsonlite and tidyverse packages with a few lines of hastily written code that might look like this: gist.github.com/mtdukes/54d198…
Either way, now you've got a handy dataset you can sort, filter and do some quick analysis on.

And you can REPEAT this every time the CDC updates.

Better yet, you can listen to @simonw tell you how to automate that repetition. simonwillison.net/2021/Mar/5/git…
This technique is worth a shot if you're stymied by a site that won't fork over the raw data.

If you're lucky, you can pry loose in a few minutes what it might take days/weeks to request.

If you're unlucky, it's probably because it's Tableau.

But that's a separate thread.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tyler Dukes

Tyler Dukes Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mtdukes

Mar 17, 2022
As #SunshineWeek winds down, just a quick thread about a pretty awesome tradition here in North Carolina.

For 8 years now, we've brought together a coalition of journalists from across the state interested in telling stories to promote open government and transparency.
It started in 2015, when we examined a year's worth of travel records from the state's top leaders, combining the forces of @wral, @newsobserver, @theobserver, @SpecNews1RDU, @fayobserver & @wectnews. wral.com/records-show-b…

wral.com/governor-s-off…
In 2016, @wral, @ap, @newsobserver, @theobserver, @SpecNews1RDU and @wectnews teamed up to probe state and local officials' use of private email to conduct public business (anyone remember Nick Danger?). wral.com/private-email-…
Read 15 tweets
Mar 15, 2022
Happy Sunshine Week!

@ncwatchdognet obtained the 2021 working calendars of North Carolina's top government leaders via public requests and made them all searchable here: newsobserver.com/news/politics-…
@ncwatchdognet One of the interesting things about this story was our analysis of Gov. Roy Cooper's public schedule.

If you're a reporter in this state, you're probably used to getting emails exactly like this every morning. Image
@ncwatchdognet We see that exact phrase – "Throughout the day, Governor Cooper will be holding meetings and conducting other business" – a lot.

Not terribly helpful, for planning purposes or transparency.

But how much is "a lot?"

That's something we can count!
Read 9 tweets
Nov 6, 2020
A few updates on the potential votes left to count in North Carolina, as of new data 11/6 AM.

We're now looking at a *maximum ceiling* of 171,000 potential uncounted votes.

To see how that breaks down, let's get into the math! #ncpol
Outstanding mail-in ballot requests now down to ~98,000. As expected, it's ticked down over the last few days as ballots arrive. But:
- Many ballots may never be returned/returned on time
- This doesn't account for voters who changed their minds and voted on Election Day.
County Election Boards have so far received ~32,000 accepted mail-in ballots since Election Day. But those ballots have not yet been included in the state's unofficial results. er.ncsbe.gov/?election_dt=1…

They need to be approved by county boards in open meetings first.
Read 10 tweets
Nov 5, 2020
Spent today doing a deep dive on the number of potential ballots left to count in North Carolina.

All told, that number comes out to about 172,000. Here's how we got there...
newsobserver.com/news/politics-… #ncpol
First up: outstanding absentee ballots. An analysis of data as of Thursday AM shows about 108,000 in the "outstanding absentee".

That number is expected to decrease over the next few days, as it did between today and yesterday, when it was at about 116,000.
This number is a measure of the outstanding requests for mail-in ballots, but has a few big caveats:
- Not all of will be returned
- Not all will be sent by Nov. 3/arrive before Nov. 12
- Number doesn't account (yet!) for the people who change their mind and vote in person
Read 13 tweets
Nov 4, 2020
NC State Board of Elections has a livestreaming coming up on the post-election process in North Carolina. Tune in here: newsobserver.com/news/politics-… #ncpol
Per @NCSBE Director Karen Brinson Bell, number of estimated outstanding ballot still stands at 117,000.

Board still working to gather number of provisional ballots, but under state law, report on the number of those ballots is due by noon Thursday. #NCpol
Brinson Bell, of @NCSBE: "With very few exceptions will North Carolina's numbers move before the 12th or the 13th." #ncpol
Read 6 tweets
Nov 4, 2020
For those of you wondering how national reporters are getting estimates of 200,000-300,000 outstanding mail-in ballots in North Carolina, so are we.

Actually, we're not. It's wrong.

The number to watch is about 117,000 estimated outstanding ballots, per state data.
And again, big caveats:
1) Based on ballot requests, so some won't be counted because they weren't:
a) Postmarked by 11/3
b) Received by 11/12
c) Sent at all
d.) Properly completed
2) Doesn't account for outstanding ballots where voters voted on Election Day
NOW LET'S TALK PROVISIONALS Y'ALL

Provisionals are *different* than absentee by-mail ballots. We don't know how many provisionals were cast yet in North Carolina.
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(