Steven Ruggles Profile picture
Historical demographer and data impresario
Potato Of Reason Profile picture 1 subscribed
Apr 30, 2022 4 tweets 2 min read
The Census Bureau has been struggling ever since it started outsourcing operations to defense contractors in the 1990s, as @dldmag and I argued in our 2020 article on institutional change in the Bureau. The 2000 and 2010 censuses were near disasters. 1/4 academic.oup.com/jah/article/10… Image The 2020 census faced special challenges stemming from both the pandemic and from Trump’s attempt to add a citizenship question at the last minute. But as @dldmag and I argued, “The greatest concern for the 2020 census is the potential for information technology failure.” 2/4
Aug 24, 2021 35 tweets 5 min read
In our new open-access research brief, @dcvanriper and I argue that the emperor is buck naked. 1/x
rdcu.be/cvT26 The Census Bureau plans a new approach to disclosure control for the 2020 census that will add noise to every statistic the agency produces for places below the state level. /2
Aug 18, 2021 7 tweets 2 min read
The town of Carrollton, Mississippi won the Differential Privacy lottery! They really have somewhere in the neighborhood of 175 people, but the 2020 Census "counted" over twice as many, 423! /1 thetaxpayerschannel.org/news.php?news_… The discrepancy is mostly due to one block, where there are no households but 214 persons. The only building on the block is the courthouse, and nobody lives there. /2
Aug 18, 2021 5 tweets 2 min read
The town of Carrollton, Mississippi won the Differential Privacy lottery! They really have somewhere in the neighborhood of 175 people, but the 2020 Census "counted" over twice as many, 423! /1
thetaxpayerschannel.org/news.php?news_… The discrepancy is mostly due to one block, where there are no households but 214 persons. The only building on the block is the courthouse, and nobody lives there. /2
Aug 16, 2021 6 tweets 2 min read
The Census Bureau adopted a global privacy budget of ε=19.61 for the PL-94-171 redistricting data file. What does that imply?
According to differential privacy co-inventor @frankmcsherry it means that the Census Bureau privacy protections are pointless.
Jul 30, 2021 6 tweets 3 min read
Newly-available data show that the 2020 Census will be the worst ever with respect to one key metric: Item Non Response (INR), which occurs when people are counted but the census does not capture their characteristics. This graph compares INR in 2010 and 2020 for sex and age. /1 These graphs were obtained through a recent FOIA request and appeared in a court filing last week (1:21-cv-01361-ABJ). DRF1 (Decennial Response File 1) is the raw data, and DRF2 has the duplicates removed. Here are the INR graphs for Hispanic Origin and Race. /2
Jul 3, 2021 8 tweets 3 min read
Here is a screenshot from yesterday's Census Bureau Webinar on new specifications for the 2020 census. The table shows the crazy inconsistencies in the block-level data, comparing the version they adopted April 28 with the new version just announced./1 The demonstration data released in April was terrible, as we and others explained.
We were expecting the new version to be more accurate than the previous one, but for blocks it turned out even worse./2
users.pop.umn.edu/~ruggles/Artic…
Jul 1, 2021 10 tweets 2 min read
It has now become clear that the 2020 Census will not provide block-level statistics usable for planning or research./1 Image Newly-published data reveal that the Census Bureau has increased the "noise" added to the data at the block level, compared with the demonstration data released in April./2
census.gov/programs-surve…
Jun 9, 2021 12 tweets 3 min read
Back in December 2018, Mark Hansen wrote an article on differential privacy in the NYT. To explain how the Census protected identities in 2010, he cited the case of the only two residents of Liberty Island in New York harbor, who oversee the national monument./1 Liberty Island is considered a block by the Census Bureau, even though it only has two residents. The actual residents of the island were a married couple, who were interviewed by Hanson, aged 59 and 49, who both identified as white./2
Jun 8, 2021 9 tweets 2 min read
Somehow I missed this subtweet. @frankmcsherry finds it "mystifying" that "(some) demographers" (i.e. me) have "contempt for the the privacy of their subjects." In his blog quotes, he screenshots my tweets and characterizes them as "just embarrassing."/1 In my tweets, I was objecting to @john_abowd's characterization of a 45% match rate between his so-called database reconstruction and the actual data on four characteristics as "highly accurate."
Jun 3, 2021 21 tweets 6 min read
There has been so much happening on the Census Bureau’s disclosure avoidance plans that it can be hard to follow. Here’s a quick guide to key developments over the past three months. /1 Image Back on March 10, the State of Alabama filed a lawsuit objecting to the use of differential privacy in the census, arguing that the infusion of deliberate errors into the data is unconstitutional./2
Jun 2, 2021 12 tweets 3 min read
The Census Bureau has conducted a new analysis that purports to show that swapping is ineffective for the prevention of reidentification attacks. /1
www2.census.gov/about/partners… The new analysis completely misses the point, and actually provides a useful demonstration of the gross misrepresentation of the Census Bureau’s “Database Reconstruction Experiment.” /2
May 21, 2021 30 tweets 5 min read
/1. Yesterday at the ACS Data Users Conference, the Census Bureau described its plans to replace the American Community Survey (ACS) microdata with “fully synthetic” data over the next three years. /2. Details of the methodology have not been disclosed, but the idea is to develop models describing the interrelationships of all the variables in the ACS, and then construct a simulated population consistent with those models.
May 19, 2021 13 tweets 3 min read
/1. @samwang misinterprets the second declaration of John Abowd in Alabama v. Department of Commerce. /2. Abowd states that in tiny blocks, if you “reconstruct” age and it matches someone who lives on the on the block in the commercial database, and then look up the names of those people in the census, the census recorded the same people 72.24% off the time.
May 15, 2021 20 tweets 4 min read
/1. The Census Bureau plans to add intentional errors to the 2020 census to protect the confidentiality of census respondents. The Census Bureau insists that the intentional error is necessary to combat the threat of “database reconstruction.” /2. Database reconstruction is a process for inferring individual-level responses from tabular data. The Chief Scientist of the Census Bureau asserts that database reconstruction “is the death knell for traditional data publication.”
Apr 20, 2021 9 tweets 2 min read
1.I prepared a report for the Plaintiffs in the Alabama v. Department of Commerce lawsuit over differential privacy in the census, available here: users.hist.umn.edu/~ruggles/censi… 2.I argue that the database reconstruction experiment did not demonstrate a convincing threat to confidentiality, because the results reported by the Census Bureau can be largely explained by chance.
Jul 5, 2019 26 tweets 7 min read
What we have learned about the Census Bureau’s implementation of differential privacy.

In September 2020, the Census Bureau announced new confidentiality standards that mark a “sea change for the way that official statistics are produced and published.” 1/ The new system, known as Differential Privacy (DP), will be applied first to 2020, and “will then be adapted to protect publications from the American Community Survey and eventually all of our statistical releases.” 2/