There has been so much happening on the Census Bureau’s disclosure avoidance plans that it can be hard to follow. Here’s a quick guide to key developments over the past three months. /1 Image
Back on March 10, the State of Alabama filed a lawsuit objecting to the use of differential privacy in the census, arguing that the infusion of deliberate errors into the data is unconstitutional./2
The lawsuit also argues that the decision to implement differential privacy and to delay the redistricting numbers is “arbitrary and capricious” in violation of the Administrative Procedure Act./3
The lawsuit is being reviewed by a three-judge panel. A decision is expected soon. Regardless of the outcome, any appeals will be referred directly to the Supreme court. /4
My testimony is here: users.hist.umn.edu/~ruggles/censi…
Other documents related to the case are here: brennancenter.org/our-work/court… /5
The Census Bureau argues that differential privacy is needed because of the threat of “database reconstruction.” /6
I argue in this thread that the Census Bureau grossly exaggerated the threat, and that there is no realistic threat to privacy from published tables. /7
I produced a working paper with @dcvanriper that showed that the great majority of purported Census Bureau re-identifications could be explained by chance. /8
assets.ipums.org/_files/mpc/wp2…
On April 28, the Census Bureau released their final “demonstration product,” a noise-infused version of the 2010 census redistricting data so that outsiders could evaluate how the new disclosure avoidance procedures would affect usability of the data./9
Interested parties had exactly one month to provide feedback to the Census Bureau. Several feedback documents have now appeared publicly, and the results are devastating. /10
For example, @Chris_T_Kenny et al. find “Our analysis finds that the DAS-protected data are biased against certain areas, depending on voter turnout and partisan and racial composition.” /11 alarm-redist.github.io/posts/2021-05-…
Similarly, @AABeveridge argues that the noise infused data violate the traditional redistricting principles required by the courts. /12
dropbox.com/s/i9sfutxwj8f1…
@JTomMueller and @AppDemography “find the method introduces significant error into growth ratesat the county level for all groups except the total and non-Hispanic white population,” especially in rural areas. /13
osf.io/preprints/soca…
@ipums found “pervasive biases and inconsistencies, high levels of inaccuracy in the counts of minority populations, and isolated large errors in the population counts for particular communities.” /14
users.hist.umn.edu/~ruggles/Artic…
In the midst of all this, on May 20 the Census Bureau gave a presentation at the ACS data users conference in which they announced that the ACS public use microdata files would be replaced with fully synthetic data by 2024. /15
I tweeted about why this would be terrible. /16
On May 27--a few days after Twitter erupted in protest--Acting Census Director @jarmin_ron backtracked from the ACS presentation, tweeting “no decisions have been made or are imminent on introducing a synthetic ACS PUMS.”/17
The newest development will occur on June 4, when the Census Bureau will present a webinar to demonstrate why they cannot revert to traditional methods of statistical disclosure control for the 2020 census. /18
They argue that even if you swap 50% of the households while altering 50% of household sizes and perturbing 70% of tracts, it somehow does not have a significant impact on the percentage of “re-identifications.” /19
I argue that this is because the overwhelming majority of their reidentifications are fake. /20

That brings us up to the present. The saga continues tomorrow. /end

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Steven Ruggles

Steven Ruggles Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @HistDem

2 Jun
The Census Bureau has conducted a new analysis that purports to show that swapping is ineffective for the prevention of reidentification attacks. /1
www2.census.gov/about/partners…
The new analysis completely misses the point, and actually provides a useful demonstration of the gross misrepresentation of the Census Bureau’s “Database Reconstruction Experiment.” /2
The Census Bureau claimed that without swapping, they could “putatively re-identify” 44.60 of the population. /3
Read 12 tweets
21 May
/1. Yesterday at the ACS Data Users Conference, the Census Bureau described its plans to replace the American Community Survey (ACS) microdata with “fully synthetic” data over the next three years.
/2. Details of the methodology have not been disclosed, but the idea is to develop models describing the interrelationships of all the variables in the ACS, and then construct a simulated population consistent with those models.
/3. Such modeled data captures relationships between variables only if they have been intentionally included in the model. Accordingly, synthetic data are poorly suited to studying unanticipated relationships, which impedes new discovery.
Read 30 tweets
19 May
/1. @samwang misinterprets the second declaration of John Abowd in Alabama v. Department of Commerce.
/2. Abowd states that in tiny blocks, if you “reconstruct” age and it matches someone who lives on the on the block in the commercial database, and then look up the names of those people in the census, the census recorded the same people 72.24% off the time.
/2. Everyone on the block in the commercial database ought to be found on the same block in the census.
Read 13 tweets
15 May
/1. The Census Bureau plans to add intentional errors to the 2020 census to protect the confidentiality of census respondents. The Census Bureau insists that the intentional error is necessary to combat the threat of “database reconstruction.”
/2. Database reconstruction is a process for inferring individual-level responses from tabular data. The Chief Scientist of the Census Bureau asserts that database reconstruction “is the death knell for traditional data publication.”
/3. To demonstrate the threat Census conducted a database reconstruction experiment that attempted to infer the age, sex, race, and Hispanic or Non-Hispanic ethnicity for every individual in each of the 6.3 million inhabited census blocks in the 2010 census.
Read 20 tweets
20 Apr
1.I prepared a report for the Plaintiffs in the Alabama v. Department of Commerce lawsuit over differential privacy in the census, available here: users.hist.umn.edu/~ruggles/censi…
2.I argue that the database reconstruction experiment did not demonstrate a convincing threat to confidentiality, because the results reported by the Census Bureau can be largely explained by chance.
3. Any randomly-chosen age-sex combination would be expected to be found on any given block more than 50% of the time.
Read 9 tweets
5 Jul 19
What we have learned about the Census Bureau’s implementation of differential privacy.

In September 2020, the Census Bureau announced new confidentiality standards that mark a “sea change for the way that official statistics are produced and published.” 1/
The new system, known as Differential Privacy (DP), will be applied first to 2020, and “will then be adapted to protect publications from the American Community Survey and eventually all of our statistical releases.” 2/
I am increasingly convinced that DP will degrade the quality of data available about the population, and will make scientifically useful public use microdata impossible. 3/
Read 26 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(