Yves-A. de Montjoye Profile picture
Associate Professor of Applied Math at @ImperialCollege. SpAd to @DReynders (🇪🇺). Member of @APD_GBA (🇧🇪 DPA). @MIT @MediaLab PhD. Opinions are my own.
May 23, 2023 21 tweets 13 min read
🚨 Does client-side scanning really allows you to detect known illegal content without breaking encryption? In a new @IEEESSP paper, we show that secondary features such as targeted facial recognition could actually be hidden into it. A thread 🧵 There have been long standing concerns by law enforcement agencies 👮 that encryption, and more recently end-to-end encryption, is preventing them from accessing content sent by users*.

(* some strongly disagree with the notion that law enforcement agencies are "going dark")
Jan 25, 2022 14 tweets 5 min read
🚨 New profiling attack in @NatureComms: we show, using graph neural networks, how interaction data such as messages or bluetooth close proximity metadata can be used to uniquely identify individuals over long periods of time. nature.com/articles/s4146… A thread 🧵 Re-identification attacks so far have mostly focused on matching attacks, meaning that the adversary 😈 has access to a subset of the data that is in the "anonymous" dataset. These auxiliary information range from gender+zip code+DOB to spatio-temporal points.
Jan 10, 2022 12 tweets 6 min read
🚨 We analyzed in @NatureComms the guarantees of an anonymous “Differentially Private” mobility dataset of 300M @googlemaps users shared with researchers. We believe these guarantees to be based on assumptions that are not met in practice. A thread 🧵nature.com/articles/s4146… In 2019, @googlemaps shared with researchers a dataset consisting of “trip flow information from over 300 million people world-wide” for “Google users who opted-in to Location History” 🗺️. The data was collected in 2016, and aggregated weekly in regions of roughly 1.27 km².
Aug 5, 2021 9 tweets 4 min read
I wasn’t planning to tweet about this paper as it is under review but it’s important to get the word out there now: we evaluated 5 perceptual hashing algorithms and found all of them to be vulnerable to a simple black-box adversarial attack arxiv.org/abs/2106.09820 A thread ⤵️ Perceptual hashing-based client-side scanning solutions have been proposed by policy makers and some academics as a “privacy-preserving” solution to detect CSAM and other illegal content even when E2EE is used.
Jul 22, 2021 6 tweets 4 min read
The risk of individuals being re-identified in "anonymous" datasets has long been dismissed as a theoretical academic concern, unlikely to happen in practice. Yesterday, a US priest was shown to be using @Grindr and visiting gay bars. A thread ⤵️ While technical details on how the re-identification occurred are still unclear, the publication reported "correlat[ing] a unique mobile device to Burrill" in an anonymous "app signal dataset" washingtonpost.com/religion/2021/…
May 23, 2021 9 tweets 4 min read
"Anonymized" mobile phone data used to analyze vaccinated users' mobility behavior in the UK. A thread ⬇️ telegraph.co.uk/politics/2021/… cc/ @lilianedwards, @realhamed There are a few interesting things we can learn from the document above: first, the data comes from a CDR dataset of 18M users in the UK and the analysis is likely to have been performed by CKDelta ckdelta.ie
Apr 2, 2020 10 tweets 3 min read
#COVID19 contact tracing apps: we need to go beyond shallow reassurances that privacy is protected. Here are 8 questions we think you should ask. A thread. cpg.doc.ic.ac.uk/blog/evaluatin… Question 1: How do you limit the personal data gathered by the authority? Large-scale collection of personal data can quickly lead to mass surveillance.