This is cool, earlier this year I looked into the privacy of FMD (by @gabrie_beck et al) including simulations of attacks on realistic datasets.

Now, @Istvan_A_Seres et al have performed their own analysis and, in addition, have shown attack improvements on those same datasets.
You can find my original dive into those datasets as part of the book I put together for fuzzytags (a rust implementation of FMD)

docs.openprivacy.ca/fuzzytags-book…
The attack improvements come from considering temporal relationships (the probability of receiving messages over a given threshold in a period of time) instead of just over the lifetime of the system.

This can be devastating if false positive rates are poorly selected.
One thing this new analysis does not consider is the existence of, what I call, "entangled tags" (see: docs.openprivacy.ca/fuzzytags-book…)

Basically FMD schemes permit anyone to efficiently forge tags that 100% match multiple users.
I recently release an update to fuzzytags that makes use of avx2 speedups in dalek ristretto to allow a consumer desktop to produce a completely entangled tag for 2 parties in ~79 seconds:

But, importantly, under the FMD threat model the routing server can only perform attacks given information about false positive rates well below 2^24 which means that you can partially entangle a tag to multiple parties that the server cannot distinguish.
And further you can do this both altruistically (to hide you are sending a message to someone by also entangling it to someone else), and maliciously (to implicate someone else in a deniable way).
I'm currently working on a project called Niwl which is best described as a mixnet design that makes heavy use of fuzzy message detection with entangled tags to improve both decentralization and auditability.

git.openprivacy.ca/openprivacy/ni…
Basically by adding mix nodes to an FMD scheme you can allow those nodes to take on the bandwidth-heavy and altruistic anonymity functions to provide for bandwidth-lite clients...
...those clients can, in addition, make use of entangling to check that mix nodes are acting honestly without adding additional traffic to the network (by tagging some messages to their contact AND themselves)
There are a couple of other neat tricks you can do as well, like entangle a tag to both a well known mix node AND a contact. Or entangle tag a message to two different mix nodes.

See the fuzzytags book for a write up on generic strategies: docs.openprivacy.ca/fuzzytags-book…
I'll add that all this comes with a very large hic sunt dracones warning - all of this is an experimental design that requires more analysis and testing.

git.openprivacy.ca/openprivacy/ni…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Sarah Jamie Lewis

Sarah Jamie Lewis Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @SarahJamieLewis

18 Aug
Both these images have NeuralHash: 1e986d5d29ed011a579bfdea

Just a reminder that visually similar images are not necessarily semantically similar images.
Love playing games like "Are these, technically, semantically similar images"?

All these images have NeuralHash: ba9f4edd1233a856784b2dc4
Hashes generated using the instructions / script found here: github.com/AsuharietYgvar…
Read 18 tweets
16 Aug
Revisiting first impressions of the Apple PSI system in light of the new threat model.

pseudorandom.resistant.tech/ftpsi-paramete…
I think the main takeaway is that there hasn't been enough push back and that this now seems depressingly inevitable.

I expect we will see more calls for surveillance like this in the coming months heavily remixed into the ongoing "online harms" narrative.
Without a strong stance from other tech companies, in particular device manufacturers and OS developers, we will look back on the last few weeks as the beginning of the end of generally available consumer devices that don't conduct constant algorithmic surveillance.
Read 5 tweets
13 Aug
Apple have given some interviews today where they explicitly state that the threshold t=30.

Which means the false acceptance rate is likely an order of magnitude *more* that I calculated in this article.
Someone asked me on a reddit thread the other day what value t would have to be if NeuralHash had a similar false acceptance rate to other perceptual hashes and I ball parked it at between 20-60...so yeah.
Some quick calculations with the new numbers:

3-4 photos/day: 1 match every 286 days.
50 photos/day: 1 match every 20 days.
Read 17 tweets
12 Aug
As an appendix/follow up to my previous article (a probabilistic analysis of the high level operation of a system like the one that Apple has proposed) here are some thoughts / notes / analysis of the actual protocol.

pseudorandom.resistant.tech/a_closer_look_…
Honestly I think the weirdest thing given the intent of this system is how susceptible this protocol seems to be to malicious clients who can easily make the server do extra work, and can probably also just legitimately DoS the human-check with enough contrived matches.
Read 11 tweets
12 Aug
Daily Affirmation: End to end encryption provides some safety, but it doesn't go far enough.

For decades our tools have failed to combat bulk metadata surveillance, it's time to push forward and support radical privacy initiatives.
Watching actual cryptographers debate about whether or not we should be voluntarily *weakening* encryption instead of radically strengthening threat models makes my skin crawl.
I don't think I can say this enough right? Some of you are under the weird impressions that systems are "too secure for the general public to be allowed access to" and it just constantly blows my fucking mind.
Read 5 tweets
10 Aug
Based on some discussions yesterday, I wrote up a more detailed note on the Apple on-device scanning saga with a focus on the "obfuscation" of the exact number of matches and dived into how one might (probabilistically) break it.

Comments welcome.

pseudorandom.resistant.tech/obfuscated_app…
This isn't the biggest problem with the proposed system. It does however suggest that even if you *really* trust Apple to not abuse their power (or be abused by power) then Apple still needs to release details about system parameters and assumptions.

We can quibble about the exact numbers I used, and the likelihood of the existence of a "prolific parent account" taking 50 photos a day for an entire year but there are *real* bounds on the kinds of users any static threshold/synthetic parameters can sustain.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(