Tweet

Sarah Jamie Lewis

13 Aug, 17 tweets, 4 min read

https://twitter.com/SarahJamieLewis/status/1425211436804968448

Apple have given some interviews today where they explicitly state that the threshold t=30.

Which means the false acceptance rate is likely an order of magnitude *more* that I calculated in this article.

https://twitter.com/SarahJamieLewis/status/1425211436804968448

Someone asked me on a reddit thread the other day what value t would have to be if NeuralHash had a similar false acceptance rate to other perceptual hashes and I ball parked it at between 20-60...so yeah.

Some quick calculations with the new numbers:

3-4 photos/day: 1 match every 286 days.
50 photos/day: 1 match every 20 days.

Also the fact that they gave a single number for the threshold indicates that they are planning to use a single, global threshold.

Which will result in worse privacy for heavy-use accounts, and will mean the obfuscation can be trivially broken as I explain in the article.

(because if the threshold is constant then Apple *cannot* adjust the rate of the synthetic matches because doing so would mean much more work for the server when running the detection algorithm for accounts with more photos - which is backwards scaling)

So to state this as clear as possible. Given all the technical papers and information about the parametrization that Apple has provided:

1. I believe the obfuscation mechanisms this system claims to provide are fundamentally flawed, and easily broken.

This means for accounts with more photos Apple (et al) will be able to calculate how many actual matches you have - with high probability - even before you cross the threshold.

(See here for details)

pseudorandom.resistant.tech/obfuscated_app…

2. I believe that for accounts with more photos, the actual probability of hitting enough false positives to cross the threshold is far greater than the 1 in a trillion number Apple have thrown around (~4% a year for 50 photos/day)

https://twitter.com/SarahJamieLewis/status/1425216095388987393

Hopefully Apple will release a breakdown of how they calculated probabilities in their system and what the boundaries are.

But to be very clear, there are hard functional limits around obfuscation given the design of the system they have proposed.

https://twitter.com/SarahJamieLewis/status/1425216095388987393

Apples new threat model document contains some actual justification for the numbers! (apple.com/child-safety/p…)

They are assuming 1/100000 false acceptance rate for NeuralHash which seems incredible low. And assuming that every photo library is larger than the actual largest one.

Some more information about NeuralHash too. They state they did not train it on CSAM images (which makes one wonder what they *did* train it on).

This 100 million number needs some inspection given that there are billions of images exchanged everyday.

In 2017 Whatsapp said they were seeing 4.5 billions photos shared per day.

You can't extrapolate an acceptance false positive rate from 100 million tests.

Some relief that the 1 in a trillion seems to derive from accounts with the heaviest use and not a general average account.

Although given all those numbers (far=1:100M, t=30) it puts the size of the large iphoto library at around 6M images.

So where are we now:

* Mechanism for choosing synthetic voucher probability still unknown
* Obfuscation algorithm still seems broken
* False positives might be lower if you believe Apples experiments were somehow representative of the real world.

Also all the policy issues about client-side scanning and how quickly and easily that can be abused and exploited by Apple (et al) to target all kinds of communities, and the inevitable follow up use to undermine e2ee messages.

Anyway keep up the pressure. The fact that Apple felt it necessary to do a PR blitz today along with releasing new slivers of information regarding parametrization is a good sign.

Also remember "Screeching voices of the minority"

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @SarahJamieLewis

Sarah Jamie Lewis

@SarahJamieLewis

12 Aug

As an appendix/follow up to my previous article (a probabilistic analysis of the high level operation of a system like the one that Apple has proposed) here are some thoughts / notes / analysis of the actual protocol.

pseudorandom.resistant.tech/a_closer_look_…

https://twitter.com/SarahJamieLewis/status/1425211436804968448

The previous article can be found here:

https://twitter.com/SarahJamieLewis/status/1425211436804968448

Honestly I think the weirdest thing given the intent of this system is how susceptible this protocol seems to be to malicious clients who can easily make the server do extra work, and can probably also just legitimately DoS the human-check with enough contrived matches.

Read 11 tweets

Sarah Jamie Lewis

@SarahJamieLewis

12 Aug

Daily Affirmation: End to end encryption provides some safety, but it doesn't go far enough.

For decades our tools have failed to combat bulk metadata surveillance, it's time to push forward and support radical privacy initiatives.

Watching actual cryptographers debate about whether or not we should be voluntarily *weakening* encryption instead of radically strengthening threat models makes my skin crawl.

I don't think I can say this enough right? Some of you are under the weird impressions that systems are "too secure for the general public to be allowed access to" and it just constantly blows my fucking mind.

Read 5 tweets

Sarah Jamie Lewis

@SarahJamieLewis

10 Aug

Based on some discussions yesterday, I wrote up a more detailed note on the Apple on-device scanning saga with a focus on the "obfuscation" of the exact number of matches and dived into how one might (probabilistically) break it.

Comments welcome.

pseudorandom.resistant.tech/obfuscated_app…

https://twitter.com/SarahJamieLewis/status/1424278559112122370

This isn't the biggest problem with the proposed system. It does however suggest that even if you *really* trust Apple to not abuse their power (or be abused by power) then Apple still needs to release details about system parameters and assumptions.

https://twitter.com/SarahJamieLewis/status/1424278559112122370

We can quibble about the exact numbers I used, and the likelihood of the existence of a "prolific parent account" taking 50 photos a day for an entire year but there are *real* bounds on the kinds of users any static threshold/synthetic parameters can sustain.

Read 8 tweets

Sarah Jamie Lewis

@SarahJamieLewis

9 Aug

Also has anyone else attempted to reverse engineer how Apple might have arrived at 1/trillion probability of false account flagging?

Some back of the napkin math, please double check...

If you assume the threshold is >10 false positives over a year to trigger an account (thrown around in the Apple docs), and each person stores ~1024 new photos per year (~3-4/day) then to get a 1/trillion figure your single-instance false positive probability has to be ~1/2000

You can get that probability if you assume the database being checked against contains ~16M unique hashes and the effective hash size is ~36bit (Neuralhash hashes appear to be 128 bit, but they are perceptual not random)

Neither of those values seems absurd given what we know.

Read 10 tweets

Sarah Jamie Lewis

@SarahJamieLewis

8 Aug

https://twitter.com/BaronVonBourbon/status/1424238100620980224

These are fair question regarding systems like the one Apple has proposed, and there is enough general ignorance regarding some of the building blocks that I think it is worth attempting to answer.

But it's going to take way more than a few tweets, so settle in...

https://twitter.com/BaronVonBourbon/status/1424238100620980224

First, I'll be incredibly fair to Apple and assume that the system has no bugs - that is there is no way for a malicious actor inside of outside of Apple to exploit the system in ways that it wasn't meant to be exploited.

Idealized constructions only.

At the highest level there is your phone and Apple's servers. Apple has a collection of hashes, and your phone has...well tbh if you are like a large number of people in the world it probably has links to your entire digital life.

We can draw a big line down the centre.

Read 43 tweets

Sarah Jamie Lewis

@SarahJamieLewis

7 Aug

https://twitter.com/SarahJamieLewis/status/1423781344824332288

As I have said before, I am willing to be the person who draws a line here, against the calls for "nuance".

There is no room for nuance, because nuance thinks surveillance systems can be built such that they can only used only for good or to only target bad people.

They can't.

https://twitter.com/SarahJamieLewis/status/1423781344824332288

https://twitter.com/SarahJamieLewis/status/1423682202227802113

This isn't a trivialization of the situation, it *is* the situation.

There has never been a surveillance system in the history of humanity that remained static, unable to grow to the whims of power.

https://twitter.com/SarahJamieLewis/status/1423682202227802113

It is our duty to oppose all such system *before* they become entrenched!

Not to work out how to entrench them with the least possible public outrage at their very existence by shielding their true nature with a sprinkling of mathematics.

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Sarah Jamie Lewis

Try unrolling a thread yourself!

More from @SarahJamieLewis

Sarah Jamie Lewis

Sarah Jamie Lewis

Sarah Jamie Lewis

Sarah Jamie Lewis

Sarah Jamie Lewis

Sarah Jamie Lewis

Did Thread Reader help you today?

Like this author's thread?