While I wait for the GPU to churn through 2*26^11 possibilities, a brief recap of how we got here. It started when I noticed this bit of code in the @GitHubCopilot Visual Studio Code extension that detects naughty words in either the prompt or the suggestions.
Because the words are hashed, we have to guess what words might be in the list, compute the hash of each, and then check to see if it's in the list (just like cracking password hashes).
This managed to decode about 75% of the list right off the bat, and turned up some weird entries, like "israel" and "communist"
I also figured out what it's using the list for: it will suppress autocomplete suggestions if the suggestion contains a slur. So it has trouble completing a list of Near East countries, for example:
I tried ever-larger wordlists from increasingly dubious sources (a dump of 4chan's /pol/ archive proved particularly fruitful) and managed to get 95% of the list decoded. But the last 5% is *hard*
So it was time to apply absurd amounts of computer science to the problem. After porting the hash algorithm from Javascript to C, I started off by using symbolic execution (a fork of @kleesymex that supports floating point) to generate solutions
This line of attack got much faster with the help of @RolfRolles's Z3 skills and an observation from @saleemrash1d (independently noticed by @ESultanik) that the hash function could be converted to pure integer math
One of the best finds from the Z3 approach was the discovery that "q rsqrt" had been added to the bad word list to prevent Copilot from spitting out a piece of the Quake III source code
I also started doing some amateur cryptanalysis of the hash function. My crypto skills aren't great but I managed to get one word out of this by discovering that you can use already-decoded words to tell you things about undecoded ones
I also decided to take the analogy to password cracking seriously and wrote a plugin for John the Ripper, a popular password cracking tool. This uncovered quite a few more words.
At this point we only had 5 or so hashes left. And with the help of Z3, we could generate tens of thousands of possible candidates for each hash—but looking through all of them to tell which one was most likely was infeasible.
Time for some machine learning. Large language models like GPT-2 are not only good at *generating* text, they can be used to evaluate how plausible some text is according to the model.
If you ever find yourself in a situation where you need to evaluate whether a word or sentence is plausible English according to GPT-2, lm-scorer does the job nicely github.com/simonepri/lm-s…
I also felt, around this time, like I was hitting the limit of Z3's performance. I wanted to generate *every* 11-character alphabetic possibility for a hash, but after running for a day Z3 only came up with ~100,000 — and it can't easily be parallelized.
You know what's great at doing a ton of simple things in parallel? A GPU. So I ported the hash function to CUDA to run it on an RTX 3090. This let me check about 1.5 trillion candidates per second and check 26^11 possibilities in ~40 minutes. gist.github.com/moyix/f78e0b0d…
And that brings us up to the present. The GPU cracker worked perfectly, and found one of the four remaining words – and GPT-2 correctly ranked it as the most likely candidate
I was also able to take advantage of another observation about the hash function to realize that the plural form of this word was one of the other remaining hashes. So this leaves us with just two words (out of 1,170) undeciphered
Oops, I realized I didn't link to the decoded list. It can be found here; I have rot13 encoded it to guard against accidental viewing, but you can decode it with something like rot13.commoyix.net/~moyix/copilot…
An amazing algorithmic speedup from @Sc00bzT means I can now generate those same 800K candidates in just 1 minute on my laptop. It uses a very cool meet-in-the-middle attack!
After scoring the 854,653 solutions found by the CUDA hash cracker using GPT-2, I believe we have solved another two of the remaining slurs - the highlighted word and its plural! (The scores here are log-probabilities)
Only two hashes are left uncracked:
272617466 and 867567715
Good thing I have two GPUs 😎
(To be scrupulously fair I suspect a couple of the "decoded" entries on the list, like "w00se" and "jui ch", are collisions. But I have to draw the line somewhere.)
TFW you're happy that your GPU-accelerated hash finder gets 24 million hashes/sec but then you remember hashcat can do ~65 BILLION MD5 hashes/sec on the same device...
If anyone would like to tell me how hashcat manages to pull off this feat I'd be super curious, 3000X faster is a LOT (and MD5 is more complex than the function I'm trying to compute!)
I feel like I always get tripped up on silly things with CUDA – like: I can compute all these results very fast, but actually saving and reporting them somewhere is a huge pain.
Either I just call printf() device-side and tank performance or I have to manage parallel host and device buffers, synchronize between GPU threads for adding things to the result list and lose the ability to see results as they come in.
How can I add its negation (Or(word0[i] != model[word0[i]]) for i in range(arraylen)) to the constraints?
Unfortunately since the constraints were loaded in from an SMT file I don't have the original Array, nor do I know how to retrieve it from the constraint object or model...
Ok this just feels cheeky: I asked Z3 for an array word0 where word0 != FirstSolution. It happily provided me with such an array: FirstSolution but with array index 1073741824 set to 64. This is how I learned Z3 has no concept of array length.
So by looking at XOR of two hashes we can learn what the XOR of the last two letters to narrow down the possibilities.
Can you figure out the length of the string? Maybe... the multiplication by 33 dominates, but for longer strings the amount of error in your guess gets bigger and bigger.
A VQGAN+CLIP interpretation of Kubla Khan (Or, a vision in a dream. A Fragment.) by Coleridge, style by @GurneyJourney [inspired by @moultano's Sacred Library]
________
In Xanadu did Kubla Khan
A stately pleasure-dome decree:
Where Alph, the sacred river, ran
Through caverns measureless to man
Down to a sunless sea.