Suha Profile picture
Sep 23 21 tweets 5 min read Read on X
I decided to do a summary thread so here we go.

My talk was about incubated ML exploits, a new class of exploits for ML systems that we identified. 🧵
Taking a step back: ML and AI are everywhere now, and people are finding clever ways to trick these systems. For example, you might have heard stories about people using prompt injection on chatbots or protestors fooling self-driving cars with traffic cones.
These tricks often stem from understanding how these models work and what data they're trained on. A stick figure on the left says, 'Maybe we shouldn't put AI in everything?' Next to them is an image of a self-driving car with a traffic cone placed on its hood, blocking its sensors. On the right, a stick figure with claw hands says, 'Yay! More AI, more exploits for me!' Caption at the bottom reads: 'Source: Kerr, Dara. "Armed with traffic cones, protesters are immobilizing driverless cars." NPR, 2023.' DEFCON 32 text appears at the bottom left.
But here's the thing: many research on ML attacks focus solely on the model itself. In reality, ML systems are complex and have many moving parts. That's why we developed a framework to bridge the gap between model security and systems security.
Specifically, we started by defining what is known as hybrid ML exploit. These attacks chain a system security issue with a model vulnerability. On the left, a red box contains white text that reads, 'A hybrid ML exploit chains a system security issue with a model vulnerability.' On the right is a diagram titled 'Hybrid ML Exploits,' showing two connected circles. The left circle is labeled 'System Security Issues,' and the right circle is labeled 'Model Vulnerabilities,' with a double arrow between them. DEFCON 32 text appears at the bottom left.
It can go both ways: a model vulnerability can expose a system security issue, or a system security issue can enable exploitation of a model vulnerability.
So why is this framing useful? A big issue with ML security is that model security and systems security are treated separately.
But what we need to understand is that if we’re only covering model security, we’re missing a big piece, and if we’re only covering systems security, we’re still missing a big piece. We can’t treat these two processes completely independently.
Because then we’re entirely ignoring the potential for hybrid ML exploits. In the center of the slide, three stick figures represent different areas of security. The middle figure, labeled 'Hybrid ML Exploits,' has claw hands and a mischievous grin, standing between two other stick figures. The left figure represents 'Model Security' and the right figure represents 'System Security,' both thinking 'Looks good to me!' The top title reads 'Summoning Demons: The Pursuit of Exploitable Bugs in Machine Learning,' with a banner on the top right titled 'Learned Systems Security.' Various text boxes surround the figures, including 'Poisoning Web-Scale Training Datasets is...
This is an emergent property because a model is embedded in a system and it's going to interact with all of the different system components in new and exploitable ways.
While there have been specific instances of hybrid ML exploits in the literature, they are not called that explicitly. Previous work is largely limited to specific instances or implications. Our framework lets us treat this interaction explicitly and systematically.
Now, there’s one kind of model vulnerability called a model backdoor. The precise definition is that a backdoor attack allows a malicious actor to force an ML model to produce specific outputs given specific inputs.
You can use input-handling bugs to inject backdoors into models. We called that an incubated ML exploit, which is a subclass of hybrid ML exploits. We decided to identify and construct incubated ML exploits using bugs that arise when parsing ML model files. A diagram titled 'Hybrid ML Exploits' shows two connected circles. The left circle is labeled 'System Security Issues' with 'Input-Handling Bugs' inside it. The right circle is labeled 'Model Vulnerabilities' with 'Model Backdoors' inside it. A red arrow connects 'Input-Handling Bugs' to 'Model Backdoors,' labeled 'Incubated ML Exploits.' At the top of the slide, text reads, 'It’s dangerous to go alone! Take this.' DEFCON 32 text appears at the bottom left.
So, why focus on ML model files? Well, there's a culture in ML of sharing model artifacts without sufficient validation. Real malicious models have been found on platforms like HuggingFace Hub. Plus, there are tons of ML file formats out there.
To organize our thinking about these bugs, we turned to LangSec. LangSec applies formal language theory to systems security, and we found a useful LangSec taxonomy of input-handling bugs.
We found bugs up and down the ML stack. For instance, we explored issues with pickling and restricted unpickling. We also highlighted an incubated ML exploit that took advantage of arbitrary code execution through the ONNXRuntime.
We found multiple issues involving parser differentials. This is when different parsers interpret the same input differently. These model files can appear benign to one system component but represented a backdoored model when interpreted by another. The title 'Parser Differentials' appears in a red box at the top. Below, a diagram shows a document icon with a neural network graph on it, split into two arrows pointing to two labeled boxes: 'Parser 1' and 'Parser 2.' Parser 1 leads to a neural network graph with several parts marked by gray gear icons, indicating differences. Parser 2 leads to a similar neural network graph without any differences. DEFCON 32 text is at the bottom left.
We also discovered issues involving polyglots. These are files that can be validly interpreted as multiple formats. We created polyglots with Safetensors and PyTorch files. The latter led to updates to the Fickling tool as well. The title 'Polyglot Files' appears in a red box at the top. Below are two boxes labeled 'Format 1' and 'Format 2.' Both contain a neural network graph, with the 'Format 2' graph having gray gear icons indicating differences. The boxes are stacked on top of a document icon with a folded corner. DEFCON 32 logo is at the bottom left.
Looking ahead, we hope to see more work on hybrid and incubated ML exploits in addition to ML security work that takes the whole stack and supply chain into consideration.
BSidesLV recording:

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Suha

Suha Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(