12,399 views

Brendan Dolan-Gavitt

@moyix

, 50 tweets, 10 min read

My Authors

Last session of the day at NDSS BAR features three invited talks! First, Jacopo Corbetta talks about IoT platform fuzzing at Qualcomm

Fuzzing has been very successful at Qualcomm; 100+ bugs found via fuzzing in their IoT platform. Around half from "pure" fuzzing, others from "hybrid" (manual analysis+fuzzing, etc)

"But doesn't Qualcomm have the source code?" Not always; third-party vendor code is common.

But source code does help. Lots of wins from being able to modify the code to make fuzzing easier. Main approach: think about and model what they want to actually test, then "cheat" on the rest.

E.g., TCP/IP client. Do you care about ARP, multicast, ethernet checksums, ...? Usually not. So strip that stuff out and focus on fuzzing what you actually want to test.

Other areas where you can maybe cheat: low-level packet details (e.g., decoding a radio protocol), multithreading, checksums/MACs, crypto, randomness, counters.

Moving on from "what" to "how". In IoT platforms, dynamic allocations are more expensive. In Qualcomm codebases, often few mallocs but lots and lots of buffer reuse. This can make use of things like ASAN difficult.

Instead found using AFL's libdislocator gave most of benefits but with less overhead.

Another technique: explicitly model internal state. Even if you have fuzzed many functions already, if you can identify when the program is in a new state it may be worth fuzzing them again. You can potentially do this by including a state variable & including it in cov metric.

Another benefit of fuzzing your own code is that you get to decide what a bug is—not just look for crashes. Found many interesting issues by sprinkling around a lot of assert()s to check internal consistency & fuzzing.

Example: found a corner case in crypto negotiation: received less key material than necessary for the protocol. Doesn't result in a memory error, but can assert() that you got the expected amount of key material.

[NB: this case wasn't exploitable, but led to revising & improving the negotiation logic]

Other issues: with fuzzing you will usually find many inputs for the same underlying bug. Corpus & test case minimization help, but still need to do a lot of deduplication before handing off to get fixed.

Solution is to create a "tentative patch" — hypothesize root cause, add condition to check, and then feed it the crashing inputs and see which ones are hit by your hypothesized root cause test

if (root_cause) { print(); exit(); }

Other lessons: the corpus you get from fuzzing is really valuable, re-use it! E.g., take your generated HTTP testcases and try them on all the different HTTP implementations you have.

Sometimes non-crashing inputs can be useful too: you might look at them by hand and decide that they really *shouldn't* be considered valid.

Next talk is also about tiny devices: Eric Gustafson talks about the State of Firmware Analysis

Why care about firmware analysis? Well, going by media reports, because our electric toothbrushes are going to rise up, become a botnet, and cause untold ruin

Big attack surface: everything is connected. How do we test? Dynamic analysis (fuzzing?), static analysis, symbolic execution. Talk will focus mainly on dynamic analysis and symbolic execution.

But uh oh: we have *lots* of different chips in these devices and almost none of them are Intel. So how do we do bug discovery? Comparatively simple for e.g. Chrome: just fuzz it, find bugs, report them, get them fixed (maybe get bounty $$ too!)

But for firmware we're missing a lot of pieces. How do we obtain the firmware? Maybe you can find it online (not always)? Maybe you can take it from the device itself? Difficult and hard to scale.

What firmware extraction looks like in practice

Okay so maybe now we have the code. But do our tools understand it? Probably not. Even for things like angr that use nice libraries like VEX with support for many platforms, that support is usually much less well tested.

Eric has done a lot to improve angr's support here. "Gymrat" lifters (get it?) make it easier to add new CPU architectures to angr

But testing an embedded system is more than just instructions. There's a whole hardware execution environment, like all the peripherals! To get over this you need to "re-host" the system.

Peripherals are hard. Each peripheral has lots of different registers, state, etc. May raise interrupt, may use DMA. All of this has to be modeled somewhat faithfully, and even peripherals that do the same thing may have very diff. implementations.

An attempt at automated rehosting: Pretender. Observes real hardware and tries to build a model of the memory-mapped I/O to use in an emulator (sites.cs.ucsb.edu/~vigna/publica…)

But there's still a lot missing. Recording the I/O traces in the first place is still very hard. Another approach: HALlucinator (subwire.net/papers/halucin…) – try to identify the higher-level OS abstractions and then model hardware at that layer.

[BDG note: if you followed N64 emulator development, this is what UltraHLE did to "re-host" the Nintendo 64 and create the first practical emulator]

@CSAW_NYUTandon

@CSAW_NYUTandon

Being able to emulate gives you debugging superpowers! UCSB used re-hosting successfully at the @CSAW_NYUTandon Embedded System Challenge in 2019.

@CSAW_NYUTandon

@CSAW_NYUTandon

@CSAW_NYUTandon Open research problems: DMA (hard to infer from software, hard to see dynamically), external peripherals

@CSAW_NYUTandon

@CSAW_NYUTandon

@CSAW_NYUTandon But also practical issues: keeping tools up with current QEMU, handling complicated protocols like Bluetooth

@CSAW_NYUTandon

@CSAW_NYUTandon

@CSAW_NYUTandon Future directions: what about patching? Fewer abstractions in firmware, hard to build a complete interprocedural CFG. Can't just add code anywhere. And firmware doesn't want you to patch it! Best case: CRCs, worst case: code signing :(

Big mood

But there’s hope!

@Zardus

@Zardus

Last talk of BAR: @Zardus on how to train the next generation of binary analysts with "From Zero to Hero: Bootstrapping Students into Binary Analysis"

@Zardus

@Zardus

@Zardus "Binary Analysis Research" has three parts. How to teach the "Research" part is out of scope for this talk :) Will focus for now on the "Binary" part.

@Zardus

@Zardus

@Zardus Yan got into binary analysis research via CTF. First thought: just make everyone play in CTFs! But not so successful to just throw students in without more structure and practice.

The “Karate Kid” model of binary analysis skills

Currently have decent techniques for Yellow->Brown belt, Brown->Black belt. But White->Yellow belt is under-served. How can we do better?

Wax on, wax off! We need a resource that has gradual learning curve, lots of repetition, guidance to enable progress, and built-in motivation.

Yan's solution (used in ASU CSE466): Pwn College pwn.college . Goal is to go from White Belt ("How do I read data with /bin/cat") up to Yellow Belt (heap tcache exploitation, kernel exploitation) in one semester.

Lots of challenges in each module, and where possible lots of repetition. Each challenge (when possible) has three randomly generated variants. E.g., for buffer overflow randomize the buffer size and offset on the stack to make students learn to calculate offsets.

To provide guidance: "teaching" challenges. The challenges self-describe their own solution steps. E.g., in buffer overflow, challenge tells the student how to calculate size and offsets, and prints out steps as it executes exploit.

Workload is high, but positive feedback!

Showing off pwn.college – "It uses the hacker theme from GitHub, so you can tell it's for hackers"

@ctfdio

@ctfdio

Uses @ctfdio, challenges start up docker instance for each challenge. Even has in-browser terminal!

@ctfdio

@ctfdio

@ctfdio Other nice features. Can switch it into "practice" mode which gives you root access but changes the flag to a dummy flag. Home directory persists though so you can debug your work in priv. mode and then run it in real mode when you've got it working.

@ctfdio

@ctfdio

@ctfdio Yan is also offering free private instances if you want to use this in your own course, email pwn-college@asu.edu

@ctfdio

@ctfdio

@ctfdio Future work: more topics (crypto, web, forensics), filling in gaps, making sure difficulty scales smoothly

Enjoying this thread?

Try unrolling a thread yourself!

Trending hashtags

Enjoying this thread?

Try unrolling a thread yourself!

More from @moyix see all

Related threads

Trending hashtags

Embed code for your website

Did Thread Reader help you today?