My Authors
Read all threads
Starting off NDSS BAR 2020, @thedavidbrumley gives his keynote based on his 17 years of experience in binary analysis!
Early work: asm2c, 2003. Lifts x86 assembly to C with SMT formulae in-line. Then you compile & execute lifted code directly.
But writing lifters in C is painful and didn’t end up being maintainable. So after a brief detour through Java settled on OCaml for Vine, BAP.
Why OCaml? Strong type system, great for writing compilers. Static type checker caught algorithmic flaws. In CMU compilers class, OCaml usage predicted good grade ;)
Currently BAP on v2, 10K lines of code, maintained by two people.
Downsides though: a lot more people know Python, so you get less community involvement by using OCaml.
Some interesting thoughts on lifters: very common approach is to model only the minimum you need to get result. But the things you don’t model can lead to subtle divergences! Advice: verify your IL correctness!
Bap-veri does this: lifts, runs lifted code on simulated CPU and checks against ground truth of real execution. github.com/BinaryAnalysis…
Another nice point: common to start thinking about binary analysis at the lifter. But need to also think about the machine and program model: CPU semantics, ABI, and APIs. Ideally these should not be too tightly coupled.
More practical lessons: separate out your graph representation from your lifter, and make sure your lifter remembers the context of the actual bytes in the program you lifted.
Somewhat surprising but true I think: floating point is still hard to model well even today. Some cool examples in this paper by David Monniaux arxiv.org/abs/cs/0701192
Brings up rounding errors - can an attacker find crafted inputs that get rounded and cause errors? [BDG: I think Project Zero found a TTF bug like this]
Last big lesson: binary analysis is not just the inverse of compilation. Many abstractions are lost and not recoverable. Example: VSA tries to recover variables, but variable abstraction may not exist at the binary level.
Analyses can be co-dependent. Different analyses may disagree, inform each other cyclically
More challenges: binary code from other languages (OCaml, Go, Rust, Jovial). Analyses may have hidden assumption that code came from C/C++
Other OSes (VxWorks!), systems of systems also problematic. Very common in embedded.
SMT solvers are not magic (well, maybe black magic). Logically equiv formulas can take wildly varying times to solve, so need to use simplifications appropriate for your domain. And larger formulae not necessarily more difficult to solve!
When do things get hard? In the middle ground between many free variables and highly constrained formulae. Then solver has to backtrack a lot.
Sometimes undoing compiler optimizations can make solving easier! Eg compiler converts division to multiplication, hard for SMT - convert it back to plain division (opposite of strength reduction)
Ending on a non-technical note. Rejection is not failure. Gives his own “rejection CV”. Democratizing software security is important, need to persist
Wrapping up, five big lessons
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Brendan Dolan-Gavitt

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!