Marcel Böhme👨‍🔬 Profile picture
May 12 5 tweets 2 min read Read on X
Recently modified code and sanitizer instrumentation seem to be among the most effective heuristics for target selection in directed #fuzzing according to this recent SoK by Weissberg et al. LLMs show much promise for target selection, too.

📝 mlsec.org/docs/2024c-asi…
Image
More info about those two heuristics:
🦠 Sanitizer-guided Greybox Fuzzing:
♻️ Regression Greybox Fuzzing: usenix.org/system/files/s…
mboehme.github.io/paper/CCS21.pdf
But in an interesting twist, the authors find that choosing functions by their complexity might be even better at retrieving functions that contained vulnerabilities in the past.
Now, this analysis is hypothetical and w.r.t. the discovery of vulnerabilities across the *entire history* of a repository. Since the objective of a directed fuzzer is to find (unknown) vulns in the *current version*, I would be excited to see this hypothesis put to the test.
When we actually run fuzzers implementing each of these heuristics on the most recent version of a program, do we expect the cyclomatic-complexity-guided fuzzer to outperform the other heuristics?

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Marcel Böhme👨‍🔬

Marcel Böhme👨‍🔬 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mboehme_

Mar 28, 2023
After oracles for memory-safety, what's next?

- generic correctness prop.
- dataflow based properties
- "unusually large" resource consumptions

* Program-specific vs generic oracles
* One input (e.g., crash) vs distribution (e.g. performance)
* Ref implementation(s)

#Dagstuhl
- Human artifacts (documentation) as oracles.
- How to infer oracles, e.g. from JavaDoc comments? What about false pos? Consider them as signal for user.
- Oracle problem impacts how good deduplication works.
- Metamorphic testing. Explore in other domains, e.g. perf. testing!
- Mine assertions and use them in a fuzzer feedback loop
- Assertions are the best way to build oracles into the code
- hyperproperties are free oracles (differential testing)
- ML to detect vuln patterns. Use as oracles
- Bugs as deviant behavior (Dawson)
Read 5 tweets
Mar 28, 2023
Peter O'Hearn (@PeterOHearn12) on "Hits and Misses from a decade of program analysis in industry".

#Dagstuhl
- Bi-abductive symbolic execution
- Infer ran "symbolic execution" on changed part of every commit/diff
- Post-land analysis versus diff-time analysis changed fix rate from 0% to 70%. Why?
* Cost of context switch
* Relevance to developer
- Deploying a static analysis tool is an interaction with the developers.
- Devs would accept false positives and work with the team to "fit" the tool to the project rather.
- Audience matters!
* Dev vs SecEng
* Speed tolerance
* FP/FN tolerance
Read 5 tweets
Mar 28, 2023
Anna Zaks on "From Bug Detection to Mitigation and Elimination".

- Static and dynamic analysis.
- Hard to ensure coverage at scale!

#Dagstuhl
Security tooling
- ideal solution mitigates entire classes of bugs
- performance is important.
- adoption is critical!
- works with the ecosystem
Rewriting in memory-safe language (e.g. Swift)
- View new code as green islands in a blue ocean of memory-unsafe code.
- Objective: Turn blue to green.
- We need solutions with low adoption cost.
Read 4 tweets
Mar 28, 2023
Anders Møller (@amoellercsaudk) on "Dependencies Everywhere".

#Dagstuhl
Motivation
- Keeping dependencies up2date is not easy.
- Breaking changes are problematic for dependants.
- Informally specified and difficult to check against your project
- general tools don't assist with changes.
Research challenges
- we fully trust the dependencies ecosystem.
- supply chain is reported to be full of vulnerabilities, how does a maintainer interpret this? 95% false positives?
Read 5 tweets
Mar 27, 2023
Can we use LLMs for bug detection?
- compiler testing: generate programs
- "like" static analyzers:
* what is wrong, how to fix it?
* this is wrong, how to fix it?
- cur. challenge: limited prompt size
- reasoning power?
#Dagstuhl
Q: Isn't it the *unusual* and the *unlikely* that makes us find bugs?
A: You can increase temperature. Make it hallucinate more.
C: LLMs can't be trusted. Instead of bug finding, we should find use cases where we don't *need* to trust it. Maybe use it as a fuzzer guidance?
Read 4 tweets
Mar 27, 2023
"Coverage-guided fuzzing is probably the most widely used bug finding tool in industry. You can tell by the introductory slides everyone presented this morning".
--Dmitry Vyukov
In the future, we need more practical, simple, and sound techniques for bug finding,
- Find bugs in production
- Find new types of bugs
- Develop better dynamic tools
- Develop better static tools
- Require less human time
- Reports bugs in a way to improve fix rate!
Q: Should we add assertions to make fuzzers more effective at finding bugs?
A: Can do, but people do not even fix memory corruption bugs. The number of critical bugs found is not currently a problem.

syzkaller.appspot.com/upstream
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(