ruby nealon Profile picture
Mar 30 25 tweets 8 min read Read on X
The setup behind the CVE-2024-3094 supply-chain attack is fascinating. I originally wanted to finish and share a tool to audit other OSS projects for anomalous contributor behavior, but I feel what I found trying to MVP it is way more interesting. 🧵 1/25 gist.github.com/rubyroobs/77cc…
diff of running strings on an existing test fixture in the xz project and the  one containing the injected code added by the attacker
2/25 If you haven't, please read the full @Openwall mailing list disclosure. The first advisory summary a friend shared with me had such a high-level overview that I feel I initially grossly underestimated the level of sophistication of this attack. openwall.com/lists/oss-secu…
3/25 Hackers tend to be lazy. When I heard "fake identity", I was thinking automation of "grammar fix" OSS contributions on many fake identities, farming activity on projects, and only after the identity met a threshold would an attacker even assess it for repetitional value.
4/25 It still seemed unlikely to fool project maintainers though. Even with newer technologies like ChatGPT, I thought this would need to be done on a scale that would leave some identifiable patterns in activity.. Then I started to read the full original disclosure.
5/25 Suddenly I had a lot of questions. Why did sshd/OpenSSH load xz-utils if OpenSSH doesn't depend on it? As I understand now, official OpenSSH does not but linux distro packages often patch it to support systemd, which does. (still not 100% - please correct me if I am wrong!)
6/25 My thought then was to audit other projects for anomalous contributor behavior - especially ones that may have been an "unclear" dependency. But I was still confident the agg. stats of the backdoor commit author's git contributions would have patterns of automation too.
7/25 I started manually auditing the xz repo. Another surprise was reading the test file README in xz:
"Many of the files have been created by hand with a hex editor, thus there is no better "source code" than the files themselves."
With hindsight of the test file backdoor... 😅 xz project test/files/README    .xz and .lzma Test Files ------------------------  0. Introduction      This directory contains bunch of files to test handling of .xz,     .lzma (LZMA_Alone), and .lz (lzip) files in decoder implementations.     Many of the files have been created by hand with a hex editor, thus     there is no better "source code" than the files themselves. All the     test files and this README may be distributed under the terms of     the BSD Zero Clause License (0BSD).
8/25 When I looked for commits in other related projects adding new binary files, my first hit was a test fixture binary in zstd - also a compression lib too!
The same commit also had automation to regenerate and detect the file changing. github.com/facebook/zstd/…
9/25 I don't think this is by any means the single/most important factor that lead to the attack, but I did want to show them in contrast to at least highlight that there is a better way of doing this, and that CI/test infra hygiene is worth continuously reviewing and bettering.
10/25 I'm now so emotionally invested I want to start script something. I iterated by auditing a few lower-level library projects, and adding new ideas as they came to me. I was also very eager to (and honestly, way, way too late...) start testing my script in the xz-utils repo!
11/25 I spent way too much time keeping it as a one-liner, but I now had something to find each binary file in a repo, the commit author who last modified it and agg. git stats, recursive-extract binwalk / strings it, and print an (ugly) plaintext report. gist.github.com/rubyroobs/77cc…
12/25 But as I was running my MVP script in the xz-utils repo, I realized that if this user was a 'fake identity' as suspected, the creator had been anything but lazy. This is by far the most work/time/persistence I've seen go into an attack that anyone can follow chronologically
13/25 Factoring in the lack of any other online web presence, as of now I would be incredibly surprised to learn this account was not created by the backdoor commit author, most likely with xz as a target to try to infiltrate.
Regardless, they have had a very busy past few years!
14/25 Since their first commit in Jan 2022, they have authored a total of 451 commits in xz-utils main banch. That's 19% of all main branch commits in just over 2 years. The project's first commit (when it migrated to git) was over 16 years ago!
15/25 The other contributor has authored 76% of commits, incl. the first. So between them, 95% of all commits.

But their GH account was created in 2021! Before working on xz, they ... tried to make libarchive auto-download combinations of dependencies that didn't make sense 🤔 Added Dependency downloader script for apt and yum #1595  I found it difficult to collect all of the dependencies when I was trying to build for the first time. To make it easier for everyone else, I figured I could automate this. Let me know if I am missing anything or if this script belongs in a different folder.  (description written by the attacker)
16/25 From 2022 though a focus on xz-utils, even representing it in other projects! In Google's oss-fuzz, they disabled the same compiler feature their backdoor uses to intercept execution. And then changed the primary contact, so any bugs it did manage to find went to them...🤔
GitHub PR "xz: Disable ifunc to fix Issue 60259" by the author of the backdoor. The change just adds "--disable-ifunc" to the build instructions Google's security fuzzer uses to the build instructions. IFUNC is also the glibc feature used to intercept execution by the backdoor though, so it's a bit hard to believe this is just a coincidence...
GitHub PR "XZ updates". The backdoor author changes the "primary_contact" email address registered with Google's open source security fuzzer to their own.
17/25 I'll stop with all the 🤔 sorry 🙇‍♀️

Once they decided they were ready to launch their backdoor, they still checked every detail carefully. They injected their code by a mix of an unclear-but-uninteresting build script addition that descrambled the test files in the project.
18/25 While that would prevent tools like binwalk from properly identifying the machine code it contained, they went so far as to make their scrambled backdoor test file have similar artifacts to others. Shown is a diff of the strings between the backdoor and another test file. Image
19/25 Last one! When I was skimming the binwalk outputs, I thought I ID'd another backdoor payload when it found an an xz-compressed x86 binary with a different name. Turns out this has been there since 2009, with the context explained in the git commit message. Image
20/25 I wondered why risk adding 2 new test files that didn't even get used. The disclosure actually mentioned 5.6.0 and 5.6.1 being vulnerable with different payloads. This is how the original backdoor payloads were added... hiding in plain sight 🥲 Tests: Add two RISC-V Filter test files. author	Jia Tan <jiat0218@gmail.com>	 Tue, 23 Jan 2024 00:33:39 +0900 (23:33 +0800) committer	Jia Tan <jiat0218@gmail.com>	 Wed, 24 Jan 2024 00:05:47 +0900 (23:05 +0800) commit	e2870db5be1503e6a489fc3d47daf950d6f62723 tree	9a1b5eb9bde7f2fa5fc2f2ef4c32fb37773d65ec	tree parent	b26a89869315ece2f6d9d10d32d45f672550f245	commit | diff Tests: Add two RISC-V Filter test files.  These test files achieve 100% code coverage in src/liblzma/simple/riscv.c. They contain all of the instructions that should be filtered and a few cases that should not.
21/25 The repo is currently unavailable, but in an earlier PR I found on web archive they were merging their own changes without review as early as 2023. I unfortunately couldn't get the PRs for the backdoors, but I wonder if that PR had any review at all web.archive.org/web/2024032919…
GitHub PR where backdoor author merged without approval to the main/master branch of the xz-utils repo.
22/25 At the time of writing they are even still listed as co-maintainer on the sponsoring project's website too. My point isn't to goof on the project, but rather to highlight the level of trust and access they achieved while infiltrating the project. tukaani.org/about.html
Image
23/25 I wonder how many other high-effort "fake identities" are still in the infiltration stage, building trust with maintainers of other quiet or older projects that are a valuable target for attackers but aren't necessarily understood as one.
24/25 If the injected code was more conservative selecting targets and didn't have a performance impact so significant that someone who (in their own words) "is not a security researcher/engineer" began to investigate, how long could this have gone undetected?
25/25 It feels very lucky that it was discovered at the stage it was. I hope with this attack on people's minds, other OSS projects in similar positions consider doing tabletop scenario exercises for this kind of attack and how they can prevent/detect it. Thanks for reading!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with ruby nealon

ruby nealon Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(