Tweet

Matthew

Nov 14 • 22 tweets • 11 min read

🐲 Ghidra Tips 🐲- Malware Encryption and Hashing functions often produce byte sequences that are great for #Yara rules.

Using #Ghidra and a Text Editor - You can quickly develop Yara rules to detect common malware families.
(Demonstrated with #Qakbot)

[1/20]
#Malware #RE

[2/20]
Hashing and encryption functions make good targets for #detection as they are reasonably unique to each malware family and often contain lengthy and specific byte sequences due to the mathematical operations involved.

These characteristics make for good Yara rules 😁

[3/20] The biggest challenge is locating the functions responsible for hashing and encryption. I'll leave that for another thread, but for now...

You can typically recognize hashing/encryption through the use of bitwise operators inside a loop. (xor ^ and shift >> etc).

[4/20] For example, here's a string hashing function utilised by recent #qakbot samples.

Note the heavy usage of mathematical operators. Like xor (^), right shift (>>) and bitwise "AND" (&).

These will typically produce a unique sequence of bytecodes.

[5/20] The disassembly and bytecodes for those instructions can be used for a Yara rule.

To grab the bytecodes, Highlight the decompiled code (right), this will automatically highlight the disassembly and bytecodes (left)....

[6/20] Highlighting the entire function should be avoided, as it is only the mathematical operators that will be consistent enough between samples.

For example, by including the do/while loop, then the Jump instructions (JZ/JC etc) would also be included in the disassembly....

[7/20] ... Cont'd

Jumps (JZ/JC/JNZ) == inconsistent Byte Values == not good for a Yara rule.

If a jump is accidentally included, it can be manually unselected in the #Ghidra disassembly window.

The final result should look like this.

[8/20] At this point, it's useful to obtain multiple samples of the same malware. In order to check that the remaining selected bytes are the same between samples.

With #qakbot, this value (red) does change between samples. It's important to account for this in the final rule.

[9/20] The bytecodes are easily obtained using #Ghidra.

Highlight-> Right-Click -> Copy Special -> Byte String.

This copies the highlighted code in a format that can be used by #Yara.

[10/20] The bytes can then be pasted directly into a #Yara template.

I'll keep the rule as minimal as possible to demonstrate the concept.

(IRL - Filters would be added to improve performance)

[11/20] Running the rule from there, it successfully finds the original sample. But other related samples (3 others) in the same folder remain undetected.

[11/20] This is due to the issue mentioned, where bytes unrelated to mathematical operations can differ between similar samples.

Using #Ghidra to compare two samples from the same Qakbot campaign, there are minor differences that are enough to break the original Yara rule

[12/20] To correct this, wildcards can be added to the bytes that differ between samples.

An example of this can be seen below. The new Yara rule is *mostly* the same, but with a few wildcards (??) added where the bytes differed.

[13/20] With the new changes saved, the rule can be re-run and multiple samples are now detected.

[14/20] Running the rule against running processes is able to identify where #Qakbot has successfully injected itself.

Qakbot likes to inject into OneDriveSetup.exe, so this is likely a True Positive.

(Very useful when combined with #DFIR tooling)

[15/20] Qakbot was used for this example, but the concept works well across other malware families.

#IcedID is a good example where the unique encryption
can be used for detection.

[16/20] Now for a few notable and important caveats....

{1}: This technique is generally only effective against unpacked payloads or in situations where the malware is already executing in memory.

Detecting packed files on disk will typically require a different approach.

[17/20]

{2} - Technically the same approach can be applied to the bytecodes of unpacking routines used by loaders. But that tends to be more complex and a topic for another day.

[18/20]

{3}: The final rule has been kept simple to demonstrate the concept.

Although technically accurate for the use case, it lacks filters to perform quickly and without consuming large amounts of CPU. This would likely need to be adjusted in a production environment.

[19/20]

{4} - Malware authors can avoid this type of detection by updating encryption/hashing logic with each sample (or by introducing randomised junk instructions between the "real" code).

[20/20]

{5} - In-memory masking (like Foliage) will defend against this type of detection very well.

I'm yet to see this implemented by the major malware families, but it has been implemented (very effectively) by #HavocC2 and #BruteRatel.

[21] The base rule can be found on my Github here.

github.com/embee-research…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Read 6 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Matthew

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @embee_research

Matthew

Matthew

Matthew

Matthew

Matthew

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!