Techniques for String Decryption in macOS Malware with Radare2

Techniques for String Decryption in macOS Malware with Radare2
2021-10-13 02:52:34 Author: www.sentinelone.com(查看原文) 阅读量:36 收藏

If you’ve been following this series so far, you’ll have a good idea how to use radare2 to quickly triage a Mach-O binary statically and how to move through it dynamically to beat anti-analysis attempts. But sometimes, no matter how much time you spend looking at disassembly or debugging, you’ll hit a roadblock trying to figure out your macOS malware sample’s most interesting behavior because much of the human-readable ‘strings’ have been rendered unintelligible by encryption and/or obfuscation.

That’s the bad news; the good news is that while encryption is most definitely hard, decryption is, at least in principle, somewhat easier. Whatever methods are used, at some point during execution the malware itself has to decrypt its code. This means that, although there are many different methods of encryption, most practical implementations are amenable to reverse engineering given the right conditions.

Sometimes, we can do our decryption statically, perhaps emulating the malware’s decryption method(s) by writing our own decryption logic(s). Other times, we may have to run the malware and extract the strings as they are decrypted in memory. We’ll take a practical look at using both of these techniques in today’s post through a series of short case studies of real macOS malware.

First, we’ll look at an example of AES 128 symmetric encryption used in the recent macOS.ZuRu malware and show you how to quickly decode it; then we’ll decrypt a Vigenère cipher used in the WizardUpdate/Silver Toucan malware; finally, we’ll see how to decode strings dynamically, in-memory while executing a sample of a notorious adware installer.

Although we cannot cover all the myriad possible encryption schemes or methods you might encounter in the wild, these case studies should give you a solid basis from which to tackle other encryption challenges. We’ll also point you to some further resources showcasing other macOS malware decryption strategies to help you expand your knowledge.

For our case studies, you can grab a copy of the malware samples we’ll be using from the following links:

Don’t forget to use an isolated VM for all this work: these are live malware samples and you do not want to infect your personal or work device!

Breaking AES Encryption in macOS.ZuRu

Let’s begin with a recent strain of new macOS malware dubbed ‘macOS.ZuRu’. This malware was distributed inside trojanized applications such as iTerm, MS Remote Desktop and others in September 2021. Inside the malware’s application bundle is a Frameworks folder containing the malicious libcrypto.2.dylib. The sample we’re going to look at has the following hash signatures:

md5 b5caf2728618441906a187fc6e90d6d5
sha1 9873cc929033a3f9a463bcbca3b65c3b031b3352
sha256 8db4f17abc49da9dae124f5bf583d0645510765a6f7256d264c82c2b25becf8b

Let’s load it into r2 in the usual way (if you haven’t read the earlier posts in this series, catch up here and here), and consider the simple sequence of reversing steps illustrated in the following images.

Getting started with our macOS.ZuRu sample

As shown in the image above, after loading the binary, we use ii to look at the imports, and see among them CCCrypt (note that I piped this to head for display purposes). We then do a case insensitive search on ‘crypt’ in the functions list with afll~+crypt.

If we add [0] to the end of that, it gives us just the first column of addresses. We can then do a for-each over those using backticks to pipe them into axt to grab the XREFS. The entire command is:

> axt @@=`afll~crypt[0]`

The result, as you can see in the lower section of the image above, shows us that the malware uses CCCrypt to call the AESDecrypt128 block cipher algorithm.

AES128 requires a 128-bit key, which is the equivalent of 16 bytes. Though there’s a number of ways that such a key could be encoded in malware, the first thing we should do is a simple check for any 16 byte strings in the binary.

To do that quickly, let’s pipe the binary’s strings through awk and filter on the len column for ‘16’: That’s the fourth column in r2’s iz output. We’ll also narrow down the output to just cstrings by grepping on ‘string’, so our command is:

> iz | awk ‘$4==16’ | grep string

We can see the output in the middle section of the following image.

Filtering the malware’s strings for possible AES 128 keys

We got lucky! There’s two occurrences of what is obviously not a plain text string. Of course, it could be anything, but if we check out the XREFS we can see that this string is provided as an argument to the AESDecrypt method, as illustrated in the lower section of the above image.

All that remains now is to find the strings that are being deciphered. If we get the function summary of AESDecrypt from the address shown in our last command, 0x348b, it reveals that the function is using base64 encoded strings.

> pds @ 0x348b

Grabbing a function summary in r2 with the `pds` command

Our search returns three hits that look like good candidates, but the proof is in the pudding! What we have at this point is candidates for:

the encryption algorithm – AES128
the key – “quwi38ie87duy78u”
three ciphers – “oPp2nG8br7oIB+5wLoA6Bg==, …”

All we need to do now is to run our suspects through the appropriate decryption routine for that algorithm. There are online tools such as Cyber Chef that can do that for you, or you can find code for most popular algorithms for your favorite language from an online search. Here, we implemented our own rough-and-ready AES128 decryption algorithm in Go to test out our candidates:

A simple AES128 ECB decryption algorithm implemented in Go

We can pipe all the candidate ciphers to file from within r2 and then use a shell one-liner in a separate Terminal window to run each line through our Go decryption script with the candidate key.

Revealing the strings in clear text with our Go decrypter

And voila! With a few short commands in r2 and a bash one-liner, we’ve decrypted the strings in macOS.ZuRu and found a valuable IoC for detection and further investigation.

Decoding a Vigenère Cipher in WizardUpdate Malware

In our second case study, we’re going to take a look at the string encryption used in a recent sample of WizardUpdate malware. The sample we’ll look at has the following hash signatures:

md5 0c91ddaf8173a4ddfabbd86f4e782baa
sha1 3c224d8ad6b977a1899bd3d19d034418d490f19f
sha256 73a465170feed88048dbc0519fbd880aca6809659e011a5a171afd31fa05dc0b

We’ll follow the same procedure as last time, beginning with a case insensitive search of functions with “crypt” in the name, filtering the results of that down to addresses, and getting the XREFS for each of the addresses. This is what it looks like on our new sample:

Finding our way to the string encryption code from the function analysis

We can see that there are several calls from main to a decrypt function, and that function itself calls sym.decrypt_vigenere.

Vigenère is a well-known cipher algorithm which we will say a bit more about shortly, but for now, let’s see if we can find any strings that might be either keys or ciphers.

Since a lot of the action is happening in main, let’s do a quick pds summary on the main function.

Using `pds` to get a quick summary of a function

There are at least two strings of interest. Let’s take a better look by leveraging r2’s afns command, which lists all strings associated with the current function.

r2’s `afns` can help you isolate strings in a function

That gives us a few more interesting looking candidates. Given its length and form, my suspicion at this point is that the “LBZEWWERBC” string is likely the key.

We can isolate just the strings we want by successive filtering. First, we get just the rows we want:

> afns~:1..5

And then grab just the last column (ignoring the addresses):

> afns~:1..5[2]

Then using sed to remove the “str.” prefix and grep to remove the “{MAID}” string, we end up with:

Access to the shell in r2 makes it easy to isolate the strings of interest

As before, we can now pipe these out to a “ciphers” file.

> afns~:1..5[2] | grep -v MAID | sed ‘s/str.//g’ > ciphers

Let’s next turn to the encryption algorithm. Vigenère has a fascinating history. Once thought to be unbreakable, it’s now considered highly insecure for cryptography. In fact, if you like puzzles, you can decrypt a Vigenère cipher with a manual table.

The Vigenère cipher was invented before computers and can be solved by hand

One of the Vigenère cipher’s weaknesses is that it’s possible to discern patterns in the ciphertext that can reveal the length of the key. That problem can be avoided by encrypting a base64 encoding of the plain text rather than the plain text itself.

Now, if we jump back into radare2, we’ll see that WizardUpdate does indeed decode the output of the Vigenère function with a base64 decoder.

WizardUpdate malware uses base64 encoding either side of encrypting/decrypting

There is one other thing we need to decipher a Vigenère cipher aside from the key and ciphertext. We also need the alphabet used in the table. Let’s use another r2 feature to see if it can help us find it. Radare2’s search function, /, has some crypto search functionality built in. Use /c? to view the help on this command.

Search for crypto materials with built-in r2 commands

The /ck search gives us a hit which looks like it could function as the Vigenère alphabet.

OK, it’s time to build our decoder. This time, I’m going to adapt a Python script from here, and then feed it our ciphers file just as before. The only differences are I’m going to hardcode the alphabet in the script and then run the output through base64. Let’s see how it looks.

Decoding the strings returns base64 as expected

So far so good. Let’s try running those through base64 -D (decode) and see if we get our plain text.

Our decoder returns gibberish after we try to decode the base64

Hmm. The script runs without error, but the final decoded base64 output is gibberish. That suggests that while our key and ciphers are correct, our alphabet might not be.

Returning to r2, let’s search more widely across the strings with iz~string

Finding cstrings in the TEXT section with r2’s `~` filter

The first hit actually looks similar to the one we tried, but with fewer characters and a different order, which will also affect the result in a Vigenère table. Let’s try again using this as the hardcoded alphabet.

Decoding the WizardUpdate’s encrypted strings back to plain text

Success! The first cipher turns out to be an encoding of the system_profiler command that returns the device’s serial number, while the second contains the attacker’s payload URL. The third downloads the payload and executes it on the victim’s device.

Reading Encrypted Strings In-Memory

Reverse engineering is a multi-faceted puzzle, and often the pieces drop into place in no particular order. When our triage of a malware sample suggests a known or readily identifiable encryption scheme has been used as we saw with macOS.ZuRu and WizardUpdate, decrypting those strings statically can be the first domino that makes the other pieces fall into place.

However, when faced with an incalcitrant sample on which the authors have clearly spent a great deal of time second-guessing possible reversing moves, a ‘cheaper’ option is to detonate the malware and observe the strings as they are decrypted in memory. Of course, to do that, you might need to defeat some anti-analysis and anti-debugging tricks first!

In our third case study, then, we’re going to take a look at a common adware installer. Adware is big business, employs lots of professional coders, and produces code that is every bit as crafty as any sophisticated malware you’re likely to come across. If you spend anytime dealing with infected Macs, coming across adware is inevitable, so knowing how to deal with it is essential.

md5 cfcba69503d5b5420b73e69acfec56b7
sha1 e978fbcb9002b7dace469f00da485a8885946371
sha256 43b9157a4ad42da1692cfb5b571598fcde775c7d1f9c7d56e6d6c13da5b35537

Let’s dump this into r2 and see what a quick triage can tell us.

Well, not much! If we print the disassembly for the main function with pdf @main, we see a mass of obfuscated code.

Lots of obfuscated code in this adware installer

However, the only calls here are to system and remove, as we saw from the function list. Let’s quit and reopen in r2’s debugger mode (remember: you may need to chmod the sample and remove any code signature and extended attributes as explained here).

sudo r2 -AA -d 43b9157a4ad42da1692cfb5b571598fcde775c7d1f9c7d56e6d6c13da5b35537

Let’s find the entrypoint with the ie command. We’ll set a breakpoint on that and then execute to that point.

Now that we’re at main, let’s break on the system call and take a look at the registers. To do that, first get the address of the system flag with

> f~system

Then set the breakpoint on the address returned with the db command. We can continue execution with dc.

Setting a breakpoint on the system call and continuing execution

Note that in the image above, our first attempt to continue execution results in a warning message and we actually hit our main breakpoint again. If this happens, repeating the dc command should get you past the warning. Now we can look at all the registers with drr.

At the rdi register, we can see the beginning of the decrypted string. Let’s see the rest of it.

The clear text is revealed in the rdi register

Ah, an encoded shell script, typical of Bundlore and Shlayer malware. One of my favorite things about r2 is how you can do a lot of otherwise complex things very easily thanks to the shell integration. Want to pretty-print that script? Just pipe the same command through sed from right within r2.

> ps 2048 @rdi | sed ‘s/;/\n/g’

We can easily format the output by piping it through the `sed` utility

More Examples of macOS String Decryption Techniques

WizardUpdate and macOS.ZuRu provided us with some real-world malware samples where we could use the same general technique: identify the encryption algorithm in the functions table, search for and isolate the key and ciphers in the strings, and then find or implement an appropriate decoding algorithm.

Some malware authors, however, will implement custom encryption and decryption schemes and you’ll have to look more closely at the code to see how the decryption routine works. Alternatively, where necessary, we can detonate the code, jump over any anti-analysis techniques and read the decrypted strings directly from memory.

If all this has piqued your interest in string encryption techniques used in macOS malware, then you might like to check out some or all of the following for further study.

EvilQuest, which we looked at in the previous post, is one example of malware that uses a custom encryption and decryption algorithm. SentinelLabs broke the encryption statically, and then created a tool based on the malware’s own decryption algorithm to decrypt any files locked by the malware. Fellow macOS researcher Scott Knight also published his Python decryption routine for EvilQuest, which is worth close study.

Adload is another malware that uses a custom encryption scheme, and for which researchers at Confiant also published decryption code.

Notorious adware dropper platforms Bundlore and Shlayer use a complex and varying set of shell obfuscation techniques which are simple enough to decode but interesting in their own right.

Likewise, XCodeSpy uses a simple but quite effective shell obfuscation trick to hide its strings from simple search tools and regex pattern matches.

Conclusion

In this post, we’ve looked at a variety of different encryption techniques used by macOS malware and how we can tackle these challenges both statically and dynamically. If you haven’t checked out the previous posts in this series, have a look Part 1 and Part 2. I hope you’ll join us for the next post in this series as we continue to look at common challenges facing macOS malware researchers.

文章来源: https://www.sentinelone.com/labs/techniques-for-string-decryption-in-macos-malware-with-radare2/
如有侵权请联系:admin#unsafe.sh