Phishing campaigns are now commonplace for IT professionals. They continue to be the preferred way to attack an enterprise or individuals, taking advantage of end users and the inherent latency of AV signatures. While themes change to reflect relevant regional events or various holidays, and attack methods continue to evolve, certain techniques tend to be favored. The use of macros (containing malicious code) in documents has also been and continues to be an ongoing concern for most network defenders.
This document demonstrates how to analyze an Excel Document with embedded macros that was used to download a malicious payload. In this scenario the payload that was downloaded was a Dridex variant, which we will also analyze. We will also cover how the activity associated with malicious macros and the subsequent malware it downloads can be easily detected and analyzed using RSA’s ECAT.
Recently a client approached RSA Incident Response (IR) with some questions regarding a series of phishing emails that used attachments (specifically Word and Excel documents) as a first stage downloader, which ultimately delivered Dridex variants onto the newly infected systems. The client felt that they had addressed some concerns at their perimeter, which they anticipated would block some of the attempts aimed at larger groups or internal mailing lists. However, they were concerned that some of these phishing attempts could slip past their rules and make it through their commercial email inspection devices. So their question was two-fold; how to detect if someone on the network opens one of these malicious attachments and how to analyze some of these attachments with open source tools. The research and analysis RSA IR conducted into this request is provided in this article, to provide insight for other organizations with similar concerns.
This organization, like many, is in the process of advancing or upgrading its existing detection and response framework by increasing the number of analysts (as well as their capabilities) and providing them with better analytical tools. This client has a list of tools and products currently in procurement, however for the time being, they primarily use an assortment of open source tools coupled with some commercial tools. With that in mind, we analyzed the samples with tools that they typically use (hex editors, oleump, olley/immunity, IDA, python), which is covered in the Sample Analysis portion of this article.
The first part of their request (how do we detect this type of activity and tell if someone on the network opens a malicious document that slipped through?) can be tackled from many angles. First we discussed what the client was currently doing to detect samples. Their answer was what we observe in many organizations, which is looking for other organizations’ known Indicators of Compromise (IOCs). This includes getting lists of IP addresses, URLs, hashes from various private and open source distributions and blacklisting or blocking matches in their environment. They also employed typical Anti-Virus products, IDS/IPS, and email filters. Organizations relying solely on this type of ‘preventive’ strategy are very likely to experience problems. These devices are useful and have a place in a larger security strategy, but they should be part of a broader approach. Looking for other organization’s known IOC’s can prove beneficial, but leaves a company blind to what is unknown in their environment. This client was a Security Analytics client, but unfortunately was not an ECAT customer. At its core ECAT does not use signatures, but instead focuses on static characteristics and dynamic behavior to identify malicious activity, regardless of the Trojan or exploit. In addition to quickly looking for known hashes, file metadata, URLS and IP addresses; ECAT can easily highlight malicious activity regardless of what specific malware is being used. ECAT’s analytics are highlighted in the section below.
To demonstrate how to hunt for this type of activity using ECAT we used the same samplethat is manually analyzed in the Sample Analysis section. We set up an ECAT testing environment and documented the current state, which is displayed in Figure 1. Notice that ECAT has provided a system score of 1 (where the range is 0-1024, 1024 being the most severe) for the current state.
Figure 1: Initial ECAT Testing Environment
We then executed the Excel document that ultimately dropped the Dridex variant. The image depicted in Figure 2, shows the system score changed from 1 to 285, after the malicious carrier file was opened. The data highlighted below, displays how ECAT tracks the malicious actions. Item 1 shows the xls document being opened by the Excel process itself. Approximately 10 seconds later the Excel process is observed creating an executable file and then executing the malicious Dridex variant (highlighted as Item 2). The data in Figure 3 highlights how ECAT natively alerts on certain applications (like Word or Excel) writing executable files to disk. The lower two panes show the executable file that was created and triggered the alert, and the system where the file resides.
igure 2: Post Execution System Score
Figure 3: ECAT Identification
Figure 4, below, depicts how ECAT tracks network communications. Specifically shown here is the initial Excel process communicating with the URL hosting the second-stage malware. Seconds later you can observe where the second stage malware (a Dridex variant named taxanom.exe) began communicating with one of its command and control nodes.
Figure 4: Malicious Network Activity
Figure 5, below, further highlights how ECAT tracks all of the network communications as the taxanom.exe sample continues to run on the now infected system.
Figure 5: Expanded Network Activity
The sample (one of many they had received lately) was an XLS document; the metadata for the file is listed below. While this sample may not have been readily detectable using conventional AV at the time of discovery, it has since been submitted more than 100 times to VirusTotal (and is detected by over 30 of the AV engines).
One of the first steps we took was to analyze the file was with oledump.py, which is one of many great tools for analyzing OLE files. Looking at the streams in the file there were seven that contained VBA Macro Code (designated with a M or m; specifically streams 7,8,9,20,21,22, and 23).
20: m 976 '_VBA_PROJECT_CUR/VBA/\xd0\x9b\xd0\xb8\xd1\x81\xd1\x821'
21: m 976 '_VBA_PROJECT_CUR/VBA/\xd0\x9b\xd0\xb8\xd1\x81\xd1\x822'
22: m 976 '_VBA_PROJECT_CUR/VBA/\xd0\x9b\xd0\xb8\xd1\x81\xd1\x823'
23: M 1488 '_VBA_PROJECT_CUR/VBA/\xd0\xad\xd1\x82\xd0\xb0\xd0\x9a\xd0\xbd\xd0\xb8\xd0
Table 2: Oledump Output
All of the streams containing Macros were decompressed (using oledump) and redirected to different files on disk for further analysis.
While manually examining the decompressed streams, Stream 23 stood out. There are a couple of interesting artifacts here, first the VB_Name attribute, which is Russian and translated to “This Book”. The second important item is the Private Sub call, which is highlighted in the red box in Figure 6. The Private Sub Workbook_Open() will call the CheckTaxa, XMLTaxoTree, Checkheaders, and ProcessCarriageReturns functions, initiating the malicious functionality of the embedded Macros.
Figure 6: Stream 23
The XMLTaxoTree, Checkheaders, and ProcessCarriageReturns functions were all present in the Module1 stream, which was previously decompressed and written to a file. Figure 7, below, shows the relevant code in Module1 file. The file itself contains over 400 lines of code, the majority of which was not relevant to the malicious outcome of executing the file. This extra code was most likely included to make analysis of the file more difficult. The small rectangular blocks in Figure 7 indicate where lines of code were collapsed to make viewing easier for this article.
Figure 7: Module 1
The CheckTaxa function was present in the Module3 stream, which was also previously decompressed and written to a file. Figure 8, below, shows the relevant code in Module3 file. Again the file contained over 650 lines of unrelated code, most likely to make analysis harder. The small rectangular blocks in Figure 8 indicate where lines of code were collapsed to make viewing easier for this article.
Figure 8: Module3
This malicious code contained several interesting objects, which are denoted with arrows and corresponding numbers. Item 1, in the figure above, is setting the variable CheckTaxaTa to CreateObject(“Microsoft.XMLHTTP”). Item 2 is setting the variable shellApp to the instantiated Shell object, which will be used later. Items 3 and 4 work together to instantiate the Shell object, and then read the environment variable for the User’s Temp location. Item 5 is setting the tempFile variable, as a string by combining the User’s temp location (from Item 4) with the string \taxanom.exe, resulting in a string like C:\Users\<username>\AppData\Local\Temp\taxanom.exe. This combination of items will be used to download the second stage malware and subsequently write the file to disk at taxanom.exe in the current user’s Temp folder.
All of these Items are called through a series of GoTo commands, the last of which is responsible for call the CheckTaxonomyWithAutoCorrection function, which is highlighted in the red box in Figure 9.
The CheckTaxonomyWithAutoCorrection function was present in the Module2 stream, which was also previously decompressed and written to a file. Figure 9, below, shows the relevant code in the Module2 file. Again the file contained over 400 lines of unrelated code, most likely to make analysis harder. The CheckTaxonomyWithAutoCorrection function, highlighted in red, is ultimately responsible for creating an array, which is a lightly encoded string of characters that will be used to download the second stage malware. In the variants that we analyzed, the array was passed to another function (in this sample XMLMetadata1) with a parameter, which is also used in the URL decoding process. The data highlighted in yellow shows the decoding function (XMLMetadata1) being called with the needed parameters, and then the data highlighted in orange shows the decoding function.
Figure 9: Module2
The decoding functions varied by sample, but all of the variants took an integer argument, which was used in combination with other simple mathematical procedures to decode each array value to an integer, which was then converted to an ASCII letter. The decoding function, from Figure 9 is represented in Figure 10, where is an integer value from the haoami array at line 214 and the argument being passed is the integer 59 from line 218.
x-(25 × argument)- 7700-20-2
Figure 10: Decoding Function
9301-(25 × 59)-7722=104
Figure 11: Decoding Function Simplified
The result from Figure 11 is the integer value 104, which when converted to a character is the letter “h”, which is the beginning of the http portion of the decoded URL.
To automate the task of manually locating and decoding URL, where the second stage malware will be downloaded, we wrote an oledump plugin that can locate all of the data needed to evaluate the decoding function and produce the decoded string. The output of the plugin can be seen below in Table 3. The decoded string is highlighted in blue, and the error message is highlighted in yellow. Obviously as the routine changes or other encoding variants emerge the plugin will need to be altered or updated to also search for those variants. The oledump plugin, a python equivalent to work as a stand alone on exported streams (phishing_dropper.py script output displayed in Table 4), and a Yara signature are all provided with this article.
In this sample the file 9o8jhdw.exe is downloaded from the URL, highlighted above in blue, and saved as taxanom.exe. This particular sample was a Dridex variant, which could then in-turn be analyzed further. There are numerous reports openly available regarding Dridex variant analysis, and the analysis covered in this article was focused on the current capabilities of the client.
For analysis of this sample immunity debugger was utilized. Having a previous understanding of Dridex variants, we load the sample into immunity and then create breakpoints on calls to VirtualAlloc and VirtualAllocEx (Alt + E to view the executable modules, then select kernel32.dll and ctrl + n to view the available names). Then the analyst can simply place breakpoints (F2) on the desired APIs, highlighted in the red boxes in Figure 12 below.
Figure 12: VirtualAlloc Breakpoints
Once the breakpoints are set, the analyst can execute the code. In this sample the VirtualAllocEX call is called twice, which in this case allocates memory at the address 0x0E0000 and 0x0F0000, which can be seen in Figure 13 and 14.
Figure 13: VirtualAllocEx Breakpoint
Figure 14: VirtualAllocEx Breakpoint
The Dridex variant will then write an executable to the previously allocated memory at 0x0F0000 (Figure 15) and code to the 0x0E0000 location (Figure 16), which will be used to further unpack or deobfuscate malicious code.
Figure 15: 0xF0000 Data
Figure 16: 0xE0000 Data
At this point an analyst could dump or copy out the executable for further analysis or continue to analyze the sample in the debugger until the C2 data is decoded.
File Name – 988271023-PRCL.xls, MD5 – 8c05c5cddd26b64d8461d5aa40a401eb