This blog post is focused on triaging malicious Microsoft Office documents. Specifically, we are analyzing an older but extremely efficient Rich Text Format (RTF) exploit that masquerades as a Microsoft Word Document.
The exploit in question targets Microsoft’s Security Bulletin MS12-027 (http://technet.microsoft.com/en-us/security/bulletin/ms12-027) and is based on CVE-2012-0158 (http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-0158). Vulnerable versions of Microsoft Office include Office 2003, 2007 and 2010.
During a recent engagement, RSA’s Incident Response Team gathered malware samples that exploited this vulnerability. In order to share our analysis techniques while protecting the client’s anonymity, we’ll be analyzing a sample from VirusTotal that shares very similar code to the samples found in the client environment. The details about this sample, “msf.doc”, can be found in figure 1.
Sample detail:
File Name: msf.doc
File Size: 10296 bytes
MD5: 41a38ec709daf66b7b5e133991120268
SHA1: 70d494c3826b907a485bd70e70b93b25dc4e37e8
File Type: RTF
MIME Type: text/rtf
Warning: Unspecified RTF encoding. Will assume Latin.
VirusTotal detail:
Figure 1: VirusTotal submission detail of our sample
Viewing this file in a hex editor (010 Editor Version 3.2.2 was used during this analysis) confirms the sample is indeed an RTF file. Since Microsoft operating systems look at file extensions to associate file types to an application, Microsoft Word was the default program associated with this file type. When we originally encountered the similar sample during Incident Response activities, the attacker targeted vulnerable versions of Microsoft Office, which happened to exist on phished client systems.
Figure 2: RTF header confirmation in a hex editor
Now that we have confirmed that we are dealing with a potentially malicious RTF file, we can utilize Frank Boldewin’s OfficeMalScanner (http://www.reconstructer.org/main.html) suite of tools, which includes multiple analysis tools that assist in analyzing malicious Microsoft Office documents. We are going to use “RTFScan”, “OfficeMalScanner” and “MalHost-Setup”. RTFScan scans RTF files for shellcode, and dumps OLE’s and other file containers it discovers for subsequent analysis. OfficeMalScanner has similar functionality as RTFScan, but analyzes Microsoft Office files including Word (doc), Excel (xls), and PowerPoint (ppt). MalHost-Setup stands up shellcode and is extremely useful during triage of potentially malicious Microsoft Office documents when shellcode exists. It should be noted that the carving and reconstruction this tool does could also be performed manually with the hex editor of your liking.
Figure 3: The OfficeMalScanner suite of tools is available at: http://reconstructor.org/main.html
RTFScan has several switches than can be leveraged during analysis; we utilized the “scan” and “debug” switches. The scan switch looks for shellcode and embedded files, while the debug switch attempts to disassemble shellcode if it’s discovered. The syntax (RTFScan.exe OLE_DOCUMENT__msf__1.bin scan debug) can be seen in the first red box in figure 4.
Figure 4: RTFScan in action
The second red box in figure 4 shows that RTFScan discovered and dumped an embedded OLE (Object Linking and Embedding) document inside the RTF. OLE documents provide a mechanism for Microsoft Office documents to store compound documents from multiple sources allowing Microsoft Office applications (e.g. PowerPoint, Excel, Word) to access this data. Microsoft has a detailed explanation of OLE technology at: http://support.microsoft.com/kb/86008
After utilizing RTFScan to successfully carve “OLE_DOCUMENT__msf__1.bin” from “msf.doc”, OfficeMalScanner needs to be run against The OLE document. Figures 5 and 6 show output from this action. The complete list of OfficeMalScanner’s arguments can be viewed from the usage statement, but for this analysis we are going to focus on the info, scan, brute and debug switches.
Figure 5: OfficeMalScanner info switch against the OLE document carved with RTFScan
Figure 6: OfficeMalScanner scan/brute/debug switches in use against the same OLE document
OfficeMalScanner did not see anything malicious with the OLE object, but in order to confirm this, we manually inspected the file as well. The docx format uses the PK (zip) format to store data and shared resources (more on OLE can be found here: http://msdn.microsoft.com/en-us/library/dd942557.aspx). The embedded file that was located by RTFScan, was a docx format, so it can be extracted through multiple tools. We used 7-Zip to expand the compound OLE object by right clicking on “OLE_DOCUMENT__msf__1.bin” and extracting it to a folder as seen in figure 7:
Figure 7: Extraction of the OLE document in question
Figure 8 outlines the three files from the OLE archive, “[3]ObjInfo”, “[3]OCXNAME” and “Contents”.
File Name: [3]ObjInfo
File Size: 6 bytes
MD5: 71d6cd4431020c2e44bcf554808ec0da
SHA1: 713afe26462ca0d620a6f12b3d0393d0ef8a137b
File Name: [3]OCXNAME
File Size: 22 bytes
MD5: ed5954ebe6347144c0d2329658a654ac
SHA1: 92a2db1a9e29b22d6ded4a8b6bc0a8d2b49408b0
File Name: Contents
File Size: 1406 bytes
MD5: 5053ca420f0c04744a0e9f152fe9ad55
SHA1: 7fa27324b43afd0e2072886f5eefcc76c48af092
Figure 8: Extracted OLE Objects
Further analysis of the file “[3]OCXNAME” reveals the existence of the string “ListViewA” which refers back to CVE-201200158 (detailed documentation can be found at http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-0158). The CVE details elaborate on the “ListView” ActiveX controls found in “MSCOMCTL.OCX” as being vulnerable to arbitrary code execution via a crafted RTF file.
Figure 9: ListViewA string found in “[3]OCXNAME”
By analyzing the “Contents” file, it is not readily apparent that it is malicious but at location 009A we see eight 0x90 bytes in succession. This is a NOP slide (a set of no-operation (no-op) instructions), which is a technique utilized by malware authors as a broad landing spot directing the program execution to the following instructions. RSA IR has frequently observed this technique used in conjunction with shellcode execution in the wild.
Figure 10: no-op instruction found in the “Contents” file
After the series of no-op instructions, 599 bytes of shellcode is found starting at location 0x00A2-0x02F8.
Figure 11: Hex view of the file “Contents”
Additionally, we can confirm the OfficeMalScanner shellcode findings in the “Contents” file by reviewing the original malicious RTF file “msf.doc” in a hex editor. We can confirm that the same shellcode is found in “msf.doc” starting at location 0x133C, however it needs to be converted from ASCII to Hexadecimal before analysis. There are several ways to do the ASCII to Hexadecimal conversion, so choose whichever tool or method you are most familiar with. Figure 12 shows the ASCII representation on the right, and Hexadecimal on the left. Referring back to figure 11 above, we can see that the Hexadecimal values there match up with the ASCII representations in figure 12.
Figure 12: Shellcode as seen in the “MSF.doc” file
Next, the shellcode needs to be extracted for further analysis. To do this in 010 Editor, select and highlight the shellcode between the locations 00A2-02F8 as seen in figure 13 and select “EDIT” à “Copy As” à “Copy as Hex Text”. This stores the hex values of the shellcode onto the clipboard. Next, open a new hex document by selecting “File” à “New” à “New Hex File”. Browse to the newly created tab and select “Edit” à “Paste From” à “Paste From Hex Text”. Now let’s save the shellcode as a file by selecting “File” à “Save As” à “shellcode.sc”. I saved mine in the OfficeMalScanner for convenience, as we will be using the MalHost-Setup command line tool next for analysis.
Figure 13: Saving the shellcode from the file “Contents”
Now that we have our shellcode successfully saved to shellcode.sc, there are several approaches that can be taken for analysis. The first option is static analysis through disassembly, which can be done in a disassembler such as IDA Pro. This is the most difficult method, as the analyst must have a firm understanding of shellcode, assembly and de-obfuscating techniques as this shellcode utilizes a considerable amount of obfuscation to deter analysis.
Figure 14: shellcode.sc loaded as a binary file in IDA Pro
The other method for static analysis, demonstrated below, is to create a working executable with MalHost-Setup. This creates a wrapper around the shellcode allowing it to be executed or debugged. At this point you’ll want to be running in a safe test environment as we are going to be executing live malware. If you’re in a virtualized environment, this is a good time to take a snapshot so you can easily duplicate testing as needed. My environment has a webserver listening on port 80 & 443, I also create a netcat listener on the fly for ports that my webserver is not listening for and the malware attempts to utilize. Finally, I have a DNS listener that responds to any DNS requests.
Next, we want to turn our shellcode into an executable file that can be run and observed. To do this, navigate to your command shell and run the following from your OfficeMalScanner directory:
Syntax: MalHost-Setup.exe shellcode.sc sc.exe 0x00
This creates the executable file “sc.exe” from our shellcode (shellcode.sc) using the code starting at location 0x00, which is the starting location of the shell code in the file. Now that we have our shellcode packaged up as an executable we can execute it and debug the binary, without the dependency of having to have a vulnerable version of Microsoft Office available or analyzing the shellcode statically. The goal here is to triage the malicious document rapidly and answer questions about its core functionality.
In figure 15 below we see the successful execution of our shellcode, “sc.exe”.
Figure 15: sc.exe executing at a command shell
The Network activity seen in figure 16 is the key functionality of the shellcode, as it does nothing else. Our shellcode attempts to create a TCP handshake with the IP “192.168.218.129”, a private, RFC 1918 compliant IP address.
Figure 16: Successful Shellcode execution yields the following TCP handshake
Yara signatures are an extremely useful means to detect malicious code that is embedded in office documents. Below is a fairly basic Yara rule written by RSA to detect shellcode in RTF documents. Our rule has two strings that it searches for, “rtfmagic” and “scregex” which exist in the strings section of our rule.
The first string, “rtfmagic”, looks for the five hex bytes “{7B 5C 72 74 66}” which represent “{\rtf” in ASCII. This represents one of the popular headers that RTF documents utilize. There are several variations of the RTF header, but most all contain at least these five characters.
The second string, “scregex”, is written in the form of a regular expression (REGEX). First our REGEX looks for the characters “[39 30]” which is the ASCII representation of hex “90” a NOP instruction. We expect to find multiple instances of these, which represent a NOP sled in a malicious Word or RTF document. Our test file “msf.doc” contains 8 NOP instructions. The second part of the REGEX, “{2,20}” looks at least 2 but no more than 20 instances of the bytes “39 30”sequentially.
Finally, the condition for our rule to trigger is finding the bytes “7B 5C 72 74 66” (represented as “rtfmagic”) to begin at byte 0 of the document, the very beginning of the file. This confirms that we are working with a RTF file. If this is met, the REGEX in the string “scregex” must be found. If both exist, then our Yara signature flags on a file, as we saw in the ECAT screenshot in figure 16 above.
rule RTF_Shellcode
{
meta:
author = "RSA-IR – Jared Greenhill"
date = "01/21/13"
description = "identifies RTF's with potential shellcode"
filetype = "RTF"
strings:
$rtfmagic={7B 5C 72 74 66}
$scregex=/[39 30]{2,20}/
condition:
($rtfmagic at 0) and ($scregex)
}
We used RSA’s ECAT to aid in Yara scanning, as Yara scanning is native to the tool. Figure 17 below outlines the successful detection of shellcode existing in “msf.doc” through our rule “RTF_Shellcode”.
Figure 17: Successful Yara hit on “msf.exe” in ECAT
This document provides an example for analysts to understand how to triage malicious Microsoft Office documents. The sample we used in this blog was most likely a test document or a proof of concept for CVE-2012-0158, which demonstrates how 599 bytes of malicious shellcode makes an unassuming RTF file weaponized. Additionally it provides analysts the use case, to test and understand readily available, free tools to triage malicious Microsoft Office documents.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.