Microsoft Word Intruder RTF Sample Analysis
By Omri Herscovici, Security Researcher
Background
Check Point researchers obtained a sample of a malicious Word document that was used in an attack attempt against one of our customers. The sample itself is a Rich Text Format (RTF) file with a .DOC extension. Recently, there has been a resurgence of the trend to use malicious macro code inside office documents. However, this wasn’t the case here.
Launching the sample resulted in two GET requests with a small time difference between them:
- http://[host]/[folder]/img.php?id=[id] – Response is a 1×1 white JPG
- http://[host]/[folder]/img.php?id=[id]&act=1 – Response is a malicious payload
There were other HTTP requests as well, but they were generated by the payload (which is not in the scope of this post). Based on the GET requests, we believed we were dealing with the Microsoft Word Intruder (MWI) exploit kit described in FireEye’s blog. The MWI is a builder of malicious DOC/RTF files and is accompanied by MWISTAT, a statistics panel which tracks the infections. The MWISTAT of our sample was hosted in the same location as the payload delivery page (“img.php”). The server hosting the infecting instance also contained 7 other MWISTAT panel instances, ranging from the older 2.0b to what appears to be the newest version, 3.6.
The main changes between the versions appear to be in the statistics panel. However, in version 3.5 the author added two “snort bypasses” created in response to recent publications:
- The downloader page was changed from image.php to img.php
- header(‘Content-Description: File Transfer’) was removed when the payload was served
MWI Change log:
# TODO
- [~] backup statistic
- [~] stat by countries
- [~] stat by office versions
- [~] exec1/exec0: process execution
- [~] tools: mwistat.log dump export/import
- [~] tools: ip black list
- [~] tools: user-agent black list, av services detection
- # MWISTAT 3.7
- [~] knock drop_exec_result
- # MWISTAT 3.6
- [+] dll support (@is_dll)
- # MWISTAT 3.5
- [+] snort signature bypass
- # MWISTAT 3.0 – 3.4
- [+] bugfixes
- [+] threads: INTERNAL / EXTERNAL thread mode
- [+] navigation
- [+] logs: added pages
- [+] tables
- [+] threads: unhandled threads
- # MWISTAT 2.5
- [+] bugfixes
- [+] stats: detailed ip stats – add office_vers column
- [+] stats: sorting
- [+] logs: target_id = crc32(target_ip + target_useragent)
- [+] stats: target statistic (non-IP, as previous version)
- [+] stats: new table: office versions
- # MWISTAT 2.0a
- [+] bugfixes
- [+] more secure
- [+] invalid auth log
- [+] md5 auth
- [+] logs: office version
- [+] user-agent filter
The administration panel contains information regarding the infection time, campaign, IP, status (OPEN/LOAD/SUSP), user-agent, office version and data sent with the request.
The builder creating the exploits creates a log file:
[+] EXPLOIT CONFIGURATION:
- build_name: ./output/OC..[REDUCTED].._EXTERNAL_STATISTIC.conf.txt
- CVE-2012-0158: 1
- CVE-2010-3333: 0
- CVE-2013-3906: 1
- CVE-2014-1761: 1
- hspray_speed: SPRAY_SPEED_FAST
- code_mutation: 1
- use_wmi_exec: 1
- exe_dir_csidl: CSIDL_INTERNET_CACHE
- exe_dirname: Content.Word
- exe_filename: ..[REDUCTED]..
- exe_dir_csidl2: CSIDL_LOCAL_APPDATA
- exe_dirname2: ..[REDUCTED]..
- exe_filename2: ..[REDUCTED]..
- antisafe_mode: 0
- execution_mode: EXEC_STARTPROCESS
- rtf_doc_text: ..[REDUCTED]..
- rtf_doc_path: ..[REDUCTED]..
- use_mwi_stat: 1
- mwi_stat_url: http://[host]/[folder]/img.php?id=[id]
- exe_location: http://[host]/[folder]/img.php?id=[id]&act=1
- build_type: EXTERNAL_STATISTIC
- [+] EXE_FILE EXECUTION INFO:
- exe_path: ./%CSIDL_INTERNET_CACHE%/Content.Word/~WRX4019.tmp
- exe_process: from svchost.exe->wmiprvse.exe->~WRX4019.tmp (COM/WMI Steal Execution)
- exe_process: from WINWORD.EXE->~WRX4019.tmp (Straight CreateProcessA Execution – if COM/WMI failed)
- [+] ADVANCED BUILD INFORMATION:
- file: ./output/OC..[REDUCTED].._EXTERNAL_STATISTIC.doc
- size: 289039 bytes
- crc32: ..[REDUCTED]..
- md5: ..[REDUCTED]..
- sha1: ..[REDUCTED]..
- date: 070515_092427
The Sample
RTF is a Rich Text File format developed by Microsoft which is some-what human readable and consists of text, control codes, and enables embedding of various objects including OLE Compound files. Object Linking and Embedding (OLE) Compound File (CF) is a container that uses a FAT-like file system to define streams using allocation tables. OLECF was widely used for Office files from 1997 – 2003 until Open XML formats were introduced in 2007 and its magic bytes are ‘d0 cf 11 e0’ (“DOCFILE0”). When we encounter a malicious DOC/RTF, it’s always a good idea to try the OfficeMalScanner, a forensic tool which scans for malicious traces. When we used RTFScan to extract the objects from the file, we found one OBJDATA file and 2 OLE objects. A visual inspection of the RTF revealed that the extraction of one of the OLE objects was unsuccessful, as the RTF contained a much bigger data chunk than what was obtained by the tool. The RTF file uses a group-closing bracket ‘}’ to confuse analysis tools. This bracket causes the RTFScan to stop the extraction, but is completely ignored when a standard DOC/RTF reader parses the file.
Extracting the object manually resolves in a DOCX that produces a non-scriptable heap-spray. We will elaborate on this later.
Overall, the RTF contains 3 different exploits. This is its structure:
The initial GET request is used as an infection mark for the MWISTAT panel that sets the state of an IP to “OPEN”. It is caused by the following RTF INCLUDEPICTURE command:
Once an exploit is successful, “&act=1” is added to the original request, which changes the IP status to “LOADED”. In return, a response containing the attacker’s chosen payload is received. The first exploit activated is CVE-2012-0158, a stack overflow in MSCOMCTL.OCX in the controls MSCOMCTL.TreeView and MSCOMCTL.ListView. Use of this exploit can be identified when looking for the ListView.2CLSID (bdd1f04b-858b-11d1-b16a-00c0f0283628):
The first 3 parts are little-endian. Once again, a line break is used to bypass simple static protections. A few lines further down, we can see the ROP chain and the shellcode embedded in the object.
The exploit launches a ROP chain based on MSCOMCTL.OCX version 6.1.98.18, as it is a non-ASLR module up to MS14-024. This is the ROP chain with commands and annotations:
The ROP chain creates an executable heap with the size of 0x10000. It then copies the 1st stage shellcode to it and runs it. The 1st stage shellcode uses standard FLDZ/FSTENV to get the current instruction pointer, and uses a self XOR decoding loop with a 0x3B key. The purpose of this shellcode is to locate the 2nd stage shellcode. This technique is called egg hunting and uses the kernel32!isBadReadPtr() function.
More on this technique can be found here.
Egg hunting requires a unique signature to be placed before the shellcode. This is the code segment searching for the signature:
The signature is 12 bytes longs and is constructed of 4 bytes X 2 of “888840404” and 4 bytes of “FFFFAEAE” (Little-Endian). The signature and the 2nd shellcode can be seen in the OBJDATA defined at the beginning of the file:
The next part is the encoded 2nd shellcode, which is decoded by the 1st shellcode. Once decoded, we can easily see the URL in for the malicious payload:
In case the first exploit succeeded, the payload is downloaded and executed, and the Word process exits. The next section of the RTF is the heap spraying DOCX that is used in the following exploit. The DOCX in the second OLE object extracted from the RTF uses a technique for spraying the heap as part of the file structure when parsed by Office Word. The DOCX contains a folder (\word\activex\) with an ActiveX object (OLE):
The size of the file activeX12.bin is 575 KB.
It is referenced by 141 relationships XMLs containing this tag:
This is the activX12.bin content:
The FC (CLD – Clear Direction Flag) is used for the NOP sled.
The activeX contains ROP chains after every 3,845 bytes and ends with the egg hunting shellcode.
This ROP chain is based on MSCOMCTL.OCX version 6.1.98.34. It uses VirtualAlloc in order to change the allocated chunk to executable and start sliding through it. The ROP is repeated throughout the block. Therefore, there is a trampoline redirecting the flow before each ROP, to ensure that the ROP is not executed again. This continues until we reach the shellcode at the end of the object. When launching the document, the process memory of winword.exe inflates to around 100MB.
At this point, let’s take a look at the process heaps:
The allocation percentage in the default process heap:
From this we see that 97% of the total is data chunks with a size of 8fc00, which is the activeX file size (588800 B).
Let’s search for data chunks of this size.
Here is the first one:
Indeed, this is the activeX control. The next segment in the RTF file is CVE-2013-3906, a heap overflow vulnerability caused by an integer overflow in a TIFF image. The TIFF file was identical in all MWI samples we analyzed. In some cases, other formats such as jpegblip \ pngblib were set as the “pict” format.
The use of a TIFF file can be identified by the magic bytes (“49492a00”).
Parsing the TIFF shows the requirements for exploiting CVE-2013-3906 are met:
First, we calculate the StripsPerImage:
floor ((ImageLength + RowsPerStrip – 1) / RowsPerStrip)
ImageLength = 0x152
StripsPerImage = (0x152 + 0x5 – 0x1) / 0x5 = 0x44
Next, we determine the JPEGInterchangeFormatLength, which is 0x6455.
Finally, we calculate the StripByteCounts total:
We calculate the size of the buffer to be allocated with this formula:
StripByteCountsTotal + 8 + 2xStripsPerImage + JPEGInterchangeFormatLength
This will result in 0x100000000 (Integer Overflow)
The exploitation will cause the vftable pointer to be overwritten with 0a0a0a0a which points to the sprayed heap.
The final segment of the RTF file is CVE-2014-1761, object confusion vulnerability.
Its initialization is immediately before the INCLUDEPICTURE:
According to the RTF specification, the listoverridecount should be 0, 1 or 9.
In this case, it is 26.
The trigger for the exploitation can be found at the end of the RTF file.
The ROP chain and egg hunting shellcode for this vulnerability are stored inside the \leveltext section.
Each Unicode char is a word size (2 bytes) decimal representation of a ROP gadget.
The ROP chain uses VirtualAlloc in order to create a new executable chunk and run the shellcode.
Detailed analysis of the CVE-2014-1761 vulnerability can be found here and here.
Conclusion
Although the kit uses quite old CVEs, the author used multiple exploits and ROP chains in order to make the malicious document more robust and effective against different or patched versions of Office Word.
The panel for statistics and infection tracking we see here is often found in browser exploit kits. This panel allows the attacker to get reliable information regarding the geographic location, success rate and spread of his campaign.
With the decline of PDF and JAVA exploits due to security mitigations (Sandbox / Signed applets), It wouldn’t be a surprise that along the rise of Flash exploits we will start seeing more exploits targeted towards the Microsoft Office package.
The Check Point IPS blade provide protections against these threats:
- “Microsoft Word RTF listoverridecount Remote Code Execution (MS12-079)” (CVE-2014-1761)
- “Microsoft Office Embedded TIFF Image Remote Code Execution” (CVE-2013-3906)
- “Microsoft MSCOMCTL.OCX ActiveX Control Remote Code Execution (MS12-027)” (CVE-2012-0158)
- “Microsoft Office RTF Stack Buffer Overflow (MS10-087)” (CVE-2010-3333)
In addition, Check Point IPS blade provides protection against MWI Exploit Kit: “Microsoft Word Intruder RTF FILE“