Threat Extraction – A Preventive Method for Document-Based Malware

By Shiran Yodev and Einat Ferber

Threat Extraction proactively protects against known and unknown threats contained in documents by removing exploitable content. This method is also known as file sanitization or CDR (content disarm and reconstruction). The solution is unique because it doesn’t rely on detection like most security solutions. Instead, it facilitates true zero-day prevention, while delivering files to users quickly.

Preventing attacks delivered through documents

In many cases, malware infection starts with a document. In 2019, about 60% of malicious email attachments and 20% of malicious web downloads were delivered through documents such as PDF, Microsoft Office Word, Excel and PowerPoint.

Examples include Emotet banking Trojan, jaff ransomware and recent Iranian APT, MuddyWater. Threat Extraction technology prevents such attacks.

Delivering malware through documents became a very effective method to infect victims.

Here are the top document types used to deliver malware by email in 2019:

Based on Check Point ThreatCloud

Here are the top document types used to deliver malware by web download in 2019:

Based on Check Point ThreatCloud

Not just in cybercrime: document-based malware are also used in APTs

In April 2019, the Check Point research team uncovered the Iranian APT, MuddyWater targeted attack and Trojanized Teamviewer against Government targets.

Both attacks were delivered using documents in the infection chain.  MuddyWater used a Microsoft Word document and Trojanized Teamviewer used an XLSM document. Both files were infected with a malicious macro. Threat Extraction prevented both attacks by removing the malicious macros and other embedded active parts.

An inside look into document-based attacks

We will examine Jaff ransomware, a document-based attack. In this example, the attack starts with the victim receiving a PDF file as an email attachment. The file name and email content varies and aims to entice the victim to open the file.

Once the victim opens the PDF, a malicious action is invoked and a pop up will request the victim to click ”OK” to open a file:

After clicking “OK”, a Microsoft Word document that’s embedded inside the PDF is automatically extracted and opened.

The Microsoft Word document will display false information to the victim, claiming the document is “protected” and asking the victim to click “Enable content” in order to open the document:

Clicking “Enable content” actually allows Microsoft Word to execute Visual Basic code (macro, a feature originally intended to automate actions within the document, but is often used for malicious purposes.

In our case, the macro connects to a remote server to download and execute a malicious executable, resulting in victim’s files being encrypted:

Practical prevention with Check Point Threat Extraction

Check Point SandBlast Zero-Day Protection utilizes Threat Extraction technology to eliminate threats by removing exploitable content and reconstructing documents using known safe elements.

Threat Extraction eliminates delays associated with traditional sandboxes and enables real-world deployment for Zero-Day protection in prevent mode, while delivering cleaned files to users quickly.

When looking at the Jaff ransomware attack described above, Threat Extraction has cleaned the files from malicious content.

Threat Extraction event log:

The following content was removed:

  1. Embedded objects – The encoded Microsoft Word document that was hidden inside of the PDF.
  2. PDF JavaScript Actions – The malicious action that is launched once the file is opened and extracts the embedded document.
  3. Macros and Code – The JavaScript code that opens the file with the default reader.

Removing any of the contents above on its own would stop the attack entirely.

User Friendly: self-catered access to original files with Threat Extraction

While active elements are very popular in malicious documents, legitimate documents very rarely include active parts and in the vast majority of the cases, the difference between the sanitized document and the original one is not recognizable to the naked eye.

Nevertheless, in rare cases the end user seeks to access the original file for various reasons. Once the file completes an additional analysis by SandBlast Threat Emulation sandboxing and is confirmed to be benign, Threat Extraction allows self- catered access to the original file.

This approach allows IT departments to deploy the solution with minimal impact on users and without burdening their helpdesk.

An example of a cleaned downloaded file:


Malware delivered through documents are very common. Threat Extraction file sanitization technology plays a major role in preventing these attacks by using a unique, practical prevention method to remove exploitable content.

To achieve maximum security, productivity, and usability, we recommend organizations use security solutions that offer a hybrid prevention approach that combines file sanitization and advanced sandboxing.

Check Point Sandblast products provide an effective hybrid solution by combining Threat Extraction file sanitization and Threat Emulation sandboxing in a single solution.

Threat Extraction promptly delivers a safe, sanitized content to its intended destination and ensures productivity, while SandBlast Threat Emulation sandboxing performs a deep analysis of the file and determines whether it was malicious or not. The end user can access the original file if it is not classified as malicious.

Check Point is continuing to innovate by enabling this critical capability on the gateway. Therefore, files downloaded from the web or files sent in email are extracted and cleaned before they reach the user.

Relevant references: