JavaScript Hooking as a Malicious Website Research Tool
One of the top Internet threats today is drive-by download attacks which originate from exploits kits, hacked websites, spam campaigns and more. As browsers are the main tool for navigating the web, the main attack vectors are browser vulnerabilities, plugin and extension vulnerabilities, as well as some OS vulnerabilities.
We have been playing with the idea of using JavaScript hooking as a research tool with the goal of identifying hacked websites, exploit-kits, and CVEs, and of profiling websites for research purposes.
Why JavaScript hooking?
A web page is constructed from static and dynamic components.
The static components are declared as part of the HTML source code where Document Object Model (DOM) elements are rendered by the browser or its plugins. The dynamic components include scripts like JavaScript which are able to change the HTML document by adding/removing/modifying DOM elements, as well as use specific objects like XmlHttpRequest for AJAX.
In our experience with exploit kit behavior, hacked websites, and web-based exploits, we find that malicious websites rely heavily on obfuscation to bypass signature-based protection products as well as making the researcher’s life a bit harder.
De-obfuscation is usually done through JavaScript. Most of the time, the de-obfuscation methods take data which is not easily readable by human eyes and can’t be used directly by the browser, and output code or elements that the browser can handle.
For example, the following screenshot is taken from a landing page of a known exploit kit:
As you can see, this is just a blob of data, not something with a lot of meaning initially. However, this data is decoded and de-obfuscated by the functions in the landing page code.
When the process is finished, new elements are added to the webpage. These include a new DIV element, two new JavaScript elements, and a new applet element which loads a Java exploit.
As we said above, if de-obfuscation is done through scripts, why not hook these script functions so we will be alerted when something interesting happens, such as dynamic addition of Java applet tags?
Our Goals:
- Detect a web site that has been injected with malicious code.
- Detect exploit-kits.
- Identify new evasion techniques and new delivery methods.
- Intercept and react to specific patterns in the middle of the de-obfuscation process. We will not explain the specifics in this blog post, but it is an attempt to solve the problem of manipulating data inside an obfuscation/de-obfuscation chain.
- Detect common browser vulnerabilities.
Our Challenges:
- JavaScript implementation varies in different browsers; it does not always follow the standard. In addition, different versions of different browsers support different JavaScript features.
- Not everything can be hooked easily; once again, this depends on the particular browser and its version.
Core Architecture
Hooking JavaScript functions is a very straight forward solution that makes it easy to target specific events. It is also a cross-browser solution and is much easier to implement than hooking each browser’s JavaScript engine.
The core architecture must be able to somehow inject the JavaScript hooks into the client’s browser and log the noteworthy items to a server.
Our chosen topology for the task looks like this:
The client browser is configured with our proxy server, which gets the request and passes it on to a website. When the response comes back from the website, we inject JavaScript code into the HTTP response.
Our JavaScript code handles all hooking and logic.
Once our JavaScript code finishes running, it needs to log the result.
We have a few available options. For example, we could use the console.log() function. However, this function works in Chrome but not in all versions of IE. The equivalent solution for IE is using the event manager, but that makes development and maintenance much harder.
Our other option is to use <img> tags, but we prefer the idea of adding a header to a new HTTP request and letting our proxy catch it.
Examples
Iframe Detection:
To demonstrate one implementation of this idea, let’s take a look at our first goal: Detect a web site that has been injected with malicious code.
Usually, a benign website is hacked and the cybercriminal adds a piece of code to the website. This redirects the unsuspecting victim to a malicious server such as an exploit kit server.
Our experience shows that the redirection is usually carried out by de-obfuscating the data and dynamically creating an iframe element. The iframe may have suspicious attributes values such as hidden visibility.
An iframe object is a DOM element which eventually, after all mutations, must be added to the HTML document in a way that can be read by the browser’s HTML parser.
As a reference, let’s list some of the ways an element can be added to the HTML document:
- [DOM Element].appendChild([DOM Element])
- [DOM Element].replaceChild([DOM Element])
- write(string)
- [DOM Element].innerHTML = [string]
So… our first thought is: if we could see all the changes to an HTML document, we would know if an iframe was added dynamically. We could scan its attributes as well.
Without considering external libraries:
We found that the method that worked best for us was classic JavaScript hooking. We still need to work on the differences between browsers, but we can hook host objects, which in some cases is more reliable.
After choosing our method, we hooked the first three ways of adding elements: appendChild, replaceChild, and document.write.
The 4th method, using ‘innerHTML’, is a bit more tricky but the idea is to hook functions that are interesting, such as document.getElementById(), document.createElement(), etc. We then add the objects to the MutationObserver object or use Mutation Events and watch for changes.
First, we create a function for registering hooks. We feel most comfortable with hooking JavaScript this way:
For example, we want to hook the document.write() method and call a function after it finishes. We can use our function like this:
hook_function(document, ‘write’, null, callback)
As a side note, sometimes different functions need different hooking mechanisms. This depends on the browser and the browser version. For example, IE11 supports extending Node.prototype, but older versions of the browser don’t.
There’s not much to say about the logic. Everything is added eventually with an HTML element or a string; it sums up to traversing elements and checking their attributes and content or comparing strings.
We put all the logic to detect an iframe inside the callback function. Let’s see if that works.
It works!
Of course, we still need to check for false-positive rates. We may tweak the attributes, or maybe even use a blacklist of specific patterns. However, the example above is real: an infected website with a dynamic iframe eventually leads us to an infection chain.
Evasion Detection:
Exploit-kits are systems that try to maximize their uptime by reducing their general signature, meaning they will expose as little of themselves as possible. This is why they employ different evasion techniques. For example, sometimes they will use known vulnerabilities to determine if the landing page runs on a virtual machine. At other times, they will try to load certain components to find out if there is some kind of protection system running. These are signs that raise suspicions and help identify malicious websites. Let’s tackle these two examples.
A known technique for determining if software is installed on a system using a vulnerability which affects a different version of IE is to abuse the res:// protocol. There are a few ways to do it, but by hooking dynamic creation of elements like <img> and <script>, we can send the content to a function that checks for the use of this vulnerability.
In the following screen shot, we see an exploit kit attempt to recognize if software such as Fiddler proxy, VirtualBox, VMware, etc are installed on the system. If they are present, the exploit kit will not continue its execution.
The second example of evasion on IE is when an exploit kit tries to instantiate an ActiveXObject of a specific product. ActiveXObjects can be wrapped with our own object. A simple method would be to use code like this:
The jsh_detect_activex_by_name method will take the name and check if a known ActiveX object is used when trying to evade a system.
Here the attacker tries to instantiate a Kaspersky virtual keyboard plugin to check if a Kaspersky product is installed.
Plugin Detection:
When an exploit kit tries to attack a victim through the exploitation of a plugin vulnerability, it must add the plugin object to the html document. This object will be added to the page after de-obfuscation. With JavaScript hooking, we can hook into the creation of these elements and even extract interesting data. Let’s take a look at the next example.
Nuclear exploit kit uses a recent Flash vulnerability to attack victims. It uses the <object> tag to load a flash object. To bypass the signature-based protections, the tag is obfuscated and is only added during runtime.
The <object> tag loads the .swf file that triggers the vulnerability. It’s important to note that FlashVars are supplied to the movie as well. Exploit-kits sometimes use FlashVars to supply the payload download link or even the shellcode to the exploit.
With JavaScript, we hook into all interesting functions and scan for object creation. If we find a Flash object, we log it and extract the FlashVars if it exists.
Conclusion
Our examples show how JavaScript hooking can be leveraged for research purposes. It is clear that tackling web page behavior through this kind of interception opens up greater possibilities for gathering information about a web site and may even eliminate the need to break an obfuscation scheme.
JavaScript hooking can be used in manual analysis as well as an automated pathway integrated in emulation environments.