Wednesday, March 9, 2011

Analyzing PDF exploits for finding payloads used

We have written a couple of previous blogs which focus on an in-depth analysis of PDF exploits as this is yet another techniques used by attackers to package malicious code and avoid antivirus detection. We have also written in the past about different decoding filters used to hide the malicious code inside PDF files. In this blog, we will examine yet another in the wild PDF exploit which has hidden it’s malicious code under different objects. We will also identify the final payload used to carry out the attack. The malicious PDF sample was retrieved from “hxxp://”. Here is the PDF source code:

The above PDF file is small in size and contains clear text JavaScript. Let’s look at object 1, which contains malicious JavaScript code. This code is very simple to read and understand. The JavaScript accesses some property values declared elsewhere such as “this.producer”, “this.subject” etc. Now, where are those declared values coming from? If you look at the object 3, you will notice that all other variables are accessed from these object properties. The strings used like “eval” and “StringfromCharCode”, suggest this JavaScript is used for malicious purposes. Now, we have 3 different strings from object 3, which will be used in the malicious JavaScript code.

1) Property Producer contains a long array of values
2) Property Subject contains string “eval”
3) Property Title contains string “StringfromCharCode”

We will use Malzilla for decoding this malicious JavaScript. For that, we need to substitute the respective values into the variables. We will re-create the JavaScript code for Malzilla using those values to as it with the decoding process. We will need to look at the code carefully, because the array contains some substitution and arithmetic operations. Here is what we need to substitute:

Let’s take the first value from the Producer property array, which is “t9.5*w”. The malicious JavaScript contains one variable “w”, declared with a value of 4 and then there is another function (axp = axp.replace\(/t/g,'2'\);), which will replace character “t” with value 2. So the first integer of the array will become (29.5*4) = 118. When we substitute the whole array with values of “t” and “w” we can create the final simple JavaScript code:

We will now decode the above simplified code using Malzilla. The decoded content is shown below:

The decoded content contains malicious heap spray code, shellcode and code for attacking different Adobe vulnerabilities. However, we have to yet identify what this malicious code does once it exploits the vulnerability? What payloads does it use for the exploit? For this we need to identify the shellcode used. Here is what the shellcode looks like:

The shellcode used, is in %u Unicode-encoded format. We will convert this code into byte code or executable code for further reversing using IDA pro or OllyDbg. For this, we will use favorite online tool Shellcode 2 EXE. We will copy and paste the shellcode bytes from the variable, which will generate a sample “.exe” file to analyze. Here is the screenshot:

Now, we have executable file to analyze. So let’s open in IDA pro first to look at the strings used inside the payload. Here are the strings found,

The string shows that this payload is going to download additional files on the system. Now let’s open this file in OllyDbg for obtaining the malicious URL used inside the payload.

The shellcode starts with NOP instructions followed by another loop which will decode the malicious code. Look at the instructions above, inside the highlighted box. Those are the instructions which are used to decode everything. By stepping through the code, we come to know that there is an instruction that will compare the value with E9 to exit and another, which is XORing byte with a value of 31. We will put breakpoint at the RETN instruction. The code will successfully run and we will be presented with the decoded content, which contains more interesting strings.

Look at the highlighted string above in the dump area. This URL will be used to download another binary from the server. Now, we have identified this malicious payload and the URL used. For reference, here is the result from ThreatExpert for the same shellcode.

That’s it for now.



Stefan said...

Nice write-up, Umesh. Would you mind sharing the PDF's md5, please?

Umesh Wanve said...

@ Stefan

Sure. Here you go,

MD5: 5a60ccab494cffe7149d5f8bc722fc1a

niranjan said...

Hi Umesh,

nice write up. I cannot find the md5 . could you mail me at

Best regards,