Monday, March 29, 2010

Unlike popular belief, short links on twitter aren't malicious!

Twitter recently announced that it has implemented a new security system to scan all URLs posted in tweets to protect users from malicious sites. This follows a similar announcement from bit.ly in November 2009

Twitter, and the URL shorteners it has helped to popularize, have long been blamed for leading users to malicious sites. I posted on this topic 3 weeks ago and argued that this may not be true. I wanted to additionally do a thorough investigation of the Twitter links both before the security scan and after.
I have retrieved more than 1 million URLs (1,314,615 to be exact) from the public timeline over a couple of weeks before they put any protections in place. I then ran the links through the Zscaler infrastructure to find out which links lead to malicious sites.


The state of the Twitter links
 Prevalence of hostnames on Twitter

As expected, URL shorteners are very popular on Twitter, and bit.ly represents 40% of all links. TinyUrl, one of the original URL shorteners, comes in 3rd with only 5% of all URLs.

How many malicious links?
I looked for malicious sites - phishing sites, malware, etc. I did not look for spam, only for pages that present a security risk to users.
To my surprise, a very low number of links led to malicious pages - only 773, links, 0.06% of all links scanned, redirected to malicious content.
  Types of malicious sites
Here is the distribution of malicious links by host name:

Bit.ly represents 40% of all links, and roughly the same proportion of malicious links. Same case for TinyUrl:  5% of all URLs and 6 % of all malicious sites. It does not look like bit.ly’s phishing and malware protection is making it any safer than other URL shorteners.  Twitpic.com is used to share images, so it is unlikely to be used for malicious content. Mediafire is known for hosting malware and other viruses, even if it is not blacklisted by Google Safe Browsing.
Note that these links may have been scanned up to 4 weeks after they were collected. Bad sites may already have been taken down, or cleaned up.


Can Twitter and bit.ly really protect their users?
The key to protecting end users, is real-time scanning of both the URL and the content. Twitter and bit.ly can only scan the links periodically.  Malicious websites try to hide their malicious content to non-users by checking the user agent or geography and by requiring a real browser which fully understands Javascript, Flash, etc.  An attacker can present harmless content to the Twitter or bit.ly scanners, but harmful content to a real user.
But remember that only 0.06% of all the URLs tests represented a security risk. It is actually much safer to follow link s from Twitter that from some search results on Google!

-- Julien

Whitepaper: Botnet Analysis Leveraging Domain Ratio Analysis

While conducting stats and trends for last Quarter's "State of the Web" report, I found an interesting way of analyzing top-level domains (TLDs). I added the total number of web transactions involving a TLD for the month and divided it by the total number of unique domains within that TLD. In other words I calculated a ratio of Transactions:Unique Domains per TLD for each month and tracked this ratio. A low ratio means that the transactions were well distributed across the domains visited within that TLD. A ratio of 1:1 for example means that there was essentially 1 web transaction per unique domain visited. A very high ratio would indicate that there were a large number of transactions to one or more of the unique domains visited - suggesting that one or more popular domains dominated customer usage of that particular TLD.

By sifting through the records for the high-ratio results, some interesting information can be discovered. In some cases, high-ratios were caused by numerous transactions to a popular site or service, such as a popular social networking site in a particular ccTLD. However, high-ratios may also represent malicious command and control (C&C) or information drop servers that have a large number of transactions beaconing to them.

An example of a TLD that bubbled to the top was .LY. This domain had more than double the monthly ratio value of .COM. This high-ratio is explained by the TLD being relatively unpopular for our customers in terms of unique domains visited, but having a large number of transactions to a popular domain: BIT.LY, a URL shortening service.

Another TLD, .NU, had more than double the monthly ratio of .LY. After conducting analysis on the results, I detected that there were several customers beaconing to a .NU site over HTTP on port 53/TCP (generally used for DNS). Upon further investigation the customers were infected with a previously undetected variant of the Win32.PcClient Backdoor. The full research report of the detection methodology and incident analysis can be read HERE.

Thursday, March 25, 2010

Web Security: the Google paradox

Google has done a great deal to help people safely browse the web:
  • Google Safe browsing is a feed of malicious URLs and phishing sites, which is integrated with Firefox
  • skipfish, released earlier this week, aims at testing the security of websites
  • Chrome, Google's new browser, has a strong emphasis on user security
  • etc.
However, we have shown in previous blog posts that Google often includes malicious websites in the top-50 search results for breaking news stories: March Madness, the Chile earthquake, the Winter Olympics, the Haiti earthquake, etc. After clicking on one of these search results, the user gets redirected to a malicious website.

How safe are the most popular searches?


Yesterday, I started an experiment. I am retrieving the most popular US searches from Google Trends to check how many malicious sites are displayed by Google. Note that Google Trends changes each the day, and the might be different when you read this post.

Here are the searches I tried, all from Google Trends on 03/24 and 03/25

03/24, casey reinhardt:
  1. page 5, hxxp://iablaas.com/present/casey-reinhardt.html redirects to the malicious site hxxp://search4-protect.xorg.pl/
03/24, wikipedia down:
  1. page 1, hxxp://wikipedia-down.prolinepitcarts.com/ uses a PDF exploit
Malicious link within the Google search results.

Note that my antivirus thinks this page is safe. It does warn me while the PDF exploit is running, however.

03/24, patrick trainor:
  1. page 7, hxxp://riablaas.com/present/patrick-trainor.html redirects to hxxp://runforclear1.xorg.pl/
03/24, jarrett rex:
  1. page 1, hxxp://mandurphy.com/presentation/jarrett-rex.html redirects to hxxp://runforclear1.xorg.pl/
  2. page 3, hxxp://jarrett-rex.prolinepitcarts.com/ uses a PDF exploit
  3. page 3, hxxp://riablaas.com/present/jarrett-rex.html redirects to hxxp://runforclear1.xorg.pl/
03/25, dame edna:
  1. page 1, hxxp://front9design.com/ztssw.php?on=dame%20edna redirects to hxxp://save4my-sys.xorg.pl/
  2. page 2, hxxp://www.friendsofguitar.com/sqgrk.php?go=dame%20edna redirects to hxxp://save4my-sys.xorg.pl/
  3. page 3, hxxp://denverneighborhoodnews.com/ddlkc.php?a=dame%20edna redirects to hxxp://save4my-sys.xorg.pl/
  4. page 3, hxxp://global-equality.org/tcwud.php?in=dame%20edna%20wiki redirects to hxxp://save4my-sys.xorg.pl/
  5. page 6, hxxp://origin.ny1.com/1-all-boroughs-news-content/top_stories/... redirects to hxxp://save4my-sys.xorg.pl/
  6. page 7, hxxp://friends.opensourcediet.com/adugq.php?page=dame%20edna redirects to hxxp://save4my-sys.xorg.pl/
  7. page 8, hxxp://friends.opensourcediet.com/adugq.php?page=dame%20edna redirects to hxxp://save4my-sys.xorg.pl/
  8. page 8, hxxp://dennek.com/pilxf.php?a=dame%20edna%20broadway redirects to hxxp://save4my-sys.xorg.pl/
03/25, beyonce pregnant:
  1. page 1, hxxp://americanbeachsoccer.com/noldor/beyonce-pregnant.html redirects to hxxp://runforclear1.xorg.pl/
03/25, johnny maestro:
  1. page 1, 2 results are malicious
  2. page 2, 6 results are malicious!
  3. page 3, 6 other results are malicious
  4. page 4, 7 other results are malicious
  5. page 5, 4 malicious results (I stopped there)
As you can see, this is pretty bad. All the queries I tried did show at least 1 malicious page on the first 10 pages (top 100 results). I suspect the ratio of good URLs:malicious pages is even worse for trends that are older than what I looked at in my initial research. For the search ,johnny maestro, more than 50% of the links are malicious!

xorg.pl is involved in most of the malicious redirections. It shows a fake AV page and tricks the user into running malware on their computer. Fortunately, it is a known malicious domain and it is flagged by Firefox:

Malicious page blocked by Google Safe Browsing in Firefox

If you look for prolinepitcarts.com in Google, they list 2,200 results. These are 2,200 malicious links for a single domain. They are all PDF files, with the URL in the form of keyword1-keyword2.prolinepitcarts.com.

Search result for prolinepitcarts.com

The Challenges

Why does Google not do a better job in cleaning up the results? Malicious hackers are doing their best to hide the malicious pages from security scanners. First, you have to hit the malicious page by coming from Google (referer header). Then, you need to have a vulnerable browser (Internet Explorer 6 is a good bet). Then the tool has to run all of the Javascript, Flash and PDF elements to follow the redirections.

But I would hope that they could at least clean up their top results for the top searches. After checking a couple of bad links, you can find a few elements that make the malicious content stand out. For example, they often are .php pages with the search query in the URL parameter. Or they have all the right keywords in the sub-domains. By targeting these few URLs, limited resources would be needed by Google to cleanup their search results.

What's next?

I am looking at extending this experiment to other search engines. I also want to see if this is specific to the US, or if search results in other languages contain as many malicious sites. Finally, I will try to get more comprehensive results for more trends, older trends, etc.

-- Julien

Tuesday, March 23, 2010

March Madness Malware

The previous post spoke about the increase in frequency to sports related websites from corporate users because of March Madness. A follow on question: are there any associated security risks from this? While browsing to some well-known sports sites is of negligible risk, several Google searches for some NCAA / March Madness terms reveal malicious results:


Following the search result causes redirection to occur:

Which loads a Fake A/V page:
Which is detected by 9/42 anti-virus vendors.

In addition to this Google SEO example, Zscaler blocked and logged this malicious NCAA site:
hxxp://ncaa-bracket-2010-update.bitterrootjrfootball.com


which loads an obfuscated JavaScript file: /styless.js
which after some decoding, redirects to this flash file:
hxxp://ncaa-bracket-2010-update.bitterrootjrfootball.com/?ncaabracket2010updatencaabracket2010updatebitterrootjrfootballcom.swf

This is the Wepawet report for the flash file, and the VirusTotal (6/42 detection) report.

The flash file contains obfuscated JavaScript redirector:

Decodes to:
the document.location.search provides the query string portion of the URL.

Safe Browsing results (Google, Norton) show a handful of fake NCAA related sites that load Fake A/V:
hxxp://st-mary-s-basketball.bitterrootjrfootball.com/
hxxp://espn-bracket-picks.bitterrootjrfootball.com/
hxxp://siena-university.bitterrootjrfootball.com/
hxxp://nit-tournament.bitterrootjrfootball.com/

Monday, March 22, 2010

Corporate Users and the March Madness Temptation

Over 50% increase in Sports Related Bandwidth Usage

Those not familiar with Zscaler, Inc. – we provide Security Software as a Service (SaaS) to protect customers from web-based threats as well as enforce customer policy decisions. We see traffic from over 140 different countries, and process millions of web transactions every day. Our customer user base is composed primarily of corporate end-users. Based on these facts, Zscaler is in a unique position to conduct stats and trends for corporate end-user web usage during the NCAA basketball tournament.

I pulled the numbers for transactions to sports related websites for March 2010 and as suspected, there was a noticeable increase in requests to sports related websites. In order to best quantify the activity I added the total HTTP request and response size (bytes) per day to sports related sites - I call this bandwidth usage. I calculated a trend line for weekday and weekend traffic. Because our users are mostly corporate users, there is a general decline in weekend web usage, so the actual trends are more visible by trending weekday and weekend traffic separately.



The blue line represents the weekday trend and the red the weekend trend. The light-grey Y-Axis lines are the bandwidth values in KB – the actual numbers are not shown, but the difference between each line is 100,000 KBs. What we see are the following percentage increases for sports related bandwidth usage during March 2010:
  • 37% increase during weekday traffic
  • 68% increase during weekend traffic
  • The mean between these two, is a 52.5% increase in sports related bandwidth usage
There are a number of March related sports events, including the PGA tournament and the IPL cricket tournament (as I mentioned, Zscaler has a global user-base). As the trendlines show there is an increase in activity in the later March dates, corresponding with the NCAA tournament games:
  • First-round games (March 18-19)
  • Second-round games (March 20-21)
The two premier websites providing coverage of the NCAA tournament are ncaa.com and cbssports.com. Below is a chart of bandwidth usage to these two domains over March:


Other NCAA dates visible within the data are the NCAA division championships March 12-14 and the NCAA tournament selection (“Selection Sunday”) March 14. The trend line shows roughly a 95% increase in bandwidth to these two domains over March.

While corporations may turn a blind-eye to their employees spending time online monitoring their brackets and watching the amazing upsets that have occurred in this year’s tournament – this increase in sports related web usage, likely means a drop in productivity and potentially associated costs with respect to bandwidth usage. However, most would argue that the tournament and its associated office pools provide a welcomed increase in moral … that is of course, unless you are a die-hard fan of one of the top-seeds that has been upset ;)

Thursday, March 18, 2010

Trojan Monkif is still an active and consistent Botnet threat

While we have seen in a recent Koobface blog post showing sudden rise and fall in Koobface network activity, the Trojan Monkif’s C&C requests are still being consistently seen. According to the data, the Trojan Monkif is still highly active today and making new HTTP requests every day to pull down the additional C&C commands from the specific servers. Even though this threat is a year old, it is still present in the wild and impacting users. The Trojan Monkif makes a number of different requests to the predefined servers to collect commands, and the server responds with JPEG images encapsulating the malicious commands inside. This is used to avoid C&C command detection on the network

The functionality of Monkif is to pull out the command instructions from the remote server and download additional malware. It generates a lot of unique HTTP requests on the same server to download malicious JPEG files containing encoded commands. It also installs BHO (Browser Helper Object) on the infected machine. Interestingly, there is less of a drop for Trojan Monkif’s unique network traffic per month than Koobface. Here are the March 2010 unique web requests per day:


There are only 4-5 C&C servers seen in the requests but every request is unique containing different root directories, random “.php” files and different parameters passed to the file. The only highly active and live C&C server is hosted in Sweden with IP 88.80.7.152. Here are the servers associated with the Monkif activity:
hxxp://88.80.7.152
hxxp://www.clickspot.biz
hxxp://stats.hillmedia.biz
hxxp://cdn.cbtclick.biz
hxxp://88.80.5.3
hxxp://www.clickbig.biz
hxxp://stats.woodmedia.biz


Here is how random requests looks like,
hxxp://88.80.7.152/cgi/tjo.php?ddddd=4x4x4x4x4x70x
hxxp://88.80.7.152/cgi/dsiy.php?dsydm=5561671
hxxp://88.80.7.152/babymaybe/hwmcr.php?wmcrx=06166<1x644434x4x4x4x=x>
hxxp://88.80.7.152/babynot/blfp.php?dxrzz=77445=1x644413x640
hxxp://cdn.cbtclick.biz/babynot/ecccccc.php?cccc=4x4x640
hxxp://88.80.7.152/cgi/scrhxm.php?rhxm=4x4x4x4x4x70x
hxxp://88.80.7.152/cgi/nd.php?iy=4x4x4x4x4x70x
hxxp://88.80.7.152/cgi/ffffffff.php?fff=4x4x4x4x4x70x
hxxp://88.80.7.152/cgi/raqfv.php?aqf=4x4x4x4x4x70x
hxxp://stats.hillmedia.biz/cgi/wl.php?rgmra==<44=46x644547x640>

As the above requests show, every request is unique, containing random data. This is done to fool IDS/IPS engines from detecting the malicious requests. The C&C server replies with Content-Type “image/jpeg” which contains JPEG image with malicious commands hidden inside. The malicious image file contains the JPEG header followed by some commands that are in encoded format. We observed by downloading some image files that first part of the command remains same and later part changes values. Here are some examples:


lppt>++<<*<4*3*516+`+`h*tlt;bh9g<4f=g=fb==1f5aefg6364a1<64fa4fe"bm`9544"5911266139|200118|2048|0|0|66

lppt>++<<*<4*3*516+`+`h*tlt;bh96efe<6600g6fbb`5<14253a4ea5=6`03"bm`9544"593300195|200057|2048|0|0|2

lppt>++<<*<4*3*516+`+`h*tlt;bh9ag=3g7a31<6<<17`7b2`b53f1`52=ff7"bm`9544"591955928|200044|2048|0|0|18


The bold text remains same in the JPEG image file and later part differs for every request – this makes detection difficult. The values within the JPEG are the malicious commands that instruct the Monkif to infect victim systems. If you compare other Botnet traffic like Koobface verses Monkif in Q1 2010, you will find that Monkif is very consistent and ongoing.

The unique network traffic for Trojan Monkif remains approximately same for each month in this quarter (March data is up to 17th). The detection for these JPEG file is zero (here is the Virustotal result for one of the C&C image files). This means you are at a greater risk from this Botnet if you are relying on only a single security protection like Antivirus.

We are near to end of Q1, 2010 and Botnet attacks remain a large threat on the web. Previously, we saw Koobface activity rise and fall, but this Monkif threat remains consistently active. The attackers behind Trojan Monkif are evading the detection by hiding their malicious Botnet commands inside JPEG files. The commands inside the JPEG files are encoded and vary, making detection more difficult for Antivirus vendors. The Virustotal result shows none of the Antivirus vendors out of 42 are detecting these malicious samples. The web is growing and so are Botnets. Currently, we are seeing only a few C&C servers being used for this Trojan, but this may increase in the future. Koobface and Monkif have been active threats in the Q1 2010, but Monkif has remained consistent. Zscaler’s solution is detecting these kinds of attacks every day. Be sure to block the above-mentioned malicious domains, and check that your security solutions are protecting you from these dangerous threats.

Be Safe!!!

Umesh