Google XSS Flaw in Website Optimizer Scripts explained

This week thousands of system administrators who make use of Goolge products will open their inbox to see an email from Google explaining that their Web Optimizer product contains an Cross-site scripting flaw that allows hackers to inject scripts into their Google Optimized web pages.

A part of this email follows:

“you are using a control script that could allow an attacker to execute malicious code on your site. To fix the vulnerable section of code, you should immediately either replace the control scripts in your affected experiments or stop the affected experiments and start new experiments”

On receiving this notification I quickly scrambled to my web sites to immediately implement the fix recommended by Google. Later on in the day I had time to to dig deeper into the problem and analyse the security flaw in more detail. What I found is a multi-staged attack that relies on cookie injection, improper text parsing and DOM script injection code.

I have documented my research in this article, and I hope that it will be of use to you. There is a lot to learn from other people’s mistakes, especially when those people are Google themselves.

The flaw exists in Googles Web Optimizer, which is a series of scripts that web administrators use to gain insight into how their web sites are navigated by online customers.

Below is a segment of the the flawed code.

<!-- Google Website Optimizer Control Script -->
<script>
function utmx_section(){}function utmx(){}
(function(){var k='XXXXXXXXXX',d=document,l=d.location,c=d.cookie;function f(n){
if(c){var i=c.indexOf(n+'=');if(i>-1){var j=c.indexOf(';',i);return c.substring(i+n.
length+1,j<0?c.length:j)}}}var x=f('__utmx'),xx=f('__utmxx'),h=l.hash;
d.write('<sc'+'ript src="'+
'http'+(l.protocol=='https:'?'s://ssl':'://www')+'.google-analytics.com'
+'/siteopt.js?v=1&utmxkey='+k+'&utmx='+(x?x:'')+'&utmxx='+(xx?xx:'')+'&utmxtime='
+new Date().valueOf()+(h?'&utmxhash='+escape(h.substr(1)):'')+
'" type="text/javascript" charset="utf-8"></sc'+'ript>')})();
</script><script>utmx("url",'A/B');</script>
<!-- End of Google Website Optimizer Control Script -->

This Website Optimizer Control Script is embedded within your web page to track it. It will be run on the user’s end, and under a successful attack it will extract a malicious script from their cookie and execute it in their browser.

The code above is standard JavaScript however it is not easy to read. There are two reasons for this; firstly, like most Google client side scripts, it is obfuscated, purposely making it cryptic. Secondly it was designed to work fast and efficiently, and not to be easily understood.

I manually de-obfuscated this code, and whilst doing that, I re-factored it to make it easy to understand. The code below should be easy enough to read by anyone with JavaScript knowledge, yet it fulfills the same function as the cryptic code provided by Google.

01.  function AB_Analysis(){
02.     var k='YOURTACKINGNUMBER'
03.     var d=document;
04.     var l=d.location;
05.     var h=l.hash;
06.     var injectionvector1 = ReadFromCookie('__utmx');
07.     var injectionvector2 = ReadFromCookie('__utmxx');
08.     d.write
09.     ('<script src=http://www.google-analytics.com/siteopt.js?v=1&utmxkey='+k
10.     +'&utmx=' + injectionvector1
11.     +'&utmxx='+ injectionvector2
12.     +'&utmxtime=' + new Date().valueOf()
13.     +(h?'&utmxhash='+escape(h.substr(1)):'')
14.     + '" type="text/javascript" charset="utf-8"></script>')
15.  }
16.
17.   function ReadFromCookie(field_name){
18.     var c = document.cookie;
19.     var start = c.indexOf(field_name+'=');
20.     var end = c.indexOf(';',start);
21.     return c.substring(start + field_name.length + 1, end);
22.   }
23.

The security flaw starts in lines 06 and 07:
06.     var injectionvector1 = ReadFromCookie('__utmx');
07.     var injectionvector2 = ReadFromCookie('__utmxx');


Both these lines call into the function ReadFromCookie which parses the headers of a cookie file without sanitising the input. The lack of sanitation is on line 21:

21. return c.substring(start + field_name.length + 1, end);

Over here we can see a classic mistake – data is blindly read from an untrusted source. The substring function reads from the start of the field’s data all the way till the fist semicolon. What it reads should be a tracking number, but in this case it is a specifically planted ‘dormant’ script. It is dormant because it resides inside a cookie and not inside the HTML of the web page itself. The lines 10 and 11 are where the real trouble begins to show. The extracted and potentially dangerous script is injected into the user’s DOM:


08.     d.write
09.     ('<script src=http://www.google-analytics.com/siteopt.js?v=1&utmxkey='+k
10.     +'&utmx=' + injectionvector1
11.     +'&utmxx='+ injectionvector2
12.     +'&utmxtime=' + new Date().valueOf()
13.     +(h?'&utmxhash='+escape(h.substr(1)):'')
14.     + '" type="text/javascript" charset="utf-8"></script>')


The code above is the one responsible for the fatal injection. There is some irony here. In the same statement of  code there exists some protection against XSS, but it does not go far enough.

Look at line 13:

13. +(h?'&utmxhash='+escape(h.substr(1)):'')

This code correctly treats the DOM hash (variable h) as untrusted because it can be manipulated in a similar way as the cookie can. The lines before it, however omit calling the escape() function that effectively sanitises code against XSS and similar attacks. Its a typical case of ‘so close, yet so far away’.

For those who find it hard to read JavaScript, I have included a flow chart showing the two functions, AB_Analysis and ReadFromCookie.

AB Analysis Function Flow Chart

AB Analysis Function

The diagram above is a flowchart for the AB_Analysis script. This script is embedded on pages by web developers who are making use of the Google Web Site Optimiser. The red processes are where data is read from the cookie and added to a script, which is in turn injected into the DOM.

ReadFromCookie flowchart

ReadFromCookie flowchart

Above is a flowchart for the ReadFromCookie function. There is no actual flaw here, except maybe that there is no limit to how much data is read out of the cookie. Also, the end of record detection is rather crude – simply looking for a semicolon in the data.

Below is how a normal cookie might look. Cookies are not very sophisticated and are generally described as simple text files on the user’s computer. In HTML5 cookies have been replaced by a full blown relational database.

Normal Cookie Example

BEGIN COOKIE
umtx: some_value;
umtxx: some_other_value;
END COOKIE

The compromised cookie below contains script inside the umtx and umtxx fields. This script is not active and therefore not dangerous. However, when the AB_Analysis script is executed, the umtx script gets activated through this XSS attack.

Compromised Cookie Example

BEGIN COOKIE
umtx: <<malicious script goes here>>;
umtxx: <<malicious script goes here>>;
END COOKIE

An attack is two staged; first the malicious script has to be injected into a cookie on the victim’s browser. After that, the user must visit a web page. containing the Google AB_Analysis script. The attack can be summarised in the diagram below.

Attack on Google Web Optimizer

Attack on Google Web Optimizer

Google was fast to react and provide a fix however this fix needs to be deployed by every web site administrator that uses Google Web Optimiser. This applies to hundreds of thousands of web pages globally.

I hope that administrators are quick to fix this problem as it could easily result in an XSS attack against their site if targeted.

  • How would the malicious cookie be injected? e.g. If the Web Optimizer script was hosted on http://www.example.com, which was not otherwise vulnerable to XSS, and there were no other subdomains of example.com, the site should be safe, right?

  • I don’t understand how the attack is going to be triggered.
    Even if the issue is there, the impact is going to be near zero if there’s no way to force a victim to set umtxx|umtx cookies to whatever.
    Probably I’m missing something, but the attack is not an attack if there’s no explicit entry point.

  • Hi Paul and Stefano,

    You both raise good points – the initial part of the attack is not trivial, neither obvious but definitely possible. The intent of my original article was to focus on the second part of the attack, however I will try to elaborate here on the first part, which is usually termed a ‘cross-site cookie injection’ attack.

    There are several ways to perform a cross-site cookie injection;

    – Older browsers are vulnerable to what is called “Cross-site cooking”. This allows attackers to create and modify cookies for domains other than the originator of the cookie commands. Some more information could be found here: http://goo.gl/bqjSb

    – Most browsers these days protect the user from cross site-cooking, so the attacker needs to resort to other methods, such as CRSF and XSS to create the cookie, making this particular attack part of a larger one.

    – One more way which might be possible, but I have never tried it would be to ‘iframe’ the website you are attacking, and change the document.cookie in the DOM to point to your own cookie.

    – After a few minuets of Googling, I found several JavaScript workarounds that can be used to set cross-site cookies (example here: http://goo.gl/kceFo)

    – There of course also exist ‘future’ methods which could be discovered, especially with the recent introduction of the yet-untested HTML5 standard.

    – Cookie injection does not exclude a traditional virus or malware which could be installed on the victims computer. You might say that with a virus, the computer is already compromised, however injecting javascript into a target’s website DOM is not easy to do through a virus. Injecting it into the cookie instead is much simpler.

    I can think of even more attack vectors that are plausible, yet I cannot not agree with your comments; the first stage of the attack is not a piece of cake and will require some wit from the hacker. Google themselves stated this in their original advisory (I linked to this in the article). If you read it, you will notice the last paragraph stating:

    In addition, even if a site is using code generated before Dec. 3, 2010, attackers can only execute malicious code on a website or browser if it has already been compromised by a separate attack. Though the immediate probability of this attack is low, we urge you to take action immediately.

  • Jeremy,
    thanks for the clarification.
    It wasn’t clear to me if there were also a cookie injection at some point or not.
    I agree there’s a vulnerability even if it’s not directly exploitable.

  • Hi there, I would like to subscribe for this blog to get most up-to-date updates, thus where can i do it please assist.

  • Leave a Reply

    Your email address will not be published.


    *