Skip to content

GIF/Javascript Polyglots

One of the nice aspects of working on Caja has been the people I’ve had a chance to meet and work with. Their ideas have helped inspire Caja’s design and implementation. Other times, they have inspired the following kind of wackiness:

<script src="thinkfu-js.gif"></script>


<img src="thinkfu-js.gif">

That’s right - that is the same GIF that is being interpreted as a script and as an image.

There’s no content-sniffing tricks going on here but it was content-sniffing that inspired me to create this particular trick. In the space of a week, chance conversations with Arturo Bejar and Doug Crockford about content-sniffing reminded me of Perl::Visualize — a GIF/Perl polyglot generator I’d written back in 2003. In humanspeak, a polyglot is a person who speaks more than one language. In computerspeak, a polyglot is a snippet of code which is a valid program in two or more languages. Writing polyglots can be a fun mental exercise and if you have never tried it, I thoroughly recommend it.

The thing that made Perl::Visualize clever is that one of the languages is GIF — as in the picture format. One way of thinking about a GIF viewer is as an interpreter that takes a peculiar programming language as input and as a result of executing that program decode an image. Perl::Visualize takes an arbitrary GIF image and an arbitrary perl program and creates a combined file which can be interpreted as an image and as a perl program.

This makes some rather nifty tricks possible. For example, you can embed a perl program in a picture of its own control-flow graph. Now you have a program which can explain itself when you throw it at a picture viewer and run itself when you throw it at a perl interpreter. It takes literate programming to its logical conclusion — I call it aesthetic programming and I am sure it will have Knuth laughcrying.

Unfortunately doing polyglots with perl impresses no one. People unfamiliar with perl think that because perl can look like line-noise, all line-noise is valid perl. It turns out though that the tricks that made GIF/Perl polyglots possible can be ported to Javascript rather easily. I’ll describe how GIF/Javascript polyglots work and a script for generating them in another post but its worth noting here that the image above is a perfectly valid GIF file just as it is a perfectly valid javascript program (in fact - it’s even a valid Caja program!). An image tag expects its src attribute to point to content which parses correctly as an image, just as a script tag expects its src attribute to point to a javascript file. The tag specifies a context in which content of a particular type is expected. If the only information a browser used to render content was the context created for it by the surrounding tag, things would be simple. But things in the browser world are never simple. When a server sends a file, it also sends that file’s MIME type in a Content-Type header. All is well when the Content-Type the server asserts is consistent with the expected context in which that content gets used. What happens when the server does not send a Content-Type? What happens when a file with one Content-Type is sent when a different type is expected?

Sadness happens.

Some browsers consider the content-type the server asserts to be authoritative and if the content fails to parse as that type, the content is not rendered. Others ignore the server asserted type and try to guess (sniff the content) for its type. This sniffing can take the form of heuristics like the suffix of the file name in the URL that specifies it, the “magic” first couple of bytes of the content, or simply trying to parse the file with different parsers until one fits. The type of parser tried is sometimes constrained by the particular tag (fr’instance content expected by an img tag would only attempt to be parsed according to native image formats supported by the browser.). The problem is further exacerbated by plugins like Java and Flash and by different types of caches and “file save” feature in browsers which may or may not remember what content-type was asserted by the server.

In the programming languages world, this kind of thing would be called duck-typing (if it walks like a duck and quacks like a duck, treat it like a duck).

In web world, this is completely busted.

Browsers perform content-sniffing ostensibly in the interest of usability so even badly configured servers can continue to “work”. The problem here is that a browser gives different types of content different amounts of access. If you can fool the browser into thinking one type of content is actually another you can bypass the restrictions placed on the actual content’s access. For example, an HTML page is allowed to load external images, stylesheets and scripts. In this case the security context these resources execute in is derived from the URL of the page that these resources are embedded in. On the other hand, if the type of content being loaded is Flash or Java applet say, the security context is derived from the URL of the applet object itself. If the browser uses heuristics and gets confused between a Flash object and an image, there are real security implications! It was this type of confusion which was the source of the GIFAR attack.

What are the security implications of GIF/Javascript polyglots? Since images and javascript share the same same-origin policy, getting a browser to confuse one for the other does not result in an obvious exploit. However, it does re-emphasize the lesson from the GIFAR exploit — blacklisting or recoding particular files is not going to be sufficient while:

  • we’re able to construct data that can be validly parsed as two as two or more types;
  • browsers sniff to determine content-type; and
  • the security context a resource executes in depends on its content-type.

(Thanks to Mike “The Human Linter” Samuel for correcting errors in this post)

3 Trackbacks/Pingbacks

  1. [...] This post was mentioned on Twitter by d3v1l, theharmonyguy and Klaus Johannes Rusch, ‌. ‌ said: Very great find; JS 'polyglots', XSS in images again: http://www.thinkfu.com/blog/gifjavascript-polyglots [...]

  2. The animated gif of Dorian Grey « reperiendi on Monday, May 3, 2010 at 1:45 pm

    [...] a Perl module, Perl::Visualize, that makes Perl/Gif polyglots. He later adapted his technique to Javascript/Gif polyglots. Some guy generalized a quine to print out the source code of the program together with a comment [...]

  3. [...] 首先,这个题目不是标题党。确实可以在GIF图片中隐藏很多种代码,包括C、Perl、Javascript等等,这不是我说的,是有人实验成功的:http://www.thinkfu.com/blog/gifjavascript-polyglots。 [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*