Skip to content

JavaScript Puzzle I: Answer

The last puzzle asked what the following JavaScript snippet prints to the console:

j = a = i = 2;
var foo = j
/[a/*]/; foo++; [ /**/ ]
console.log(foo);

Clearly since this is a puzzle about parsing, you might guess that incorrect answers probably result from misparsing the snippet. Naively there are at least three interesting ways this snippet could be parsed as JavaScript. Here are two ways of them (the third is left as an exercise):

j = a = i = 2;
var foo = j;
/[a/*]/;
foo++;
[ /**/ ];
console.log(foo);
semicolon inserted
a useless regular expression that matches a, /, or *
increment foo
a useless array expression containing a block comment
j = a = i = 2;
var foo = j / [ a
/*]/; foo++; [ /**/
];
console.log(foo);
var foo initialize to j divide by an array containing the variable "a"
Block comment that is completely ignored by the interpreter
the end of the array

The answer to the puzzle depends on which of these occurs.  Trying it out in any JavaScript shell you can tell the answer is 1. The interesting point is off course why it is so. In order to see why it is, let us pretend to be a JavaScript parser and see what the parser sees.

The parser scans characters from left to right. It seems the first statement j = a = i = 2;. If we assume that this statement is the first statement in the program, it creates the variables j, a and i and initializes them to 1. The next line is var foo = j. The parser parses the variable declaration for foo, an equal sign for assignment and begins parsing what it expects to be an expression beginning with a variable j. As we saw before, at this point the parser must decide whether the end-of-line character it sees next is the end of the statement (in which case it should insert a semi-colon), or the statement is continued on the next line (in which case the end-of-line should be treated as whitespace). To decide which of these it should do, the parser looks at the first character of the next line — ah it anthropomorphically says to itself — the first character is ‘/’ which could be the start of a comment, the start of a regular expression or a division operation. Since the expression the parser is parsing is not a restricted production and an expression can be followed by a division operator, the parser will interpret the ‘/’ as a division operator.

What is expected after a division operator? Well it is an expression! “Ah!”, our enthusiastic parser says to itself, “I see a square bracket - that can be the start of an expression.” The parser consumes the square bracket and the variable a. It then sees a ‘/’ character. Once more this can be either another division operator, the start of a comment or the start of a regular expression. The parser looks ahead one character, sees the ‘*’ and decides it is a comment block. As a result, it can skip processing characters till it sees a closing ‘*/’. This doesn’t happen till close to the end of the line and is followed by the closing square bracket. As a result, what the interpreter will see is a parse of var foo = j / [ a ];

Once we have a parsed version of the program, evaluating the parse tree is relatively straight-forward. Division only makes sense if what you are dividing with is a number. JavaScript sometimes helpfully converts non-numbers into numbers when needed - arithmetic is one such time.  The numerical value of an array which contains just one element (like [ a ]) happens to be the value of the only element.   You can try this out yourself in a shell by trying to evaluate examples like <code>+ [ 1 ]</code> and <code>+[1, 2]</code>.  In our case, the value of a is its initial value, 1. Thus the interpreter divides the value of j (which is 1) by the 1 and assigns the resulting 1 to foo. And the next line prints it out.

All of this might be fun (or you might think its just a bit quirky), but there are a couple of interesting lessons to take away from this puzzle (and from other programming language puzzles you will see here). The simplest and most important lesson is:

If a language is hard to parse, it is hard for users of the language to write correct and secure programs in it.

The set of rules you need to keep in your head when trying to parse a given snippet of JavaScript is large and complicated and required three rambling paragraphs to explain. This matters to a small degree when you are implementing a parser for the language - but it matters a lot when a human programmer is trying to understand their own code or someone else’s. Humans make simplifying assumptions about what a piece of code looks like it is doing, and deduce that that is what it is doing. It was partially for this reason that when you glanced at the original puzzle program, you might have mentally broken down the program in to one (or both) of the potential parses without really carrying out a mental step by step parse. It is very much like an optical illusion - there are some pictures which can at a glance will be seen by some people one way and others a different way.

A programming language designed to write correct and secure code should NOT exhibit this feature.