Skip to content

JavaScript Puzzles: First in a series…

JavaScript makes the use of semicolons to delimit statements optional. As The Good Parts warns, this is dangerous and is one of the sharp knives in JavaScript. Crockford’s JSLint will scold you properly and Caja will issue static lint warnings if you forget to explicitly use semicolons.

For those of you who are already familiar with this, feel free to cut straight to the puzzle. For the rest of us, there are some great simple examples demonstrating how semicolon insertion is confusing. For instance, what does the following function return?

function universe() {
  return
    42;
}

Well, its not 42 and the reason is a semicolon is inserted for you by the parser immediately after the return statement. The JavaScript grammar specifies a small number of restricted productions which mark where a line may be terminated. The effect of these restricted productions is that when a continue, break, return, or a throw keyword is followed by the end of a line, a semicolon is automatically inserted. As a result the function above is parsed as if it consisted of the following two statements:

function universe() {
  return;  // semicolon inserted
    42;     // useless expression
}

Since the return takes no expression, the function returns the special JavaScript value “undefined” and the statement consisting of just the expression 42 is never executed.

You might be tempted to dismiss this example because it may appear obvious what should happen. Unfortunately there are two other particularly befuddling sources of confusion when parsing JavaScript — distinguishing the division operator and the start of a comment — which combined with semicolon insertion means that code in one location can affect the parsing of code an long distance away. Any time that happens, it means the semantics of a snippet of code you are looking at can be affected by seemingly unrelated code elsewhere in the file which results in bugs, security problems and general sadness.

What does the following snippet of code output:

j = a = i = 2;
var foo = j
/a/i;
console.log(foo);

How would you expect a parser to parse it?

The first line clearly just initializes three variables. There are two ways the next two lines might be parsed. One way is for a semicolon to be inserted automatically after the second line terminating the statement expression var foo = j like this:

j = a = i = 2;
var foo = j;
/a/i;
console.log(foo);

. In this case, the third line is the regular expression /a/i which matches the character “a”. The “i” is a flag that makes the regular expression matching case-insensitive. In this case, there’s no string that the regular expression is being applied to and the expression has no side-effect. As a result, you might expect console.log to print 2.

The second way in which this expression could be parsed is for no semicolon to be inserted. In this case, the end of line is treated just like whitespace like this:

j = a = i = 2;
var foo = j /a/i;
console.log(foo);

Ah - that looks just like a series of divisions. In this case, the value 0.5 would be printed to console.

Which one of these two things really happens? The ES specification which defines the language requires that the latter occur. Loosely speaking the spec suggests as long as you’re not parsing a restricted production (like we were in our first example), no semicolon is inserted if the token after the end of a line might be valid if the end of the line was treated as whitespace.

Now you’re ready for something a bit more interesting - notice that the character that starts a line comment or a comment block is the same character used both for division and to introduce regular expressions! If you are writing a parser (or much more commonly a human reading code someone else has written), how do you decide when you see the / character whether it is the start of a comment, a regular expression or a divide operator?

Let’s cast this in the form of a puzzle.


Puzzle

What does the following JavaScript snippet print to the console?

j = a = i = 2;
var foo = j
/[a/*]/; foo++; [ /**/ ]
console.log(foo);

Answer next week.

One Comment

  1. izz wrote:

    ah scope, how you breaketh … should get back to caroling
    looking forward to the next post, hummm that crafty ‘i’

    Wednesday, December 23, 2009 at 10:38 pm | Permalink

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*