A couple of weeks ago, I wrote about how to do some basic static analysis of Javascript source code using Esprima. This post is a follow up where I demonstrate how to write a program to transforms Javascript source code using an Esprima-based library - falafel.
Hello, Falafel
Falafel is an AST traversal library like estraverse. If you've read my previous article, this is familiar territory. Instead of taking the AST as an argument, falafel takes the source code as a string.
falafel('console.log("hello world");', function(node){
console.log('Entered node', node.type);
});
In addition to traversal, Falafel also adds some interesting methods to the nodes being traversed:
update(newCode:string)
- replaces the source of the node with new code - a little like the DOM replaceChild method.source():string
- returns the original source of the node.
Armed with these new weapons, let's write something. Let's say you want to see a timestamp with all your console.log statements. You could use a logging library, but you'd have to rewrite all your console.log statements. Or, you could write a program to automatically rewrite your source code!
So, for this problem, given the input
console.log('hello');
we want the output - the rewritten program - to be
console.log(new Date() + ':', 'hello');
First, we need to detect all console.log statements in the program. We'll do that using this function
function isConsoleLog(node){
return node.type === 'CallExpression' &&
node.callee.type === 'MemberExpression' &&
node.callee.object.type === 'Identifier' &&
node.callee.object.name === 'console' &&
node.callee.property.type === 'Identifier' &&
node.callee.property.name === 'log';
}
Then, it's just a matter of constructing the source we want and giving it to update()
.
code = falafel(code, function(node){
if (isConsoleLog(node)){
node.update('console.log(new Date() + ":", ' +
node.arguments.map(function(arg){
return arg.source();
}).join(', ') + ')');
}
});
Notice that we use the source()
method to get at each argument's original source code - unchanged.
Better Assertions
Now, that your feet are wet, let's get into something more involved - let's implement better assertions by using source rewriting.
An assertion is a statement in the program containing a sole boolean expression which is checked during its execution. If the expression is false at that point of program execution, the program will fail with an assertion error. Assertions are most commonly used in automated tests, but can also be used in production code. In fact, using assertions in your production code may have these benefits
- Fail erroneous conditions early so that they have minimal negative impact on the state of the application.
- Makes post-mortem debugging easier by failing closer to the root cause.
Since Javascript does not have an assert statement, we resort to implementing an assert()
function instead
assert(n === 0);
which may be implemented as
function assert(condition){
if (!condition){
throw new Error('Assertion failed.')
}
}
If the assertion fails at this point, we'd get an error like this
Error: Assertion failed.
at assert (simple_assert.js:6:11)
at Object.<anonymous> (simple_assert.js:2:1)
...
this output leaves something to be desired
- Currently we have to track down the code that generated the assert statement using the line numbers in the stacktrace. It would be helpful to be able to see the source of the assert statement directly in the error message. Insist is a node modules that does this using some AST hackery.
It would be helpful to see the state of the program at the time the error occured, especially the variables and sub-expressions referenced in the assert statement. Various assertion libraries handle this by giving you access to comparison functions. For example, instead of
assert(n === 0)
you'd writeassert.equal(n, 0)
, which would yield an error message like:Assertion Error: Expected 1 to equal 0.
but there are still downsides to this approach
- although you see the the values on each side of the comparison, you don't see the names of the variables in the comparisons
- you have to learn and depend on their API - which can be potentially large
- you'll encounter things you want to check which are not in the API or are not comparisons at all, at which point you'd have to fallback to the simple assert form
The Goal
If we add source code rewriting to the toolbox, new possibilities open up. Let's consider how we may want to rewrite an assert statement like
assert(n === 0);
First, to implement just the assertion behavior, we can rewrite the above to
if (!(n === 0)){
throw new Error('Assertion failed.');
}
Next, we can embed the source of the assert statement into the error message - this satisfies requirement #1
if (!(n === 0)){
throw new Error('assert(n === 0) failed.');
}
For requirement #2, we want to display the state of the program, but which variables to display? At the minimium, we should display the values of sub-expressions in the assert statement itself. For our example, the sub-expression we care about is the variable n, so we'll display its value in a separate line within the error message.
if (!(n === 0)){
throw new Error(
'assert(n === 0) failed.\n' +
' n = ' + n);
}
Let's separate the above into 3 checkpoints:
- Basic assertion.
- Display assert statement source code in error message.
- Display variables and sub-expressions in the assert statement.
Okay, now that we know where we are headed, let's get coding. If you don't want me to ruin the ending for you, stop reading, go implement it yourself and then come back. I'll wait.
Checkpoint #1
First, to detect the assert statement, we'll have an isAssert
function
function isAssert(node){
return node.type === 'CallExpression' &&
node.callee.type === 'Identifier' &&
node.callee.name === 'assert';
}
Rewriting it as an if statement then is as easy as this
code = falafel(code, function(node){
if (isAssert(node)){
var predicate = node.arguments[0];
node.update(
'if (!(' + predicate.source() + ')){ ' +
'throw new Error("Assertion failed.");' +
' }');
}
})
For this program as input
var n = 1;
assert(n === 0);
we should get back the rewritten program as
var n = 1
if (!(n === 0)){ throw new Error("Assertion failed."); };
And if we run it we'll get
Error: Assertion failed.
Checkpoint #2
To embed the source from the assert statement into the error message isn't much more involved.
code = falafel(code, function(node){
if (isAssert(node)){
var predicate = node.arguments[0];
node.update(
'if (!(' + predicate.source() + ')){ ' +
'throw new Error("' + predicate.source() + ' failed.");' +
' }');
}
})
For the same example as checkpoint #1, we should now get back
var n = 1
if (!(n === 0)){ throw new Error("n === 0 failed."); };
And the error now looks like
Error: n === 0 failed.
Checkpoint #3
Displaying variables and sub-expressions in the assert statement's prediate will involve another AST traversal. Let's consider a few cases:
- The predicate is a binary expression like n === 0. In this case, we want to show the sub-expressions on both sides, but not literals - because those are already apparent in the source.
- The predicate is a function call like isNaN(n). In this case we'll want to drill down into the argument(s) of the function and display its value.
- The predicate is a method call like _.isArray(arr). As with case #2, we'll want to display the value of the argument(s).
After several attempts, I've come up with this code which balances cases covered and simplicity
function subExpressions(predicate){
var exprs = [];
estraverse.traverse(predicate, {
enter: function(node){
if (
predicate !== node && // exclude the original expression itself
node.type !== 'Literal' // don't want string/number/etc literals
){
exprs.push(node);
}
// Skip the nodes under MemberExpression
// because it's property name as an identifier
// will not be recognizable as a variable
if (node.type === 'MemberExpression') this.skip();
}
});
return exprs;
}
Then, it's just a matter of formatting the desired Javascript correctly.
code = falafel(code, function(node){
if (isAssert(node)){
var predicate = node.arguments[0];
var predicateSrc = predicate.source();
var exprs = subExpressions(predicate);
node.update(
'if (!(' + predicateSrc + ')){ ' +
'throw new Error("' + predicateSrc + ' failed."' +
' + ' +
(exprs.map(function(expr){
var src = expr.source();
return '"\\n ' + src + ' = " + ' + src;
}).join(' + ') || '""') +
');' +
' }');
}
})
Okay, admittedly I've made a bit of a mess here. If you run this on the same example program, you'll now get
Error: n === 0 failed.
n = 1
Now we are getting somewhere! Here's another example
var obj = {
name: 'tom',
getName: function(){
return 'blah'
}
}
assert(obj != null)
assert(obj.name === obj.getName())
If you convert this source and run it, you'll get this error:
Error: obj.name === obj.getName() failed.
obj.name = tom
obj.getName() = blah
obj.getName = function (){
return 'blah'
}
So helpful!
Homework
Not so fast! Did you think you could get away without any homework this time?
In postmortem debugging, any bit of information you can get your hands on may prove valuable for finding the root cause of the problem. So, it seems like a good idea to display the local variables when the assertion has failed. In fact, it may be valuable to display all the variables accessable at that point - in other words the entire variable scope chain. Your mission, should you choose to accept it, is to make the code do exactly this. Good luck!