Update: since this article was first published, the send()
method has been removed from the spec. send(value)
was replaced with calling next with a parameter value, i.e. next(value)
. I have updated this post with this change.
It looks like generators are actually going to make it into Javascript, and it has landed in V8. This is so exciting! Lately it's been all the rage. But what is it? With this post I'll do my best to explain at a basic level what it is.
Get It
Before we begin, let's make sure we can execute the code in the examples - because code is no good unless you can run it. You'd want to install an experimental version of Node (0.11.10+). I recommend:
- First install nvm
- then
nvm install v0.11.10
to install Node - now to run any of the examples, make sure you use the
--harmony
flag, i.e.node --harmony
ornode --harmony myscript.js
The Basics
At its simplest form, a generator represents a sequence of values - essentially it is an iterator. In fact, the interface of a generator object is a just an iterator for which you'd keep calling next()
on it until it runs out of values. What makes a generator special is the way you write it:
function* naturalNumbers(){
var n = 1;
while (true){
yield n++;
}
}
In this example(source), naturalNumbers
is a generator function - or generator for short. It is written using the new * syntax as well as the new yield
keyword introduced in ES6. This generator function returns generator objects that return the set of natural numbers, i.e. it is an infinite sequence. Each time yield
is called, the yielded value becomes the next value in the sequence. To iterate the sequence, a for-of
syntax has also been proposed in ES6, however this is not yet been implemented in V8:
// this doesn't work yet. Also, it's an infinite loop ;P
for (var number of naturalNumbers()){
console.log('number is ', n);
}
Don't fret, we can still make it do nice things. To create a new sequence, you'd call the generator function in order to get a live generator object:
> var numbers = naturalNumbers();
Now, if you call numbers.next()
, you get an object with the properties value
and done
.
> numbers.next()
{ value: 0, done: false }
value
is the next value in the sequence, and done
is whether or not the sequence has stopped - the sequence stops when the code has reach the end of the generator function. But in the natural numbers example, the sequence would never stop, so to illustrate a stopped sequence let's make a simple sequence that just returns two numbers:
function* two(){
yield 1;
yield 2;
}
Now, let it run:
> var seq = two()
> seq.next()
{ value: 1, done: false }
> seq.next()
{ value: 2, done: false }
> seq.next()
{ value: undefined, done: true }
As you can see, the third time we get done
to be true
and value
to be undefined
. If we call it a fourth time, we get an exception:
> seq.next()
Error: Generator has already finished
Code Suspension
So you've seen the basics of generators, and using it to implement iterators. But the one thing that makes generators significant is this: it is now possible to suspend code execution. Once you instantiate a generator object, you have a handle on a function whose execution can start and stop, moreover, whenever it stops, you have control over when it restarts. To see this more concretely, let's create a generator that simply prints out strings to the console.
function* haiku(){
console.log('I kill an ant');
yield null; // the yield keyword requires a value, so I put null
console.log('and realize my three children');
yield null;
console.log('have been watching.');
yield null;
console.log('- Kato Shuson');
}
Now, if we iterate the generator object from this generator
> var g = haiku()
and then iterate through it, line by line by calling g.next()
on the command line
> g.next()
You'll realize that
- the code in the generator doesn't start executing until you say so
- when the code encounters a
yield
statement, it suspends execution indefinitely, and doesn't start again until you tell it to
This opens possibilities for asychronicity: you can call next()
after some asynchronous event has occurred. To see this, take a look at the following example which makes the poem display a line per second by combining the generator with an async loop.
var g = haiku();
function next(){
if (g.next().done) return;
setTimeout(next, 1000);
}
next();
Note that the code in the generator haiku
looks like straight line Javascript, and yet, asychronicity has occurred in the middle of that code - which was not possible before due to Javascript's general single threaded nature. More specifically, each time a yield is encountered, asychronicity has a chance to occur. Those yields are kind of like time warping wormholes. (source)
Send
Up until now we've looked at a generator objects only as a producer of a sequence of values, where information only goes one way - from the generator to you. It turns out that you can also send values to it by giving next()
a parameter, in which case the yield statement actually returns a value! Let's see this by making a generator that consumers values:
function* consumer(){
while (true){
var val = yield null;
console.log('Got value', val);
}
}
First instantiate a generator object
> var c = consumer()
Next, let's try calling next()
with a value:
> c.next(1)
{ value: null, done: false }
This returns the expected object, but it also didn't console.log()
anything, the reason being that initially, the generator wasn't yielding yet. If we call send again however, will see the console.log
message:
> c.next(2)
Got value 2
{ value: null, done: false }
> c.next(3)
Got value 3
{ value: null, done: false }
Throw
Cool! So we can both send and receive values from a generator, but guess what? You can also throw!
> c.throw(new Error('blarg!'))
Error: blarg!
When you throw()
an error onto a generator object, the error actually propagates back up into the code of the generator, meaning you can actually use the try
and catch
statements to catch it. So, if we add try/catches to the last example:
function* consumer(){
while (true){
try{
var val = yield null;
console.log('Got value', val);
}catch(e){
console.log('You threw an error but I caught it ;P')
}
}
}
This time, once we instantiate the generator object, we'll call next()
first, because there's no way the generator can catch a error that's thrown at it before it even starts executing.
> var c = consumer()
> c.next()
After this point on, if we throw an error, it will catch the error and handle it gracefully:
> c.throw(new Error('blarg!'))
You threw an error but I caught it ;P
try/catch worked.
Getting Fancy
Now that you know the basic mechanics of generators, what can you really do with it? Well, most folks are excited about them because it is seen as the Ticket Out of Callback Hell ™, so let's see how that can be done.
The Premise
In Javascript, especially in Node, IO operations are generally done as asynchronous operations that require a callback. When you have to do multiple async operations one after another it looks like this:
fs.readFile('blog_post_template.html', function(err, tpContent){
fs.readFile('my_blog_post.md', function(err, mdContent){
resp.end(template(tpContent, markdown(String(mdContent))));
});
});
It gets worse when you add in error handling:
fs.readFile('blog_post_template.html', function(err, tpContent){
if (err){
resp.end(err.message);
return;
}
fs.readFile('my_blog_post.md', function(err, mdContent){
if (err){
resp.end(err.message);
return;
}
resp.end(template(tpContent, markdown(String(mdContent))));
});
});
The promise of generators is that you can now write the equivalent code in a straight-line fashion using generators
try{
var tpContent = yield readFile('blog_post_template.html');
var mdContent = yield readFile('my_blog_post.md');
resp.end(template(tpContent, markdown(String(mdContent))));
}catch(e){
resp.end(e.message);
}
This is fantastic! Aside from less code and asthetics, it also has these benefits:
- Line independence: the code for one operation is no longer tied to the ones that come after it. If you want to reorder of operations, simply switching the lines. If you want to remove an operation, simply deleting the line.
- Simpler and DRY error handling: where as the callback-based style required error handling to be done for each individual async operation, with the generator-based style you can put one try/catch block around all the operations to handle errors uniformly - generators gives us back the power of try/catch exception handling.
Make it Happen
You've seen what we are aiming for, now how do we implement it? If you are the type who wants to figure it out for yourself, I don't want to ruin it for you. Stop reading. Code away. Come back. I'll wait.
The first thing to realize is that the async operations need to take place outside of the generator function. This means that some sort of "controller" will need to handle the execution of the generator, fulfill async requests, and return the results back. So we'll need to pass the generator to this controller, for which we'll just make a run()
function:
run(function*(){
try{
var tpContent = yield readFile('blog_post_template.html');
var mdContent = yield readFile('my_blog_post.md');
resp.end(template(tpContent, markdown(String(mdContent))));
}catch(e){
resp.end(e.message);
}
});
run()
has the responsibility of calling the generator object repeatedly via next()
, and fulfill a request each time a value is yielded. It will assume that the requests it receives are functions that take a single callback parameter which takes an err
, and another value
argument - conforming to the Node style callback convention. When err
is present, it will call throw()
on the generator object to propagate it back into the generator's code path. The code for run()
looks like:
function run(genfun){
// instantiate the generator object
var gen = genfun();
// This is the async loop pattern
function next(err, answer){
var res;
if (err){
// if err, throw it into the wormhole
return gen.throw(err);
}else{
// if good value, send it
res = gen.next(answer);
}
if (!res.done){
// if we are not at the end
// we have an async request to
// fulfill, we do this by calling
// `value` as a function
// and passing it a callback
// that receives err, answer
// for which we'll just use `next()`
res.value(next);
}
}
// Kick off the async loop
next();
}
Now given that, readFile
takes the file path as parameter and needs to return a function
function readFile(filepath){
return function(callback){
fs.readFile(filepath, callback);
}
}
And that's it! If that went too fast, feel free to poke at the full source code.
More Resources
To learn more about generators:
- The original post by wingo when generators made it into V8
- T.J. Holowaychuk's post and his co library which is in concept similar to the
run
function I've demonstrated - Suspend and genny take a slightly different approach and has different tradeoffs than something like co
- task.js - Dave Herman's task library
- Q is a promise library which supports generators
- A brilliant tutorial on generators in python by David Beazley
- The original proposal that introduced generators to Python