What Is This Thing Called Generators?

Update: since this article was first published, the send() method has been removed from the spec. send(value) was replaced with calling next with a parameter value, i.e. next(value). I have updated this post with this change.

It looks like generators are actually going to make it into Javascript, and it has landed in V8. This is so exciting! Lately it's been all the rage. But what is it? With this post I'll do my best to explain at a basic level what it is.

Get It

Before we begin, let's make sure we can execute the code in the examples - because code is no good unless you can run it. You'd want to install an experimental version of Node (0.11.10+). I recommend:

  1. First install nvm
  2. then nvm install v0.11.10 to install Node
  3. now to run any of the examples, make sure you use the --harmony flag, i.e. node --harmony or node --harmony myscript.js

The Basics

At its simplest form, a generator represents a sequence of values - essentially it is an iterator. In fact, the interface of a generator object is a just an iterator for which you'd keep calling next() on it until it runs out of values. What makes a generator special is the way you write it:

function* naturalNumbers(){
  var n = 1;
  while (true){
    yield n++;
  }
}

In this example(source), naturalNumbers is a generator function - or generator for short. It is written using the new * syntax as well as the new yield keyword introduced in ES6. This generator function returns generator objects that return the set of natural numbers, i.e. it is an infinite sequence. Each time yield is called, the yielded value becomes the next value in the sequence. To iterate the sequence, a for-of syntax has also been proposed in ES6, however this is not yet been implemented in V8:

// this doesn't work yet. Also, it's an infinite loop ;P
for (var number of naturalNumbers()){
  console.log('number is ', n);
}

Don't fret, we can still make it do nice things. To create a new sequence, you'd call the generator function in order to get a live generator object:

> var numbers = naturalNumbers();

Now, if you call numbers.next(), you get an object with the properties value and done.

> numbers.next()
{ value: 0, done: false }

value is the next value in the sequence, and done is whether or not the sequence has stopped - the sequence stops when the code has reach the end of the generator function. But in the natural numbers example, the sequence would never stop, so to illustrate a stopped sequence let's make a simple sequence that just returns two numbers:

function* two(){
  yield 1;
  yield 2;
}

Now, let it run:

> var seq = two()
> seq.next()
{ value: 1, done: false }
> seq.next()
{ value: 2, done: false }
> seq.next()
{ value: undefined, done: true }

As you can see, the third time we get done to be true and value to be undefined. If we call it a fourth time, we get an exception:

> seq.next()
Error: Generator has already finished

Code Suspension

So you've seen the basics of generators, and using it to implement iterators. But the one thing that makes generators significant is this: it is now possible to suspend code execution. Once you instantiate a generator object, you have a handle on a function whose execution can start and stop, moreover, whenever it stops, you have control over when it restarts. To see this more concretely, let's create a generator that simply prints out strings to the console.

function* haiku(){
  console.log('I kill an ant');
  yield null; // the yield keyword requires a value, so I put null
  console.log('and realize my three children');
  yield null;
  console.log('have been watching.');
  yield null;
  console.log('- Kato Shuson');
}

Now, if we iterate the generator object from this generator

> var g = haiku()

and then iterate through it, line by line by calling g.next() on the command line

> g.next()

You'll realize that

  1. the code in the generator doesn't start executing until you say so
  2. when the code encounters a yield statement, it suspends execution indefinitely, and doesn't start again until you tell it to

This opens possibilities for asychronicity: you can call next() after some asynchronous event has occurred. To see this, take a look at the following example which makes the poem display a line per second by combining the generator with an async loop.

var g = haiku();
function next(){
  if (g.next().done) return;
  setTimeout(next, 1000);
}
next();

Note that the code in the generator haiku looks like straight line Javascript, and yet, asychronicity has occurred in the middle of that code - which was not possible before due to Javascript's general single threaded nature. More specifically, each time a yield is encountered, asychronicity has a chance to occur. Those yields are kind of like time warping wormholes. (source)

Send

Up until now we've looked at a generator objects only as a producer of a sequence of values, where information only goes one way - from the generator to you. It turns out that you can also send values to it by giving next() a parameter, in which case the yield statement actually returns a value! Let's see this by making a generator that consumers values:

function* consumer(){
  while (true){
    var val = yield null;
    console.log('Got value', val);
  }
}

First instantiate a generator object

> var c = consumer()

Next, let's try calling next() with a value:

> c.next(1)
{ value: null, done: false }

This returns the expected object, but it also didn't console.log() anything, the reason being that initially, the generator wasn't yielding yet. If we call send again however, will see the console.log message:

> c.next(2)
Got value 2
{ value: null, done: false }
> c.next(3)
Got value 3
{ value: null, done: false }

Throw

Cool! So we can both send and receive values from a generator, but guess what? You can also throw!

> c.throw(new Error('blarg!'))
Error: blarg!

When you throw() an error onto a generator object, the error actually propagates back up into the code of the generator, meaning you can actually use the try and catch statements to catch it. So, if we add try/catches to the last example:

function* consumer(){
  while (true){
    try{
      var val = yield null;
      console.log('Got value', val);
    }catch(e){
      console.log('You threw an error but I caught it ;P')
    }
  }
}

This time, once we instantiate the generator object, we'll call next() first, because there's no way the generator can catch a error that's thrown at it before it even starts executing.

> var c = consumer()
> c.next()

After this point on, if we throw an error, it will catch the error and handle it gracefully:

> c.throw(new Error('blarg!'))
You threw an error but I caught it ;P

try/catch worked.

Getting Fancy

Now that you know the basic mechanics of generators, what can you really do with it? Well, most folks are excited about them because it is seen as the Ticket Out of Callback Hell ™, so let's see how that can be done.

The Premise

In Javascript, especially in Node, IO operations are generally done as asynchronous operations that require a callback. When you have to do multiple async operations one after another it looks like this:

fs.readFile('blog_post_template.html', function(err, tpContent){
  fs.readFile('my_blog_post.md', function(err, mdContent){
    resp.end(template(tpContent, markdown(String(mdContent))));
  });
});

It gets worse when you add in error handling:

fs.readFile('blog_post_template.html', function(err, tpContent){
  if (err){
    resp.end(err.message);
    return;
  }
  fs.readFile('my_blog_post.md', function(err, mdContent){
    if (err){
      resp.end(err.message);
      return;
    }
    resp.end(template(tpContent, markdown(String(mdContent))));
  });
});

The promise of generators is that you can now write the equivalent code in a straight-line fashion using generators

try{
  var tpContent = yield readFile('blog_post_template.html');
  var mdContent = yield readFile('my_blog_post.md');
  resp.end(template(tpContent, markdown(String(mdContent))));
}catch(e){
  resp.end(e.message);
}

This is fantastic! Aside from less code and asthetics, it also has these benefits:

Make it Happen

You've seen what we are aiming for, now how do we implement it? If you are the type who wants to figure it out for yourself, I don't want to ruin it for you. Stop reading. Code away. Come back. I'll wait.

The first thing to realize is that the async operations need to take place outside of the generator function. This means that some sort of "controller" will need to handle the execution of the generator, fulfill async requests, and return the results back. So we'll need to pass the generator to this controller, for which we'll just make a run() function:

run(function*(){
  try{
    var tpContent = yield readFile('blog_post_template.html');
    var mdContent = yield readFile('my_blog_post.md');
    resp.end(template(tpContent, markdown(String(mdContent))));
  }catch(e){
    resp.end(e.message);
  }
});

run() has the responsibility of calling the generator object repeatedly via next(), and fulfill a request each time a value is yielded. It will assume that the requests it receives are functions that take a single callback parameter which takes an err, and another value argument - conforming to the Node style callback convention. When err is present, it will call throw() on the generator object to propagate it back into the generator's code path. The code for run() looks like:

function run(genfun){
  // instantiate the generator object
  var gen = genfun();
  // This is the async loop pattern
  function next(err, answer){
    var res;
    if (err){
      // if err, throw it into the wormhole
      return gen.throw(err);
    }else{
      // if good value, send it
      res = gen.next(answer);
    }
    if (!res.done){
      // if we are not at the end
      // we have an async request to
      // fulfill, we do this by calling 
      // `value` as a function
      // and passing it a callback
      // that receives err, answer
      // for which we'll just use `next()`
      res.value(next);
    }
  }
  // Kick off the async loop
  next();
}

Now given that, readFile takes the file path as parameter and needs to return a function

function readFile(filepath){
  return function(callback){
    fs.readFile(filepath, callback);
  }
}

And that's it! If that went too fast, feel free to poke at the full source code.

More Resources

To learn more about generators:

blog comments powered by Disqus