I am reading up on how map/reduce works in CouchDB, think I am getting the hang of it now, thanks to this great tool. I tried the "Retrieve the top N tags" example on this page. It didn't work for me at all. So I wrote my own as an exercise. Here's my code:
// the map function
function(doc){
for(var i in doc.tags)
{
emit(null, doc.tags[i]);
}
}
// the reduce function
function(key, values, rereduce){
var hash = {}
if (!rereduce){
for (var i in values){
var tag = values[i]
hash[tag] = (hash[tag] || 0) + 1
}
}else{
for (var i in values){
var topN = values[i]
for (var i in topN){
var pair = topN[i]
var tag = pair[0]
hash[tag] = (hash[tag] || 0) + pair[1]
}
}
}
var all = []
for (var key in hash)
all.push([key, hash[key]])
return all.sort(function(one, other){
return other[1] - one[1]
}).slice(0, 3)
}
The approach I took was different one from the one from the example page, but I believe it to be the more correct one: rather than returning the results keyed by the tag in the map step, I would emit every occurrence of every tag instead. Then in the reduce step, I would calculate the aggregation values grouped by tag using a hash, transform it into an array, sort it, and choose the top 3. For the rereduce case, I would combine a set of top 3 choices and then again pick the top 3 among them.