
In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the collection that match the query condition). The map function emits key-value pairs. The mapReduce command has the following prototype form:
db.runCommand(
{
mapReduce: <collection>,
map: <function>,
reduce: <function>,
out: <output>,
query: <document>,
sort: <document>,
limit: <number>,
finalize: <function>,
scope: <document>,
jsMode: <boolean>,
verbose: <boolean>
}
)
var mapFunction = f unction ( ) { ... };
var reduceFunction = f unction ( key, values ) { ... };
db.runCommand(
{
mapReduce: 'orders',
map: mapFunction,
reduce: reduceFunction,
out: { merge: 'map_reduce_results', db: 'test' },
query: { ord_date: { $gt: new Date('01/01/2012') } }
}
)
The customer wants to have collection:
RatingSnapshot: [{
location_id: 12345abcde,
network: "google",
timestamp: 2014-09-23T11:34:00.123Z,
rating: 2.1,
num_ratings: 6
},
{
location_id: 12345abcde,
network: "yelp",
timestamp: 2014-09-23T11:34:00.123Z,
rating: 2.2,
num_ratings: 7
}
...
];
My proposal:
RatingSnapshot:
{
location_id: 12345abcde,
timestamp: 2014-09-23T11:34:00.123Z,
advertizer: "McDonald's"
ratings:
[
{
network: "google",
rating: 2.1,
num_ratings: 6
},
{
network: "yelp",
rating: 2.2,
num_ratings: 7
}
...
];
One more pix describing SQL to noSQl MongDB commands

Basic Example1: Calculate RatingSnapshot and Total Quantity with Average Quantity Per Rating
Map-reduce operation on the RatingSnapshots collection for all documents that have a Timestamp value greater than 01/10/2014. The operation groups by the Rating.Network field, and calculates the number of RatingSnapshot and the total quantity RatingSnapshoted for each Network. The operation concludes by calculating the average quantity perRatingSnapshot for each Network value. Define the map function to process each input document:
- In the function, this refers to the document that the map-reduce operation is processing.
- For each Rating, the function associates the Network with a new object value that contains the count of 1 and the Rating Num_ratings for theRatingSnapshot and emits the Network and value pair.
var mapFunction = f unction ( ) {
for (var idx = 0; idx < this.ratings.length; idx++) {
var key = this.ratings[idx].network;
var value = {
count: 1,
num_ratings: this.ratings[idx].num_ratings
};
emit(key, value);
}
};
Define the corresponding reduce function with two arguments keyNetwork and countObjVals:
- countObjVals is an array whose elements are the objects mapped to the grouped keyNetwork values passed by map function to the reducer function.
- The function reduces the countObjVals array to a single object reducedValue that contains the count and the num_ratings fields.
- In reducedVal, the count field contains the sum of the count fields from the individual array elements, and the num_ratings field contains the sum of the num_ratings fields from the individual array elements.
var reduceFunction = f unction ( keyNetwork, countObjVals ) {
reducedVal = { count: 0, num_ratings: 0 };
for (var idx = 0; idx < countObjVals.length; idx++) {
reducedVal.count += countObjVals[idx].count;
reducedVal.num_ratings += countObjVals[idx].num_ratings;
}
return reducedVal;
};
Then define a finalize function with two arguments key and reducedVal. The function modifies the reducedVal object to add a computed field named avg and returns the modified object
var finalizeFunction = f unction ( key, reducedVal ) {
reducedVal.avg = reducedVal.num_ratings/reducedVal.count;
return reducedVal;
};
Perform the map-reduce operation on the RatingSnapshots collection using the mapFunction,reduceFunction, and finalizeFunction functions.
db.RatingSnapshots.mapReduce( mapFunction,
reduceFunction,
{
out: { merge: "map_reduce_output_num_rating" },
query: { ord_date:
{ $gt: new Date('01/10/2014') }
},
finalize: finalizeFunction
}
)
This operation uses the query field to select only those documents with ord_date greater than new Date(01/10/2014). Then it output the results to a collection map_reduce_output_num_rating. If the map_reduce_output_num_rating collection already exists, the operation will merge the existing contents with the results of this map-reduce operation.
Basic Example2: Calc num_ratings
{
network: 'google',
rating: 3.4
num_ratings: 2
}
Here we have a num_rating networked by ‘google’ with two num_ratings. Now, we want to find the total number of num_ratings each num_rating network has earned across the entire num_rating collection. It’s a problem easily solved with map-reduce.
Mapping
As its name suggests, map-reduce essentially involves two operations. The first, specified by our map function, formats our data as a series of key-value pairs. Our key is the num_rating network’s name (this makes sense only if this username is unique). Our value is a document containing the number of num_ratings. We generate these key-value pairs by emitting them. See below:
var map = f unction ( ) {
emit(this.network, {num_ratings: this.num_ratings});
};
When we run map-reduce, the map function is applied to each document. This results in a collection of key-value pairs. What do we do with these results? It turns out that we don’t even have to think about them because they’re automatically passed on to our reduce function.
Reducing
Specifically, the reduce function will be invoked with two arguments: a key and an array of values associated with that key. Returning to our example, we can imagine our reduce function receiving something like this:
reduce('google', [{num_ratings: 2}, {num_ratings: 1}, {num_ratings: 4}]);
var reduce = function(key, values) {
var sum = 0;
values.forEach(function(doc) {
sum += doc.num_ratings;
});
return {num_ratings: sum};
};
Results
From the shell, we pass our map and reduce functions to the mapReduce helper.
var op = db.num_ratings.mapReduce(map, reduce, {out: "mr_results"});
{
"result" : "mr_results",
"timeMillis" : 8,
"counts" : {
"input" : 6,
"emit" : 6,
"output" : 2
},
"ok" : 1
}
db[op.result].find();
{ "_id" : "yelp", "value" : { "num_ratings" : 21 } }
{ "_id" : "google", "value" : { "num_ratings" : 13 } }
Basic Example3: Places in each network
We want to end up with a "networks" collection that has documents that look like this:
{"_id" : "Google", "value" : 4}
{"_id" : "Yelp", "value" : 2}
Emit each network in the map function, then count them in the reduce function.
1 The map function first checks if there is a networks field, as running a for-loop on undef would cause an error. Once that has been established, we go through each element, emiting the network name and a count of 1:
map = f unction ( ) {
if (!this.rating.network) {
return;
}
for (index in this.rating.network) {
emit(this.rating.network[index], 1);
}
}
2 Reduce. For the reduce function, we initialize a counter to 0 and then add each element of the current array to it. Then we return the final count.
reduce = function(previous, current) {
var count = 0;
for (index in current) {
count += current[index];
}
return count;
}
3 Call the mapreduce command
result = db.r u n Command({
... "mapreduce" : "RatingSnapsohts",
... "map" : map,
... "reduce" : reduce,
... "out" : "networks"})
db.networks.find ( )
{"_id" : "Google", "value" : 4}
{"_id" : "Yelp", "value" : 2}
Basic Example4: Pivot Data with Map reduce
You have a collection of Places with an array of the rating.network with data.
You want to generate a collection of rating.network with an array of Places in each.
db.Places.insert( { Place: "123asd", rating.network: ['Google', 'Yelp', '4square'] });
db.Places.insert( { Place: "adf134", rating.network: ['Google', 'Yelp', 'fb'] });
We need to loop through each location in the Place document and emit each location individually. The catch here is in the reduce phase. We cannot emit an array from the reduce phase, so we must build a Places array inside of the "value" document that is returned.
map = f unction ( ) {
for(var i in this.rating.network){
key = { location: this.rating.network[i] };
value = { Places: [ this.Place ] };
emit(key, value);
}
}
reduce = f unction ( key, values) {
Place_list = { Places: [] };
for(var i in values) {
Place_list.Places = values[i].Places.con cat(Place_list.Places);
}
return Place_list;
}
php - MongoDB::command (Execute a database command) Examples
$people = $db->people;
$people->insert(array("name" => "Joe", "age" => 4));
$people->insert(array("name" => "Sally", "age" => 22));
$people->insert(array("name" => "Dave", "age" => 22));
$people->insert(array("name" => "Molly", "age" => 87));
$ages = $db->command(array("distinct" => "people", "key" => "age"));
foreach ($ages['values'] as $age) {
echo "$age\n";
}
$people = $db->people;
$people->insert(array("name" => "Joe", "age" => 4));
$people->insert(array("name" => "Sally", "age" => 22));
$people->insert(array("name" => "Dave", "age" => 22));
$people->insert(array("name" => "Molly", "age" => 87));
$ages = $db->command(
array(
"distinct" => "people",
"key" => "age",
"query" => array("age" => array('$gte' => 18))
)
);
foreach ($ages['values'] as $age) {
echo "$age\n";
}
$events->insert(array("user_id" => $id,
"type" => $type,
"time" => new MongoDate(),
"desc" => $description));
$map = new MongoCode("function ( ) { emit(this.user_id,1); }");
$reduce = new MongoCode("function ( k, vals ) { ".
"var sum = 0;".
"for (var i in vals) {".
"sum += vals[i];".
"}".
"return sum; }");
$sales = $db->command(array(
"mapreduce" => "events",
"map" => $map,
"reduce" => $reduce,
"query" => array("type" => "sale"),
"out" => array("merge" => "eventCounts")));
$users = $db->selectCollection($sales['result'])->find();
foreach ($users as $user) {
echo "{$user['_id']} had {$user['value']} sale(s).\n";
}