How to Group MongoDB Data

Introduction

In this lab, you will learn how to perform basic aggregation operations in MongoDB, including grouping data by a single field, calculating group totals, counting group items, filtering groups, and sorting group results. These skills are essential for data analysis and gaining insights from your MongoDB data.

The lab covers a range of aggregation techniques, starting with grouping data by a single field and then expanding to more complex operations such as calculating averages, counts, and filtering groups. By the end of the lab, you will have a solid understanding of how to leverage MongoDB's aggregation framework to unlock the power of your data.

Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL mongodb(("`MongoDB`")) -.-> mongodb/QueryOperationsGroup(["`Query Operations`"]) mongodb(("`MongoDB`")) -.-> mongodb/AggregationOperationsGroup(["`Aggregation Operations`"]) mongodb/QueryOperationsGroup -.-> mongodb/find_documents("`Find Documents`") mongodb/QueryOperationsGroup -.-> mongodb/sort_documents("`Sort Documents`") mongodb/QueryOperationsGroup -.-> mongodb/project_fields("`Project Fields`") mongodb/AggregationOperationsGroup -.-> mongodb/group_documents("`Group Documents`") mongodb/AggregationOperationsGroup -.-> mongodb/aggregate_group_totals("`Aggregate Group Totals`") subgraph Lab Skills mongodb/find_documents -.-> lab-422083{{"`Group MongoDB Data`"}} mongodb/sort_documents -.-> lab-422083{{"`Group MongoDB Data`"}} mongodb/project_fields -.-> lab-422083{{"`Group MongoDB Data`"}} mongodb/group_documents -.-> lab-422083{{"`Group MongoDB Data`"}} mongodb/aggregate_group_totals -.-> lab-422083{{"`Group MongoDB Data`"}} end

Group By Single Field

In this step, you'll learn how to group data by a single field using MongoDB's aggregation framework. Grouping allows you to organize and summarize data based on a specific field, which is crucial for data analysis.

First, let's start by launching the MongoDB shell:

mongosh

Now, let's create a sample collection of products to demonstrate grouping:

use salesdb

db.products.insertMany([
    { category: "Electronics", price: 500 },
    { category: "Clothing", price: 50 },
    { category: "Electronics", price: 300 },
    { category: "Books", price: 25 },
    { category: "Clothing", price: 75 },
    { category: "Books", price: 40 }
])

Next, we'll use the $group aggregation operator to group products by their category and calculate the total price for each category:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      totalPrice: { $sum: "$price" }
    }
  }
]);

Example Output

[
  { _id: 'Electronics', totalPrice: 800 },
  { _id: 'Clothing', totalPrice: 125 },
  { _id: 'Books', totalPrice: 65 }
]

Let's break down the aggregation pipeline:

$group: Groups documents by a specified expression
_id: "$category": Specifies we're grouping by the category field
totalPrice: { $sum: "$price" }: Calculates the total price for each category

Calculate Group Totals

In this step, we'll expand on our previous grouping skills by learning how to calculate various aggregate totals using MongoDB's aggregation framework. We'll use multiple aggregation operators to derive more complex insights from our product data.

Let's continue working with the salesdb database and the products collection we created in the previous step:

// Ensure we're in the correct database
use salesdb

We'll explore different aggregation operators to calculate group totals:

Calculate the average price per category:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      averagePrice: { $avg: "$price" }
    }
  }
]);

Find the maximum and minimum prices in each category:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      maxPrice: { $max: "$price" },
      minPrice: { $min: "$price" }
    }
  }
]);

Combine multiple calculations in a single aggregation:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      totalPrice: { $sum: "$price" },
      averagePrice: { $avg: "$price" },
      productCount: { $sum: 1 }
    }
  }
]);

Example Output

[
  {
    _id: 'Electronics',
    totalPrice: 800,
    averagePrice: 400,
    productCount: 2
  },
  {
    _id: 'Clothing',
    totalPrice: 125,
    averagePrice: 62.5,
    productCount: 2
  },
  {
    _id: 'Books',
    totalPrice: 65,
    averagePrice: 32.5,
    productCount: 2
  }
]

Key aggregation operators explained:

$avg: Calculates the average of numeric values
$max: Finds the maximum value
$min: Finds the minimum value
$sum: 1: Counts the number of documents in each group

Count Group Items

In this step, we'll explore different ways to count items within groups using MongoDB's aggregation framework. Counting is a fundamental operation that helps you understand the distribution of your data.

Let's continue working with our salesdb database:

// Ensure we're in the correct database
use salesdb

First, let's add more products to make our counting more interesting:

db.products.insertMany([
  { category: "Electronics", brand: "Apple" },
  { category: "Electronics", brand: "Samsung" },
  { category: "Clothing", brand: "Nike" },
  { category: "Clothing", brand: "Adidas" },
  { category: "Books", brand: "Penguin" },
  { category: "Books", brand: "Oxford" }
]);

Now, let's count items in different ways:

Count the number of products in each category:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      productCount: { $sum: 1 }
    }
  }
]);

Count unique brands within each category:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      uniqueBrands: { $addToSet: "$brand" }
    }
  }
]);

Count the number of unique brands and total products:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      totalProducts: { $sum: 1 },
      uniqueBrands: { $addToSet: "$brand" },
      brandCount: { $addToSet: "$brand" }
    }
  },
  {
    $project: {
      totalProducts: 1,
      uniqueBrandCount: { $size: "$uniqueBrands" }
    }
  }
]);

Example Output

[
  {
    _id: 'Electronics',
    totalProducts: 4,
    uniqueBrandCount: 2
  },
  {
    _id: 'Clothing',
    totalProducts: 4,
    uniqueBrandCount: 2
  },
  {
    _id: 'Books',
    totalProducts: 4,
    uniqueBrandCount: 2
  }
]

Key aggregation operators explained:

$sum: 1: Counts the total number of documents in each group
$addToSet: Creates an array of unique values
$size: Counts the number of elements in an array

Filter Groups

In this step, we'll learn how to filter groups using MongoDB's aggregation framework. Filtering allows you to focus on specific subsets of your grouped data, providing more targeted insights.

Let's continue working with our salesdb database:

// Ensure we're in the correct database
use salesdb

Let's add some price information to our existing products to make filtering more meaningful:

db.products.updateMany(
  { category: "Electronics", brand: "Apple" },
  { $set: { price: 1000 } }
);

db.products.updateMany(
  { category: "Electronics", brand: "Samsung" },
  { $set: { price: 800 } }
);

db.products.updateMany(
  { category: "Clothing", brand: "Nike" },
  { $set: { price: 100 } }
);

db.products.updateMany(
  { category: "Clothing", brand: "Adidas" },
  { $set: { price: 80 } }
);

Now, let's explore different filtering techniques:

Filter groups with more than one product:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      products: { $push: "$brand" },
      productCount: { $sum: 1 }
    }
  },
  {
    $match: {
      productCount: { $gt: 1 }
    }
  }
]);

Filter groups with total price above a certain threshold:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      totalPrice: { $sum: "$price" },
      products: { $push: "$brand" }
    }
  },
  {
    $match: {
      totalPrice: { $gt: 500 }
    }
  }
]);

Complex filtering with multiple conditions:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      totalPrice: { $sum: "$price" },
      avgPrice: { $avg: "$price" },
      brands: { $addToSet: "$brand" }
    }
  },
  {
    $match: {
      totalPrice: { $gt: 500 },
      avgPrice: { $lt: 900 }
    }
  }
]);

Example Output

[
  {
    _id: 'Electronics',
    totalPrice: 1800,
    avgPrice: 900,
    brands: [ 'Apple', 'Samsung' ]
  }
]

Key aggregation operators explained:

$match: Filters documents based on specified conditions
$gt: Greater than
$lt: Less than
$push: Adds elements to an array
$addToSet: Adds unique elements to an array

Sort Group Results

In this final step, we'll learn how to sort grouped results using MongoDB's aggregation framework. Sorting helps you organize and prioritize your aggregated data, making it easier to analyze and understand.

Let's continue working with our salesdb database:

// Ensure we're in the correct database
use salesdb

We'll use the products we've already created to demonstrate sorting techniques:

Sort groups by total price in descending order:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      totalPrice: { $sum: "$price" },
      productCount: { $sum: 1 }
    }
  },
  {
    $sort: {
      totalPrice: -1
    }
  }
]);

Sort groups by product count and total price:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      totalPrice: { $sum: "$price" },
      productCount: { $sum: 1 }
    }
  },
  {
    $sort: {
      productCount: -1,
      totalPrice: 1
    }
  }
]);

Advanced sorting with multiple aggregation stages:

db.products.aggregate([
  {
    $group: {
      _id: "$category",
      totalPrice: { $sum: "$price" },
      avgPrice: { $avg: "$price" },
      brands: { $addToSet: "$brand" }
    }
  },
  {
    $project: {
      category: "$_id",
      totalPrice: 1,
      avgPrice: 1,
      brandCount: { $size: "$brands" }
    }
  },
  {
    $sort: {
      brandCount: -1,
      totalPrice: -1
    }
  }
]);

Example Output

[
  {
    category: 'Electronics',
    totalPrice: 1800,
    avgPrice: 900,
    brandCount: 2
  },
  {
    category: 'Clothing',
    totalPrice: 180,
    avgPrice: 90,
    brandCount: 2
  },
  {
    category: 'Books',
    totalPrice: 0,
    avgPrice: 0,
    brandCount: 2
  }
]

Key aggregation operators explained:

$sort: Sorts the documents
- 1: Ascending order
- -1: Descending order
$project: Reshapes documents, allowing you to include, exclude, or transform fields
$size: Counts the number of elements in an array

Summary

In this lab, you will learn how to group MongoDB data using the aggregation framework. First, you will group data by a single field, such as the product category, and calculate the total price for each group. You will then explore more advanced aggregation operators to calculate group totals, including the average price, the count of items, and filtering and sorting the grouped results. These techniques are essential for data analysis and gaining insights from your MongoDB data.

Group MongoDB Data

Introduction

Skills Graph

Group By Single Field

Example Output

Calculate Group Totals

Example Output

Count Group Items

Example Output

Filter Groups

Example Output

Sort Group Results

Example Output

Summary

Other MongoDB Tutorials you may like