How to Transform MongoDB Data

Introduction

In this lab, you will learn how to transform MongoDB data using basic aggregation operations. The lab covers five key steps: selecting output fields, renaming fields, calculating new fields, formatting output, and filtering results. Through these steps, you will gain hands-on experience in reshaping and analyzing data stored in MongoDB collections. The lab provides a sample dataset of books and demonstrates how to leverage the aggregation pipeline to extract, manipulate, and present the data in a more meaningful way.

Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL mongodb(("`MongoDB`")) -.-> mongodb/BasicOperationsGroup(["`Basic Operations`"]) mongodb(("`MongoDB`")) -.-> mongodb/QueryOperationsGroup(["`Query Operations`"]) mongodb(("`MongoDB`")) -.-> mongodb/DataTypesGroup(["`Data Types`"]) mongodb(("`MongoDB`")) -.-> mongodb/ArrayandEmbeddedDocumentsGroup(["`Array and Embedded Documents`"]) mongodb(("`MongoDB`")) -.-> mongodb/AggregationOperationsGroup(["`Aggregation Operations`"]) mongodb/BasicOperationsGroup -.-> mongodb/insert_document("`Insert Document`") mongodb/QueryOperationsGroup -.-> mongodb/query_with_conditions("`Query with Conditions`") mongodb/QueryOperationsGroup -.-> mongodb/project_fields("`Project Fields`") mongodb/DataTypesGroup -.-> mongodb/use_numeric_data_types("`Use Numeric Data Types`") mongodb/DataTypesGroup -.-> mongodb/work_with_array_data_types("`Work with Array Data Types`") mongodb/ArrayandEmbeddedDocumentsGroup -.-> mongodb/query_embedded_documents("`Query Embedded Documents`") mongodb/AggregationOperationsGroup -.-> mongodb/aggregate_group_totals("`Aggregate Group Totals`") subgraph Lab Skills mongodb/insert_document -.-> lab-422094{{"`Transform MongoDB Data`"}} mongodb/query_with_conditions -.-> lab-422094{{"`Transform MongoDB Data`"}} mongodb/project_fields -.-> lab-422094{{"`Transform MongoDB Data`"}} mongodb/use_numeric_data_types -.-> lab-422094{{"`Transform MongoDB Data`"}} mongodb/work_with_array_data_types -.-> lab-422094{{"`Transform MongoDB Data`"}} mongodb/query_embedded_documents -.-> lab-422094{{"`Transform MongoDB Data`"}} mongodb/aggregate_group_totals -.-> lab-422094{{"`Transform MongoDB Data`"}} end

Select Output Fields

In this step, we'll learn how to use MongoDB's aggregation pipeline to select and transform output fields. Aggregation is a powerful way to process and analyze data in MongoDB.

First, let's start by launching the MongoDB shell:

mongosh

Now, let's create a sample collection of books to work with:

use bookstore

db.books.insertMany([
    {
        title: "MongoDB Basics",
        author: "Jane Smith",
        price: 29.99,
        pages: 250,
        categories: ["Database", "Programming"]
    },
    {
        title: "Python Deep Dive",
        author: "John Doe",
        price: 39.99,
        pages: 450,
        categories: ["Programming", "Python"]
    },
    {
        title: "Data Science Handbook",
        author: "Alice Johnson",
        price: 49.99,
        pages: 600,
        categories: ["Data Science", "Programming"]
    }
])

Now, let's use the aggregation pipeline to select specific output fields:

db.books.aggregate([
  {
    $project: {
      _id: 0,
      bookTitle: "$title",
      bookAuthor: "$author"
    }
  }
]);

Example output:

[
  { bookTitle: 'MongoDB Basics', bookAuthor: 'Jane Smith' },
  { bookTitle: 'Python Deep Dive', bookAuthor: 'John Doe' },
  { bookTitle: 'Data Science Handbook', bookAuthor: 'Alice Johnson' }
]

Let's break down what we did:

$project is an aggregation stage that reshapes documents
_id: 0 excludes the default MongoDB document ID
bookTitle: "$title" renames the 'title' field to 'bookTitle'
bookAuthor: "$author" renames the 'author' field to 'bookAuthor'

The $ before field names tells MongoDB to use the value of that field.

Rename Fields

In this step, we'll explore more advanced field renaming techniques using MongoDB's aggregation pipeline. We'll build upon the book collection we created in the previous step.

Let's continue in the MongoDB shell:

mongosh

First, let's switch to our bookstore database:

use bookstore

Now, we'll use a more complex $project stage to rename and transform multiple fields:

db.books.aggregate([
  {
    $project: {
      _id: 0,
      bookInfo: {
        name: "$title",
        writer: "$author",
        bookLength: "$pages",
        pricing: "$price"
      },
      genres: "$categories"
    }
  }
]);

Example output:

[
  {
    bookInfo: {
      name: 'MongoDB Basics',
      writer: 'Jane Smith',
      bookLength: 250,
      pricing: 29.99
    },
    genres: [ 'Database', 'Programming' ]
  },
  // ... other book documents
]

Let's break down the renaming technique:

We created a nested object bookInfo with renamed fields
name replaces title
writer replaces author
bookLength replaces pages
pricing replaces price
We also preserved the categories as genres

You can also use the $rename stage for simpler field renaming:

db.books.aggregate([
  {
    $rename: {
      title: "bookName",
      author: "bookWriter"
    }
  }
]);

This stage directly renames fields in the original documents.

Calculate New Fields

In this step, we'll learn how to create new fields by performing calculations using MongoDB's aggregation pipeline. We'll continue working with our bookstore database.

Let's start by launching the MongoDB shell:

mongosh

Switch to the bookstore database:

use bookstore

We'll use the $addFields stage to create new calculated fields:

db.books.aggregate([
  {
    $addFields: {
      totalValue: { $multiply: ["$price", 1.1] },
      discountedPrice: { $multiply: ["$price", 0.9] },
      pageCategories: {
        $concat: [
          { $toString: "$pages" },
          " page ",
          { $arrayElemAt: ["$categories", 0] }
        ]
      }
    }
  }
]);

Example output:

[
  {
    _id: ObjectId("..."),
    title: "MongoDB Basics",
    author: "Jane Smith",
    price: 29.99,
    pages: 250,
    categories: ["Database", "Programming"],
    totalValue: 32.989,
    discountedPrice: 26.991,
    pageCategories: "250 page Database"
  },
  // ... other book documents
]

Let's break down the calculations:

totalValue: Multiplies price by 1.1 (10% markup)
discountedPrice: Multiplies price by 0.9 (10% discount)
pageCategories: Combines number of pages with first category using $concat

We can also perform more complex calculations. Let's calculate a book rating based on pages:

db.books.aggregate([
  {
    $addFields: {
      bookRating: {
        $switch: {
          branches: [
            { case: { $lt: ["$pages", 300] }, then: "Short Book" },
            { case: { $lt: ["$pages", 500] }, then: "Medium Book" }
          ],
          default: "Long Book"
        }
      }
    }
  }
]);

This example uses $switch to categorize books based on their page count.

Format Output

In this step, we'll explore various techniques to format and transform output using MongoDB's aggregation pipeline. We'll continue working with our bookstore database.

Let's start by launching the MongoDB shell:

mongosh

Switch to the bookstore database:

use bookstore

First, let's use $toUpper and $toLower to format text fields:

db.books.aggregate([
  {
    $project: {
      _id: 0,
      titleUpperCase: { $toUpper: "$title" },
      authorLowerCase: { $toLower: "$author" }
    }
  }
]);

Example output:

[
  {
    titleUpperCase: 'MONGODB BASICS',
    authorLowerCase: 'jane smith'
  },
  // ... other book documents
]

Next, let's format numeric values using $round and create formatted price strings:

db.books.aggregate([
  {
    $project: {
      _id: 0,
      title: 1,
      roundedPrice: { $round: ["$price", 1] },
      formattedPrice: {
        $concat: ["$", { $toString: { $round: ["$price", 2] } }]
      }
    }
  }
]);

Example output:

[
  {
    title: 'MongoDB Basics',
    roundedPrice: 30,
    formattedPrice: '$29.99'
  },
  // ... other book documents
]

We can also format arrays and create complex string representations:

db.books.aggregate([
  {
    $project: {
      _id: 0,
      title: 1,
      categoriesSummary: {
        $reduce: {
          input: "$categories",
          initialValue: "",
          in: {
            $concat: [
              "$$value",
              { $cond: [{ $eq: ["$$value", ""] }, "", ", "] },
              "$$this"
            ]
          }
        }
      }
    }
  }
]);

Example output:

[
  {
    title: 'MongoDB Basics',
    categoriesSummary: 'Database, Programming'
  },
  // ... other book documents
]

This last example uses $reduce to join array elements into a comma-separated string.

Filter Results

In this final step, we'll explore various filtering techniques using MongoDB's aggregation pipeline. We'll continue working with our bookstore database to demonstrate different ways to filter results.

Let's start by launching the MongoDB shell:

mongosh

Switch to the bookstore database:

use bookstore

First, let's filter books using simple comparison operators:

db.books.aggregate([
  {
    $match: {
      price: { $gt: 30 },
      pages: { $lt: 500 }
    }
  }
]);

This query filters books that:

Have a price greater than 30
Have fewer than 500 pages

Example output:

[
  {
    _id: ObjectId("..."),
    title: "Python Deep Dive",
    author: "John Doe",
    price: 39.99,
    pages: 450,
    categories: ["Programming", "Python"]
  }
]

Next, let's use more complex filtering with array operations:

db.books.aggregate([
  {
    $match: {
      categories: { $in: ["Programming"] }
    }
  }
]);

This query finds all books that have "Programming" in their categories.

We can also combine multiple filtering techniques:

db.books.aggregate([
  {
    $match: {
      $or: [{ pages: { $gt: 400 } }, { categories: { $in: ["Database"] } }]
    }
  },
  {
    $project: {
      title: 1,
      pages: 1,
      categories: 1
    }
  }
]);

This more complex query:

Finds books with more than 400 pages OR in the "Database" category
Projects only specific fields in the output

Example output:

[
  {
    _id: ObjectId("..."),
    title: "Data Science Handbook",
    pages: 600,
    categories: ["Data Science", "Programming"]
  },
  {
    _id: ObjectId("..."),
    title: "MongoDB Basics",
    pages: 250,
    categories: ["Database", "Programming"]
  }
]

Summary

In this lab, you learned how to use MongoDB's aggregation pipeline to select and transform output fields. You started by creating a sample collection of books, then used the $project stage to select specific fields and rename them. You also explored more advanced field renaming techniques, including using computed expressions and nested fields. Finally, you learned how to calculate new fields, format the output, and filter the results. These skills are essential for efficiently processing and analyzing data in MongoDB.

Transform MongoDB Data

Introduction

Skills Graph

Select Output Fields

Rename Fields

Calculate New Fields

Format Output

Filter Results

Summary

Other MongoDB Tutorials you may like