Link MongoDB Documents

MongoDBBeginner
Practice Now

Introduction

In this lab, you will learn the fundamentals of establishing relationships between documents in MongoDB. This technique, known as referencing, is essential for building complex and organized database structures. You will practice creating collections, inserting documents with references to other documents, retrieving and combining related data using the $lookup aggregation stage, and managing the lifecycle of these linked documents. By the end of this lab, you will have a solid understanding of how to model and manage one-to-many relationships in your MongoDB database.

Establishing Document Relationships

In this first step, you will create two separate collections and establish a relationship between them. We will model a common scenario: a library database with authors and books. Each book will reference its author.

First, open the MongoDB Shell. This interactive command-line interface allows you to interact with your MongoDB instance.

mongosh

Once inside the shell, you will see a test> prompt. Let's switch to a new database named library_database. If the database does not exist, MongoDB will create it for you when you first store data.

use library_database

Now, let's create the authors collection by inserting two documents. Each document has a unique _id of the ObjectId type, which we will use for referencing.

db.authors.insertMany([
  {
    _id: ObjectId("660a1f5c9b8f8b1234567890"),
    name: "Jane Austen",
    nationality: "British"
  },
  {
    _id: ObjectId("660a1f5c9b8f8b1234567891"),
    name: "George Orwell",
    nationality: "British"
  }
]);

You should see a confirmation that the documents were inserted successfully.

Example output:

{
  "acknowledged": true,
  "insertedIds": {
    "0": ObjectId("660a1f5c9b8f8b1234567890"),
    "1": ObjectId("660a1f5c9b8f8b1234567891")
  }
}

Next, create the books collection. In each book document, the author_id field will store the ObjectId of the corresponding author from the authors collection. This creates the link between a book and its author.

db.books.insertMany([
  {
    title: "Pride and Prejudice",
    author_id: ObjectId("660a1f5c9b8f8b1234567890"),
    year: 1813
  },
  {
    title: "1984",
    author_id: ObjectId("660a1f5c9b8f8b1234567891"),
    year: 1949
  }
]);

Example output:

{
  "acknowledged": true,
  "insertedIds": {
    "0": ObjectId("660b2a1c9b8f8b1234567892"),
    "1": ObjectId("660b2a1c9b8f8b1234567893")
  }
}

You have now successfully created two collections and linked documents in the books collection to documents in the authors collection. Keep the MongoDB shell open for the next step.

Querying Linked Documents

Now that you have established relationships, the next logical step is to retrieve the linked data in a single query. MongoDB's aggregation framework provides the $lookup stage for this purpose, which performs a left outer join to another collection.

Ensure you are still in the mongosh shell and using the library_database.

Let's perform a query to fetch all books and embed their corresponding author information within the results.

db.books.aggregate([
  {
    $lookup: {
      from: "authors",
      localField: "author_id",
      foreignField: "_id",
      as: "author_details"
    }
  }
]);

The $lookup stage joins the books collection with the authors collection. Let's review its parameters:

  • from: "authors": Specifies the collection to join with.
  • localField: "author_id": Specifies the field from the input documents (from the books collection).
  • foreignField: "_id": Specifies the field from the documents in the "from" collection (the authors collection).
  • as: "author_details": Specifies the name of the new array field to add to the output. This array will contain the matched author documents.

Example output for one of the documents:

[
  {
    "_id": ObjectId("..."),
    "title": "Pride and Prejudice",
    "author_id": ObjectId("660a1f5c9b8f8b1234567890"),
    "year": 1813,
    "author_details": [
      {
        "_id": ObjectId("660a1f5c9b8f8b1234567890"),
        "name": "Jane Austen",
        "nationality": "British"
      }
    ]
  },
  ...
]

As you can see, the author_details field is an array containing the full document for the author. This powerful feature allows you to retrieve comprehensive data without needing to perform multiple queries from your application.

Data in a database is rarely static. In this step, you will learn how to update documents in both the authors and books collections. Because we are using references, you can update an author's information in one place, and all queries that join with that author will automatically reflect the change.

Let's add a birth year to Jane Austen's document. We will use the updateOne method with the $set operator to add a new field without overwriting the entire document.

db.authors.updateOne(
  { name: "Jane Austen" },
  {
    $set: {
      birth_year: 1775
    }
  }
);

Example output:

{
  "acknowledged": true,
  "insertedId": null,
  "matchedCount": 1,
  "modifiedCount": 1,
  "upsertedCount": 0
}

Now, let's update a book's details. We will add a genre to "Pride and Prejudice".

db.books.updateOne(
  { title: "Pride and Prejudice" },
  {
    $set: {
      genre: "Romance"
    }
  }
);

To verify that both updates were successful, you can query the documents directly.

First, check the author:

db.authors.findOne({ name: "Jane Austen" });

Then, check the book:

db.books.findOne({ title: "Pride and Prejudice" });

You will see the new birth_year and genre fields in the respective documents. The reference author_id in the book document remains unchanged, preserving the link.

The final part of managing relationships is handling deletions. When you remove a document, you must consider what happens to the documents that reference it. MongoDB does not enforce referential integrity automatically, so this is a task you must manage at the application level.

First, let's delete a book while keeping its author. We will remove the book "1984".

db.books.deleteOne({ title: "1984" });

Example output:

{ "acknowledged": true, "deletedCount": 1 }

If you now query the books collection, you will see only one book remains. The "George Orwell" document in the authors collection is unaffected.

Now, consider a more complex scenario: removing an author and all of their associated books. This requires a multi-step process to maintain data integrity.

First, find the author's ID and store it in a variable. We will remove "Jane Austen".

const authorId = db.authors.findOne({ name: "Jane Austen" })._id;

Next, use this ID to delete all books associated with that author. The deleteMany command is used in case an author has multiple books.

db.books.deleteMany({ author_id: authorId });

Finally, remove the author document itself.

db.authors.deleteOne({ _id: authorId });

This manual, multi-step process ensures that you do not leave "orphaned" book documents with invalid author_id references. You can verify that both the book "Pride and Prejudice" and the author "Jane Austen" have been removed from their respective collections.

Now you can exit the MongoDB shell.

exit

Summary

In this lab, you have learned the essential techniques for working with linked documents in MongoDB. You started by creating authors and books collections and establishing a one-to-many relationship using document references. You then practiced how to retrieve and combine data from these related collections using the $lookup aggregation stage. Furthermore, you learned how to update individual documents without breaking their links and, finally, how to properly manage deletions to maintain data integrity by removing related documents in a controlled, multi-step process. These skills form a strong foundation for designing and building more sophisticated and interconnected NoSQL database applications.