Export MongoDB Data

MongoDBBeginner
Practice Now

Introduction

In this lab, you will learn how to use the mongoexport command-line utility to export data from a MongoDB database. You will practice exporting data into two common formats: JSON and CSV. The lab will guide you through creating a sample dataset, exporting an entire collection, selecting specific fields for export, and using queries to filter the data you export. By the end of this lab, you will be proficient in extracting data from MongoDB for backups, analysis, or migration to other systems.

Exporting a Collection to JSON

In this first step, you will learn the fundamental process of exporting a MongoDB collection to a JSON file. JSON (JavaScript Object Notation) is a standard format that preserves the rich, nested structure of MongoDB documents, making it ideal for backups and migrations.

First, you need to connect to the MongoDB server and create some sample data. Open the MongoDB Shell by running the following command in your terminal:

mongosh

Once you are inside the MongoDB Shell, you will see a new prompt. Now, create and switch to a new database named exportlab, and insert three documents into a users collection.

use exportlab
db.users.insertMany([
  { name: "Alice", age: 28, email: "alice@example.com", status: "active" },
  { name: "Bob", age: 35, email: "bob@example.com", status: "active" },
  { name: "Charlie", age: 42, email: "charlie@example.com", status: "inactive" }
]);

After the documents are inserted, you will see a confirmation message. Now, exit the MongoDB Shell to return to your regular terminal.

exit;

With the data in place, you can now use the mongoexport utility to export the users collection. This command specifies the database, the collection, and the output file.

mongoexport --db=exportlab --collection=users --out=$HOME/project/users.json
  • --db: Specifies the database to connect to (exportlab).
  • --collection: Specifies the collection to export (users).
  • --out: Specifies the path and filename for the output file ($HOME/project/users.json).

To confirm the export was successful, view the contents of the newly created JSON file.

cat ~/project/users.json

You will see the three documents you inserted, each on a new line in JSON format. This format is called JSONL (JSON Lines), where each line contains a separate, complete JSON object. Note that MongoDB adds a unique _id field to each document.

{"_id":{"$oid":"656f1a6b..."},"name":"Alice","age":28,"email":"alice@example.com","status":"active"}
{"_id":{"$oid":"656f1a6b..."},"name":"Bob","age":35,"email":"bob@example.com","status":"active"}
{"_id":{"$oid":"656f1a6b..."},"name":"Charlie","age":42,"email":"charlie@example.com","status":"inactive"}

Exporting a Collection to CSV

While JSON is excellent for preserving data structure, CSV (Comma-Separated Values) is often more convenient for use in spreadsheets or for simple data exchange. In this step, you will export the same users collection to a CSV file.

When exporting to CSV, you must specify which fields to include. This is because CSV is a flat, tabular format and cannot represent nested JSON structures.

Use the mongoexport command again, but this time add the --type=csv and --fields options. We will export the name, age, and email fields.

mongoexport --db=exportlab --collection=users --type=csv --fields=name,age,email --out=$HOME/project/users.csv
  • --type=csv: This flag tells mongoexport to output in CSV format.
  • --fields: A comma-separated list of fields to include in the export. The order you list them here determines the column order in the CSV file.

Now, inspect the contents of the users.csv file.

cat ~/project/users.csv

The output will be a standard CSV format, with the field names as the header row, followed by the data.

name,age,email
Alice,28,alice@example.com
Bob,35,bob@example.com
Charlie,42,charlie@example.com

You have now successfully exported the same data into two different formats.

Filtering Data with a Query

Often, you do not need to export an entire collection. mongoexport allows you to use a query to filter which documents are exported. This is useful for extracting specific subsets of your data.

In this step, you will export only the users who have a status of "active". The --query option accepts a JSON document that specifies the filter criteria, just like a find() operation in the MongoDB Shell.

Run the following command to export only the active users to a new JSON file named active_users.json.

mongoexport --db=exportlab --collection=users --query='{"status": "active"}' --out=$HOME/project/active_users.json
  • --query='{"status": "active"}': This option filters the documents, exporting only those where the status field is equal to "active". Note the use of single quotes around the JSON string to prevent shell interpretation issues.

Let's verify the contents of the exported file.

cat ~/project/active_users.json

The output should only contain the documents for Alice and Bob, as Charlie's status is "inactive".

{"_id":{"$oid":"656f1a6b..."},"name":"Alice","age":28,"email":"alice@example.com","status":"active"}
{"_id":{"$oid":"656f1a6b..."},"name":"Bob","age":35,"email":"bob@example.com","status":"active"}

This filtering capability is powerful for creating targeted data exports without needing to manipulate the data after it has been exported.

Formatting and Limiting Output

mongoexport provides additional options to control the format and amount of data you export. In this step, you will learn how to create a more human-readable "pretty" JSON output and how to limit the number of documents in your export.

First, let's export the users collection again, but this time using the --pretty flag. This will format the JSON output with indentation and line breaks, making it much easier to read.

mongoexport --db=exportlab --collection=users --pretty --out=$HOME/project/users_pretty.json
  • --pretty: Formats the output JSON to be human-readable.

View the formatted file to see the difference.

cat ~/project/users_pretty.json

The output will be nicely indented, like this:

[
  {
    "_id": {
      "$oid": "656f1a6b..."
    },
    "name": "Alice",
    "age": 28,
    "email": "alice@example.com",
    "status": "active"
  },
  {
    "_id": {
      "$oid": "656f1a6b..."
    },
    "name": "Bob",
    "age": 35,
    "email": "bob@example.com",
    "status": "active"
  },
  {
    "_id": {
      "$oid": "656f1a6b..."
    },
    "name": "Charlie",
    "age": 42,
    "email": "charlie@example.com",
    "status": "inactive"
  }
]

Next, you will use the --limit option to export only a specific number of documents. This is useful for creating small sample files or for testing. Let's export only the first two documents to a CSV file.

mongoexport --db=exportlab --collection=users --type=csv --fields=name,status --limit=2 --out=$HOME/project/users_limited.csv
  • --limit=2: Restricts the export to a maximum of 2 documents.

Check the contents of the limited CSV file.

cat ~/project/users_limited.csv

As expected, the file contains the header and only the first two user records.

name,status
Alice,active
Bob,active

Verifying Exported Files

In this final step, you will practice using common Linux command-line tools to inspect and verify the files you have created. This is a crucial skill for confirming the integrity of your data exports.

First, list all the files in your project directory to see everything you have created. The -lh flags will show details in a human-readable format.

ls -lh ~/project/

You should see all the .json and .csv files from the previous steps.

total 20K
-rw-rw-r-- 1 labex labex 224 Aug 27 15:48  active_users.json
-rw-rw-r-- 1 labex labex  96 Aug 27 15:48  users.csv
-rw-rw-r-- 1 labex labex 344 Aug 27 15:36  users.json
-rw-rw-r-- 1 labex labex  36 Aug 27 15:48  users_limited.csv
-rw-rw-r-- 1 labex labex 410 Aug 27 15:48  users_pretty.json

Next, use the wc -l command to count the number of lines in your files. This is a quick way to check the number of exported documents.

wc -l ~/project/*.json ~/project/*.csv

For JSON files where each document is on one line, the line count equals the document count. For CSV files, the line count is the number of data rows plus one for the header.

  2 /home/labex/project/active_users.json
  3 /home/labex/project/users.json
27 /home/labex/project/users_pretty.json
  4 /home/labex/project/users.csv
  3 /home/labex/project/users_limited.csv
39 total

Finally, you can validate the syntax of JSON files. Note that mongoexport creates JSONL (JSON Lines) format by default, where each document is a separate JSON object on its own line. To validate this format, you can check each line individually:

while IFS= read -r line; do
  echo "$line" | python3 -m json.tool > /dev/null
done < ~/project/users.json && echo "All JSON lines are valid"

If the commands run without errors, your JSON files are valid. These verification techniques help ensure your data exports are complete and correct.

Summary

In this lab, you have learned the essential functions of the mongoexport utility. You started by creating a sample dataset and performing a basic export to a JSON file. You then exported the same data to a CSV file, learning how to specify fields for a tabular format with --fields. You also practiced using the --query option to filter data and export only a specific subset of documents. Finally, you explored formatting options like --pretty for human-readable JSON and --limit to control the number of exported records. Through these exercises, you have gained practical skills for extracting data from MongoDB for various purposes.