JSON Data Processing with jq

LinuxBeginner
Practice Now

Introduction

Welcome to the Linux jq Programming Lab! In this lab, you'll discover how to use jq, a lightweight and versatile command-line JSON processor. Think of jq as sed, but specifically designed for JSON data. It empowers you to effortlessly slice, filter, map, and transform structured data. This lab is structured to guide you from basic to advanced jq usage through practical examples that you can apply in real-world scenarios, like processing JSON data from APIs or configuration files.

Imagine you're planning a trip to China and using a travel app that provides details about various attractions, including their locations, opening hours, and reviews. The app's backend stores this data in JSON format. Your task is to extract specific pieces of information to plan your trip effectively. This lab will demonstrate how to use jq to query and manipulate this JSON data, allowing you to quickly identify the perfect attractions to visit.

Basic JSON Querying

Let's start by learning how to extract simple data from a JSON object.

You should now have the data.txt file in your /home/labex/project/ directory. It contains JSON data representing a list of attractions. The file content looks like this:

[
  {
    "name": "The Great Wall of China",
    "location": "Shanxi Province",
    "opening_hours": "24 hours"
  },
  {
    "name": "Terracotta Warriors",
    "location": "XiAn",
    "opening_hours": "9:00 AM -  5:00 PM"
  }
]

Our goal here is to extract the names of all the attractions listed in this JSON data.

To achieve this, use the following command:

cat ~/project/data.txt | jq '.[] | .name'

This command will produce the following output:

"The Great Wall of China"
"Terracotta Warriors"

Let's break down what's happening in this command. cat ~/project/data.txt simply reads the contents of the data.txt file. The | symbol, known as a pipe, takes the output of the cat command and feeds it as input to the jq command. The core of the extraction logic is in jq '.[] | .name'. Here's how jq processes this:

  • .[] tells jq to iterate through each element (in this case, each attraction object) in the JSON array.
  • | again, pipes the result of the iteration to the next operation, in this case .name.
  • .name extracts the value associated with the "name" key from each of the attraction objects.

In essence, the command steps through each attraction, picks out its name, and displays it.

Filtering JSON Data

Let's move on to filtering JSON data based on specific criteria.

Our objective is to find only those attractions that are open 24 hours.

Use the following command to accomplish this:

cat ~/project/data.txt | jq '.[] | select(.opening_hours == "24 hours") | .name'

Executing this command will output:

"The Great Wall of China"

Here's how the filtering works: The command begins with cat ~/project/data.txt | jq '.[]', which, as before, reads the file and iterates over each attraction. The key part is the addition of select(.opening_hours == "24 hours"):

  • select() is a jq function that allows you to filter elements of the JSON based on a condition you specify.
  • The condition .opening_hours == "24 hours" checks whether the value of the opening_hours field is exactly equal to the string "24 hours". Only attractions matching this condition will be passed on to the next stage.
  • The final part, | .name, simply extracts the name of each attraction that passed the filter.

In this case, only "The Great Wall of China" meets the condition, so it's the only name that's extracted and displayed.

Transforming JSON Data

Now, let's explore how to transform JSON data into a different, more useful format.

Our goal here is to make the opening hours more readable. Specifically, if an attraction is open 24 hours, we want to display "Open 24 hours"; otherwise, we will add the prefix "Open " to the existing opening hours text.

Use the following command to achieve this:

cat ~/project/data.txt | jq '.[] | {name: .name, location: .location, opening_hours: (.opening_hours | if . == "24 hours" then "Open 24 hours" else "Open \(.)" end)}'

This command produces the following output:

{
  "name": "The Great Wall of China",
  "location": "Shanxi Province",
  "opening_hours": "Open 24 hours"
}
{
  "name": "Terracotta Warriors",
  "location": "XiAn",
  "opening_hours": "Open 9:00 AM -  5:00 PM"
}

Let's understand the transformation: As before, cat ~/project/data.txt | jq '.[]' gets us started by reading the file and iterating through each attraction in the array. The core of this transformation is in the object construction and the if-else statement:

  • {name: .name, location: .location, opening_hours: ...} creates a new JSON object, pulling data from the original object. It includes the name and location from the original object directly. The value for the opening_hours field, however, is more complex.
  • (.opening_hours | if . == "24 hours" then "Open 24 hours" else "Open \(.)" end) takes the value of the original opening_hours and processes it:
    • .opening_hours selects the original opening hours value.
    • The if . == "24 hours" then "Open 24 hours" else "Open \(.)" end statement checks if the original opening_hours is exactly equal to "24 hours". If it is, the value is replaced with "Open 24 hours". If not, "Open " is added as a prefix to the existing opening_hours. Note the use of \(.), which allows us to embed the value within the string.

In essence, this command transforms the data by creating a new object for each attraction and adjusting the opening_hours value to be more readable for the user.

Summary

Congratulations! You've successfully completed the Linux jq Programming Lab. You've learned how to query, filter, and transform JSON data using jq, a powerful tool for working with structured data directly from the command line. Whether you’re processing data from APIs, configuration files, or any other JSON source, jq empowers you to extract, filter, and manipulate the data you need with great efficiency and clarity.

Remember, consistent practice is essential for mastering jq and other command-line tools. Feel free to experiment with your own JSON data, trying different queries and transformations. Happy coding!