Introduction
In this lab, you will explore advanced operations using PostgreSQL's JSONB data type. You will learn how to effectively store, manage, and query JSON data within a relational database.
The lab will guide you through creating a table with a JSONB column, inserting data, and then applying a GIN index to improve search performance. You will also practice querying nested JSON structures, updating specific elements within a JSONB document, and performing aggregate calculations on your JSON data.
Create a Table and Index JSONB Fields
In this step, you will create a table to store product information using the JSONB data type and then create a GIN index to optimize queries on this data. JSONB stores JSON data in a decomposed binary format, which is slightly slower to input but much faster to process. A GIN (Generalized Inverted Index) is ideal for indexing composite values like those in a JSONB column.
First, open a terminal and connect to the PostgreSQL database using the psql interactive shell:
sudo -u postgres psql
You will now see the PostgreSQL prompt, which looks like postgres=#.
Next, create a table named products with an id column and a data column of type JSONB.
CREATE TABLE products (
id SERIAL PRIMARY KEY,
data JSONB
);
Now, insert some sample product data into the products table.
INSERT INTO products (data) VALUES
('{"name": "Laptop", "price": 1200, "tags": ["electronics", "computer"]}'),
('{"name": "Keyboard", "price": 75, "tags": ["electronics", "accessory"]}'),
('{"name": "Mouse", "price": 30, "tags": ["electronics", "accessory"]}'),
('{"name": "Monitor", "price": 300, "tags": ["electronics", "display"]}');
To significantly speed up searches within the data column, create a GIN index on it.
CREATE INDEX idx_products_data ON products USING GIN (data);
You can verify that the index was created successfully by using the \di command, which lists all indexes.
\di
You should see idx_products_data in the list of relations, similar to the output below:
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+---------------------+-------+----------+-----------+------------+-------------
public | idx_products_data | index | postgres | products | 16 kB |
public | products_pkey | index | postgres | products | 16 kB |
(2 rows)
You have successfully set up your table and indexed the JSONB column. You will remain in the psql shell for the next step.
Query Nested JSON Structures
In this step, you will learn how to query data within JSONB columns, including nested structures. PostgreSQL provides several operators for this purpose.
First, let's insert some products with more complex, nested data into the products table.
INSERT INTO products (data) VALUES
('{"name": "Gaming PC", "price": 1500, "specs": {"cpu": "Intel i7", "ram": "32GB", "storage": "1TB SSD"}}'),
('{"name": "Office PC", "price": 800, "specs": {"cpu": "Intel i5", "ram": "16GB", "storage": "500GB SSD"}}');
To access a top-level key, you can use the ->> operator, which returns the value as text. Let's find the price of the "Laptop".
SELECT data ->> 'price' FROM products WHERE data ->> 'name' = 'Laptop';
The output will be:
price
-------
1200
(1 row)
To access a key within a nested JSON object, you can chain the -> and ->> operators. The -> operator gets a JSON object field, while ->> gets it as text. Let's retrieve the CPU specification for the "Gaming PC".
SELECT data -> 'specs' ->> 'cpu' FROM products WHERE data ->> 'name' = 'Gaming PC';
This command will return the CPU type:
?column?
--------------
Intel i7
(1 row)
You can also use these operators in the WHERE clause to filter results based on nested values. For example, find all products that have an "Intel i5" CPU.
SELECT data ->> 'name' FROM products WHERE data -> 'specs' ->> 'cpu' = 'Intel i5';
The query will return the name of the matching product:
name
-----------
Office PC
(1 row)
You are now able to query both top-level and nested data within your JSONB columns.
Update Specific JSONB Elements
In this step, you will learn how to modify specific elements within a JSONB column using the jsonb_set function. This is more efficient than retrieving the entire JSON object, modifying it in your application, and writing it back.
The jsonb_set function has the following syntax: jsonb_set(target_jsonb, path_array, new_value_jsonb, create_if_missing).
Let's update the price of the "Laptop" from 1200 to 1250. The path to the price is '{price}', and the new value must be cast to JSONB.
UPDATE products
SET data = jsonb_set(data, '{price}', '1250'::jsonb)
WHERE data ->> 'name' = 'Laptop';
Verify the update by selecting the laptop's data.
SELECT data FROM products WHERE data ->> 'name' = 'Laptop';
The output should show the new price:
data
--------------------------------------------------------------------------
{"name": "Laptop", "price": 1250, "tags": ["electronics", "computer"]}
(1 row)
You can also update nested values. Let's upgrade the RAM of the "Office PC" to "32GB". The path to the nested ram key is '{specs,ram}'.
UPDATE products
SET data = jsonb_set(data, '{specs,ram}', '"32GB"'::jsonb)
WHERE data ->> 'name' = 'Office PC';
Now, verify this nested update.
SELECT data -> 'specs' FROM products WHERE data ->> 'name' = 'Office PC';
The result will confirm the change to the ram value:
?column?
------------------------------------------------------------
{"cpu": "Intel i5", "ram": "32GB", "storage": "500GB SSD"}
(1 row)
You have now learned how to perform targeted updates on JSONB data.
Aggregate JSON Data
In this final step, you will perform aggregate calculations on data extracted from JSONB columns. This is useful for reporting and analysis.
To perform calculations, you often need to extract a value and cast it to a numeric type. Let's calculate the average price of all products.
SELECT AVG((data ->> 'price')::numeric) FROM products;
The query extracts the price as text, casts it to numeric, and then calculates the average. The output will be a single number:
avg
------------------------
659.1666666666666667
(1 row)
You can also use aggregate functions with a GROUP BY clause. Let's find the total number of products for each tag. To do this, we first need to unnest the tags array into separate rows using jsonb_array_elements_text.
SELECT tag, COUNT(*)
FROM products, jsonb_array_elements_text(data -> 'tags') AS tag
GROUP BY tag;
This query produces a count for each unique tag:
tag | count
-------------+-------
accessory | 2
electronics | 4
computer | 1
display | 1
(4 rows)
Finally, let's find the total value of all products that have the "electronics" tag.
SELECT SUM((data ->> 'price')::numeric)
FROM products
WHERE data -> 'tags' @> '"electronics"'::jsonb;
The @> operator checks if the left JSONB value contains the right JSONB value. Here, we check if the tags array contains the element "electronics".
The result is the sum of the prices for the four electronic products:
sum
--------
1655
(1 row)
You have now learned how to use aggregate functions on your JSONB data. To exit the psql shell, type:
\q
Summary
In this lab, you have learned several advanced operations for handling JSONB data in PostgreSQL. You started by creating a table with a JSONB column and saw the importance of using a GIN index to optimize query performance. You then practiced querying both top-level and nested JSONB data using the -> and ->> operators. Additionally, you learned how to perform targeted modifications on JSONB documents with the jsonb_set function and how to perform powerful data analysis using aggregate functions like AVG, COUNT, and SUM on your JSON data.


