PostgreSQL Relationships and Joins

PostgreSQLPostgreSQLBeginner
Practice Now

Introduction

In this lab, you will explore relationships and joins in PostgreSQL. You'll learn how to create tables with foreign key constraints to ensure data integrity.

You will start by creating two tables, customers and orders, and establish a foreign key relationship between them. Then, you'll insert data into these tables, making sure the data adheres to the defined relationship. Finally, you'll learn how to retrieve data using INNER JOIN and compare the results of LEFT, RIGHT, and FULL OUTER JOIN operations to understand how they handle different data relationships.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL postgresql(("PostgreSQL")) -.-> postgresql/PostgreSQLGroup(["PostgreSQL"]) postgresql/PostgreSQLGroup -.-> postgresql/table_init("Create Basic Table") postgresql/PostgreSQLGroup -.-> postgresql/col_int("Add Integer Column") postgresql/PostgreSQLGroup -.-> postgresql/col_text("Add Text Column") postgresql/PostgreSQLGroup -.-> postgresql/col_date("Add Date Column") postgresql/PostgreSQLGroup -.-> postgresql/key_primary("Set Primary Key") postgresql/PostgreSQLGroup -.-> postgresql/row_add("Insert One Row") postgresql/PostgreSQLGroup -.-> postgresql/data_all("Select All Data") postgresql/PostgreSQLGroup -.-> postgresql/data_where("Filter With WHERE") postgresql/PostgreSQLGroup -.-> postgresql/func_call("Call Stored Function") subgraph Lab Skills postgresql/table_init -.-> lab-550959{{"PostgreSQL Relationships and Joins"}} postgresql/col_int -.-> lab-550959{{"PostgreSQL Relationships and Joins"}} postgresql/col_text -.-> lab-550959{{"PostgreSQL Relationships and Joins"}} postgresql/col_date -.-> lab-550959{{"PostgreSQL Relationships and Joins"}} postgresql/key_primary -.-> lab-550959{{"PostgreSQL Relationships and Joins"}} postgresql/row_add -.-> lab-550959{{"PostgreSQL Relationships and Joins"}} postgresql/data_all -.-> lab-550959{{"PostgreSQL Relationships and Joins"}} postgresql/data_where -.-> lab-550959{{"PostgreSQL Relationships and Joins"}} postgresql/func_call -.-> lab-550959{{"PostgreSQL Relationships and Joins"}} end

Create Tables with Foreign Key Constraints

In this step, you will create two tables, customers and orders, and establish a foreign key constraint between them. This constraint ensures that the relationship between the tables is maintained, preventing invalid data from being entered.

Understanding Foreign Keys

A foreign key is a column in one table that references the primary key of another table. It establishes a link between the two tables. The table containing the foreign key is called the "child" table, and the table containing the primary key is called the "parent" table.

Step 1: Connect to PostgreSQL

Open a terminal in your LabEx VM. Connect to the PostgreSQL database using the psql command:

sudo -u postgres psql

You should now see the PostgreSQL prompt (postgres=#).

Step 2: Create the customers Table

Create the customers table with the following columns:

  • customer_id: A unique identifier for each customer (primary key).
  • first_name: The customer's first name.
  • last_name: The customer's last name.
  • email: The customer's email address (must be unique).

Execute the following SQL command in the psql shell:

CREATE TABLE customers (
    customer_id SERIAL PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE
);

This command creates the customers table. The SERIAL keyword automatically generates a sequence of numbers for the customer_id, making it auto-incrementing. PRIMARY KEY designates customer_id as the primary key. NOT NULL ensures that the first_name and last_name columns cannot be empty, and UNIQUE ensures that each email address is unique.

Step 3: Create the orders Table with a Foreign Key

Create the orders table with the following columns:

  • order_id: A unique identifier for each order (primary key).
  • customer_id: The ID of the customer who placed the order (foreign key referencing customers).
  • order_date: The date the order was placed.
  • total_amount: The total amount of the order.

Execute the following SQL command in the psql shell:

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    customer_id INTEGER REFERENCES customers(customer_id),
    order_date DATE NOT NULL,
    total_amount NUMERIC(10, 2) NOT NULL
);

This command creates the orders table. The customer_id column is defined as a foreign key using the REFERENCES keyword. REFERENCES customers(customer_id) specifies that the customer_id column in the orders table references the customer_id column in the customers table. This establishes the foreign key relationship.

Step 4: Verify Table Creation

Verify that the tables have been created successfully by listing the tables in the database. Execute the following command in the psql shell:

\dt

You should see the customers and orders tables listed.

Step 5: Describe the Tables

To see the structure of the tables, use the \d command followed by the table name. For example, to describe the customers table, execute:

\d customers

This will show you the columns, data types, and constraints defined for the customers table. Similarly, you can describe the orders table:

\d orders

This will show you the foreign key constraint on the customer_id column.

Illustration of creating tables with FK

You have now successfully created two tables with a foreign key constraint.

Insert Data and Enforce Referential Integrity

In this step, you will insert data into the customers and orders tables, ensuring that referential integrity is maintained. This means that you cannot add an order for a customer that doesn't exist in the customers table.

Understanding Referential Integrity

Referential integrity ensures that relationships between tables remain consistent. In our case, it means that the customer_id in the orders table must exist in the customers table.

Step 1: Insert Data into the customers Table

Insert data into the customers table using the following SQL command in the psql shell:

INSERT INTO customers (first_name, last_name, email) VALUES
('John', 'Doe', '[email protected]'),
('Jane', 'Smith', '[email protected]'),
('David', 'Lee', '[email protected]');

This command adds three customers to the customers table. The customer_id is automatically generated.

Step 2: Verify Data Insertion into customers Table

Verify that the data has been inserted correctly by querying the customers table:

SELECT * FROM customers;

You should see the three customers you just inserted, along with their automatically generated customer_id values. The output should look similar to this:

 customer_id | first_name | last_name |         email
-------------+------------+-----------+------------------------
           1 | John       | Doe       | [email protected]
           2 | Jane       | Smith     | [email protected]
           3 | David      | Lee       | [email protected]
(3 rows)

Step 3: Insert Data into the orders Table

Insert data into the orders table, referencing the customer_id values from the customers table:

INSERT INTO orders (customer_id, order_date, total_amount) VALUES
(1, '2023-11-01', 100.00),
(2, '2023-11-05', 250.50),
(1, '2023-11-10', 75.25),
(3, '2023-11-15', 120.00);

This command adds four orders to the orders table. Each order is associated with a customer_id from the customers table.

Step 4: Verify Data Insertion into orders Table

Verify that the data has been inserted correctly by querying the orders table:

SELECT * FROM orders;

You should see the four orders you just inserted. The output should look similar to this:

 order_id | customer_id | order_date | total_amount
----------+-------------+------------+--------------
        1 |           1 | 2023-11-01 |       100.00
        2 |           2 | 2023-11-05 |       250.50
        3 |           1 | 2023-11-10 |        75.25
        4 |           3 | 2023-11-15 |       120.00
(4 rows)

Step 5: Attempt to Insert Invalid Data (Demonstrating Referential Integrity)

To demonstrate the referential integrity constraint, try to insert an order with a customer_id that does not exist in the customers table:

INSERT INTO orders (customer_id, order_date, total_amount) VALUES
(4, '2023-11-20', 50.00);

You should see an error message similar to this:

ERROR:  insert or update on table "orders" violates foreign key constraint "orders_customer_id_fkey"
DETAIL:  Key (customer_id)=(4) is not present in table "customers".

This error message confirms that the foreign key constraint is working. The database prevents the insertion of the order because customer_id 4 does not exist in the customers table.

Illustration of data insertion process

You have now successfully inserted data into the customers and orders tables, ensuring referential integrity.

Query Data Using INNER JOIN

In this step, you will learn how to retrieve data from multiple tables using the INNER JOIN clause in PostgreSQL. INNER JOIN combines rows from two or more tables based on a related column.

Understanding INNER JOIN

An INNER JOIN returns only the rows where there is a match in both tables being joined. If there is no match, the row is excluded from the result.

Step 1: Connect to PostgreSQL

Ensure you are connected to the PostgreSQL database using the psql command:

sudo -u postgres psql

Step 2: Execute the INNER JOIN Query

Execute the following SQL query in the psql shell:

SELECT orders.order_id, customers.first_name, orders.order_date, orders.total_amount
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id;

Let's break down this query:

  • SELECT orders.order_id, customers.first_name, orders.order_date, orders.total_amount: This specifies the columns you want to retrieve from the orders and customers tables. Using the table name as a prefix (e.g., orders.order_id) clarifies which table each column comes from.
  • FROM orders: This specifies the first table you are querying from.
  • INNER JOIN customers ON orders.customer_id = customers.customer_id: This specifies the second table you are joining with (customers) and the join condition (orders.customer_id = customers.customer_id). The ON clause specifies that the customer_id in the orders table must match the customer_id in the customers table for rows to be included.

Step 3: Analyze the Results

The query will return a result set containing the order ID, customer's first name, order date, and total amount for each order. The output should look similar to this:

 order_id | first_name | order_date | total_amount
----------+------------+------------+--------------
        1 | John       | 2023-11-01 |       100.00
        3 | John       | 2023-11-10 |        75.25
        2 | Jane       | 2023-11-05 |       250.50
        4 | David      | 2023-11-15 |       120.00
(4 rows)

The query successfully joined the orders and customers tables based on the customer_id and retrieved the requested information. Only orders with a corresponding customer in the customers table are included.

Step 4: Using Aliases (Optional)

For more complex queries, you can use aliases to make the query more readable. The previous query can be rewritten using aliases:

SELECT o.order_id, c.first_name, o.order_date, o.total_amount
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id;

In this query, o is an alias for orders and c is an alias for customers. The result will be the same, but the query is more concise.

Illustration for INNER JOIN query

You have now successfully queried data from multiple tables using the INNER JOIN clause.

Compare LEFT, RIGHT, and FULL OUTER JOIN Results

In this step, you will explore and compare the results of LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN in PostgreSQL. These joins retrieve all rows from one or both tables, even if there are no matching values in the other table.

Understanding OUTER JOINs

  • LEFT OUTER JOIN (or LEFT JOIN): Returns all rows from the left table and the matching rows from the right table. If there's no match in the right table, NULL values are returned for the right table's columns.
  • RIGHT OUTER JOIN (or RIGHT JOIN): Returns all rows from the right table and the matching rows from the left table. If there's no match in the left table, NULL values are returned for the left table's columns.
  • FULL OUTER JOIN (or FULL JOIN): Returns all rows from both tables. If there's no match in one table, NULL values are returned for the other table's columns.

Step 1: Connect to PostgreSQL

Ensure you are connected to the PostgreSQL database using the psql command:

sudo -u postgres psql

Step 2: Insert a New Customer without Orders

Insert a new customer into the customers table who hasn't placed any orders:

INSERT INTO customers (first_name, last_name, email) VALUES
('Alice', 'Brown', '[email protected]');

Step 3: Verify the New Customer

Verify that the new customer has been added to the customers table:

SELECT * FROM customers;

You should see Alice Brown listed in the results, with a new customer_id (likely 4).

Step 4: Perform a LEFT OUTER JOIN

Execute the following SQL query to perform a LEFT OUTER JOIN between the customers and orders tables:

SELECT customers.first_name, orders.order_id, orders.order_date
FROM customers
LEFT OUTER JOIN orders ON customers.customer_id = orders.customer_id;

This query returns all customers, along with any orders they've placed. If a customer hasn't placed any orders, the order_id and order_date columns will contain NULL values. The output should look similar to this:

 first_name | order_id | order_date
------------+----------+------------
 John       |        1 | 2023-11-01
 John       |        3 | 2023-11-10
 Jane       |        2 | 2023-11-05
 David      |        4 | 2023-11-15
 Alice      |          |
(5 rows)

Notice that Alice Brown is included, even though she hasn't placed any orders. The order_id and order_date columns are NULL for her.

Step 5: Perform a RIGHT OUTER JOIN

Execute the following SQL query to perform a RIGHT OUTER JOIN between the customers and orders tables:

SELECT customers.first_name, orders.order_id, orders.order_date
FROM customers
RIGHT OUTER JOIN orders ON customers.customer_id = orders.customer_id;

This query returns all orders, along with the first name of the customer who placed each order. Since every order has a corresponding customer, the result will be the same as an INNER JOIN in this case. The output should look similar to this:

 first_name | order_id | order_date
------------+----------+------------
 John       |        1 | 2023-11-01
 Jane       |        2 | 2023-11-05
 John       |        3 | 2023-11-10
 David      |        4 | 2023-11-15
(4 rows)

Step 6: Perform a FULL OUTER JOIN

Execute the following SQL query to perform a FULL OUTER JOIN between the customers and orders tables:

SELECT customers.first_name, orders.order_id, orders.order_date
FROM customers
FULL OUTER JOIN orders ON customers.customer_id = orders.customer_id;

This query returns all customers and all orders. If a customer hasn't placed any orders, the order_id and order_date columns will contain NULL values. If an order doesn't have a corresponding customer (which isn't possible in our current setup due to the foreign key constraint), the first_name column would contain NULL values. The output should look similar to this:

 first_name | order_id | order_date
------------+----------+------------
 John       |        1 | 2023-11-01
 John       |        3 | 2023-11-10
 Jane       |        2 | 2023-11-05
 David      |        4 | 2023-11-15
 Alice      |          |
(5 rows)

Notice that Alice Brown is included with NULL values for order_id and order_date.

OUTER JOIN Results Illustration

Step 7: Understanding the Differences

  • LEFT OUTER JOIN includes all rows from the customers table, even if there are no matching orders.
  • RIGHT OUTER JOIN includes all rows from the orders table. In our case, it behaves like an INNER JOIN because all orders have a corresponding customer.
  • FULL OUTER JOIN includes all rows from both tables.

You have now explored and compared the results of LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN in PostgreSQL.

Summary

In this lab, you learned how to create tables with foreign key constraints in PostgreSQL to maintain data integrity. You created the customers and orders tables and established a relationship between them using a foreign key. You then inserted data into these tables, ensuring that the foreign key constraint was enforced. Finally, you explored different types of JOIN operations (INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN) to retrieve data from related tables and understand how they handle different data relationships.