MySQL Common Table Expressions (CTEs)

MySQLBeginner
Practice Now

Introduction

In this lab, you will learn how to use Common Table Expressions (CTEs) in MySQL. CTEs help simplify complex queries by breaking them into logical, readable steps. A CTE creates a temporary, named result set that you can reference within a single SQL statement.

You will begin by setting up a database and creating a simple CTE to filter employee data. Then, you will explore recursive CTEs to handle hierarchical data, such as an organizational chart. Finally, you will learn to join a CTE with a table to perform more advanced data analysis.

Set Up the Database and a Simple CTE

In this first step, you will set up the database environment and define a simple Common Table Expression (CTE). CTEs are defined using the WITH clause and act as a temporary view that exists only for the duration of a single query. This makes complex queries easier to read and manage.

First, open the terminal from your desktop.

Connect to the MySQL server as the root user. In this lab environment, you can use sudo to connect without a password.

sudo mysql -u root

Once connected, you will see the MySQL prompt (mysql>). Now, create a database named labex_db and switch to it.

CREATE DATABASE IF NOT EXISTS labex_db;
USE labex_db;

Next, create an employees table and insert some sample data. This table will store basic information about employees, including their department and salary.

CREATE TABLE IF NOT EXISTS employees (
    employee_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50),
    salary DECIMAL(10, 2)
);

INSERT INTO employees (employee_id, first_name, last_name, department, salary) VALUES
(1, 'John', 'Doe', 'Sales', 60000.00),
(2, 'Jane', 'Smith', 'Marketing', 75000.00),
(3, 'Robert', 'Jones', 'Sales', 55000.00),
(4, 'Emily', 'Brown', 'IT', 90000.00),
(5, 'Michael', 'Davis', 'Marketing', 70000.00);

Now that the table is ready, let's define a simple CTE to select only the employees from the 'Sales' department.

WITH SalesEmployees AS (
    SELECT employee_id, first_name, last_name, salary
    FROM employees
    WHERE department = 'Sales'
)
SELECT * FROM SalesEmployees;

Let's break down this query:

  • WITH SalesEmployees AS (...): This defines a CTE named SalesEmployees.
  • The query inside the parentheses selects employees from the employees table where the department is 'Sales'.
  • SELECT * FROM SalesEmployees;: This is the main query that retrieves all data from our temporary SalesEmployees result set.

You should see the following output, listing only the employees from the Sales department:

+-------------+------------+-----------+----------+
| employee_id | first_name | last_name | salary   |
+-------------+------------+-----------+----------+
|           1 | John       | Doe       | 60000.00 |
|           3 | Robert     | Jones     | 55000.00 |
+-------------+------------+-----------+----------+
2 rows in set (0.00 sec)

This confirms your first CTE is working correctly. Please remain in the MySQL shell for the next step.

Create a Recursive CTE for Hierarchical Data

Recursive CTEs are a powerful feature for querying hierarchical data, such as organizational charts or nested categories. A recursive CTE refers to itself to process a tree-like structure level by level. It consists of an "anchor member" (the starting point) and a "recursive member" (the iteration).

In this step, you will create a table representing an employee hierarchy and use a recursive CTE to display it.

First, while still in the MySQL shell, create the employee_hierarchy table. This table includes a manager_id that points to another employee's employee_id, creating the hierarchy.

CREATE TABLE IF NOT EXISTS employee_hierarchy (
    employee_id INT PRIMARY KEY,
    employee_name VARCHAR(50),
    manager_id INT
);

INSERT INTO employee_hierarchy (employee_id, employee_name, manager_id) VALUES
(1, 'David', NULL),
(2, 'Emily', 1),
(3, 'Frank', 1),
(4, 'Grace', 2),
(5, 'Henry', 2),
(6, 'Ivy', 3),
(7, 'Jack', 3);

In this structure, 'David' is the top-level manager because his manager_id is NULL.

Now, write a recursive CTE to traverse this hierarchy. The RECURSIVE keyword is required.

WITH RECURSIVE EmployeeHierarchyCTE AS (
    -- Anchor member: selects the top-level manager
    SELECT employee_id, employee_name, manager_id, 0 AS level
    FROM employee_hierarchy
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive member: joins with itself to find subordinates
    SELECT e.employee_id, e.employee_name, e.manager_id, h.level + 1
    FROM employee_hierarchy e
    INNER JOIN EmployeeHierarchyCTE h ON e.manager_id = h.employee_id
)
SELECT * FROM EmployeeHierarchyCTE ORDER BY level, employee_name;

Let's analyze this query:

  • Anchor Member: The first SELECT statement finds the root of the hierarchy (where manager_id is NULL) and assigns it level 0.
  • Recursive Member: The second SELECT statement joins employee_hierarchy with the CTE itself (EmployeeHierarchyCTE). It finds all employees whose manager_id matches an employee_id already in the CTE, and it increments the level.
  • UNION ALL: This operator combines the results from the anchor and recursive members.

The query will produce the following output, showing the full organizational chart with levels:

+-------------+---------------+------------+-------+
| employee_id | employee_name | manager_id | level |
+-------------+---------------+------------+-------+
|           1 | David         |       NULL |     0 |
|           2 | Emily         |          1 |     1 |
|           3 | Frank         |          1 |     1 |
|           4 | Grace         |          2 |     2 |
|           5 | Henry         |          2 |     2 |
|           6 | Ivy           |          3 |     2 |
|           7 | Jack          |          3 |     2 |
+-------------+---------------+------------+-------+
7 rows in set (0.00 sec)

You have successfully used a recursive CTE to query hierarchical data. Please remain in the MySQL shell for the final step.

Join a CTE with a Table

CTEs can be joined with other tables just like regular tables. This is useful for combining aggregated data from a CTE with detailed data from another table. In this step, you will create a CTE to calculate the average salary per department and then join it back to the employees table.

While still in the MySQL shell, execute the following query. It defines a CTE named AvgSalaryByDepartment and then joins it with the employees table.

WITH AvgSalaryByDepartment AS (
    SELECT
        department,
        AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department
)
SELECT
    e.employee_id,
    e.first_name,
    e.department,
    e.salary,
    a.avg_salary
FROM
    employees e
JOIN
    AvgSalaryByDepartment a ON e.department = a.department;

Here is the breakdown:

  • The AvgSalaryByDepartment CTE calculates the average salary for each department.
  • The main query then joins the employees table (aliased as e) with this CTE (aliased as a) on the department column.
  • This allows you to display each employee's salary next to their department's average salary.

The expected output will be:

+-------------+------------+------------+----------+--------------+
| employee_id | first_name | department | salary   | avg_salary   |
+-------------+------------+------------+----------+--------------+
|           1 | John       | Sales      | 60000.00 | 57500.000000 |
|           2 | Jane       | Marketing  | 75000.00 | 72500.000000 |
|           3 | Robert     | Sales      | 55000.00 | 57500.000000 |
|           4 | Emily      | IT         | 90000.00 | 90000.000000 |
|           5 | Michael    | Marketing  | 70000.00 | 72500.000000 |
+-------------+------------+------------+----------+--------------+
5 rows in set (0.00 sec)

You can manually verify the results. For example, the average salary for the 'Sales' department is (60000 + 55000) / 2 = 57500, which matches the output. This confirms your query is working correctly.

You have successfully joined a CTE with a table. You can now exit the MySQL shell.

exit;

Summary

In this lab, you have learned how to effectively use Common Table Expressions (CTEs) in MySQL. You started by setting up a database and tables, then progressed through creating different types of CTEs.

You have learned to:

  • Define a simple CTE using the WITH clause to filter data and improve query readability.
  • Create a recursive CTE to navigate and display hierarchical data from a table.
  • Join a CTE with a table to combine aggregated results with detailed row-level data for more complex analysis.

These skills are fundamental for writing clean, maintainable, and powerful SQL queries, especially when dealing with complex logic or data structures.