Introduction
In this lab, you will explore SQLite subquery techniques to enhance your data retrieval and filtering capabilities. You will learn how to use subqueries within the WHERE clause, embed them in the SELECT statement, and build correlated subqueries.
Create Tables and Insert Data
In this first step, you will create two tables, departments and employees, and insert some sample data into them. This will provide the data you need to practice using subqueries in the following steps.
First, open your terminal in the LabEx VM. Your default path is /home/labex/project.
To start, connect to a SQLite database named company.db. If the database doesn't exist, SQLite will create it for you. Run the following command:
sqlite3 company.db
This command opens the SQLite command-line tool and connects to the company.db database. You'll see the sqlite> prompt.
Now, create the departments table with the following SQL command:
CREATE TABLE departments (
department_id INTEGER PRIMARY KEY,
department_name TEXT,
location TEXT
);
This command creates a table named departments with three columns: department_id, department_name, and location. The department_id column is the primary key for this table.
Next, insert some sample data into the departments table:
INSERT INTO departments (department_name, location) VALUES
('Sales', 'New York'),
('Marketing', 'Los Angeles'),
('Engineering', 'San Francisco');
This command inserts three rows into the departments table, representing three different departments and their locations.
Now, create the employees table with the following SQL command:
CREATE TABLE employees (
employee_id INTEGER PRIMARY KEY,
employee_name TEXT,
department_id INTEGER,
FOREIGN KEY (department_id) REFERENCES departments(department_id)
);
This command creates a table named employees with three columns: employee_id, employee_name, and department_id. The department_id column is a foreign key that references the department_id column in the departments table.
Finally, insert some sample data into the employees table:
INSERT INTO employees (employee_name, department_id) VALUES
('Alice', 1),
('Bob', 2),
('Charlie', 1),
('David', 3);
This command inserts four rows into the employees table, representing four different employees and their department IDs.
Use Subqueries in the WHERE Clause
In this step, you will learn how to use subqueries in the WHERE clause to filter results based on the output of another query.
A subquery is a SELECT statement nested inside another SQL statement. In this case, you'll use a subquery to select department_id values from the departments table and then use those values to filter the results of a query on the employees table.
Let's find all employees who work in departments located in 'New York'. To do this, you'll first need to find the department_id for departments in 'New York', and then find all employees with that department_id.
Enter the following SQL command in the sqlite> prompt:
SELECT employee_name FROM employees WHERE department_id IN (SELECT department_id FROM departments WHERE location = 'New York');
This command selects the employee_name from the employees table where the department_id is in the list of department_id values returned by the subquery. The subquery selects the department_id from the departments table where the location is 'New York'.
After executing the command, you should see the following output:
Alice
Charlie
This output shows the names of the employees who work in the Sales department, which is located in New York.
Embed Subqueries in the SELECT Clause
In this step, you will learn how to embed subqueries within the SELECT clause of an SQL statement to retrieve related data.
Embedding a subquery in the SELECT clause allows you to retrieve a single value for each row in the outer query. This value is often a calculated value or a related value from another table.
Let's retrieve each employee's name along with the name of their department. Enter the following SQL command in the sqlite> prompt:
SELECT employee_name, (SELECT department_name FROM departments WHERE department_id = employees.department_id) AS department_name FROM employees;
This command selects the employee_name from the employees table and also includes a subquery that retrieves the department_name from the departments table. The subquery uses the department_id from the employees table to match the correct department. The result of the subquery is aliased as department_name.
After executing the command, you should see the following output:
Alice|Sales
Bob|Marketing
Charlie|Sales
David|Engineering
This output shows each employee's name and the name of their corresponding department.
Build Correlated Subqueries
In this step, you will learn how to build correlated subqueries. Correlated subqueries are subqueries that refer to a column from the outer query. This means the subquery is executed once for each row of the outer query.
Unlike simple subqueries that are executed once and their result is used by the outer query, correlated subqueries depend on the outer query for their values. They are used when you need to compare values within the subquery to values in the current row of the outer query.
Let's find all employees who work in a department located in the same city as the employee's name. To make this work, we'll first update the employee names to be city names.
UPDATE employees SET employee_name = 'New York' WHERE employee_name = 'Alice';
UPDATE employees SET employee_name = 'Los Angeles' WHERE employee_name = 'Bob';
UPDATE employees SET employee_name = 'New York' WHERE employee_name = 'Charlie';
UPDATE employees SET employee_name = 'San Francisco' WHERE employee_name = 'David';
These commands update the employee_name column in the employees table to city names.
Now, let's write the correlated subquery:
SELECT e.employee_name FROM employees e WHERE e.department_id IN (SELECT d.department_id FROM departments d WHERE d.location = e.employee_name);
This command selects the employee_name from the employees table (aliased as e) where the department_id is in the list of department_id values returned by the subquery. The subquery selects the department_id from the departments table (aliased as d) where the location matches the employee_name from the outer query.
After executing the command, you should see the following output:
New York
Los Angeles
San Francisco
This output shows the names of the employees (now city names) who work in departments located in the same city.
Assess Subquery Efficiency with JOIN
In this step, you will learn how to assess the efficiency of subqueries and explore alternative approaches using JOIN operations for optimization.
While subqueries are powerful, they can sometimes lead to performance bottlenecks, especially with large datasets. In many cases, you can rewrite subqueries using JOIN operations, which can be more efficient.
Let's rewrite the correlated subquery from the previous step using a JOIN:
SELECT e.employee_name FROM employees e JOIN departments d ON e.department_id = d.department_id WHERE d.location = e.employee_name;
This command selects the employee_name from the employees table (aliased as e) and joins it with the departments table (aliased as d) on the department_id column. The WHERE clause then filters the results to only include rows where the location in the departments table matches the employee_name in the employees table.
To verify the result, execute the command. You should see the same output as in the previous step:
New York
Los Angeles
San Francisco
To assess the efficiency, you would typically use EXPLAIN QUERY PLAN before and after the change. However, due to limitations in the LabEx environment, we cannot fully demonstrate the EXPLAIN QUERY PLAN command. The key takeaway is that JOIN operations are often more efficient than correlated subqueries, especially for larger datasets.
Finally, exit the sqlite3 shell:
.exit
This will return you to the bash prompt.
Summary
In this lab, you have learned how to use SQLite subqueries to enhance your data retrieval and filtering capabilities. You practiced using subqueries within the WHERE clause, embedding them in the SELECT statement, and building correlated subqueries. You also learned how to rewrite subqueries using JOIN operations for better efficiency. These techniques provide you with powerful tools for working with data in SQLite.


