SQLite Index Optimization

Introduction

In this lab, you will learn how to optimize SQLite database performance using indexes. You'll create single-column indexes to improve query speed, focusing on practical application and analysis. You'll also learn to analyze query plans and drop redundant indexes.

Create a Database and Table

In this step, you will create a SQLite database and an employees table. You'll then insert some sample data into the table.

First, open your terminal in the LabEx VM. Your default path is /home/labex/project.

To create a SQLite database named my_database.db, run the following command:

sqlite3 my_database.db

This command creates a new SQLite database file named my_database.db in your project directory and opens the SQLite shell.

Next, create the employees table with the following structure:

CREATE TABLE employees (
    id INTEGER PRIMARY KEY,
    first_name TEXT,
    last_name TEXT,
    email TEXT,
    department TEXT
);

This SQL statement creates a table named employees with five columns: id, first_name, last_name, email, and department. The id column is set as the primary key, which means it must contain unique values.

Now, insert some sample data into the employees table:

INSERT INTO employees (first_name, last_name, email, department) VALUES
('John', 'Doe', 'john.doe@example.com', 'Sales'),
('Jane', 'Smith', 'jane.smith@example.com', 'Marketing'),
('Robert', 'Jones', 'robert.jones@example.com', 'Engineering'),
('Emily', 'Brown', 'emily.brown@example.com', 'Sales'),
('Michael', 'Davis', 'michael.davis@example.com', 'Marketing');

This will insert five rows of data into the employees table.

To verify the data has been inserted correctly, run the following command:

SELECT * FROM employees;

You should see the following output:

1|John|Doe|john.doe@example.com|Sales
2|Jane|Smith|jane.smith@example.com|Marketing
3|Robert|Jones|robert.jones@example.com|Engineering
4|Emily|Brown|emily.brown@example.com|Sales
5|Michael|Davis|michael.davis@example.com|Marketing

Create an Index

In this step, you will create an index on the last_name column of the employees table.

Indexes are special lookup tables that the database search engine can use to speed up data retrieval.

To create an index named idx_lastname on the last_name column, run the following command:

CREATE INDEX idx_lastname ON employees (last_name);

This SQL statement creates an index named idx_lastname on the last_name column of the employees table.

To verify that the index has been created, you can use the following command:

PRAGMA index_list(employees);

This command will display a list of indexes on the employees table, including the idx_lastname index you just created. You should see output similar to this:

0|idx_lastname|0|c|0

This output confirms that the idx_lastname index exists on the employees table.

Analyze Queries with EXPLAIN QUERY PLAN

In this step, you will learn how to use the EXPLAIN QUERY PLAN command to analyze how SQLite executes a query. This is a powerful tool for understanding query performance and identifying potential bottlenecks.

To analyze a query, prefix it with EXPLAIN QUERY PLAN. For example, to analyze the following query:

SELECT * FROM employees WHERE last_name = 'Smith';

Run the following command:

EXPLAIN QUERY PLAN SELECT * FROM employees WHERE last_name = 'Smith';

The output will look something like this:

QUERY PLAN
`--SEARCH employees USING INDEX idx_lastname (last_name=?)

This output tells you that SQLite is using the idx_lastname index to find the employees with the last name 'Smith'. The SEARCH keyword indicates that SQLite is using an index to perform the search.

If the index was not used, the output would look different. For example, if you query for employees with a first name of 'John' (and you haven't created an index on the first_name column), the output would be:

EXPLAIN QUERY PLAN SELECT * FROM employees WHERE first_name = 'John';

The output will look something like this:

QUERY PLAN
`--SCAN employees

The SCAN keyword indicates that SQLite is performing a full table scan, which means it has to examine every row in the table to find the employees with the first name 'John'. This is less efficient than using an index.

Add More Data and Analyze Sorting

Let's insert more data to make the query plan analysis more meaningful. Insert the following data into the employees table:

INSERT INTO employees (first_name, last_name, email, department) VALUES
('Alice', 'Johnson', 'alice.johnson@example.com', 'HR'),
('Bob', 'Williams', 'bob.williams@example.com', 'Finance'),
('Charlie', 'Brown', 'charlie.brown@example.com', 'IT'),
('David', 'Miller', 'david.miller@example.com', 'Sales'),
('Eve', 'Wilson', 'eve.wilson@example.com', 'Marketing'),
('John', 'Taylor', 'john.taylor@example.com', 'Engineering'),
('Jane', 'Anderson', 'jane.anderson@example.com', 'HR'),
('Robert', 'Thomas', 'robert.thomas@example.com', 'Finance'),
('Emily', 'Jackson', 'emily.jackson@example.com', 'IT'),
('Michael', 'White', 'michael.white@example.com', 'Sales');

Now, let's analyze a more complex query that involves sorting. Suppose you want to find all employees in the 'Sales' department and sort them by last name. You can use the following query:

SELECT * FROM employees WHERE department = 'Sales' ORDER BY last_name;

Analyze the query plan:

EXPLAIN QUERY PLAN SELECT * FROM employees WHERE department = 'Sales' ORDER BY last_name;

The output might look like this:

QUERY PLAN
`--SCAN employees USING INDEX idx_lastname

In this case, SQLite is performing a full table scan and then sorting the results.

Let's create an index on the department column:

CREATE INDEX idx_department ON employees (department);

Now, analyze the query plan again:

EXPLAIN QUERY PLAN SELECT * FROM employees WHERE department = 'Sales' ORDER BY last_name;

The output might change to:

QUERY PLAN
|--SEARCH employees USING INDEX idx_department (department=?)
`--USE TEMP B-TREE FOR ORDER BY

Now SQLite is using the idx_department index to find the employees in the 'Sales' department, but it still needs to sort the results.

Drop Redundant Indexes

In this step, you will learn how to identify and drop redundant indexes in SQLite. Redundant indexes can negatively impact database performance by increasing the overhead of write operations without providing any benefit to read operations.

Let's create an index on both department and last_name columns:

CREATE INDEX idx_department_lastname ON employees (department, last_name);

Now, list all indexes on the employees table:

PRAGMA index_list(employees);

You should see output similar to this:

0|idx_lastname|0|c|0
1|idx_department|0|c|0
2|idx_department_lastname|0|c|0

Now, let's analyze a query that filters by department and last_name:

EXPLAIN QUERY PLAN SELECT * FROM employees WHERE department = 'Sales' AND last_name = 'Doe';

The output might look like this:

QUERY PLAN
`--SEARCH employees USING INDEX idx_department_lastname (department=? AND last_name=?)

This output indicates that SQLite is using the idx_department_lastname index for this query.

Now, let's analyze a query that filters only by department:

EXPLAIN QUERY PLAN SELECT * FROM employees WHERE department = 'Sales';

The output might look like this:

QUERY PLAN
`--SEARCH employees USING INDEX idx_department (department=?)

This output indicates that SQLite is using the idx_department index for this query.

In this scenario, the idx_department_lastname index is redundant because the idx_department index can be used for queries that filter only by department. The idx_department_lastname index only provides a benefit for queries that filter by both department and last_name.

To drop the redundant idx_department index, you can use the DROP INDEX command:

DROP INDEX idx_department;

Now, list all indexes on the employees table again:

PRAGMA index_list(employees);

You should see that the idx_department index is no longer listed.

Summary

In this lab, you have learned how to optimize SQLite database performance using indexes. You created single-column indexes to improve query speed, analyzed query plans using EXPLAIN QUERY PLAN, and dropped redundant indexes. These skills will help you build more efficient and responsive SQLite databases.