Introduction
In this lab, you will learn how to process JSON data within SQLite. You'll explore how to store, extract, filter, and update JSON data within an SQLite database. This lab provides a hands-on introduction to working with JSON data in SQLite, a skill increasingly valuable in modern data management.
Create a Database and Table
In this step, you will create an SQLite database and a table to store JSON data. SQLite is a lightweight database that stores data in a single file, making it easy to manage.
First, open your terminal. The default path is /home/labex/project.
Now, let's create a directory to store our database.
mkdir sqlite_json
cd sqlite_json
These commands create a directory named sqlite_json and then change the current directory to it. This will keep your project files organized.
Next, create an SQLite database named mydatabase.db.
sqlite3 mydatabase.db
This command opens the SQLite shell, connecting to the mydatabase.db database. If the database file doesn't exist, SQLite will create it.
Now, create a table named products with columns for id, name, and details. The details column will store JSON data as text.
CREATE TABLE products (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
details TEXT
);
This SQL command creates the products table:
id: A unique integer that automatically increments for each new product.name: The name of the product (e.g., Laptop, Smartphone).details: A text field to store the JSON data for the product.
Insert JSON Data
In this step, you will insert JSON data into the products table.
Let's insert two sample records into the products table.
INSERT INTO products (name, details) VALUES (
'Laptop',
'{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}'
);
INSERT INTO products (name, details) VALUES (
'Smartphone',
'{"brand": "Samsung", "model": "Galaxy S21", "specs": {"display": "6.2 inch", "camera": "12MP", "storage": "128GB"}}'
);
These INSERT statements add two rows to the products table. The details column contains JSON data as a text string.
To verify the data was inserted correctly, run the following query:
SELECT * FROM products;
Expected Output:
1|Laptop|{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}
2|Smartphone|{"brand": "Samsung", "model": "Galaxy S21", "specs": {"display": "6.2 inch", "camera": "12MP", "storage": "128GB"}}
This output confirms that the JSON data has been successfully stored in the products table.
Extract JSON Fields with a Custom Function
Since SQLite doesn't have built-in JSON functions, you'll create a custom Python function to extract data from the JSON strings.
First, exit the SQLite shell.
.exit
Now, create a Python file named json_extractor.py.
nano json_extractor.py
Paste the following Python code into the json_extractor.py file:
## Import the necessary libraries
import sqlite3
import json
## Define a function to extract a value from a JSON string using a path
def json_extract(json_str, path):
try:
## Parse the JSON string into a Python dictionary
json_data = json.loads(json_str)
## Split the path into components (e.g., 'specs.cpu' becomes ['specs', 'cpu'])
path_components = path.split('.')
## Start with the full JSON object
value = json_data
## Traverse the JSON object using the path components
for component in path_components:
## Get the value for the current component
value = value.get(component)
## Return the final value
return value
## Handle errors if JSON is invalid or path does not exist
except (json.JSONDecodeError, AttributeError, TypeError):
return None
## Define a function to connect to the database and register the custom function
def connect_db(db_path):
## Connect to the SQLite database at the given path
conn = sqlite3.connect(db_path)
## Register the 'json_extract' Python function as a custom SQL function
## "json_extract" is the name in SQL, 2 is the number of arguments, json_extract is the Python function
conn.create_function("json_extract", 2, json_extract)
## Return the database connection
return conn
## This block runs when the script is executed directly
if __name__ == '__main__':
## Connect to the database and register the function
conn = connect_db('mydatabase.db')
## Create a cursor object to execute SQL queries
cursor = conn.cursor()
## Use the custom SQL function to extract the 'brand' from the 'details' column
cursor.execute("SELECT json_extract(details, 'brand') FROM products WHERE name = 'Laptop'")
## Fetch the result and print it
print(cursor.fetchone()[0])
## Use the custom SQL function to extract the 'cpu' from the nested 'specs' object
cursor.execute("SELECT json_extract(details, 'specs.cpu') FROM products WHERE name = 'Laptop'")
## Fetch the result and print it
print(cursor.fetchone()[0])
## Close the database connection
conn.close()
This Python code defines a function json_extract that takes a JSON string and a path as input and returns the value at that path. It also includes a connect_db function to connect to the SQLite database and register the json_extract function.
json.loads(json_str): This line parses the JSON string into a Python dictionary.path.split('.'): This splits the path into a list of components. For example,'specs.cpu'becomes['specs', 'cpu'].- The loop iterates through the path components, accessing nested values in the JSON data.
Save the file and exit nano.
Now, run the Python script.
python3 json_extractor.py
Expected Output:
Dell
Intel i7
This script connects to the database, registers the json_extract function, and then uses it to extract the brand and CPU of the Laptop.
Filter Data Using JSON Queries
In this step, you'll use the custom json_extract function to filter data based on values within the JSON fields.
Open the json_extractor.py file again.
nano json_extractor.py
Modify the json_extractor.py file to include a function for querying the database:
## Import the necessary libraries
import sqlite3
import json
## Define a function to extract a value from a JSON string using a path
def json_extract(json_str, path):
try:
## Parse the JSON string into a Python dictionary
json_data = json.loads(json_str)
## Split the path into components (e.g., 'specs.cpu' becomes ['specs', 'cpu'])
path_components = path.split('.')
## Start with the full JSON object
value = json_data
## Traverse the JSON object using the path components
for component in path_components:
## Get the value for the current component
value = value.get(component)
## Return the final value
return value
## Handle errors if JSON is invalid or path does not exist
except (json.JSONDecodeError, AttributeError, TypeError):
return None
## Define a function to connect to the database and register the custom function
def connect_db(db_path):
## Connect to the SQLite database at the given path
conn = sqlite3.connect(db_path)
## Register the 'json_extract' Python function as a custom SQL function
conn.create_function("json_extract", 2, json_extract)
## Return the database connection
return conn
## Define a function to filter products based on a JSON field
def filter_products(db_path, json_path, value):
## Connect to the database
conn = connect_db(db_path)
## Create a cursor object
cursor = conn.cursor()
## Create the SQL query using an f-string to filter by a JSON value
query = f"SELECT * FROM products WHERE json_extract(details, '{json_path}') = '{value}'"
## Execute the query
cursor.execute(query)
## Fetch all matching results
results = cursor.fetchall()
## Close the database connection
conn.close()
## Return the results
return results
## This block runs when the script is executed directly
if __name__ == '__main__':
## Example usage:
## Filter for products where the brand is 'Dell'
dell_products = filter_products('mydatabase.db', 'brand', 'Dell')
print("Products with brand 'Dell':", dell_products)
## Filter for products where the CPU is 'Intel i7'
intel_products = filter_products('mydatabase.db', 'specs.cpu', 'Intel i7')
print("Products with CPU 'Intel i7':", intel_products)
This code adds a filter_products function that takes a database path, a JSON path, and a value as input. It then connects to the database, registers the json_extract function, and executes a query to find all products where the value at the specified JSON path matches the given value.
Save the file and exit nano.
Now, run the Python script.
python3 json_extractor.py
Expected Output:
Products with brand 'Dell': [(1, 'Laptop', '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}')]
Products with CPU 'Intel i7': [(1, 'Laptop', '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}')]
This output shows the products that match the specified criteria.
Update JSON Values
In this step, you'll learn how to update values within the JSON fields.
Open the json_extractor.py file.
nano json_extractor.py
Modify the json_extractor.py file to include functions for updating the JSON and the database:
## Import the necessary libraries
import sqlite3
import json
## Define a function to extract a value from a JSON string using a path
def json_extract(json_str, path):
try:
## Parse the JSON string into a Python dictionary
json_data = json.loads(json_str)
## Split the path into components
path_components = path.split('.')
## Start with the full JSON object
value = json_data
## Traverse the JSON object
for component in path_components:
value = value.get(component)
return value
except (json.JSONDecodeError, AttributeError, TypeError):
return None
## Define a function to connect to the database and register the custom function
def connect_db(db_path):
## Connect to the database
conn = sqlite3.connect(db_path)
## Register the custom SQL function
conn.create_function("json_extract", 2, json_extract)
return conn
## Define a function to filter products based on a JSON field
def filter_products(db_path, json_path, value):
## Connect to the database
conn = connect_db(db_path)
## Create a cursor object
cursor = conn.cursor()
## Create the SQL query to filter by a JSON value
query = f"SELECT * FROM products WHERE json_extract(details, '{json_path}') = '{value}'"
## Execute the query
cursor.execute(query)
## Fetch all matching results
results = cursor.fetchall()
## Close the connection
conn.close()
return results
## Define a function to update a value within a JSON string
def update_json_value(json_str, path, new_value):
try:
## Parse the JSON string into a Python dictionary
json_data = json.loads(json_str)
## Split the path into components
path_components = path.split('.')
## Navigate to the parent of the target value
target = json_data
for i in range(len(path_components) - 1):
target = target.get(path_components[i])
## Update the target value
target[path_components[-1]] = new_value
## Convert the Python dictionary back to a JSON string
return json.dumps(json_data)
except (json.JSONDecodeError, AttributeError, TypeError):
## If an error occurs, return the original JSON string
return json_str
## Define a function to update a product's details in the database
def update_product_details(db_path, product_name, json_path, new_value):
## Connect to the database
conn = sqlite3.connect(db_path)
## Create a cursor object
cursor = conn.cursor()
## Get the current details for the specified product
## The '?' is a placeholder to prevent SQL injection
cursor.execute("SELECT details FROM products WHERE name = ?", (product_name,))
result = cursor.fetchone()
## If the product doesn't exist, close connection and return False
if not result:
conn.close()
return False
## Get the current JSON details string
current_details = result[0]
## Update the JSON string with the new value
updated_details = update_json_value(current_details, json_path, new_value)
## Update the database with the new JSON string
cursor.execute("UPDATE products SET details = ? WHERE name = ?", (updated_details, product_name))
## Commit the changes to the database
conn.commit()
## Close the connection
conn.close()
return True
## This block runs when the script is executed directly
if __name__ == '__main__':
## Example usage: Update the laptop's memory to 32GB
update_product_details('mydatabase.db', 'Laptop', 'specs.memory', '32GB')
print("Laptop memory updated to 32GB")
## Verify the update by fetching the data again
conn = sqlite3.connect('mydatabase.db')
cursor = conn.cursor()
cursor.execute("SELECT details FROM products WHERE name = 'Laptop'")
## Fetch the updated details
updated_details = cursor.fetchone()[0]
print("Updated Laptop details:", updated_details)
## Close the connection
conn.close()
This code adds two functions:
update_json_value: This function takes a JSON string, a path, and a new value as input. It parses the JSON string, updates the value at the specified path, and returns the updated JSON string.update_product_details: This function takes a database path, a product name, a JSON path, and a new value as input. It connects to the database, retrieves the current JSON data for the product, updates the value at the specified path usingupdate_json_value, and then updates the database with the modified JSON data.
Save the file and exit nano.
Now, run the Python script.
python3 json_extractor.py
Expected Output:
Laptop memory updated to 32GB
Updated Laptop details: {"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "32GB", "storage": "512GB SSD"}}
This output confirms that the Laptop's memory has been updated to 32GB in the database.
Summary
In this lab, you have learned how to process JSON data within SQLite. You started by creating a database and a table to store JSON data. Then, you learned how to insert JSON data into the table. You created a custom Python function to extract specific fields from the JSON data and used this function to filter data based on values within the JSON fields. Finally, you learned how to update values within the JSON fields using a custom Python function. These skills provide a foundation for managing JSON data effectively within SQLite databases.


