SQLite JSON Processing

SQLiteSQLiteBeginner
Practice Now

Introduction

In this lab, you will learn how to process JSON data within SQLite. You'll explore how to store, extract, filter, and update JSON data within an SQLite database. This lab provides a hands-on introduction to working with JSON data in SQLite, a skill increasingly valuable in modern data management.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL sqlite(("SQLite")) -.-> sqlite/SQLiteGroup(["SQLite"]) sqlite/SQLiteGroup -.-> sqlite/make_table("Create New Table") sqlite/SQLiteGroup -.-> sqlite/add_rows("Insert Multiple Rows") sqlite/SQLiteGroup -.-> sqlite/get_all("Select All Rows") sqlite/SQLiteGroup -.-> sqlite/query_where("Filter With WHERE") sqlite/SQLiteGroup -.-> sqlite/edit_row("Update Single Row") sqlite/SQLiteGroup -.-> sqlite/check_version("Get SQLite Version") subgraph Lab Skills sqlite/make_table -.-> lab-552553{{"SQLite JSON Processing"}} sqlite/add_rows -.-> lab-552553{{"SQLite JSON Processing"}} sqlite/get_all -.-> lab-552553{{"SQLite JSON Processing"}} sqlite/query_where -.-> lab-552553{{"SQLite JSON Processing"}} sqlite/edit_row -.-> lab-552553{{"SQLite JSON Processing"}} sqlite/check_version -.-> lab-552553{{"SQLite JSON Processing"}} end

Create a Database and Table

In this step, you will create an SQLite database and a table to store JSON data. SQLite is a lightweight database that stores data in a single file, making it easy to manage.

First, open your terminal. The default path is /home/labex/project.

Now, let's create a directory to store our database.

mkdir sqlite_json
cd sqlite_json

These commands create a directory named sqlite_json and then change the current directory to it. This will keep your project files organized.

Next, create an SQLite database named mydatabase.db.

sqlite3 mydatabase.db

This command opens the SQLite shell, connecting to the mydatabase.db database. If the database file doesn't exist, SQLite will create it.

Now, create a table named products with columns for id, name, and details. The details column will store JSON data as text.

CREATE TABLE products (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT,
    details TEXT
);

This SQL command creates the products table:

  • id: A unique integer that automatically increments for each new product.
  • name: The name of the product (e.g., Laptop, Smartphone).
  • details: A text field to store the JSON data for the product.

Insert JSON Data

In this step, you will insert JSON data into the products table.

Let's insert two sample records into the products table.

INSERT INTO products (name, details) VALUES (
    'Laptop',
    '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}'
);

INSERT INTO products (name, details) VALUES (
    'Smartphone',
    '{"brand": "Samsung", "model": "Galaxy S21", "specs": {"display": "6.2 inch", "camera": "12MP", "storage": "128GB"}}'
);

These INSERT statements add two rows to the products table. The details column contains JSON data as a text string.

To verify the data was inserted correctly, run the following query:

SELECT * FROM products;

Expected Output:

1|Laptop|{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}
2|Smartphone|{"brand": "Samsung", "model": "Galaxy S21", "specs": {"display": "6.2 inch", "camera": "12MP", "storage": "128GB"}}

This output confirms that the JSON data has been successfully stored in the products table.

Extract JSON Fields with a Custom Function

Since SQLite doesn't have built-in JSON functions, you'll create a custom Python function to extract data from the JSON strings.

First, exit the SQLite shell.

.exit

Now, create a Python file named json_extractor.py.

nano json_extractor.py

Paste the following Python code into the json_extractor.py file:

import sqlite3
import json

def json_extract(json_str, path):
    try:
        json_data = json.loads(json_str)
        path_components = path.split('.')
        value = json_data
        for component in path_components:
            value = value.get(component)
        return value
    except (json.JSONDecodeError, AttributeError, TypeError):
        return None

def connect_db(db_path):
    conn = sqlite3.connect(db_path)
    conn.create_function("json_extract", 2, json_extract)
    return conn

if __name__ == '__main__':
    conn = connect_db('mydatabase.db')
    cursor = conn.cursor()

    cursor.execute("SELECT json_extract(details, 'brand') FROM products WHERE name = 'Laptop'")
    print(cursor.fetchone()[0])

    cursor.execute("SELECT json_extract(details, 'specs.cpu') FROM products WHERE name = 'Laptop'")
    print(cursor.fetchone()[0])

    conn.close()

This Python code defines a function json_extract that takes a JSON string and a path as input and returns the value at that path. It also includes a connect_db function to connect to the SQLite database and register the json_extract function.

  • json.loads(json_str): This line parses the JSON string into a Python dictionary.
  • path.split('.'): This splits the path into a list of components. For example, 'specs.cpu' becomes ['specs', 'cpu'].
  • The loop iterates through the path components, accessing nested values in the JSON data.

Save the file and exit nano.

Now, run the Python script.

python3 json_extractor.py

Expected Output:

Dell
Intel i7

This script connects to the database, registers the json_extract function, and then uses it to extract the brand and CPU of the Laptop.

Filter Data Using JSON Queries

In this step, you'll use the custom json_extract function to filter data based on values within the JSON fields.

Open the json_extractor.py file again.

nano json_extractor.py

Modify the json_extractor.py file to include a function for querying the database:

import sqlite3
import json

def json_extract(json_str, path):
    try:
        json_data = json.loads(json_str)
        path_components = path.split('.')
        value = json_data
        for component in path_components:
            value = value.get(component)
        return value
    except (json.JSONDecodeError, AttributeError, TypeError):
        return None

def connect_db(db_path):
    conn = sqlite3.connect(db_path)
    conn.create_function("json_extract", 2, json_extract)
    return conn

def filter_products(db_path, json_path, value):
    conn = connect_db(db_path)
    cursor = conn.cursor()
    query = f"SELECT * FROM products WHERE json_extract(details, '{json_path}') = '{value}'"
    cursor.execute(query)
    results = cursor.fetchall()
    conn.close()
    return results

if __name__ == '__main__':
    ## Example usage:
    dell_products = filter_products('mydatabase.db', 'brand', 'Dell')
    print("Products with brand 'Dell':", dell_products)

    intel_products = filter_products('mydatabase.db', 'specs.cpu', 'Intel i7')
    print("Products with CPU 'Intel i7':", intel_products)

This code adds a filter_products function that takes a database path, a JSON path, and a value as input. It then connects to the database, registers the json_extract function, and executes a query to find all products where the value at the specified JSON path matches the given value.

Save the file and exit nano.

Now, run the Python script.

python3 json_extractor.py

Expected Output:

Products with brand 'Dell': [(1, 'Laptop', '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}')]
Products with CPU 'Intel i7': [(1, 'Laptop', '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}')]

This output shows the products that match the specified criteria.

Update JSON Values

In this step, you'll learn how to update values within the JSON fields.

Open the json_extractor.py file.

nano json_extractor.py

Modify the json_extractor.py file to include functions for updating the JSON and the database:

import sqlite3
import json

def json_extract(json_str, path):
    try:
        json_data = json.loads(json_str)
        path_components = path.split('.')
        value = json_data
        for component in path_components:
            value = value.get(component)
        return value
    except (json.JSONDecodeError, AttributeError, TypeError):
        return None

def connect_db(db_path):
    conn = sqlite3.connect(db_path)
    conn.create_function("json_extract", 2, json_extract)
    return conn

def filter_products(db_path, json_path, value):
    conn = connect_db(db_path)
    cursor = conn.cursor()
    query = f"SELECT * FROM products WHERE json_extract(details, '{json_path}') = '{value}'"
    cursor.execute(query)
    results = cursor.fetchall()
    conn.close()
    return results

def update_json_value(json_str, path, new_value):
    try:
        json_data = json.loads(json_str)
        path_components = path.split('.')
        target = json_data
        for i in range(len(path_components) - 1):
            target = target.get(path_components[i])

        target[path_components[-1]] = new_value
        return json.dumps(json_data)
    except (json.JSONDecodeError, AttributeError, TypeError):
        return json_str  ## Return original if update fails

def update_product_details(db_path, product_name, json_path, new_value):
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()

    ## Get the current details
    cursor.execute("SELECT details FROM products WHERE name = ?", (product_name,))
    result = cursor.fetchone()
    if not result:
        conn.close()
        return False

    current_details = result[0]

    ## Update the JSON
    updated_details = update_json_value(current_details, json_path, new_value)

    ## Update the database
    cursor.execute("UPDATE products SET details = ? WHERE name = ?", (updated_details, product_name))
    conn.commit()
    conn.close()
    return True

if __name__ == '__main__':
    ## Example usage:
    update_product_details('mydatabase.db', 'Laptop', 'specs.memory', '32GB')
    print("Laptop memory updated to 32GB")

    ## Verify the update
    conn = sqlite3.connect('mydatabase.db')
    cursor = conn.cursor()
    cursor.execute("SELECT details FROM products WHERE name = 'Laptop'")
    updated_details = cursor.fetchone()[0]
    print("Updated Laptop details:", updated_details)
    conn.close()

This code adds two functions:

  • update_json_value: This function takes a JSON string, a path, and a new value as input. It parses the JSON string, updates the value at the specified path, and returns the updated JSON string.
  • update_product_details: This function takes a database path, a product name, a JSON path, and a new value as input. It connects to the database, retrieves the current JSON data for the product, updates the value at the specified path using update_json_value, and then updates the database with the modified JSON data.

Save the file and exit nano.

Now, run the Python script.

python3 json_extractor.py

Expected Output:

Laptop memory updated to 32GB
Updated Laptop details: {"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "32GB", "storage": "512GB SSD"}}

This output confirms that the Laptop's memory has been updated to 32GB in the database.

Summary

In this lab, you have learned how to process JSON data within SQLite. You started by creating a database and a table to store JSON data. Then, you learned how to insert JSON data into the table. You created a custom Python function to extract specific fields from the JSON data and used this function to filter data based on values within the JSON fields. Finally, you learned how to update values within the JSON fields using a custom Python function. These skills provide a foundation for managing JSON data effectively within SQLite databases.