Applying the Optimized find_indices() Function
Now that you've learned about the optimization techniques for the find_indices()
function, let's explore some practical applications and use cases.
Filtering Large Datasets
One common use case for the find_indices()
function is to filter large datasets based on specific criteria. For example, imagine you have a dataset of customer information, and you need to extract the indices of customers from a certain city or with a specific age range.
## Example dataset
customer_data = [
{"name": "John Doe", "age": 35, "city": "New York"},
{"name": "Jane Smith", "age": 28, "city": "Los Angeles"},
{"name": "Bob Johnson", "age": 42, "city": "Chicago"},
{"name": "Sarah Lee", "age": 31, "city": "New York"},
{"name": "Tom Wilson", "age": 25, "city": "Los Angeles"},
]
## Find indices of customers from New York
new_york_indices = find_indices_optimized([d["city"] for d in customer_data], ["New York"])
print(new_york_indices) ## Output: [0, 3]
## Find indices of customers aged 30 or above
age_30_plus_indices = find_indices_binary_search(sorted([d["age"] for d in customer_data]), range(30, 101))
print(age_30_plus_indices) ## Output: [0, 2, 3]
In this example, we use the optimized find_indices_optimized()
function to find the indices of customers from New York, and the find_indices_binary_search()
function to find the indices of customers aged 30 or above.
Analyzing Log Files
Another common use case for the find_indices()
function is to analyze log files. For example, you might want to find the line numbers where specific error messages or warning messages appear.
## Example log file
log_data = [
"2023-04-01 10:00:00 INFO: Application started",
"2023-04-01 10:00:10 WARNING: Disk space running low",
"2023-04-01 10:00:15 ERROR: Database connection failed",
"2023-04-01 10:00:20 INFO: Processing batch job",
"2023-04-01 10:00:30 ERROR: Invalid input data",
]
## Find indices of lines containing "ERROR"
error_indices = find_indices_optimized(log_data, ["ERROR"])
print(error_indices) ## Output: [2, 4]
## Find indices of lines containing "WARNING"
warning_indices = find_indices_binary_search(sorted(log_data), ["WARNING"])
print(warning_indices) ## Output: [1]
In this example, we use the optimized find_indices_optimized()
function to find the indices of lines containing the "ERROR" message, and the find_indices_binary_search()
function to find the indices of lines containing the "WARNING" message.
By applying the optimized find_indices()
function, you can efficiently locate and extract relevant information from large datasets, log files, or any other list-based data structure, making your data analysis and processing tasks more efficient and scalable.