Practical Examples of Frequency Analysis
In this section, we'll explore some practical examples of using sets to perform frequency analysis in Python.
Example 1: Analyzing Word Frequencies in a Text
Let's say we have a text file containing a short story, and we want to analyze the frequency of words in the text.
## Read the text file
with open('story.txt', 'r') as file:
text = file.read().lower().split()
## Count the frequency of words using sets
word_frequencies = {}
for word in set(text):
word_frequencies[word] = text.count(word)
## Sort the words by frequency in descending order
sorted_frequencies = sorted(word_frequencies.items(), key=lambda x: x[1], reverse=True)
## Print the top 10 most frequent words
print("Top 10 Most Frequent Words:")
for word, frequency in sorted_frequencies[:10]:
print(f"{word}: {frequency}")
This code will output the top 10 most frequent words in the text file, along with their frequencies.
Example 2: Identifying Unique User IDs in a Log File
Suppose you have a log file containing user activity, and you want to find the unique user IDs in the file.
## Read the log file
with open('activity_log.txt', 'r') as file:
user_ids = [line.strip().split(',')[0] for line in file]
## Convert the list of user IDs to a set to get the unique IDs
unique_user_ids = set(user_ids)
## Print the unique user IDs
print("Unique User IDs:")
for user_id in unique_user_ids:
print(user_id)
This code will output a list of unique user IDs present in the log file.
Example 3: Detecting Anomalies in Sensor Data
Imagine you have a dataset of sensor readings, and you want to identify any anomalous readings that deviate from the normal range.
## Assume we have a list of sensor readings
sensor_data = [10, 12, 15, 8, 20, 11, 9, 18, 14, 13, 22, 10]
## Convert the sensor data to a set to get the unique readings
unique_readings = set(sensor_data)
## Identify the frequency of each unique reading
for reading in unique_readings:
frequency = sensor_data.count(reading)
print(f"Reading {reading} appears {frequency} times.")
## Detect anomalies (readings that appear only once)
anomalies = [reading for reading in unique_readings if sensor_data.count(reading) == 1]
print("\nAnomalous Readings:")
for anomaly in anomalies:
print(anomaly)
This code will first print the frequency of each unique sensor reading, and then identify any anomalous readings that appear only once in the dataset.
By exploring these practical examples, you can see how sets can be effectively used to perform frequency analysis and address various data processing challenges in Python.