Data transformation is a critical process in data processing, and generators provide an elegant and efficient way to manipulate data streams.
Mapping Data
def transform_data(items):
for item in items:
yield item * 2
numbers = [1, 2, 3, 4, 5]
doubled = list(transform_data(numbers))
print(doubled) ## [2, 4, 6, 8, 10]
Filtering Data
def filter_even_numbers(items):
for item in items:
if item % 2 == 0:
yield item
numbers = [1, 2, 3, 4, 5, 6]
even_nums = list(filter_even_numbers(numbers))
print(even_nums) ## [2, 4, 6]
def multiply(items, factor):
for item in items:
yield item * factor
def add_offset(items, offset):
for item in items:
yield item + offset
numbers = [1, 2, 3, 4, 5]
result = list(add_offset(multiply(numbers, 2), 10))
print(result) ## [12, 14, 16, 18, 20]
graph LR
A[Input Data] --> B[Generator 1]
B --> C[Generator 2]
C --> D[Generator 3]
D --> E[Final Output]
Aggregation with Generators
def group_by_key(items):
groups = {}
for key, value in items:
if key not in groups:
groups[key] = []
groups[key].append(value)
return groups
data = [('a', 1), ('b', 2), ('a', 3), ('b', 4)]
grouped = group_by_key(data)
print(grouped) ## {'a': [1, 3], 'b': [2, 4]}
Technique |
Memory Usage |
Processing Speed |
List Comprehension |
High |
Moderate |
Generator Expression |
Low |
Fast |
Custom Generator |
Flexible |
Efficient |
Practical Considerations
- Use generators for large datasets
- Chain transformations for complex processing
- Leverage lazy evaluation
At LabEx, we emphasize the power of generators in efficient data transformation strategies.