Python Itertools 模块

itertools 模块是一组旨在高效处理迭代器（如列表或字典）的工具，它们在速度和内存使用方面都非常高效。

该模块标准化了一套核心的快速、内存高效的工具，它们本身或组合使用都很有用。它们共同构成了一个“迭代器代数”，使得能够用纯 Python 简洁高效地构建专业工具。

itertools 模块包含在标准库中，必须导入。一些示例也使用了 operator 模块。

import itertools
import operator

accumulate()

创建一个迭代器，返回一个函数的结果。

itertools.accumulate(iterable[, func])

示例：

data = [1, 2, 3, 4, 5]
# 使用乘法函数进行累积
result = itertools.accumulate(data, operator.mul)
for each in result:
    print(each)

operator.mul 接收两个数字并将其相乘：

operator.mul(1, 2)
# 2

operator.mul(2, 3)
# 6

operator.mul(6, 4)
# 24

operator.mul(24, 5)
# 120

传递函数是可选的：

data = [5, 2, 6, 4, 5, 9, 1]
# 不带函数进行累积，默认为加法
result = itertools.accumulate(data)
for each in result:
    print(each)

如果没有指定函数，则对元素求和：

5
5 + 2 = 7
7 + 6 = 13
13 + 4 = 17
17 + 5 = 22
22 + 9 = 31
31 + 1 = 32

combinations()

接收一个可迭代对象和一个整数 $r$。它将创建所有包含 $r$ 个成员的唯一组合。

itertools.combinations(iterable, r)

示例：

shapes = ['circle', 'triangle', 'square',]
# 生成所有 2 个元素的组合
result = itertools.combinations(shapes, 2)
for each in result:
    print(each)

('circle', 'triangle')
('circle', 'square')
('triangle', 'square')

combinations_with_replacement()

与 combinations() 类似，但允许单个元素重复出现多次。

itertools.combinations_with_replacement(iterable, r)

示例：

shapes = ['circle', 'triangle', 'square']
# 生成允许重复元素的组合
result = itertools.combinations_with_replacement(shapes, 2)
for each in result:
    print(each)

('circle', 'circle')
('circle', 'triangle')
('circle', 'square')
('triangle', 'triangle')
('triangle', 'square')
('square', 'square')

count()

创建一个迭代器，从数字 start 开始，以固定的步长生成等距的值。

itertools.count(start=0, step=1)

示例：

# 从 10 开始计数，步长为 3
for i in itertools.count(10,3):
    print(i)
    if i > 20:
        break

cycle()

此函数会无限循环遍历一个可迭代对象。

itertools.cycle(iterable)

示例：

colors = ['red', 'orange', 'yellow', 'green', 'blue', 'violet']
# 无限循环遍历颜色
for color in itertools.cycle(colors):
    print(color)

red
orange
yellow
green
blue
violet
red
orange

当到达可迭代对象的末尾时，它会从头重新开始。

chain()

将一系列可迭代对象连接起来，并将其作为一个长迭代器返回。

itertools.chain(*iterables)

示例：

colors = ['red', 'orange', 'yellow', 'green', 'blue']
shapes = ['circle', 'triangle', 'square', 'pentagon']
# 将多个可迭代对象链接成一个
result = itertools.chain(colors, shapes)
for each in result:
    print(each)

red
orange
yellow
green
blue
circle
triangle
square
pentagon

compress()

使用另一个可迭代对象来过滤一个可迭代对象。

itertools.compress(data, selectors)

示例：

shapes = ['circle', 'triangle', 'square', 'pentagon']
selections = [True, False, True, False]
# 根据布尔选择过滤形状
result = itertools.compress(shapes, selections)
for each in result:
    print(each)

circle
square

dropwhile()

创建一个迭代器，它会丢弃可迭代对象中谓词（predicate）为真的元素；之后，返回所有剩余的元素。

itertools.dropwhile(predicate, iterable)

示例：

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
# 当条件为真时丢弃元素，然后返回所有剩余元素
result = itertools.dropwhile(lambda x: x<5, data)
for each in result:
    print(each)

filterfalse()

创建一个迭代器，它从可迭代对象中过滤元素，只返回那些谓词为 False 的元素。

itertools.filterfalse(predicate, iterable)

示例：

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
# 返回谓词为 False 的元素
result = itertools.filterfalse(lambda x: x<5, data)
for each in result:
    print(each)

groupby()

简单来说，这个函数用于将元素分组。

itertools.groupby(iterable, key=None)

示例：

robots = [
    {"name": "blaster", "faction": "autobot"},
    {"name": "galvatron", "faction": "decepticon"},
    {"name": "jazz", "faction": "autobot"},
    {"name": "metroplex", "faction": "autobot"},
    {"name": "megatron", "faction": "decepticon"},
    {"name": "starcream", "faction": "decepticon"},
]
# 按派系分组机器人（可迭代对象必须先排序才能正确分组）
for key, group in itertools.groupby(robots, key=lambda x: x['faction']):
    print(key)
    print(list(group))

autobot
[{'name': 'blaster', 'faction': 'autobot'}]
decepticon
[{'name': 'galvatron', 'faction': 'decepticon'}]
autobot
[{'name': 'jazz', 'faction': 'autobot'}, {'name': 'metroplex', 'faction': 'autobot'}]
decepticon
[{'name': 'megatron', 'faction': 'decepticon'}, {'name': 'starcream', 'faction': 'decepticon'}]

islice()

此函数非常类似于切片操作。它允许你从一个可迭代对象中截取一部分。

itertools.islice(iterable, start, stop[, step])

示例：

colors = ['red', 'orange', 'yellow', 'green', 'blue',]
# 切片可迭代对象以获取前 2 个元素
few_colors = itertools.islice(colors, 2)
for each in few_colors:
    print(each)

red
orange

permutations()

itertools.permutations(iterable, r=None)

示例：

alpha_data = ['a', 'b', 'c']
# 生成元素的所有排列
result = itertools.permutations(alpha_data)
for each in result:
    print(each)

('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')

product()

创建一系列可迭代对象的笛卡尔积。

num_data = [1, 2, 3]
alpha_data = ['a', 'b', 'c']
# 生成可迭代对象的笛卡尔积
result = itertools.product(num_data, alpha_data)
for each in result:
    print(each)

(1, 'a')
(1, 'b')
(1, 'c')
(2, 'a')
(2, 'b')
(2, 'c')
(3, 'a')
(3, 'b')
(3, 'c')

repeat()

此函数会一遍又一遍地重复一个对象。除非指定了 times 参数。

itertools.repeat(object[, times])

示例：

# 重复对象 3 次
for i in itertools.repeat("spam", 3):
    print(i)

spam
spam
spam

starmap()

创建一个迭代器，它使用从可迭代对象中获取的参数来计算函数。

itertools.starmap(function, iterable)

示例：

data = [(2, 6), (8, 4), (7, 3)]
# 将函数应用于从每个元组中解包的参数
result = itertools.starmap(operator.mul, data)
for each in result:
    print(each)

12
32
21

takewhile()

与 dropwhile() 相反。创建一个迭代器，只要谓词为真，就从可迭代对象中返回元素。

itertools.takewhile(predicate, iterable)

示例：

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
# 当条件为真时取元素，然后停止
result = itertools.takewhile(lambda x: x<5, data)
for each in result:
    print(each)

tee()

从单个可迭代对象返回 $n$ 个独立的迭代器。

itertools.tee(iterable, n=2)

示例：

colors = ['red', 'orange', 'yellow', 'green', 'blue']
# 将可迭代对象分成两个独立的迭代器
alpha_colors, beta_colors = itertools.tee(colors)
for each in alpha_colors:
    print(each)

red
orange
yellow
green
blue

colors = ['red', 'orange', 'yellow', 'green', 'blue']
alpha_colors, beta_colors = itertools.tee(colors)
for each in beta_colors:
    print(each)

red
orange
yellow
green
blue

zip_longest()

创建一个迭代器，它聚合来自每个可迭代对象的元素。如果可迭代对象的长度不相等，则用 fillvalue 填充缺失的值。迭代持续到最长的可迭代对象耗尽为止。

itertools.zip_longest(*iterables, fillvalue=None)

示例：

colors = ['red', 'orange', 'yellow', 'green', 'blue',]
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10,]
# 压缩可迭代对象，用 None 填充缺失值
for each in itertools.zip_longest(colors, data, fillvalue=None):
    print(each)

('red', 1)
('orange', 2)
('yellow', 3)
('green', 4)
('blue', 5)
(None, 6)
(None, 7)
(None, 8)
(None, 9)
(None, 10)