Python re Module
The re module lets you search, match, split, and replace text using regular expressions.
import re
Regular expressions are patterns for text. Use raw strings, such as r'\d+', so backslashes are passed to the regular expression engine correctly.
search()
search finds the first match anywhere in the string.
import re
match = re.search(r'\d+', 'Order #12345 shipped')
print(match.group())
12345
findall()
findall returns all non-overlapping matches.
import re
emails = re.findall(r'[\w.-]+@[\w.-]+', 'a@example.com b@example.com')
print(emails)
['a@example.com', 'b@example.com']
sub()
sub replaces matching text.
import re
message = re.sub(r'\s+', ' ', 'too many spaces')
print(message)
too many spaces
Compiling patterns
Compiled patterns are useful when you reuse the same expression.
import re
pattern = re.compile(r'^python', re.IGNORECASE)
print(bool(pattern.match('Python Cheatsheet')))
True
Capturing groups
Parentheses capture part of a match.
import re
match = re.search(r'(\w+)=(\d+)', 'count=42')
print(match.group(1))
print(match.group(2))
count
42
Handling no match
Always check whether a match exists before calling group().
import re
match = re.search(r'\d+', 'no number here')
if match:
print(match.group())
else:
print('No match')
No match