Regular Expressions
Introduction to Regular Expressions
Regular expressions (regex) are powerful tools for pattern matching and text manipulation in Python. They provide a concise and flexible way to search, extract, and validate text based on complex patterns.
Basic Regex Syntax
Importing the Regex Module
import re
Metacharacter |
Meaning |
Example |
. |
Any single character |
a.b matches "acb", "a1b" |
* |
Zero or more occurrences |
ab*c matches "ac", "abc", "abbc" |
+ |
One or more occurrences |
ab+c matches "abc", "abbc" |
? |
Zero or one occurrence |
colou?r matches "color", "colour" |
^ |
Start of string |
^Hello matches "Hello world" |
$ |
End of string |
world$ matches "Hello world" |
Regex Pattern Matching Functions
re.search()
: Find First Match
text = "Welcome to LabEx Python Tutorial"
result = re.search(r"Python", text)
if result:
print("Pattern found!")
re.findall()
: Find All Matches
emails = "Contact us at [email protected] or [email protected]"
found_emails = re.findall(r'\S+@\S+', emails)
print(found_emails) ## ['[email protected]', '[email protected]']
Advanced Regex Techniques
Character Classes
## Match digits
phone_number = "Call 123-456-7890"
match = re.search(r'\d{3}-\d{3}-\d{4}', phone_number)
Grouping and Capturing
text = "Date: 2023-06-15"
match = re.search(r'(\d{4})-(\d{2})-(\d{2})', text)
if match:
year, month, day = match.groups()
print(f"Year: {year}, Month: {month}, Day: {day}")
Regex Workflow
graph TD
A[Input Text] --> B[Regex Pattern]
B --> C{Pattern Match?}
C --> |Yes| D[Extract/Process]
C --> |No| E[Handle No Match]
Practical Examples
Email Validation
def validate_email(email):
pattern = r'^[\w\.-]+@[\w\.-]+\.\w+'
return re.match(pattern, email) is not None
print(validate_email("[email protected]")) ## True
print(validate_email("invalid-email")) ## False
- Compile regex patterns for repeated use
- Use specific patterns to improve matching efficiency
Key Takeaways
- Regular expressions provide powerful text pattern matching
- Python's
re
module offers comprehensive regex support
- Understanding regex syntax enables complex text processing tasks