Practical Regex Examples
Real-World Regex Applications
Regex is an essential tool for solving various text processing challenges in Python development.
Data Validation Scenarios
import re
def validate_inputs():
## Phone number validation
phone_pattern = r'^\+?1?\d{10,14}$'
## Password strength validation
password_pattern = r'^(?=.*[A-Za-z])(?=.*\d)(?=.*[@$!%*#?&])[A-Za-z\d@$!%*#?&]{8,}$'
## IP address validation
ip_pattern = r'^(\d{1,3}\.){3}\d{1,3}$'
test_cases = {
'phone': ['1234567890', '+15551234567'],
'password': ['LabEx2023!', 'weak'],
'ip': ['192.168.1.1', '256.0.0.1']
}
for category, cases in test_cases.items():
print(f"\n{category.upper()} Validation:")
for case in cases:
print(f"{case}: {bool(re.match(locals()[f'{category}_pattern'], case))}")
validate_inputs()
Text Parsing and Extraction
graph TD
A[Text Parsing] --> B[Extract Specific Patterns]
A --> C[Data Cleaning]
A --> D[Information Retrieval]
Log File Analysis
def parse_log_file(log_content):
## Extract IP addresses and timestamps
ip_pattern = r'\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b'
timestamp_pattern = r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}'
ips = re.findall(ip_pattern, log_content)
timestamps = re.findall(timestamp_pattern, log_content)
return {
'unique_ips': set(ips),
'timestamps': timestamps
}
## Sample log content
log_sample = """
2023-06-15 10:30:45 192.168.1.100 LOGIN
2023-06-15 11:45:22 10.0.0.50 ACCESS
2023-06-15 12:15:33 192.168.1.100 LOGOUT
"""
result = parse_log_file(log_sample)
print(result)
Regex Use Case |
Description |
Example |
Email Normalization |
Convert emails to lowercase |
re.sub(r'@.*', lambda m: m.group(0).lower(), email) |
URL Extraction |
Find web addresses |
re.findall(r'https?://\S+', text) |
Number Formatting |
Extract numeric values |
re.findall(r'\d+', text) |
Advanced Text Processing
def text_processor(text):
## Remove extra whitespaces
cleaned_text = re.sub(r'\s+', ' ', text).strip()
## Replace multiple occurrences
normalized_text = re.sub(r'(\w+)\1+', r'\1', cleaned_text)
return normalized_text
## LabEx text processing example
sample_text = "Python is awesome awesome in programming"
print(text_processor(sample_text))
graph TD
A[Regex Performance] --> B[Compile Patterns]
A --> C[Avoid Excessive Backtracking]
A --> D[Use Specific Patterns]
Key Takeaways
- Regex is versatile for data validation and extraction
- Careful pattern design prevents performance issues
- Practice and experimentation improve regex skills
- LabEx recommends incremental learning approach