Practical Regex Examples
Real-World Symbol Removal Scenarios
1. Email Cleaning
import re
def clean_email(email):
## Remove special characters from email
return re.sub(r'[^\w.@]', '', email)
emails = [
"[email protected]",
"alice#[email protected]",
"invalid*email@domain"
]
cleaned_emails = [clean_email(email) for email in emails]
print(cleaned_emails)
2. Phone Number Standardization
def normalize_phone_number(phone):
## Remove non-digit characters
return re.sub(r'[^\d]', '', phone)
phone_numbers = [
"+1 (555) 123-4567",
"555.123.4567",
"(555) 123-4567"
]
standard_numbers = [normalize_phone_number(num) for num in phone_numbers]
print(standard_numbers)
Complex Removal Techniques
Symbol Removal Workflow
graph TD
A[Input Text] --> B{Identify Symbols}
B --> |Special Chars| C[Remove Symbols]
B --> |Unicode| D[Normalize Text]
C --> E[Cleaned Text]
D --> E
Advanced Text Cleaning
Scenario |
Regex Pattern |
Purpose |
Remove Punctuation |
[^\w\s] |
Clean text |
Extract Alphanumeric |
[a-zA-Z0-9] |
Filter characters |
Remove HTML Tags |
<[^>]+> |
Strip HTML |
3. HTML Tag Removal
def strip_html_tags(html_text):
## Remove all HTML tags
return re.sub(r'<[^>]+>', '', html_text)
html_content = """
<div>Welcome to <b>LabEx</b> Python Tutorial!</div>
"""
clean_text = strip_html_tags(html_content)
print(clean_text)
Data Validation Examples
Username Sanitization
def validate_username(username):
## Allow only alphanumeric and underscore
return re.sub(r'[^a-zA-Z0-9_]', '', username)
usernames = [
"john.doe",
"alice!user",
"python_developer123"
]
valid_usernames = [validate_username(name) for name in usernames]
print(valid_usernames)
Compiled Regex Patterns
## Precompile regex for repeated use
SYMBOL_PATTERN = re.compile(r'[^\w\s]')
def efficient_symbol_removal(text):
return SYMBOL_PATTERN.sub('', text)
## Faster for multiple operations
texts = ["Hello, World!", "LabEx Python Regex"]
cleaned = [efficient_symbol_removal(text) for text in texts]
Error Handling Strategies
def safe_symbol_removal(text):
try:
## Ensure input is string
return re.sub(r'[^\w\s]', '', str(text))
except Exception as e:
print(f"Error processing text: {e}")
return ''
Key Takeaways
- Use specific regex patterns
- Compile patterns for performance
- Handle different input types
- Consider unicode and special characters
By mastering these practical regex examples, you'll develop robust text processing skills in Python, transforming messy data into clean, usable information.