Use Regular Expressions
In this step, you will learn how to use regular expressions in Python to identify special characters. Regular expressions are powerful tools for pattern matching in strings.
First, let's create a Python script named regex_special_characters.py
in your ~/project
directory using the VS Code editor.
## Content of regex_special_characters.py
import re
def find_special_characters(text):
special_characters = re.findall(r"[^a-zA-Z0-9\s]", text)
return special_characters
text = "Hello! This is a test string with some special characters like @, #, and $."
special_chars = find_special_characters(text)
print("Special characters found:", special_chars)
Here's what this code does:
import re
: This line imports the re
module, which provides regular expression operations.
def find_special_characters(text):
: This defines a function that takes a string as input and finds all special characters in it.
special_characters = re.findall(r"[^a-zA-Z0-9\s]", text)
: This line uses the re.findall()
function to find all characters in the input string that are not alphanumeric (a-z, A-Z, 0-9) or whitespace (\s
). The [^...]
is a negated character class, meaning it matches any character not in the specified set.
return special_characters
: This line returns a list of the special characters found.
- The remaining lines define a sample string, call the function to find special characters in it, and print the result.
Now, let's run the script. Open your terminal and execute the following command:
python regex_special_characters.py
You should see the following output:
Special characters found: ['!', '@', ',', '#', '$', '.']
This output shows the list of special characters found in the input string using the regular expression.
Let's modify the script to use a different regular expression that matches only punctuation characters.
Open regex_special_characters.py
in VS Code and modify it as follows:
## Modified content of regex_special_characters.py
import re
import string
def find_punctuation_characters(text):
punctuation_chars = re.findall(r"[" + string.punctuation + "]", text)
return punctuation_chars
text = "Hello! This is a test string with some punctuation like ., ?, and !."
punctuation = find_punctuation_characters(text)
print("Punctuation characters found:", punctuation)
In this modified script, we've used string.punctuation
to define the set of punctuation characters to match.
Run the script again:
python regex_special_characters.py
You should see the following output:
Punctuation characters found: ['!', '.', '?', '!']
This output shows the list of punctuation characters found in the input string using the regular expression and the string.punctuation
constant.
Using regular expressions provides a flexible and powerful way to identify and extract special characters from strings in Python.