Applying re.findall() in Practice
Now that you understand the basics of re.findall()
, let's explore some practical applications and examples.
Suppose you have a block of text that contains various URLs, and you want to extract all of them. You can use re.findall()
with a regular expression pattern to achieve this:
import re
text = "Check out these websites: https://www.labex.io, http://example.net, and https://github.com/LabEx."
urls = re.findall(r'https?://\S+', text)
print(urls) ## Output: ['https://www.labex.io', 'http://example.net', 'https://github.com/LabEx']
The regular expression pattern r'https?://\S+'
matches both http
and https
URLs, and the \S+
part captures all non-whitespace characters after the protocol.
You can also use re.findall()
to extract numbers from a given text. Here's an example:
import re
text = "There are 5 apples, 3 oranges, and 10 bananas."
numbers = re.findall(r'\d+', text)
print(numbers) ## Output: ['5', '3', '10']
The regular expression pattern r'\d+'
matches one or more digits, allowing you to extract all numeric values from the input text.
Replacing Substrings
In addition to extracting information, you can also use re.findall()
in combination with other string manipulation functions to replace substrings in a text. Here's an example:
import re
text = "The quick brown fox jumps over the lazy dog."
new_text = re.sub(r'\b\w{4}\b', 'XXXX', text)
print(new_text) ## Output: The quick XXXX XXXX XXXX the XXXX dog.
In this example, the re.sub()
function is used to replace all words of length 4 with the string "XXXX". The regular expression pattern r'\b\w{4}\b'
matches words that are exactly 4 characters long, with word boundaries \b
to ensure that partial matches are not replaced.
These are just a few examples of how you can use re.findall()
in practice. The versatility of this function, combined with the power of regular expressions, makes it a valuable tool for a wide range of text processing tasks in Python.