Here are some basic concepts of regular expressions (regex):
-
Literals: Characters that match themselves. For example, the regex
catmatches the string "cat". -
Metacharacters: Special characters that have specific meanings, such as:
.: Matches any single character except newline.^: Matches the start of a string.$: Matches the end of a string.*: Matches zero or more occurrences of the preceding element.+: Matches one or more occurrences of the preceding element.?: Matches zero or one occurrence of the preceding element.
-
Character Classes: Defined using square brackets
[], they match any one of the enclosed characters. For example,[abc]matches "a", "b", or "c". -
Predefined Character Classes:
\d: Matches any digit (equivalent to[0-9]).\D: Matches any non-digit.\w: Matches any word character (alphanumeric plus underscore).\W: Matches any non-word character.\s: Matches any whitespace character (spaces, tabs).\S: Matches any non-whitespace character.
-
Quantifiers: Specify how many times an element can occur:
{n}: Exactly n times.{n,}: At least n times.{n,m}: Between n and m times.
-
Groups and Capturing: Parentheses
()are used to group patterns and capture the matched content for later use. -
Alternation: The pipe
|acts as a logical OR. For example,cat|dogmatches either "cat" or "dog". -
Escaping: Use a backslash
\to escape metacharacters if you want to match them literally.
These concepts form the foundation of regex and can be combined to create complex patterns for matching strings.
