Ultimate Regex Cheat Sheet: 20 Regex Practical Examples for Python

Master regular expressions with this detailed Regex cheat sheet. Learn Python regex with practical examples and become proficient in text manipulation and pattern matching.

Regular expressions, commonly known as regex, are powerful tools for matching patterns in text. They are widely used in text processing, data validation, and search operations. This ultimate regex cheat sheet will help you understand and use regular expressions effectively, especially in Python. Whether you're an 8th grader just starting out or an experienced developer, this guide will provide you with everything you need to know about regex.

Introduction to Regular Expressions

A regular expression is a sequence of characters that forms a search pattern. When you search for data in text, you can use this search pattern to describe what you are looking for. Regular expressions can be used to perform all types of text search and text replacement operations.

Why use regular expressions?

Regular expressions allow you to perform complex text searches and manipulations with ease. They are highly versatile and can be used for a wide range of tasks, from simple searches to advanced text processing.

Basic Syntax of Regular Expressions

Let's start with some basic regex syntax that forms the foundation of more complex patterns.

1. Literal Characters

Literal characters match themselves. For example, the pattern a matches the character "a".

import re
pattern = r'a'
text = 'apple'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 1), match='a'>

2. Metacharacters

Metacharacters are characters with special meanings in regex.

  • .: Matches any character except a newline.
  • ^: Matches the start of a string.
  • $: Matches the end of a string.
  • *: Matches 0 or more repetitions of the preceding element.
  • +: Matches 1 or more repetitions of the preceding element.
  • ?: Matches 0 or 1 repetition of the preceding element.
  • []: Matches any single character within the brackets.
  • |: Matches either the pattern before or the pattern after the |.
pattern = r'a.b'
text = 'a1b'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 3), match='a1b'>

Quantifiers

Quantifiers specify how many instances of a character, group, or character class must be present for a match.

3. * Quantifier

Matches 0 or more repetitions of the preceding element.

pattern = r'ab*'
text = 'abbbb'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 5), match='abbbb'>

4. + Quantifier

Matches one or more repetitions of the preceding element.

pattern = r'ab+'
text = 'abbbb'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 5), match='abbbb'>

5. ? Quantifier

Matches 0 or 1 repetition of the preceding element.

pattern = r'ab?'
text = 'a'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 1), match='a'>

6. {n} Quantifier

Matches exactly n repetitions of the preceding element.

pattern = r'a{3}'
text = 'aaa'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 3), match='aaa'>

7. {n,} Quantifier

Matches n or more repetitions of the preceding element.

pattern = r'a{2,}'
text = 'aaaa'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 4), match='aaaa'>

8. {n,m} Quantifier

Matches between n and m repetitions of the preceding element.

pattern = r'a{2,4}'
text = 'aaaaa'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 4), match='aaaa'>

Character Classes

Character classes match any one of a set of characters.

9. \d Character Class

Matches any digit. Equivalent to [0-9].

pattern = r'\d'
text = '123abc'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 1), match='1'>

10. \D Character Class

Matches any non-digit. Equivalent to [^0-9].

pattern = r'\D'
text = '123abc'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(3, 4), match='a'>

11. \w Character Class

Matches any alphanumeric character or underscore. Equivalent to [a-zA-Z0-9_].

pattern = r'\w'
text = '123abc'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 1), match='1'>

12. \W Character Class

Matches any non-alphanumeric character. Equivalent to [^a-zA-Z0-9_].

pattern = r'\w'
text = '123abc'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 1), match='1'>

13. \s Character Class

Matches any whitespace character.

pattern = r'\s'
text = 'hello world'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(5, 6), match=' '>

14. \S Character Class

Matches any non-whitespace character.

pattern = r'\S'
text = 'hello world'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 1), match='h'>

Anchors

Anchors are used to match positions within a string.

15. ^ Anchor

Matches the start of a string.

pattern = r'^hello'
text = 'hello world'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 5), match='hello'>

16. $ Anchor

Matches the end of a string.

pattern = r'world$'
text = 'hello world'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(6, 11), match='world'>

17. \b Anchor

Matches a word boundary.

pattern = r'\bworld\b'
text = 'hello world'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(6, 11), match='world'>

18. \B Anchor

Matches a non-word boundary.

pattern = r'wo\Brld'
text = 'hello world'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(6, 11), match='world'>

Groups and Lookarounds

Groups and lookarounds are used to capture parts of the pattern and to perform advanced matching.

19. Grouping

Parentheses are used to group parts of the regex.

pattern = r'(hello) (world)'
text = 'hello world'
match = re.search(pattern, text)
print(match.groups())  # Output: ('hello', 'world')

20. Lookahead

Lookahead assertions check for a match without consuming characters.

pattern = r'hello(?= world)'
text = 'hello world'
match = re.search(pattern, text)
print(match)  # Output: <re.Match object; span=(0, 5), match='hello'>

Conclusion

Regular expressions are powerful tools that can simplify text processing and data validation tasks. This regex cheat sheet, with 20 practical examples, provides a solid foundation for mastering regex in Python. Whether you're working on data extraction, text manipulation, or validation, these examples will help you understand and apply regex effectively. Start using this cheat sheet to make your text processing tasks easier and more efficient.

Previous Post Next Post