The perfect place for easy learning...

Python

×

Topics List


Python RegEx





A regular expression is a series of characters used to search or find a pattern in a string. In other words, a regular expression is a special sequence of characters that form a pattern. The regular expressions are used to perform a variety of operations like searching a substring in a string, replacing a string with another, splitting a string, etc.

The Python programming language provides a built-in module re to work with regular expressions. The re is a built-in module that gives us a variety of built-in methods to work with regular expressions. In Python, the regular expression is known as RegEx in short form.

When we want to use regular expressions, we must import the re module. See the example below.

Example
import re

result = re.search('smart', 'www.btechsmartclass.com')

if result:
    print('Match found!')
else:
    print('Match not found!!')

When we run the above example code, it produces the following output.

Python Regular Expressions

Creating Regular Expression

The regular expressions are created using the following.

  • Metacharacters
  • Special Sequences
  • Sets

Metacharacters

Metacharacters are the characters with special meaning in a regular expression. The following table provides a list of metacharacters with their meaning.

Metacharacters Meaning
[ ] A set of characters
\ A special sequence begin
. Any character excluding newline
^ Patter starts with
$ Pattern ends with
* Zero r more characters
+ One or more characters
{ } Exactly the specified number of characters
| Either or
() Grouping

Special Sequences

A special sequence is a character prefixed with \, and it has a special meaning. The following table gives a list of special sequences in Python with their meaning.

Special Sequences Meaning
\A the specified characters are at the beginning of the string
\b the specified characters are at the beginning or at the end
\B the specified characters are present, but NOT at the beginning or at the end
\d the string contains digits
\D the string does not contain digits
\s the string contains a white space character
\S the string doen not contains a white space character
\w the string contains any characters from a to Z, digits from 0-9, and the underscore _ character
\W the string does not contains any characters from a to Z, digits from 0-9, and the underscore _ character
\Z the specified characters are at the end of the string

Sets

A set is a set character enclosed in [ ], and it has a special meaning. The following table gives a list of sets with their meaning.

Set Meaning
[aeiou] Matches with one of the specified characters are present
[d-s] Matches with any lower case character from d to s
[^aeiou] Matches with any character except the specified
[1234] Matches with any of the specified digit
[3-8] Matches with any digit from 3 to 8
[a-zA-Z] Matches with any alphabet, lower or UPPER

Built-in methods of re module

The re module provides the following methods to work with regular expressions.

  • search( )
  • findall( )
  • sub( )
  • split( )

seach( ) in Python

The search( ) method of re object returns a Match object if the pattern found in the string. If there is more than one occurrence, it returns the first occurrence only.

Example
import re

print(re.search('smart', 'www.btechsmartclass.com'))

print(re.search('[cat]', 'www.btechsmartclass.com'))

When we run the above example code, it produces the following output.

Python regular expressions

findall( ) in Python

The findall( ) method of re object returns a list of all occurrences.

Example
import re

print(re.findall('smart', 'www.btechsmartclass.com'))

print(re.findall('[cat]', 'www.btechsmartclass.com'))

When we run the above example code, it produces the following output.

Python regular expressions

sub( ) in Python

The sub( ) method of re object replaces the match pattern with specified text in a string. The syntax of sub( ) method is sub( pattern, text, string ).


The sub( ) method does not modify the actual string instead, it returns the modified string as a new string.


Example
import re

webStr = 'www.btechsmartclass.com'

print(re.sub('.com', '.in', webStr))

print(webStr)

When we run the above example code, it produces the following output.

Python regular expressions

split( ) in Python

The split( ) method of re object returns a list of substrings where the actual string is beeing split at each match.


Example
import re

webStr = 'www.btechsmartclass.com'

print(re.split('\.', webStr))

When we run the above example code, it produces the following output.

Python regular expressions


Your ads here