How Regular Expression works in Python
Introduction:
In this blog, we are going to discuss how regular expressions are works when dealing with text data.
Regular Expressions knew as regex. It is used to match or find strings of text such as particular characters, words, sets of strings, patterns of characters, or numbers. This is required when dealing with raw data from the web, which would contain long text, repeated text and HTML tags.
How It Works
The python has module name named re is used to work with regular expression.
import re
Regex Flags:
Regular expression may include some basic flags. It control various aspects of matching. There are I, L, M, S, U, X.
- re.I: is used for ignoring casing.
- re.L: is used to find a local dependent.
- re.M: is useful if you want to find patterns throughout multiple lines.
- re.S: is used to find dot matches.
- re.U: is used to work for Unicode data.
- re.X: is used for writing regex in a more readable format.
Regex Patterns:
Patterns are characters that are performed in individual ways by a Regular expression engine.
1. [ab]
Match the single occurrence of character a and b.
2. [^ab]
Match characters except for a and b.
3. [a-z]
Match the character range of a to z.
4. [^a-z]
Match the character range except a to z.
5. [A-Z]
Match the character range of A to Z.
6. [0-9]
Match the character range of number 0 to 9.
7. [a-zA-Z]
Match the character range of a to z as well as A to Z.
8. [ ]
Match the any sing character.
9. \s
Match the any whitespace character.
10. \S
Match the any non-whitespace character.
11. \d
Match the any digit.
12. \D
Match the any non-digit.
13. \w
Match the any words.
14. \W
Match the any non-words.
15. \b
Match the any word boundary.
16. \B
Match the any non-word boundary.
17. ^
Starting of the string.
18. $
Ending of the string.
19. (a|b)
Match the character either a or b.
20. a?
The occurrence of a is zero or one but not more than that.
21. a*
The occurrence of a is zero or more than that.
22. a+
The occurrence of a is one or more than that.
23. a{n}
Match n number of occurrence of a.
24. a{2}
Match exactly two occurrence of a.
25. a{2, }
Match simultaneously two or more occurrence of a.
26. a{2, 6}
Match simultaneously between 2 to 6 occurrence of a.
Regex Functions:
The module defines several functions and used to find the patterns
and then can be processed according to the requirements of the application.
1. re.split() - This checks for match of the strings where the split have occurred.
2. re.match() - This checks for a match of the string only
at the start of the string. So, if it finds the pattern
at the start of the input string, then it returns the
matched pattern; if not, it returns a none.
3. re.search() - This checks for a match of the string
anywhere in the string. It finds all the occurrences of
the pattern produce a match with the given input string.
4. re.sub() - This checks for a match of the string where matched occurrences are replaced with the content of replace variable.
5. re.findall() - This checks for a match of the string containing all matches.
Examples:
1. If you want to split the sentence into words with white spaces, using re.split() function.
import re
print(re.split('\s+','I like this blog.'))
['I', 'like', 'this', 'blog.']
2. If you want to extract the email id, using re.findall() function.
import re
doc = "My email addresses are
abc@gmail.com, xyz@yahoo.com"
addresses = re.findall(r'[\w\.-]+@[\w\.-]+', doc)
for address in addresses:
print(address)
abc@gmail.com
xyz@yahoo.com
3. If you want to replace word to another word, using re.sub() function.
import re
doc = "Hi, My name is Adam"
new_doc = re.sub(r'Adam', r'John', doc)
print(new_doc)
Hi, My name is John
Conclusion:
In summary, we are discussed about how regular expressions are works when dealing with text data.
Thank you...
References:
[1]: Natural Language Processing Recipes Unlocking Text Data with Machine Learning and Deep Learning using Python link
Doing good job Mike.keep do what you learn.and be a inspiration to all.to who to do.and my word congratulations for your upcoming anything else about your future life
ReplyDelete