python regular expression (re modular )
What is regular expression
regular expression (Regular
Expression) It's a text mode , Include normal characters （ for example ,a reach z Letters between ） And special characters （ be called " Metacharacter "）. Regular expressions are described by a string , Matches a series of strings that match a syntax rule .
Introduction to regular characters
Normal characters include all printable and unprintable characters that are not displayed as metacharacters . This includes all upper and lower case letters , All numbers , All punctuation and some other symbols
Nonprinting characters can also be part of regular expressions .
Three important concepts of quantifier
greedy ( greedy ) as "*" character , Greedy quantifiers match the entire string first , When trying to match , It selects as much content as possible , If it fails, one character is backed back , Then try again , The process of backtracking is called backtracking , It goes back one character at a time , Until a match is found or no characters can be rolled back . Compared with the following two greedy quantifiers, the consumption of resources is the largest .
lazy ( reluctantly ) as "?", Lazy quantifiers are matched in another way , It tries to match from the start of the target , Check each character , And find what it wants to match , Loop until the end of the string .
Possession as "+", Possessive quantifiers are like greedy quantifiers , It will choose as much content as possible , Then try to find a match , But it only tries once , I can't go back . It's like grabbing a rock first , Then pick out the gold from the stone .
re Common function in module
Compiling regular expression patterns , Returns the schema of an object .（ You can compile some common regular expressions into regular expression objects , This can improve efficiency ）.
pattern： The expression string used at compile time .
flags： Compilation flag bit , Used to modify regular expressions , as ： Is it case sensitive , Multi line matching, etc . frequently-used flags yes ：
pattern The common methods of object are ：match(),search(),finall(),finder(),split(),sub(),subn().
This method is used to find the header of a string , It just finds a matching result and returns it .（ This method is not a perfect match . When pattern At the end, if string And the remaining characters , Still considered a success . Want a perfect match , You can add a boundary match at the end of the expression '$'）.
explain ：string Is the string to be matched ,pos and endpos Specifies the start and end positions of the string , When not specified is , By default, the match starts from the head , When the match is successful , return Match object .
count Used to specify the number of substitutions
Some points for attention
1,re.match And re.search And re.findall The difference between ：
re.match Matches only the beginning of the string , If the beginning of the string does not conform to the regular expression , The matching failed , Function return None; and re.search Matches the entire string , Until a match is found .re.findall Returns all matching results .
2, Greedy matching and non greedy matching
*,+,? It's all greedy matching , That is to match as many as possible , Add after ? Make it
Inert matching .
be careful ： If there are restrictions before and after , There is no greedy model , Unmatched mode failure .
Need to be relevant You can add wechat to get relevant information notes 【 Reptile 】