lp:~pythonregexp2.7/python/issue2636-08
Add support for named POSIX character classes of the form [:class:]. Note, the [] are part of the class definition, and are not in them selves forming a character set. To use a character class it must be included in a character set, e.g. via [[:alphanum:]_], which is equivalent to \w. A character class outside a character set will be interpreted as a non-standard character set and will generate a warning indicating this. Thus, r'[:alpha:]' will match any character in the set([':', 'a', 'h', 'l', 'p']) and generate an error, not [a-zA-Z], as the user might expect. This will be documented. The POSIX and Perl-specific Character Classes are:
alpha -- [A-Za-z]
alnum -- [A-Za-z0-9]
ascii -- All valid printable characters in the ASCII set
blank -- [ \t]
cntrl -- Control Character, [\x00-\x1f\x7f]
digit -- \d
graph -- alphanum + punct (below)
lower -- [a-z]
print -- graph + space (below)
punct -- Any punctuation, e.g. ',', '.', '!', etc.
space -- \s + '\x0b', [\s\x0b]
upper -- [A-Z]
word -- \w (specific to Python / Perl, not part of POSIX)
xdigit -- [0-9a-fA-F]
Note: The Unicode equivalents will be added to each character class where applicable.
- Get this branch:
- bzr branch lp:~pythonregexp2.7/python/issue2636-08
Branch merges
Branch information
Recent revisions
- 39031. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
-
Merged in changes from the core Regexp branch.
- 39030. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
-
Merged in changes from the core Regexp branch.
- 39029. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
-
Merged in changes from the core Regexp branch.
- 39028. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
-
Merged in changes from the core Regexp branch.
- 39027. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
-
Merged in changes from the core Regexp branch.
- 39026. By Jeffrey C. Jacobs <email address hidden>
-
Rolled back the Character Class comment because that will be
useful in this branch. - 39024. By Jeffrey C. Jacobs <email address hidden>
-
(from original local svn repository): Added ingore directives to not
allow .pyc and .pyo files to be checked in; could not be done under
DOS because there is no valid editor.Reapplied my edits to sre_parse.py:
Removed unused code for Character Classes -- will be added back in
sub-branch 6 for adding in Character Class support. - 39023. By Jeffrey C. Jacobs <email address hidden>
-
Moved include of sre_parse to the only section of code that actually needs it inside the compile method. Thus, unless compile is called, sre_parse will not unnecessarily be included.
Branch metadata
- Branch format:
- Branch format 6
- Repository format:
- Bazaar pack repository format 1 with rich root (needs bzr 1.0)