lp:~pythonregexp2.7/python/issue2636-25

Created by TimeHorse and last modified

Currently, Python defines its Regular Expression Patterns as a series of SRE_CODE-sized op-codes. In turn, the SRE_CODE is defined in terms of a 16-bit wchar_t type, specifically the same definition as Py_UNICODE. The problem with this is that one of the parameters recognized for almost every op-code is a skip-count, which is an offset within the current code to the next op code. If a given op code encodes for a large expression, one greater than 65535 units in length, it would be impossible for this expression to be compiled into a Regular Expression pattern. It is the goal of this item to attempt to fix this problem, which is based on issue 1160.

Get this branch:
bzr branch lp:~pythonregexp2.7/python/issue2636-25
Members of Python Regexp 2.7 can upload to this branch. Log in for directions.

Branch merges

Related bugs

Related blueprints

Branch information

Owner:
Python Regexp 2.7
Project:
Python
Status:
Development

Recent revisions

39040. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Changed the non-UNICODE definition of SRE_CODE to be unsigned int (typically 32-bits), though for Wide UNICODE characters, the size is Py_UCS4, which is already 32-bits.

39039. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Merged in changes from the latest python source snapshot.

39038. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Modified documentation so the paragraphs would fit in an 80 column
screen by making sure that each line occupies no more than 72 columns.

39037. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Added new, more complex, test for branching (using the OR ('|') operator)
in Regular Expressions.

39036. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Merged in changes from the latest python source snapshot.

39035. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Changed the generic VERBOSE flag to be VERBOSE_SRE_ENGINE so that it can
be defined at the make level without potentially interfering with other
modules.

39034. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Moving these Documentation changes into their own branch so that the minor
changes will not force the documentation suggestion changes to also be
included; they will now only be included in their own branch, for issue 12.

39033. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Replaced tab with spaces.

39032. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Better comment for the end of line test.

39031. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Merged in changes from the latest python source snapshot.

Branch metadata

Branch format:
Branch format 6
Repository format:
Bazaar pack repository format 1 with rich root (needs bzr 1.0)
This branch contains Public information 
Everyone can see this information.