Perl Compatible Regular Expressions


Perl Compatible Regular Expressions is a library written in C, which implements a regular expression engine, inspired by the capabilities of the Perl programming language. Philip Hazel started writing PCRE in summer 1997. PCRE's syntax is much more powerful and flexible than either of the POSIX regular expression flavors and than that of many other regular-expression libraries.
While PCRE originally aimed at feature-equivalence with Perl, the two implementations are not fully equivalent. During the PCRE 7.x and Perl 5.9.x phase, the two projects have coordinated development, with features being ported between them in both directions.
A number of prominent open-source programs, such as the Apache and Nginx HTTP servers, and the PHP and R scripting languages, incorporate the PCRE library; proprietary software can do likewise, as the library is BSD-licensed. As of Perl 5.10, PCRE is also available as a replacement for Perl's default regular-expression engine through the re::engine::PCRE module.
The library can be built on Unix, Windows, and several other environments. PCRE is distributed with a POSIX C wrapper, a native C++ wrapper, several test programs, and the utility program pcregrep built in tandem with the library.

Features

;Just-in-time compiler support
;Flexible memory management
;Consistent escaping rules
;Extended character classes
;Minimal matching
;Unicode character properties
;Multiline matching
;Newline/linebreak options
;Backslash-R options
;Beginning of pattern options
;Named subpatterns
;Backreferences
;Subroutines
;Atomic grouping
;Look-ahead and look-behind assertions
;Escape sequences for zero-width assertions
;Comments
;Recursive patterns
;Generic callouts

Differences from Perl

Differences between PCRE and Perl include but are not limited to:
;Recursive matches are atomic in PCRE and non atomic in Perl
;The value of a capture buffer deriving from the ? quantifier when nested in another quantified capture buffer is different
;PCRE allows named capture buffers to be given numeric names; Perl requires the name to follow the rule of barewords
;PCRE allows alternatives within lookbehind to be different lengths
;PCRE does not support certain "experimental" Perl constructs
;PCRE and Perl are slightly different in their tolerance of erroneous constructs
;PCRE has a hard limit on recursion depth, Perl does not
With the exception of the above points PCRE is capable of passing the tests in the Perl 't/op/re_tests' file, one of the main syntax level regression tests for Perl's regular expression engine.

Footnotes

University of Cambridge Computing Service (CSX)