In any particular syntax for regular expressions, some characters are
always special, others are sometimes special, and others are never
special. The particular syntax that Regex recognizes for a given
regular expression depends on the value in the syntax
field of
the pattern buffer of that regular expression.
You get a pattern buffer by compiling a regular expression. See GNU Pattern Buffers, and POSIX Pattern Buffers, for more information on pattern buffers. See GNU Regular Expression Compiling, POSIX Regular Expression Compiling, and BSD Regular Expression Compiling, for more information on compiling.
Regex considers the value of the syntax
field to be a collection
of bits; we refer to these bits as syntax bits. In most cases,
they affect what characters represent what operators. We describe the
meanings of the operators to which we refer in Common Operators,
GNU Operators, and GNU Emacs Operators.
For reference, here is the complete list of syntax bits, in alphabetical order:
RE_BACKSLASH_ESCAPE_IN_LISTS
\
inside a list (see List Operators
quotes (makes ordinary, if it's special) the following character; if
this bit isn't set, then \
is an ordinary character inside lists.
(See The Backslash Character, for what `\' does outside of lists.)
RE_BK_PLUS_QM
\+
represents the match-one-or-more
operator and \?
represents the match-zero-or-more operator; if
this bit isn't set, then +
represents the match-one-or-more
operator and ?
represents the match-zero-or-one operator. This
bit is irrelevant if RE_LIMITED_OPS
is set.
RE_CHAR_CLASSES
RE_CONTEXT_INDEP_ANCHORS
^
and $
are special anywhere outside
a list; if this bit isn't set, then these characters are special only in
certain contexts. See Match-beginning-of-line Operator, and
Match-end-of-line Operator.
RE_CONTEXT_INDEP_OPS
*
, and (if the syntax bit RE_LIMITED_OPS
isn't set) +
and ?
(or \+
and \?
, depending
on the syntax bit RE_BK_PLUS_QM
) represent repetition operators
only if they're not first in a regular expression or just after an
open-group or alternation operator. The same holds for {
(or
\{
, depending on the syntax bit RE_NO_BK_BRACES
) if
it is the beginning of a valid interval and the syntax bit
RE_INTERVALS
is set.
RE_CONTEXT_INVALID_OPS
If this bit isn't set, then you can put the characters representing the repetition and alternation characters anywhere in a regular expression. Whether or not they will in fact be operators in certain positions depends on other syntax bits.
RE_DOT_NEWLINE
RE_DOT_NOT_NULL
RE_INTERVALS
RE_LIMITED_OPS
RE_NEWLINE_ALT
RE_NO_BK_BRACES
{
represents the open-interval operator
and }
represents the close-interval operator; if this bit isn't
set, then \{
represents the open-interval operator and
\}
represents the close-interval operator. This bit is relevant
only if RE_INTERVALS
is set.
RE_NO_BK_PARENS
(
represents the open-group operator and
)
represents the close-group operator; if this bit isn't set, then
\(
represents the open-group operator and \)
represents
the close-group operator.
RE_NO_BK_REFS
\
digit as
the back reference operator; if this bit isn't set, then it does.
RE_NO_BK_VBAR
|
represents the alternation operator;
if this bit isn't set, then \|
represents the alternation
operator. This bit is irrelevant if RE_LIMITED_OPS
is set.
RE_NO_EMPTY_RANGES
RE_UNMATCHED_RIGHT_PAREN_ORD
RE_NO_BK_PARENS
is set) to match )
.