If the syntax bit RE_NO_BK_REF
isn't set, then Regex recognizes
back references. A back reference matches a specified preceding group.
The back reference operator is represented by \digit
anywhere after the end of a regular expression's digit-th
group (see Grouping Operators).
digit must be between 1
and 9
. The matcher assigns
numbers 1 through 9 to the first nine groups it encounters. By using
one of \1
through \9
after the corresponding group's
close-group operator, you can match a substring identical to the
one that the group does.
Back references match according to the following (in all examples below,
(
represents the open-group, )
the close-group, {
the open-interval and }
the close-interval operator):
(a)\1
matches aa
and
(bana)na\1bo\1
matches bananabanabobana
. Likewise,
(.*)\1
matches any (newline-free if the syntax bit
RE_DOT_NEWLINE
isn't set) string that is composed of two
identical halves; the (.*)
matches the first half and the
\1
matches the second half.
((a*)b)*\1\2
matches aabababa
; first group 1 (the
outer one) matches aab
and group 2 (the inner one) matches
aa
. Then group 1 matches ab
and group 2 matches
a
. So, \1
matches ab
and \2
matches
a
.
(one()|two())-and-(three\2|four\3)
matches one-and-three
and two-and-four
, but not one-and-four
or
two-and-three
. For example, if the pattern matches
one-and-
, then its group 2 matches the empty string and its
group 3 doesn't participate in the match. So, if it then matches
four
, then when it tries to back reference group 3---which it
will attempt to do because \3
follows the four
---the match
will fail because group 3 didn't participate in the match.
You can use a back reference as an argument to a repetition operator. For
example, (a(b))\2*
matches a
followed by two or more
b
s. Similarly, (a(b))\2{3}
matches abbbb
.
If there is no preceding digit-th subexpression, the regular expression is invalid.