posix generalizes the notion of a character to that of a collating element. It defines a collating element to be ``a sequence of one or more bytes defined in the current collating sequence as a unit of collation.''
This generalizes the notion of a character in
two ways. First, a single character can map into two or more collating
elements. For example, the German
``es-zet''
collates as the collating element s
followed by another collating
element s
. Second, two or more characters can map into one
collating element. For example, the Spanish ll
collates after
l
and before m
.
Since posix's ``collating element'' preserves the essential idea of a ``character,'' we use the latter, more familiar, term in this document.