3.5.5. Using regular expressions

Regular expressions are the means by which you specify and match strings. A regular expression is either:

You can include low level symbols or high level symbols in a regular expression (see High level and low level symbols for more information.)

Pattern matching is done following the UNIX regexp(5) format, but without the special symbols, ^ and $.

The following special characters modify the meaning of the previous regular expression, and work only if such regular expression is given:

*

Zero or more of the preceding regular expressions. For example, A*B would match B, AB, and AAB.

?

Zero or one of the preceding regular expression. For example, AC?B matches AB and ACB but not ACCB.

+

One or more of the preceding regular expression. For example, AC+B matches ACB and ACCB, but not AB.

The following special characters are regular expressions in themselves:

\

Precedes any special character that you want to include literally in an expression to form a single regular expression. For example, \* matches a single asterisk (*) and \\ matches a single backslash (\). The regular expression \x is equivalent to \x as the character x is not a special character.

()

Allows grouping of characters. For example, (202)* matches 202202202 (as well as nothing at all), and (AC?B)+ looks for sequences of AB or ACB, such as ABACBAB.

.

Exactly one character. This is different from ? in that the period (.) is a regular expression in itself, so .* matches all, while ?* is invalid. Note that . does not match the end-of-line character.

[ ]

A set of characters, any one of which can appear in the search match. For example, the expression r[23] would match strings r2 and r3. The expression [a-z] would match all characters between a and z.

Copyright © 1997, 1998 ARM Limited. All rights reserved.ARM DUI 0040D
Non-Confidential