14.7.2. Searching with regular expressions

By default, text searches do not assume regular expressions. You can choose to set up your searches so that regular expressions are assumed, either:

Use the second method to configure regular expression matching for the current session only. When set, displaying the Find, and Find and Replace, dialog box opens with the radio button selected. You must change your workspace settings to enable this feature for all sessions.

To enable regular expression searches for the current File Editor session:

  1. Select Find → Find... or select Find → Replace... to start a search sequence and replace matched text.

  2. Select the Reg-expression option on the Find (or Find and Replace) dialog box.

  3. Enter the required expression in the Find field.

The following examples show how to use regular expressions in search operations.

Matching simple expressions

Most characters in a regular expression match themselves. For example, entering the regular expression struct in the Find field matches all occurrences of the string struct in your source file. This changes, however, when the regular expression contains metacharacters as listed in Table 14.3.

To match a metacharacter literally, precede the metacharacter with a backslash. For example, to find every occurrence of a dollar sign ($), type \$ in the Find field. The backslash instructs the File Editor to interpret the dollar sign as a literal character, rather than a special character. If you do not use the backslash, the search finds the end-of-line characters instead.

Matching any single character

Using a dot or period (full stop) matches any single character so entering the regular expression var. in the Find field matches any four character sequence beginning with var, such as var1, and var2, and var followed by a space. It does not match var followed by an end-of-line character.

Matching alternative expressions

Using an alternation operation or pipe (|) matches alternative expressions so entering the regular expression REG|Glob in the Find field matches either REG or Glob. This can also be combined with the not operator, for example [^REG|Glob] which applies to both REG and Glob.

Matching repeating expressions

The following metacharacters enable you to match repeating occurrences of a regular expression in your search string:

  • a regular expression followed by an asterisk, *, matches none, one, or more occurrences of that regular expression

  • a regular expression followed by a plus sign, +, matches one or more occurrences of that regular expression

  • a regular expression followed by a question mark, ?, matches none or one occurrence of that regular expression.

Table 14.4 describes some simple examples to illustrate matching repeating expressions.

Table 14.4. Using repetition operators

Regular expression

Matches

s*ion

None, one, or more occurrences of the s character immediately preceding the characters ion. This regular expression matches, for example, with ion in information, sections, and expressions, and with ssion and sion in expressions.

s+ion

One or more occurrences of the s character immediately preceding the characters ion. This regular expression matches, for example, the ssion and sion in expressions.

s?ion

None or one occurrence of the s character immediately preceding the characters ion. This regular expression matches, for example, with the sion in expressions, and with ion in information, sections, and expressions.

0\.?

The number zero, followed by a dot or period (full stop). The backslash tells the File Editor to treat the dot as a literal character, and the ? operator acts on the dot. This regular expression matches, for example, with 0. and 0 followed by a character or an end-of-line character.

The asterisk, question mark, and plus metacharacters can operate on both single character regular expressions and grouped regular expressions. See Grouping expressions for details.

Grouping expressions

If an expression is enclosed in parentheses or round brackets (), it is treated as a single unit and repetition operators, such as the asterisk or plus sign are applied to the whole expression.

For example, to find strings that match is, you can type the text string is in the Find field. However, you can also use ( i)s as a regular expression. This instructs the File Editor to look for the letter s, preceded by both a space and the letter i. So, while using the text string search is matches with This, this, and is, the regular expression ( i)s matches only with is.

Matching any character in a list

A string of characters enclosed in square brackets ([]) matches any one character in that string. For example, the regular expression [xyz] matches any of the characters x, y, or z.

To match any character that is not in the string enclosed within the square brackets, precede the enclosed expression with a caret or not operator (^). For example, the regular expression [^abc] matches every character in the search text other than a, b, and c.

To specify a range of consecutive characters, use a minus sign (-) between the start and end characters, and place the whole expression within square brackets. For example, the regular expression [0-9] is the same as [0123456789].

The following applies to characters within the square brackets:

  • If a minus sign is the first or last character within the square brackets, it is treated as a literal character. For example, the regular expression [-bc] matches any one of the characters -, b, and c.

  • A right square bracket immediately following a left square bracket does not terminate the string. It is considered to be one of the characters to match. For example, the regular expression []0-9] matches the right square bracket and any digit.

  • Metacharacters, such as backslash \, asterisk *, or plus sign +, immediately following the opening square bracket are treated as literal characters. For example, the regular expression [.] matches the dot or period (full stop).

You can use square brackets to group regular expressions in the same way as parentheses. The text string in the square brackets is treated as a single regular expression. For example, the regular expression [bsl]ag matches any of bag, sag, or lag while [aeiou][0-9] matches any lowercase vowel followed by a number, such as a1.

Matching the start or end of a line

You can use a regular expression to search for start-of-line and end-of-line characters:

  • If a caret, ^, is at the beginning of the entire regular expression, it matches the beginning of a line. For example, the regular expression ^reg_opt matches any occurrence of reg_opt but only at the start of a line.

  • If a dollar sign, $, is at the end of the entire regular expression, it matches the end of a line. For example, reg_opt$ matches any occurrence of the string reg_opt but only at the end of a line.

  • If an entire regular expression is enclosed by a caret and dollar sign (^par_a4 == reg_opt$), it matches an entire line.

You can build complex search strings by combining regular expressions and metacharacters, for example ^([aeiou][0-9]) or ([aeiou][0-9])$.

Copyright © 2003, 2004 ARM Limited. All rights reserved.ARM DUI 0234B
Non-Confidential