Regular Expressions

Tutorial
Regular expressions (regex's) are used to find specific strings in text. The syntax described here is defined in the POSIX 1003.2 standard as modern or extended regular expressions.

Basic Syntax
Here is a really boring example of a regex: Regex's have some symbols that has special meanings, called metacharacters. Among these are. The point ' ' can match any character: The pipe ' ' matches one regex or another:

Regex's can be nested using the ' ' metacharacters: If you want to match a string with a character that may or may not occur, you can use the  metacharacter:

If you want to match strings where a character occurs 0 or more times, you can use the ' ' metacharacter. Similarly, the ' ' metacharacter matches the character 1 or more times. You can even specify the excact number or interval that characters may occur with ' ' curly brackets. Inside the brackets you can write a single number, a range, or open-ended intervals like so: A set of characters can be specified using the ' ' operators. Inside the brackets, you can write the matched characters. Contrarily, ' ' matches the characters not inside the brackets. Furthermore, a range of characters can be specified with syntax like ' '

Metacharacters
These are the most important of the metacharacters: . ^ $   |   ? + * { }   [ ] [^ ] The caret ' ' and the dollar sign ' ' matches the start and the end of the string, respectively.

Using the metacharacters as ordinary characters:

Of course, if you want to match on of the characters  as an ordinary character, you will have to do some trick. The trick is prefixing the character with a backslash ' ', called escaping the character: If you want to match a backslash, you should also escape it ' '

Combinations
The basic syntax can be combined to more clever matching:

Substitution
The power of regex's really show when doing substitution. This is basically the same as 'search and replace'. This is important in editors and programming languages.

In general, the syntax for substitution is s/search/replace/g ' ' means 'substitute' and ' ' means 'global', signifying that the substitution should be done for all matches in the string. A simple example: When the search string is a regex, the substitution replaces all the substrings matching the regex: The replacement is not a regex, which, if you think about it, is quite understandable. However, the replacement string does have some special characters used for more flexible substutition.

The parts of the search regex enclosed in parentheses ' ' is called a group. In the replacement string, the group in the actual matched string can be inserted by ' n', where n is the number of the group. So the first group is inserted by writing ' ' and so on. This requires a few examples:

Using Regular Expressions
Below are some programs that use regular expressions: