123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701 |
- @c Copyright (C) 1994--2021 Free Software Foundation, Inc.
- @c
- @c Permission is granted to copy, distribute and/or modify this document
- @c under the terms of the GNU Free Documentation License, Version 1.3 or
- @c any later version published by the Free Software Foundation; with no
- @c Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
- @c A copy of the license is included in the ``GNU Free
- @c Documentation License'' file as part of this distribution.
- @c this regular expression description is for: findutils
- @menu
- * findutils-default regular expression syntax::
- * emacs regular expression syntax::
- * gnu-awk regular expression syntax::
- * grep regular expression syntax::
- * posix-awk regular expression syntax::
- * awk regular expression syntax::
- * posix-basic regular expression syntax::
- * posix-egrep regular expression syntax::
- * egrep regular expression syntax::
- * posix-extended regular expression syntax::
- @end menu
- @node findutils-default regular expression syntax
- @subsection @samp{findutils-default} regular expression syntax
- The character @samp{.} matches any single character.
- @table @samp
- @item +
- indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
- @item ?
- indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
- @item \+
- matches a @samp{+}
- @item \?
- matches a @samp{?}.
- @end table
- Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
- GNU extensions are supported:
- @enumerate
- @item @samp{\w} matches a character within a word
- @item @samp{\W} matches a character which is not within a word
- @item @samp{\<} matches the beginning of a word
- @item @samp{\>} matches the end of a word
- @item @samp{\b} matches a word boundary
- @item @samp{\B} matches characters which are not a word boundary
- @item @samp{\`} matches the beginning of the whole input
- @item @samp{\'} matches the end of the whole input
- @end enumerate
- Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
- The alternation operator is @samp{\|}.
- The character @samp{^} only represents the beginning of a string when it appears:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{\(}
- @item After the alternation operator @samp{\|}
- @end enumerate
- The character @samp{$} only represents the end of a string when it appears:
- @enumerate
- @item At the end of a regular expression
- @item Before a close-group, signified by @samp{\)}
- @item Before the alternation operator @samp{\|}
- @end enumerate
- @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{\(}
- @item After the alternation operator @samp{\|}
- @end enumerate
- The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
- @node emacs regular expression syntax
- @subsection @samp{emacs} regular expression syntax
- The character @samp{.} matches any single character except newline.
- @table @samp
- @item +
- indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
- @item ?
- indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
- @item \+
- matches a @samp{+}
- @item \?
- matches a @samp{?}.
- @end table
- Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
- GNU extensions are supported:
- @enumerate
- @item @samp{\w} matches a character within a word
- @item @samp{\W} matches a character which is not within a word
- @item @samp{\<} matches the beginning of a word
- @item @samp{\>} matches the end of a word
- @item @samp{\b} matches a word boundary
- @item @samp{\B} matches characters which are not a word boundary
- @item @samp{\`} matches the beginning of the whole input
- @item @samp{\'} matches the end of the whole input
- @end enumerate
- Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
- The alternation operator is @samp{\|}.
- The character @samp{^} only represents the beginning of a string when it appears:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{\(}
- @item After the alternation operator @samp{\|}
- @end enumerate
- The character @samp{$} only represents the end of a string when it appears:
- @enumerate
- @item At the end of a regular expression
- @item Before a close-group, signified by @samp{\)}
- @item Before the alternation operator @samp{\|}
- @end enumerate
- @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{\(}
- @item After the alternation operator @samp{\|}
- @end enumerate
- The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
- @node gnu-awk regular expression syntax
- @subsection @samp{gnu-awk} regular expression syntax
- The character @samp{.} matches any single character.
- @table @samp
- @item +
- indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
- @item ?
- indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
- @item \+
- matches a @samp{+}
- @item \?
- matches a @samp{?}.
- @end table
- Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
- GNU extensions are supported:
- @enumerate
- @item @samp{\w} matches a character within a word
- @item @samp{\W} matches a character which is not within a word
- @item @samp{\<} matches the beginning of a word
- @item @samp{\>} matches the end of a word
- @item @samp{\b} matches a word boundary
- @item @samp{\B} matches characters which are not a word boundary
- @item @samp{\`} matches the beginning of the whole input
- @item @samp{\'} matches the end of the whole input
- @end enumerate
- Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
- The alternation operator is @samp{|}.
- The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
- @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{(}
- @item After the alternation operator @samp{|}
- @end enumerate
- Intervals are specified by @samp{@{} and @samp{@}}.
- Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
- The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
- @node grep regular expression syntax
- @subsection @samp{grep} regular expression syntax
- The character @samp{.} matches any single character.
- @table @samp
- @item \+
- indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
- @item \?
- indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
- @item + and ?
- match themselves.
- @end table
- Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
- GNU extensions are supported:
- @enumerate
- @item @samp{\w} matches a character within a word
- @item @samp{\W} matches a character which is not within a word
- @item @samp{\<} matches the beginning of a word
- @item @samp{\>} matches the end of a word
- @item @samp{\b} matches a word boundary
- @item @samp{\B} matches characters which are not a word boundary
- @item @samp{\`} matches the beginning of the whole input
- @item @samp{\'} matches the end of the whole input
- @end enumerate
- Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
- The alternation operator is @samp{\|}.
- The character @samp{^} only represents the beginning of a string when it appears:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{\(}
- @item After a newline
- @item After the alternation operator @samp{\|}
- @end enumerate
- The character @samp{$} only represents the end of a string when it appears:
- @enumerate
- @item At the end of a regular expression
- @item Before a close-group, signified by @samp{\)}
- @item Before a newline
- @item Before the alternation operator @samp{\|}
- @end enumerate
- @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{\(}
- @item After a newline
- @item After the alternation operator @samp{\|}
- @end enumerate
- Intervals are specified by @samp{\@{} and @samp{\@}}.
- Invalid intervals such as @samp{a\@{1z} are not accepted.
- The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
- @node posix-awk regular expression syntax
- @subsection @samp{posix-awk} regular expression syntax
- The character @samp{.} matches any single character except the null character.
- @table @samp
- @item +
- indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
- @item ?
- indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
- @item \+
- matches a @samp{+}
- @item \?
- matches a @samp{?}.
- @end table
- Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
- GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
- Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
- The alternation operator is @samp{|}.
- The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
- @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{(}
- @item After the alternation operator @samp{|}
- @end enumerate
- Intervals are specified by @samp{@{} and @samp{@}}.
- Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
- The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
- @node awk regular expression syntax
- @subsection @samp{awk} regular expression syntax
- The character @samp{.} matches any single character except the null character.
- @table @samp
- @item +
- indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
- @item ?
- indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
- @item \+
- matches a @samp{+}
- @item \?
- matches a @samp{?}.
- @end table
- Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
- GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
- Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit.
- The alternation operator is @samp{|}.
- The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
- @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{(}
- @item After the alternation operator @samp{|}
- @end enumerate
- The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
- @node posix-basic regular expression syntax
- @subsection @samp{posix-basic} regular expression syntax
- The character @samp{.} matches any single character except the null character.
- @table @samp
- @item \+
- indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
- @item \?
- indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
- @item + and ?
- match themselves.
- @end table
- Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
- GNU extensions are supported:
- @enumerate
- @item @samp{\w} matches a character within a word
- @item @samp{\W} matches a character which is not within a word
- @item @samp{\<} matches the beginning of a word
- @item @samp{\>} matches the end of a word
- @item @samp{\b} matches a word boundary
- @item @samp{\B} matches characters which are not a word boundary
- @item @samp{\`} matches the beginning of the whole input
- @item @samp{\'} matches the end of the whole input
- @end enumerate
- Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
- The alternation operator is @samp{\|}.
- The character @samp{^} only represents the beginning of a string when it appears:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{\(}
- @item After the alternation operator @samp{\|}
- @end enumerate
- The character @samp{$} only represents the end of a string when it appears:
- @enumerate
- @item At the end of a regular expression
- @item Before a close-group, signified by @samp{\)}
- @item Before the alternation operator @samp{\|}
- @end enumerate
- @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{\(}
- @item After the alternation operator @samp{\|}
- @end enumerate
- Intervals are specified by @samp{\@{} and @samp{\@}}.
- Invalid intervals such as @samp{a\@{1z} are not accepted.
- The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
- @node posix-egrep regular expression syntax
- @subsection @samp{posix-egrep} regular expression syntax
- The character @samp{.} matches any single character.
- @table @samp
- @item +
- indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
- @item ?
- indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
- @item \+
- matches a @samp{+}
- @item \?
- matches a @samp{?}.
- @end table
- Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
- GNU extensions are supported:
- @enumerate
- @item @samp{\w} matches a character within a word
- @item @samp{\W} matches a character which is not within a word
- @item @samp{\<} matches the beginning of a word
- @item @samp{\>} matches the end of a word
- @item @samp{\b} matches a word boundary
- @item @samp{\B} matches characters which are not a word boundary
- @item @samp{\`} matches the beginning of the whole input
- @item @samp{\'} matches the end of the whole input
- @end enumerate
- Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
- The alternation operator is @samp{|}.
- The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
- The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
- Intervals are specified by @samp{@{} and @samp{@}}.
- Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
- The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
- @node egrep regular expression syntax
- @subsection @samp{egrep} regular expression syntax
- This is a synonym for posix-egrep.
- @node posix-extended regular expression syntax
- @subsection @samp{posix-extended} regular expression syntax
- The character @samp{.} matches any single character except the null character.
- @table @samp
- @item +
- indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
- @item ?
- indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
- @item \+
- matches a @samp{+}
- @item \?
- matches a @samp{?}.
- @end table
- Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
- GNU extensions are supported:
- @enumerate
- @item @samp{\w} matches a character within a word
- @item @samp{\W} matches a character which is not within a word
- @item @samp{\<} matches the beginning of a word
- @item @samp{\>} matches the end of a word
- @item @samp{\b} matches a word boundary
- @item @samp{\B} matches characters which are not a word boundary
- @item @samp{\`} matches the beginning of the whole input
- @item @samp{\'} matches the end of the whole input
- @end enumerate
- Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
- The alternation operator is @samp{|}.
- The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
- @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
- @enumerate
- @item At the beginning of a regular expression
- @item After an open-group, signified by @samp{(}
- @item After the alternation operator @samp{|}
- @end enumerate
- Intervals are specified by @samp{@{} and @samp{@}}.
- Invalid intervals such as @samp{a@{1z} are not accepted.
- The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
|