In general, which characters in a regular expression need escaping?
For example, the following is not syntactically correct:
echo '[]' | grep '[]'
grep: Unmatched [ or [^
This, however, is syntatically correct:
echo '[]' | grep '\[]'
[]
Is there any documentation on which characters should be escaped in a regular expression, and which should not?
This depends on the application. In your example [
must be quoted as an argument for grep
but not echo
.
For the shell (from the POSIX specs):
Quoting is used to remove the special meaning of certain characters or words to the shell. Quoting can be used to preserve the literal meaning of the special characters in the next paragraph, prevent reserved words from being recognized as such, and prevent parameter expansion and command substitution within here-document processing (see Here-Document).
The application shall quote the following characters if they are to represent themselves:
| & ; < > ( ) $ ` \ " ' <space> <tab> <newline>
and the following may need to be quoted under certain circumstances. That is, these characters may be special depending on conditions described elsewhere in this volume of IEEE Std 1003.1-2001:
* ? [ # ˜ = %
The various quoting mechanisms are the escape character, single-quotes, and double-quotes. The here-document represents another form of quoting; see Here-Document.
Specific programs (using regexes, perl, awk) could have additional requirements on escaping.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments