Regular expression to match a line that doesn't contain a word

knaser

I know it's possible to match a word and then reverse the matches using other tools (e.g. grep -v). However, is it possible to match lines that do not contain a specific word, e.g. hede, using a regular expression?

Input:

hoho
hihi
haha
hede

Code:

grep "<Regex for 'doesn't contain hede'>" input

Desired output:

hoho
hihi
haha
Bart Kiers

The notion that regex doesn't support inverse matching is not entirely true. You can mimic this behavior by using negative look-arounds:

^((?!hede).)*$

The regex above will match any string, or line without a line break, not containing the (sub)string 'hede'. As mentioned, this is not something regex is "good" at (or should do), but still, it is possible.

And if you need to match line break chars as well, use the DOT-ALL modifier (the trailing s in the following pattern):

/^((?!hede).)*$/s

or use it inline:

/(?s)^((?!hede).)*$/

(where the /.../ are the regex delimiters, i.e., not part of the pattern)

If the DOT-ALL modifier is not available, you can mimic the same behavior with the character class [\s\S]:

/^((?!hede)[\s\S])*$/

Explanation

A string is just a list of n characters. Before, and after each character, there's an empty string. So a list of n characters will have n+1 empty strings. Consider the string "ABhedeCD":

    ┌──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┐
S = │e1│ A │e2│ B │e3│ h │e4│ e │e5│ d │e6│ e │e7│ C │e8│ D │e9│
    └──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┘

index    0      1      2      3      4      5      6      7

where the e's are the empty strings. The regex (?!hede). looks ahead to see if there's no substring "hede" to be seen, and if that is the case (so something else is seen), then the . (dot) will match any character except a line break. Look-arounds are also called zero-width-assertions because they don't consume any characters. They only assert/validate something.

So, in my example, every empty string is first validated to see if there's no "hede" up ahead, before a character is consumed by the . (dot). The regex (?!hede). will do that only once, so it is wrapped in a group, and repeated zero or more times: ((?!hede).)*. Finally, the start- and end-of-input are anchored to make sure the entire input is consumed: ^((?!hede).)*$

As you can see, the input "ABhedeCD" will fail because on e3, the regex (?!hede) fails (there is "hede" up ahead!).

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Regular expression to match text that "doesn't" contain a word?

From Dev

Regular Expression to match string which doesn't contain substring

From Dev

Regular Expression to match statement that contain word 'abc'

From Dev

Regular Expression to match statement that contain word 'abc'

From Dev

Regular expression for word that doesn't match a list of words

From Dev

Match line that doesn't end with >\s* using regular expression

From Dev

Regular Expression Match Doesn't Start With, Contain, Or End With Space And Not Empty String

From Dev

Regex: match a word in a string but only when line doesn't contain a slash

From Dev

a regular expression with except word doesn't work

From Dev

Why my regular expression doesn't match this?

From Dev

Regular Expression doesn't Match with string

From Dev

Java Regular Expression doesn't find a match

From Dev

Regex Match word if string doesn't contain another word

From Dev

regular expression to match word-word or word

From Dev

Regular expression to match part of word

From Dev

Regular expression to match not containing a word

From Dev

Searching multiple line text that doesn't contain certain expression

From Dev

Regular expression with named subpattern doesn't see the best match

From Dev

Why doesn't regular expression alternation (A|B) match as per doc?

From Dev

Why regular expression doesn't match input with sed command

From Dev

Use of Java regular expression, doesn't match *.jpg or *.gif

From Dev

Needs regular expression that doesn't starts with space and doesn't contain special characters in C#

From Dev

Line feed regular expression doesn't work in Geany

From Dev

Regular expression, match a partial word, C#

From Java

Regular expression to match a word but not inside backticks

From Java

Regular expression to match a word or its prefix

From Dev

grep - regular expression - match till a specific word

From Dev

Regular expression: Match everything after a particular word

From Dev

Function to match a word in a string on a regular expression in MySQL

Related Related

  1. 1

    Regular expression to match text that "doesn't" contain a word?

  2. 2

    Regular Expression to match string which doesn't contain substring

  3. 3

    Regular Expression to match statement that contain word 'abc'

  4. 4

    Regular Expression to match statement that contain word 'abc'

  5. 5

    Regular expression for word that doesn't match a list of words

  6. 6

    Match line that doesn't end with >\s* using regular expression

  7. 7

    Regular Expression Match Doesn't Start With, Contain, Or End With Space And Not Empty String

  8. 8

    Regex: match a word in a string but only when line doesn't contain a slash

  9. 9

    a regular expression with except word doesn't work

  10. 10

    Why my regular expression doesn't match this?

  11. 11

    Regular Expression doesn't Match with string

  12. 12

    Java Regular Expression doesn't find a match

  13. 13

    Regex Match word if string doesn't contain another word

  14. 14

    regular expression to match word-word or word

  15. 15

    Regular expression to match part of word

  16. 16

    Regular expression to match not containing a word

  17. 17

    Searching multiple line text that doesn't contain certain expression

  18. 18

    Regular expression with named subpattern doesn't see the best match

  19. 19

    Why doesn't regular expression alternation (A|B) match as per doc?

  20. 20

    Why regular expression doesn't match input with sed command

  21. 21

    Use of Java regular expression, doesn't match *.jpg or *.gif

  22. 22

    Needs regular expression that doesn't starts with space and doesn't contain special characters in C#

  23. 23

    Line feed regular expression doesn't work in Geany

  24. 24

    Regular expression, match a partial word, C#

  25. 25

    Regular expression to match a word but not inside backticks

  26. 26

    Regular expression to match a word or its prefix

  27. 27

    grep - regular expression - match till a specific word

  28. 28

    Regular expression: Match everything after a particular word

  29. 29

    Function to match a word in a string on a regular expression in MySQL

HotTag

Archive