Need help understanding this particular regular expression [^.]

phan Published at Dev

phan

[^.]+\.(txt|html)

I am learning regex, and am trying to parse this.

[^.] The ^ means "not", and the dot is a wildcard that means any character, so this means find a match with "not any character"? I still don't understand this. Can anyone explain?

The plus is a Kleene Plus which means "1 or more". So now it's "one or more" "not any character".

I get \., it means a period.

(txt|html) means match with a txt file or html file. I think I understand everything after the plus sign. What I don't understand is why it doesn't look something the DOS equivalent where I can just do this: *.txt or *.(txt|html) where * means everything that ends in the file extension .txt or .html?

Is [^.] the equivalent of * in DOS?

Amal Murali

The dot (.) has no special meaning when it's inside a character class, and doesn't require to be escaped.

[^.] means "any character that is not a literal . character". [^.]+ matches one or more occurrences of any character that is not a dot.

From regular-expressions.info:

In most regex flavors, the only special characters or meta-characters inside a character class are the closing bracket (]), the backslash (\), the caret (^), and the hyphen (-). The usual meta-characters are normal characters inside a character class, and do not need to be escaped by a backslash. Your regex will work fine if you escape the regular metacharacters inside a character class, but doing so significantly reduces readability.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-02-10

Comments

0 comments

From Dev

Related Related

Article