How do i improve my regex to grep third level domain but not extra character at last?

Dipesh Sunrait

This regex greps everything. How can i grep only domain but not extra chars.

echo "AAAA  cccc.google.com BBBB" | grep -oE "[^\.\n]*((\.[^\.\n]*){2}$)"  --color=always 

I want cccc.google.com to be grepped but not AAAA cccc.google.com BBBB. Adding \b doesnt work.
echo "AAAA cccc.google.com BBBB" | grep -oE "\b[^\.\n]*((\.[^\.\n]*){2}\b$)\b" --color=always

Edit: I forgot to say, i needed for grepping third level and fourth level domains. Here's what i meant:

  • g.google.com This is a third level domain
  • a.b.google.com This is a 4th level domain.

My above regex was grepping third level domain but it grepped some other char so i asked question. Lets say i have AAAA a.b.c.d.e.g.google.com BBBB then {3} should give me g.google.com and {4} or {3,4} should give me e.g.google.com while at the same time omitting some unwanted character. My regex does exactly that but there is extra character!

So, using this regex(from answer, modified):
echo "AAAA d.cccc.google.com BBB" | grep -oE '\w+(\.\w+){2}'
omits the .com part which my regex doesnt(but it prints exta char :( ). So, could you please modify to work in this case.

Chase

It looks like OP wants an interactive regex (clarified in the comments), that can extract n number of domains where the n is variable.

Something like this should work- (?:\w+(?:\.|\b)){4}(?=\.\w+(?: |$))\.\w+

Check out the demo

Usage

  • With {2}

    $ echo "AAAA  a.b.c.d.e.g.google.com BBB" | grep -oP "(?:\w+(?:\.|\b)){2}(?=\.\w+(?: |$))\.\w+"
    g.google.com
    
    Captures the 2 subdomains, excluding top level domain (i.e com)
  • With {3}

    $ echo "AAAA  a.b.c.d.e.g.google.com BBB" | grep -oP "(?:\w+(?:\.|\b)){3}(?=\.\w+(?: |$))\.\w+"
    e.g.google.com
    
    Captures the 3 subdomains, excluding top level domain(i.e com)

...and so on

Explanation

(?:\w+(?:\.|\b)){3} <- This is the same as my original answers, it just captures word characters followed by a ., exactly 3 times

(?=\.\w+(?: |$))\.\w+ <- This acts as the stopping point of the previous regex. It marks the start of the top level domain and captures it.

Original Answer

That regex seems completely wrong, if you want to only match urls like cccc.google.com and www.google.com but not google.com, you should use- (?:\w+(?:\.|\b)){3}

Check out the demo

Explanation

The primary part is \w+(?:\.|\b) - this matches word characters that are immediately followed by a . or a word boundary (i.e space)

This is enclosed with a (?:){3} which makes sure such groups are encountered 3 times.

To also grep 4th level domains, use just change the {3} to {3,4}

(?:\w+(?:\.|\b)){3,4}

Check out the demo

This is how you should do it with grep-

$ echo "AAAA  cccc.google.com BBB" | grep -oP "(?:\w+(?:\.|\b)){3,4}"
cccc.google.com

And with d.cccc.google.com

$ echo "AAAA  d.cccc.google.com BBB" | grep -oP "(?:\w+(?:\.|\b)){3,4}"
d.cccc.google.com

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

How do I get the last character with jquery

分類Dev

How do I improve my vector search function?

分類Dev

How do I invert the results of my RegEx

分類Dev

How do I grep for "->"?

分類Dev

How do I grep for "->"?

分類Dev

How do I add artisan to my Package and Improve my Laravel Package Development Workflow?

分類Dev

Importing a third party framework into Xcode 6 – how do I save it to my project?

分類Dev

How can I target third child level of parent in jQuery?

分類Dev

How do I find the value of the last character of a string for each string in a list>

分類Dev

How do I cancel third party task?

分類Dev

How do I make jQuery change the DOM on another webpage in my domain?

分類Dev

How to improve the regex?

分類Dev

how can I add an extra character after a word search

分類Dev

How do I get my ASUS laptop to start at the right brightness level?

分類Dev

AWS Route 53 -- create Hosted Zone with third level Domain Name?

分類Dev

How do I grep some blocks of text

分類Dev

How do i concatenate these two grep regexes

分類Dev

How do I get the level of depth of a list?

分類Dev

Why is my grep + regex not working?

分類Dev

How do I improve the accuracy of number of weekdays in a month calculation?

分類Dev

how can i remove an extra line from my plot?

分類Dev

How do I detect the character encoding of a text

分類Dev

How do I make the Caps Lock key a third Shift key?

分類Dev

In TypeScript, how do I modify a third party interface?

分類Dev

How do I exit if a third party command waits for input?

分類Dev

How do I count the Sundays in a month and select third one?

分類Dev

How do I copy columns of two files as rows of a third file

分類Dev

How do i delete the last 3 commits?

分類Dev

How can I route a domain to my box at home?

Related 関連記事

  1. 1

    How do I get the last character with jquery

  2. 2

    How do I improve my vector search function?

  3. 3

    How do I invert the results of my RegEx

  4. 4

    How do I grep for "->"?

  5. 5

    How do I grep for "->"?

  6. 6

    How do I add artisan to my Package and Improve my Laravel Package Development Workflow?

  7. 7

    Importing a third party framework into Xcode 6 – how do I save it to my project?

  8. 8

    How can I target third child level of parent in jQuery?

  9. 9

    How do I find the value of the last character of a string for each string in a list>

  10. 10

    How do I cancel third party task?

  11. 11

    How do I make jQuery change the DOM on another webpage in my domain?

  12. 12

    How to improve the regex?

  13. 13

    how can I add an extra character after a word search

  14. 14

    How do I get my ASUS laptop to start at the right brightness level?

  15. 15

    AWS Route 53 -- create Hosted Zone with third level Domain Name?

  16. 16

    How do I grep some blocks of text

  17. 17

    How do i concatenate these two grep regexes

  18. 18

    How do I get the level of depth of a list?

  19. 19

    Why is my grep + regex not working?

  20. 20

    How do I improve the accuracy of number of weekdays in a month calculation?

  21. 21

    how can i remove an extra line from my plot?

  22. 22

    How do I detect the character encoding of a text

  23. 23

    How do I make the Caps Lock key a third Shift key?

  24. 24

    In TypeScript, how do I modify a third party interface?

  25. 25

    How do I exit if a third party command waits for input?

  26. 26

    How do I count the Sundays in a month and select third one?

  27. 27

    How do I copy columns of two files as rows of a third file

  28. 28

    How do i delete the last 3 commits?

  29. 29

    How can I route a domain to my box at home?

ホットタグ

アーカイブ