sed to edit only part of a file with regular expression

web2dev

I have a file named test.txt with the following content

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<test time="60" id="01">
<java.lang.String value="cat"/><java.lang.String value="dog"/>
<java.lang.String value="mouse"/>
<java.lang.String value="cow"/>
</test>

What I would like to do is that , i want to edit the file so that when i get something like , <java.lang.String value="something"/> i will change that part to <animal>something</animal>

So for previous example , after applying a script with sed/awk/grep command the file content will be changed to or a new file will be created like following:

   <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <test time="60" id="01">
    <animal>cat</animal><animal>dog</animal>
    <animal>mouse</animal>
    <animal>cow</animal>
    </test>

I tried to extract that particular part using following command :

$less test.txt | grep -Po 'java.lang.String value="\K[^"]*' | awk -F: '{print "<animal>" $1 "</animal>"}'

The output gives me the changed part, but I want this changed part along with the rest of the file unchanged :

<animal>cat</animal>
<animal>dog</animal>
<animal>mouse</animal>
<animal>cow</animal>

I am new to scripting , I don't know how to write the complete output in a file .

Grapsus
sed -r 's#<java.lang.String value="([^"]*)"/>#<animal>\1</animal>#g' test.txt

And you should not do XML transformations with regular expressions...

EDIT about how it works

By default sed uses "basic regular expressions", where many special characters have to be prefixed with \. -r flag switches to "extended regular expressions" where the syntax is less cumbersome. See OpenGroup for details.

By default sed prints output as-is unless commands modify it. The replacement command is like s#search_regexp#replacement#flags. The delimiter can be anything like /, #, or ,. I choose # so it doesn't clash with the \ character in XML.

Then we match things like <java.lang.String value="anything_except_quotes"/>. The part that we want to reuse has parenthesis, it's called a matching group. In the replacement we refer to the thing we captured inside the matching group by \1.

g flag makes sed replace all occurences of the search pattern, not only the first one.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Regular expression to allow file with only .txt or no extension

From Dev

Regular expression to allow file with only .txt or no extension

From Dev

Replace only a part of matched regular expression in c#

From Dev

Replace only a part of matched regular expression in c#

From Dev

How to use sed to replace only on lines matching a regular expression?

From Dev

Negating part of a regular expression

From Dev

the use of "+" in sed regular expression

From Dev

sed regular expression failure

From Dev

sed with regular expression

From Dev

the use of "+" in sed regular expression

From Dev

Regular expression with sed

From Dev

sed regular expression extraction

From Dev

sed command with regular expression

From Dev

Negation of sed regular expression

From Dev

Sed Regular Expression with /P

From Dev

+ Regular Expression not working in sed

From Dev

sed regular expression does not work as expected. Differs on pipe and file

From Dev

regular expression: edit regexp for string

From Java

sed edit file in place

From Dev

edit repo file with sed

From Dev

Regular expression for only numbers

From Dev

Regular expression to match part of word

From Dev

Regular expression for getting part of string

From Dev

Include matched part in regular expression

From Dev

Regular Expression to replace part of a string

From Dev

Extract a part of regular expression java

From Dev

Regular Expression to select part in HTML

From Dev

Regular expression to retain part of the string

From Dev

Replace part of the String by Regular Expression