I have a file named test.txt with the following content
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<test time="60" id="01">
<java.lang.String value="cat"/><java.lang.String value="dog"/>
<java.lang.String value="mouse"/>
<java.lang.String value="cow"/>
</test>
What I would like to do is that , i want to edit the file so that when i get something like , <java.lang.String value="something"/>
i will change that part to <animal>something</animal>
So for previous example , after applying a script with sed/awk/grep
command the file content will be changed to or a new file will be created like following:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<test time="60" id="01">
<animal>cat</animal><animal>dog</animal>
<animal>mouse</animal>
<animal>cow</animal>
</test>
I tried to extract that particular part using following command :
$less test.txt | grep -Po 'java.lang.String value="\K[^"]*' | awk -F: '{print "<animal>" $1 "</animal>"}'
The output gives me the changed part, but I want this changed part along with the rest of the file unchanged :
<animal>cat</animal>
<animal>dog</animal>
<animal>mouse</animal>
<animal>cow</animal>
I am new to scripting , I don't know how to write the complete output in a file .
sed -r 's#<java.lang.String value="([^"]*)"/>#<animal>\1</animal>#g' test.txt
And you should not do XML
transformations with regular expressions...
EDIT about how it works
By default sed
uses "basic regular expressions", where many special characters have to be prefixed with \
. -r
flag switches to "extended regular expressions" where the syntax is less cumbersome. See OpenGroup for details.
By default sed
prints output as-is unless commands modify it. The replacement command is like s#search_regexp#replacement#flags
. The delimiter can be anything like /
, #
, or ,
. I choose #
so it doesn't clash with the \
character in XML
.
Then we match things like <java.lang.String value="anything_except_quotes"/>
. The part that we want to reuse has parenthesis, it's called a matching group. In the replacement we refer to the thing we captured inside the matching group by \1
.
g
flag makes sed
replace all occurences of the search pattern, not only the first one.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments