So I have a task where by I have to manipulate an XML file through a bash shell script.
Here are the steps:
Here is a sample of the XML with non-essential info removed:
<fmreq:fileManagementRequestDetail xmlns:fmreq="http://foobar.com/filemanagement">
<fmreq:property>
<fmreq:name>form_category_cd</fmreq:name>
<fmreq:value>Memos</fmreq:value>
</fmreq:property>
<fmreq:property>
<fmreq:name>object_name</fmreq:name>
<fmreq:value>Correspondence</fmreq:value>
</fmreq:property>
</fmreq:fileManagementRequestDetail>
I have to get the value from the value element under object_name, cross reference it, and then replace the value under the form_category_cd value element with the new value:
So if object_name -> value is Correspondence then the form_category_cd -> value might need to be YYZ.
Here's the rub, I can only use the tools available on our server as our operations group is restricting us to the tools at hand. It was a fight to get xmllint updated and then it got overruled. I'm on a version that does not support --xpath, which believe me is difficult on a good day. Also the version I have available doesn't support namespaces, so xmllint is out.
I've tried sed, but it seems to not like my regex even though every tester I try works fine.
Regex:
(<fmreq\:name>object_name<\/fmreq\:name>)(?:\n\s*)(<fmreq\:value>)(.*)(<\/fmreq\:value>)
I need to get group #3, but sed won't return it. Instead it returns the entire contents of the XML file.
sed -e 's/\(<fmreq\:name>object_name<\/fmreq\:name>\)\(?:\n\s*\)\(<fmreq\:value>\)\(.*\)\(<\/fmreq\:value>\)/\3/' < c3.xml
I'm not as familiar with awk / gawk, so I'm struggling to figure them out and this as well, but am open to them if a solution can be found.
Would love to have an awk / gawk solution just to make the boss happy since he's an old awk fan, but I'll take what I can get as I'm stumped.
Again I have to use the tools on hand and can't install anything new.
I think that there are a couple of problems in your sed
command:
You don't use the -n
option, so by default sed
just prints every line of input to the output (possibly modified by a sed
command).
You don't need the redirection < c3.xml
, because sed
recognizes the last argument as a filename.
sed
is not very well suited for matches over multiple lines. See for example here.
The following seems to work on your example:
sed -n "/<fmreq:name>object_name<\/fmreq:name>/ {n;p}" c3.xml | sed "s/^\s*<fmreq:value>\(.*\)<\/fmreq:value>/\1/g"
Or, with only one sed
invocation:
sed -n "/<fmreq:name>object_name<\/fmreq\:name>/ {n;s/^\s*<fmreq:value>\(.*\)<\/fmreq:value>/\1/g;p}" c3.xml
Breakdown of what this command does:
The option -n
tells sed
not to print the pattern space after it's finished processing the line. Consequently, you need to use the command p
explicitely to do so.
/regex/
tells sed
to execute the commands that follow only on the lines that match regex
.
The sed
command n
replaces the content of the pattern space by the next line of input, which is the one containing the value you are interested in.
The sed
command s/regex/replacement/
substitutes the first match of regex
in the pattern space by replacement
.
The sed
command p
prints the line.
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加