Skip to content

howto - sed - work with xml files - get content inside one tag

Assuming you have a large xml file (say 400 megabytes) and you want to grep the content inside one tag, which tool would solve this better then sed?

sed -n -e 's/.*\(.*\)<\/my_magicTag>.*/\1/p' myInputFile.xml > myInputFileFilteredByMyMagicTag.xml
So what we are doing? We are telling sed to search for none or a lot of text before "", store none or a lot of text before "". With "\1", we are using the first remembered pattern (since we only use one "()", we only have one in this command). With "\p", we are telling sed to print this out. After that, as usual, we are using ">" to redirect the standard output into a file.


No Trackbacks


Display comments as Linear | Threaded

No comments

Add Comment

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
To leave a comment you must approve it via e-mail, which will be sent to your address after submission.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.

Markdown format allowed
Form options