3

I've been working on a piece of code that splits a file containing multiple XML files into individual XML files. The line count of each XML file varies so I have been using the XML head tag to know where the next file starts.

grep -n $string $xmlfile | sed -n 's/^\([0-9]*\)[:].*/\1p' 

Which gets me the line number of the start of each file. How can I use the head/tail command to make use of the line numbers to pull the files apart within a single automated script?

2 Answers 2

1
// x1, x2 being XML declaration line numbers cat myfile | head -n x2 | tail -n x1 
Sign up to request clarification or add additional context in comments.

2 Comments

Not too sure how to get the variable amount of line declarations into the x1/x2 etc. Any ideas?
Loop over the list of line numbers m1,m2,...,mn. Let x1=0 and x2=n1. For each subsequent iteration assign x1=mi-1 and x2=mi (1,2,i,n should be interpreted as indices mathematically).
0

When parsing xml files in your prefered shell, your best bet will be to use xmllint command-line and Xpath expressions.


xmllint comes from libxml.

See http://www.xmlsoft.org/ & http://en.wikipedia.org/wiki/Xpath

1 Comment

Unfortunately neither are installed on the machine and I don't have the rights to install. Will need to look into libxml and Xpath when I get to another machine.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.