Unix - Split a file containing XML files into single files

Question

I've been working on a piece of code that splits a file containing multiple XML files into individual XML files. The line count of each XML file varies so I have been using the XML head tag to know where the next file starts.

grep -n $string $xmlfile | sed -n 's/^\([0-9]*\)[:].*/\1p'

Which gets me the line number of the start of each file. How can I use the head/tail command to make use of the line numbers to pull the files apart within a single automated script?

aefxx · Accepted Answer · 2012-10-12 19:50:08Z

1

// x1, x2 being XML declaration line numbers cat myfile | head -n x2 | tail -n x1

answered Oct 12, 2012 at 19:50

aefxx

25.4k6 gold badges47 silver badges55 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Rellix Over a year ago

Not too sure how to get the variable amount of line declarations into the x1/x2 etc. Any ideas?

aefxx Over a year ago

Loop over the list of line numbers m1,m2,...,mn. Let x1=0 and x2=n1. For each subsequent iteration assign x1=mi-1 and x2=mi (1,2,i,n should be interpreted as indices mathematically).

Gilles Quénot · Accepted Answer · 2012-10-12 19:47:29Z

When parsing xml files in your prefered shell, your best bet will be to use xmllint command-line and Xpath expressions.

xmllint comes from libxml.

See http://www.xmlsoft.org/ & http://en.wikipedia.org/wiki/Xpath

Unfortunately neither are installed on the machine and I don't have the rights to install. Will need to look into libxml and Xpath when I get to another machine.

Collectives™ on Stack Overflow

Unix - Split a file containing XML files into single files

2 Answers 2

2 Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Related