2

Hi Im working on perl script to split Big xml to small chunks. And i have refereed this link Split file by XML tag

and my code is like this

if($line =~ /^</row>/) { $count++; } 

but im getting this error

 works\filesplit.pl line 20. Bareword found where operator expected at E:\Work\perl works\filesplit.pl line 2 0, near "/^</row" (Missing operator before row?) syntax error at E:\Work\perl works\filesplit.pl line 20, near "/^</row" Search pattern not terminated at E:\Work\perl works\filesplit.pl line 20. 

Can anyone help me

Update

<row> <date></date> <ForeignpostingId /> <country>11</country> <domain>http://www.xxxx.com</domain> <domainid>20813</domainid> </row> <row> <date></date> <ForeignpostingId /> <country>11</country> <domain>http://www.xxxx.com</domain> <domainid>20813</domainid> </row> <row> <date></date> <ForeignpostingId /> <country>11</country> <domain>http://www.xxxx.com</domain> <domainid>20813</domainid> </row> 
3
  • How do you want the file 'chunked' and what do you want to do with those chunks? Commented Nov 28, 2013 at 6:41
  • @Kenosis... "Five" <row> ........ </row> to be chunked in single file Commented Nov 28, 2013 at 6:43
  • @Kenosis .. Actually my file size is too large so i want it to be chunked 5 <row>.. </row> in a single file ... <row>...</row> <row>....</row> like this Commented Nov 28, 2013 at 6:45

4 Answers 4

3

Have you tried xml_split? It's a tool that comes with XML::Twig that's specifically designed to split big XML files, based on a variety of criteria (tag name, level, size).

Sign up to request clarification or add additional context in comments.

Comments

2

You need ^<\/row> provided that you are trying to match </row> at the beginning of the line. Here is my test code.

#!/usr/bin/perl use strict; use warnings; my $line = "</row> something"; if ($line =~ /^<\/row>/) { print "found a match \n"; } 

OUTPUT:

# perl test.pl found a match 

Update

posting this update after OP provided sample data.

You need ^\s+<\/row> in your regex because not all of them are starting at the beginning of the line. some of them have one space before them. hence we need to match zero or more spaces at the begining of the line before we do actual match.

code:

#!/usr/bin/perl -w use strict; use warnings; while (my $line = <DATA>) { if ($line =~ /^\s+<\/row>/) { print "found a match \n"; } } __DATA__ <row> <date></date> <ForeignpostingId /> <country>11</country> <domain>http://www.xxxx.com</domain> <domainid>20813</domainid> </row> <row> <date></date> <ForeignpostingId /> <country>11</country> <domain>http://www.xxxx.com</domain> <domainid>20813</domainid> </row> <row> <date></date> <ForeignpostingId /> <country>11</country> <domain>http://www.xxxx.com</domain> <domainid>20813</domainid> </row> 

Output:

# perl test.pl found a match found a match found a match 

Comments

2

Perhaps the following will be helpful:

use strict; use warnings; my $i = 1; local $/ = '<row>'; while (<>) { chomp; s!</row>!! or next; open my $fh, '>', 'File_' . ( sprintf '%05d', $i++ ) . '.xml' or die $!; print $fh $_; } 

Usage: perl script.pl inFile.xml

This sets Perl's record separator $/ to <row> to read the xml file in those 'chunks' delimited by <row>. It removes the </row> from the chunk, then writes out that chunk to a file that has the naming scheme of "File_nnnnn.xml".

2 Comments

Im getting blank screen . Nothing happened
Check the directory for the generated files.
0
#!/bin/perl -w ## splitting xml files using perl script print "Input File ? "; chomp($XmlFile = <STDIN>); open $XmlFileHandle,'<',$XmlFile; print "\nSplit By which Tag ? "; chomp($splitby = <STDIN>); open $OutputHandle, '>','OutputFile_'.$splitby; ## to split by <user>...</user> while(<$XmlFileHandle>){ if(/<$splitby>/){ print $OutputHandle "<$splitby>\n"; last; } } while(<$XmlFileHandle>){ $line = $_; if($line =~ m/<\/$splitby>/){ print $OutputHandle "</$splitby>"; last; } print $OutputHandle $line; } print "\nOutput File is : OutputFile_$splitby\n"; 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.