Never ever use Regex to handle markup languages.
The original version of this answer (see below) used XML::XPath. Grant McLean said in the comments:
XML::XPath is an old and unmaintained module. XML::LibXML is a modern, maintained module with an almost identical API and it's faster too.
so I made a new version that uses XML::LibXML (thanks, Grant):
use warnings; use strict; use XML::LibXML; my $doc = XML::LibXML->load_xml(location => 'articles.xml'); my $xp = XML::LibXML::XPathContext->new($doc->documentElement); my $xpath = '/articles/article[position() < 4]'; foreach my $article ( $xp->findnodes($xpath) ) { # now do something with $article print $article.": ".$article->getName."\n"; }
For me this prints:
XML::LibXML::Element=SCALAR(0x346ef90): article XML::LibXML::Element=SCALAR(0x346ef30): article XML::LibXML::Element=SCALAR(0x346efa8): article
Links to the relevant documentation:
Original version of the answer, based on the XML::XPath package:
use warnings; use strict; use XML::XPath; my $xp = XML::XPath->new(filename => 'articles.xml'); my $xpath = '/articles/article[position() < 4]'; foreach my $article ( $xp->findnodes($xpath)->get_nodelist ) { # now do something with $article print $article.": ".$article->getName ."\n"; }
which prints this for me:
XML::XPath::Node::Element=REF(0x38067b8): article XML::XPath::Node::Element=REF(0x38097e8): article XML::XPath::Node::Element=REF(0x3809ae8): article
Have a look at the docs to find out what you can do with them.