To answer this question properly, we'd ideally need a better example - some valid xml is a good start.
Also - an example of desired output. You don't, for example, indicate where you'd want the <C> and <D> elements to end up within your resultant XML. They're already children of <B> - do you want to preserve B or reparent C and D to the root?
However generically reconstructing XML is quite easy using XML::Twig and perl.
E.g. Like so:
#!/usr/bin/perl use strict; use warnings; use XML::Twig; my @wanted = qw ( C D id ); my %wanted = map { $_ => 1 } @wanted; sub delete_unwanted_tags { my ( $twig, $element ) = @_; my $tag = $element -> tag; if ( not $wanted{$tag} ) { $element -> delete; } } my $twig = XML::Twig -> new ( twig_handlers => { _all_ => \&delete_unwanted_tags } ); $twig -> parse ( \*DATA ); $twig -> print; __DATA__ <A> <id>123</id> <B> <C>value1</C> <D>value2</D> <E></E> </B> <Z></Z> <Y></Y> </A>
Because we haven't said "keep <B>" the result is:
<A> <id>123</id> </A>
Adding <B> to the wanted list:
<A> <id>123</id> <B> <C>value1</C> <D>value2</D> </B> </A>
If however, what you want to do is reparent C and D into A:
#!/usr/bin/perl use strict; use warnings; use XML::Twig; my @wanted = qw ( id); my @reparent = qw ( C D ); #turn the above into hashes, so we can do "if $wanted{$tag}" my %wanted = map { $_ => 1 } @wanted; my %reparent = map { $_ => 1 } @reparent; sub delete_unwanted_tags { my ( $twig, $element ) = @_; my $tag = $element->tag; if ( not $wanted{$tag} ) { $element->delete; } if ( $reparent{$tag} ) { $element->move( 'last_child', $twig->root ); } } my $twig = XML::Twig->new( pretty_print => 'indented_a', twig_handlers => { _all_ => \&delete_unwanted_tags } ); $twig->parse( \*DATA ); $twig->print; __DATA__ <A> <id>123</id> <B> <C>value1</C> <D>value2</D> <E></E> </B> <Z></Z> <Y></Y> </A>
Note - the "twig handler" is called at the end of each element (when a close tag is encountered) which is why this works - we recurse down to find C and D before we finish processing (and deleting) B.
This produces:
<A> <id>123</id> <C>value1</C> <D>value2</D> </A>
In the above, I have used __DATA__, \*DATA and parse because it allows me to illustrate both the XML and techiques. You should probably use instead parsefile('my_file.xml') instead of parse(\*DATA).
grep -o '<[CD]>[^<]*</[CD]>'grep -o '<\(parameterC\|parameterD\)>[^<]*</\1>'grep-XMLis not a thing that's easily greppable, thanks to whitespace reformatting, tag nesting and unary tags. Not to mention handing brokenXMLappropriately. (e.g. you should at least detect if tags aren't closed).