0

I am using a shell script to remove the XML tags of a set of files in a folder. This is how my file looks like:

#!/bin/sh find texts -type f -name '*.xml' -exec sh -c ' mkdir -p modified file="$0" sed "s/<[^>]*>//g" "$file" > modified/modified_texts ' {} ';' 

This is supposed to take all the files(using $file) in the "texts" folder, remove their XML tags and place the files without the XML tags into the file "modified".

The problem is that, instead of taking all the files, it is using just one, and filling the file "modified_texts" with the content of one of the files(without XML tags, that part works).

I don't really understand what I'm doing wrong, so I would appreciate any help.

5
  • Your multiline code formatting is wrong. Please fix it. Commented Aug 16, 2019 at 16:21
  • 2
    Use >> instead of > Commented Aug 16, 2019 at 16:22
  • 2
    As PhuNgo says -- your code runs over every file, but every file it processes, it deletes the current content of modified_texts and starts it over from the beginning; so only the last file's content is present in your result. Commented Aug 16, 2019 at 16:28
  • 1
    Better than switching from >> to >, though, is to keep > modified/modified_texts the same, but move it to be at the very end of your find command. That way instead of reopening the output file every time you run sed, you're opening the output file only once, before you start find at all, and keeping it open all the way through execution. Commented Aug 16, 2019 at 16:29
  • Thank You. That fixed it. I hadn't realised that Commented Aug 16, 2019 at 16:30

1 Answer 1

1

Instead of doing the output redirection (with truncation!) for every sed command, move it to the outer scope, so the output file is opened (and its prior contents are truncated) only once, before find is started at all.

#!/bin/sh mkdir -p modified # this only needs to happen once, so move it outside find texts -type f -name '*.xml' -exec sed 's/<[^>]*>//g' {} ';' > modified/modified_texts 
Sign up to request clarification or add additional context in comments.

4 Comments

This doesn't really seem to work for me. Using >> instead of > works, though. But thank you anyway
If you need to use >>, then you're opening the file more than once. You should fix whatever in your code is making it do that, instead of just doing a hacky workaround of making the opens nondestructive, while still re-opening the file over and over.
(to be clear, this answer, exactly as given, really does work; if you're trying to apply it and getting a failure, you'll need to show a reproducer for that failure before I can speak to it).
It depends on how the OP want to use the script. If they want to call it multiple times and each time append to the file, perhaps it would be better to merge the calls to open the file only once.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.