0

Here is a test file:

gunzip -c file_1.gz Line 1 Line 2 Line 3 

I am executing bash commands this way:

cmd = "gunzip -c file_1.gz | grep 3" subprocess.call(cmd, shell=True)) Line 3 

I need to run this command on several files in parallel, then join the processes. SO it seems I have to use subprocess.Popen().communicate(). However Popen won't recognize the pipe correctly and will feed it to the first command, gunzip in my case:

subprocess.Popen(cmd.split()).communicate()) gunzip: can't stat: | (|.gz): No such file or directory gunzip: can't stat: grep (grep.gz): No such file or directory gunzip: can't stat: 8 (8.gz): No such file or directory 

I would like to keep the whole command and to avoid separating it this way:

gunzip = subprocess.Popen('gunzip -c file_1.gz'.split(), stdout=subprocess.PIPE) grep = subprocess.Popen('grep 3'.split(), stdin=gunzip.stdout, stdout=subprocess.PIPE) gunzip.stdout.close() output = grep.communicate()[0] gunzip.wait() 

Is there a way to not separate the commands and process the pipe correctly?

2
  • 1
    What does "join the processes" mean? Do you want to capture the output of several processes running concurrently? Here's code example. Unrelated: your code is probably IO bound i.e., there might be no point to read the files in parallel unless they are in memory already. Commented May 21, 2016 at 3:02
  • Sorry for delay.. By joining the processes I mean waiting until all the grep are finished on each file. Your answer you are referring to is noteworthy! Commented Aug 27, 2016 at 12:43

1 Answer 1

1

To run the grep 3 command you need the output from the previous command, so there is no way to run this successfully in a single command with subprocess.Popen.

If you always want to run grep 3 for all the files, you could just join the results of all the gunzip -c file_x.gz and then run the grep command only once on the entire list.

subprocess.Popen('gunzip -c file_1.gz'.split(), stdout=subprocess.PIPE) subprocess.Popen('gunzip -c file_2.gz'.split(), stdout=subprocess.PIPE) ... grep = subprocess.Popen('grep 3'.split(), stdin=all_gunzip_stdout, stdout=subprocess.PIPE) 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.