19

I'm running a script via Python's subprocess module. Currently I use:

p = subprocess.Popen('/path/to/script', stdout=subprocess.PIPE, stderr=subprocess.PIPE) result = p.communicate() 

I then print the result to the stdout. This is all fine but as the script takes a long time to complete, I wanted real time output from the script to stdout as well. The reason I pipe the output is because I want to parse it.

5

4 Answers 4

17

To save subprocess' stdout to a variable for further processing and to display it while the child process is running as it arrives:

#!/usr/bin/env python3 from io import StringIO from subprocess import Popen, PIPE with Popen('/path/to/script', stdout=PIPE, bufsize=1, universal_newlines=True) as p, StringIO() as buf: for line in p.stdout: print(line, end='') buf.write(line) output = buf.getvalue() rc = p.returncode 

To save both subprocess's stdout and stderr is more complex because you should consume both streams concurrently to avoid a deadlock:

stdout_buf, stderr_buf = StringIO(), StringIO() rc = teed_call('/path/to/script', stdout=stdout_buf, stderr=stderr_buf, universal_newlines=True) output = stdout_buf.getvalue() ... 

where teed_call() is define here.


Update: here's a simpler asyncio version.


Old version:

Here's a single-threaded solution based on child_process.py example from tulip:

import asyncio import sys from asyncio.subprocess import PIPE @asyncio.coroutine def read_and_display(*cmd): """Read cmd's stdout, stderr while displaying them as they arrive.""" # start process process = yield from asyncio.create_subprocess_exec(*cmd, stdout=PIPE, stderr=PIPE) # read child's stdout/stderr concurrently stdout, stderr = [], [] # stderr, stdout buffers tasks = { asyncio.Task(process.stdout.readline()): ( stdout, process.stdout, sys.stdout.buffer), asyncio.Task(process.stderr.readline()): ( stderr, process.stderr, sys.stderr.buffer)} while tasks: done, pending = yield from asyncio.wait(tasks, return_when=asyncio.FIRST_COMPLETED) assert done for future in done: buf, stream, display = tasks.pop(future) line = future.result() if line: # not EOF buf.append(line) # save for later display.write(line) # display in terminal # schedule to read the next line tasks[asyncio.Task(stream.readline())] = buf, stream, display # wait for the process to exit rc = yield from process.wait() return rc, b''.join(stdout), b''.join(stderr) 

The script runs '/path/to/script command and reads line by line both its stdout&stderr concurrently. The lines are printed to parent's stdout/stderr correspondingly and saved as bytestrings for future processing. To run the read_and_display() coroutine, we need an event loop:

import os if os.name == 'nt': loop = asyncio.ProactorEventLoop() # for subprocess' pipes on Windows asyncio.set_event_loop(loop) else: loop = asyncio.get_event_loop() try: rc, *output = loop.run_until_complete(read_and_display("/path/to/script")) if rc: sys.exit("child failed with '{}' exit code".format(rc)) finally: loop.close() 
Sign up to request clarification or add additional context in comments.

Comments

1

p.communicate() waits for the subprocess to complete and then returns its entire output at once.

Have you tried something like this instead, where you read the subprocess output line-by-line?

p = subprocess.Popen('/path/to/script', stdout=subprocess.PIPE, stderr=subprocess.PIPE) for line in p.stdout: # do something with this individual line print line 

3 Comments

if the child process generates enough output to fill OS stderr pipe buffer (65K on my machine) then it hangs. You should consume p.stderr too -- concurrently. Due to the read-ahead bug, for line in p.stdout will print in bursts. You could use for line in iter(p.stdout.readline, b'') instead. print line will print double newlines. You could use print line, (note: comma), to avoid it.
Great point about consuming stderr too. I was assuming that a few lines of buffering wouldn't be an issue in a lengthy data stream, but that's something to consider as well.
"the script takes a long time to complete" -- it means that if the script writes progress to stderr then it can stall.
0

The Popen.communicate doc clearly states:

Note: The data read is buffered in memory, so do not use this method if the data size is large or unlimited. 

https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate

So if you need realtime output, you need to use something like this:

stream_p = subprocess.Popen('/path/to/script', stdout=subprocess.PIPE, stderr=subprocess.PIPE) while stream_line in stream_p: #Parse it the way you want print stream_line 

Comments

0

This prints both stdout and stderr to the terminal as well as saving both stdout and stderr into a variable:

from subprocess import Popen, PIPE, STDOUT with Popen(args, stdout=PIPE, stderr=STDOUT, text=True, bufsize=1) as p: output = "".join([print(buf, end="") or buf for buf in p.stdout]) 

However, depending on what exactly you're doing, this might be important to note: By using stderr=STDOUT, we cannot differentiate between stdout and stderr anymore and with the call to print, your output will always be printed to stdout, doesn't matter if it came from stdout or stderr.

For Python < 3.7 you will need to use universal_newlines instead of text.

New in version 3.7: text was added as a more readable alias for universal_newlines.

Source: https://docs.python.org/3/library/subprocess.html#subprocess.Popen

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.