read subprocess stdout line by line

Question

My python script uses subprocess to call a linux utility that is very noisy. I want to store all of the output to a log file and show some of it to the user. I thought the following would work, but the output doesn't show up in my application until the utility has produced a significant amount of output.

# fake_utility.py, just generates lots of output over time import time i = 0 while True: print(hex(i)*512) i += 1 time.sleep(0.5)

In the parent process:

import subprocess proc = subprocess.Popen(['python', 'fake_utility.py'], stdout=subprocess.PIPE) for line in proc.stdout: # the real code does filtering here print("test:", line.rstrip())

The behavior I really want is for the filter script to print each line as it is received from the subprocess, like tee does but within Python code.

What am I missing? Is this even possible?

related: Python: read streaming input from subprocess.communicate() — jfs
– jfs, Commented Sep 9, 2014 at 23:03
None of the answers listed here worked for me, but stackoverflow.com/questions/5411780/… did! — boxed
– boxed, Commented Nov 11, 2018 at 8:48

Georg Plaz · Accepted Answer · 2022-02-22 11:52:12Z

250

I think the problem is with the statement for line in proc.stdout, which reads the entire input before iterating over it. The solution is to use readline() instead:

#filters output import subprocess proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE) while True: line = proc.stdout.readline() if not line: break #the real code does filtering here print "test:", line.rstrip()

Of course you still have to deal with the subprocess' buffering.

Note: according to the documentation the solution with an iterator should be equivalent to using readline(), except for the read-ahead buffer, but (or exactly because of this) the proposed change did produce different results for me (Python 2.5 on Windows XP).

edited Feb 22, 2022 at 11:52

Georg Plaz

6,0285 gold badges44 silver badges66 bronze badges

answered May 11, 2010 at 18:48

Rômulo Ceccon

10.4k5 gold badges43 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

jfs Over a year ago

for file.readline() vs. for line in file see bugs.python.org/issue3907 (in short: it works on Python3; use io.open() on Python 2.6+)

Jason Mock Over a year ago

The more pythonic test for an EOF, per the "Programming Recommendations" in PEP 8 (python.org/dev/peps/pep-0008), would be 'if not line:'.

jfs Over a year ago

@naxa: for pipes: for line in iter(proc.stdout.readline, ''):.

jfs Over a year ago

@Jan-PhilipGehrcke: yes. 1. you could use for line in proc.stdout on Python 3 (there is no the read-ahead bug) 2. '' != b'' on Python 3 -- don't copy-paste the code blindly -- think what it does and how it works.

Dr. Jan-Philip Gehrcke Over a year ago

@J.F.Sebastian: sure, the iter(f.readline, b'') solution is rather obvious (and also works on Python 2, if anyone is interested). The point of my comment was not to blame your solution (sorry if it appeared like that, I read that now, too!), but to describe the extent of the symptoms, which are quite severe in this case (most of the Py2/3 issues result in exceptions, whereas here a well-behaved loop changed to be endless, and garbage collection struggles fighting the flood of newly created objects, yielding memory usage oscillations with long period and large amplitude).

|

jbg · Accepted Answer · 2019-12-26 17:07:37Z

110

Bit late to the party, but was surprised not to see what I think is the simplest solution here:

import io import subprocess proc = subprocess.Popen(["prog", "arg"], stdout=subprocess.PIPE) for line in io.TextIOWrapper(proc.stdout, encoding="utf-8"): # or another encoding # do something with line

(This requires Python 3.)

edited Dec 26, 2019 at 17:07

answered Jan 22, 2016 at 3:56

jbg

5,2982 gold badges30 silver badges31 bronze badges

9 Comments

Dan Garthwaite Over a year ago

I'd like to use this answer but I am getting: AttributeError: 'file' object has no attribute 'readable' py2.7

matanox Over a year ago

Works with python 3

jbg Over a year ago

@sorin neither of those things make it "not valid". If you're writing a library that still needs to support Python 2, then don't use this code. But many people have the luxury of being able to use software released more recently than a decade ago. If you try to read on a closed file you'll get that exception regardless of whether you use TextIOWrapper or not. You can simply handle the exception.

Dusan Gligoric Over a year ago

you are maybe late to the party but you answer is up to date with current version of Python, ty

jbg Over a year ago

@Ammad \n is the newline character. it's conventional in Python for the newline to not be removed when splitting by lines - you'll see the same behaviour if you iterate over a file's lines or use a readlines() method. You can get the line without it with just line[:-1] (TextIOWrapper operates in "universal newlines" mode by default, so even if you're on Windows and the line ends with \r\n, you'll only have \n at the end, so -1 works). You can also use line.rstrip() if you don't mind any other whitespace-like characters at the end of the line also being removed.

|

Steve Carter · Accepted Answer · 2014-08-29 16:36:03Z

Indeed, if you sorted out the iterator then buffering could now be your problem. You could tell the python in the sub-process not to buffer its output.

proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)

becomes

proc = subprocess.Popen(['python','-u', 'fake_utility.py'],stdout=subprocess.PIPE)

I have needed this when calling python from within python.

Community · Accepted Answer · 2019-10-11 15:23:14Z

A function that allows iterating over both stdout and stderr concurrently, in realtime, line by line

In case you need to get the output stream for both stdout and stderr at the same time, you can use the following function.

The function uses Queues to merge both Popen pipes into a single iterator.

Here we create the function read_popen_pipes():

from queue import Queue, Empty from concurrent.futures import ThreadPoolExecutor def enqueue_output(file, queue): for line in iter(file.readline, ''): queue.put(line) file.close() def read_popen_pipes(p): with ThreadPoolExecutor(2) as pool: q_stdout, q_stderr = Queue(), Queue() pool.submit(enqueue_output, p.stdout, q_stdout) pool.submit(enqueue_output, p.stderr, q_stderr) while True: if p.poll() is not None and q_stdout.empty() and q_stderr.empty(): break out_line = err_line = '' try: out_line = q_stdout.get_nowait() except Empty: pass try: err_line = q_stderr.get_nowait() except Empty: pass yield (out_line, err_line)

read_popen_pipes() in use:

import subprocess as sp with sp.Popen(my_cmd, stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p: for out_line, err_line in read_popen_pipes(p): # Do stuff with each line, e.g.: print(out_line, end='') print(err_line, end='') return p.poll() # return status-code

wim · Accepted Answer · 2024-02-22 22:05:07Z

The subprocess module has come a long way since 2010, and most of the answers here are quite outdated.

Here is a simple way working for modern Python versions:

from subprocess import Popen, PIPE, STDOUT with Popen(args, stdout=PIPE, stderr=STDOUT, text=True) as proc: for line in proc.stdout: print(line) rc = proc.returncode

About using Popen as a context-manager (supported since Python 3.2): on exit of the with block, standard file descriptors are closed, and the process is waited / returncode attribute set. See subprocess.py:Popen.__exit__ in CPython sources.

nikoliazekter · Accepted Answer · 2016-12-11 18:11:23Z

18

You want to pass these extra parameters to subprocess.Popen:

bufsize=1, universal_newlines=True

Then you can iterate as in your example. (Tested with Python 3.5)

edited Dec 11, 2016 at 18:11

nikoliazekter

7572 gold badges6 silver badges25 bronze badges

answered Oct 16, 2015 at 18:57

user1747134

2,5021 gold badge21 silver badges31 bronze badges

1 Comment

Quantum7 Over a year ago

@nicoulaj It should work if using the subprocess32 package.

aiven · Accepted Answer · 2018-12-27 23:20:50Z

7

You can also read lines w/o loop. Works in python3.6.

import os import subprocess process = subprocess.Popen(command, stdout=subprocess.PIPE) list_of_byte_strings = process.stdout.readlines()

answered Dec 27, 2018 at 23:20

aiven

4,3624 gold badges35 silver badges60 bronze badges

2 Comments

ndtreviv Over a year ago

Or to convert into strings: list_of_strings = [x.decode('utf-8').rstrip('\n') for x in iter(process.stdout.readlines())]

Bobby Impollonia Over a year ago

@ndtreviv, you can pass text=True to Popen or use its "encoding" kwarg if you want the output as strings, no need to convert it yourself

StefanQ · Accepted Answer · 2021-01-20 12:29:21Z

6

Pythont 3.5 added the methods run() and call() to the subprocess module, both returning a CompletedProcess object. With this you are fine using proc.stdout.splitlines():

proc = subprocess.run( comman, shell=True, capture_output=True, text=True, check=True ) for line in proc.stdout.splitlines(): print "stdout:", line

See also How to Execute Shell Commands in Python Using the Subprocess Run Method

edited Jan 20, 2021 at 12:29

answered Mar 22, 2020 at 9:04

StefanQ

80414 silver badges18 bronze badges

3 Comments

sfuqua Over a year ago

This solution is short and effective. One problem, compared to the original question: it does not print each line "as it is received," which I think means printing the messages in realtime just as if running the process directly in the command line. Instead it only prints the output after the process finishes running.

Sridhar Sarnobat Over a year ago

Thanks @sfuqua for mentioning that. I use pipelines extensively and rely on streaming data and would have wrongly chosen this for its brevity.

wim Over a year ago

This does not answer the question. It buffers entire output of subprocess into memory.

shakram02 · Accepted Answer · 2022-10-29 21:35:06Z

I tried this with python3 and it worked, source

When you use popen to spawn the new thread, you tell the operating system to PIPE the stdout of the child processes so the parent process can read it and here, stderr is copied to the stderr of the parent process.

in output_reader we read each line of stdout of the child process by wrapping it in an iterator that populates line by line output from the child process whenever a new line is ready.

def output_reader(proc): for line in iter(proc.stdout.readline, b''): print('got line: {0}'.format(line.decode('utf-8')), end='') def main(): proc = subprocess.Popen(['python', 'fake_utility.py'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) t = threading.Thread(target=output_reader, args=(proc,)) t.start() try: time.sleep(0.2) import time i = 0 while True: print (hex(i)*512) i += 1 time.sleep(0.5) finally: proc.terminate() try: proc.wait(timeout=0.2) print('== subprocess exited with rc =', proc.returncode) except subprocess.TimeoutExpired: print('subprocess did not terminate in time') t.join()

binariedMe · Accepted Answer · 2019-03-01 06:01:24Z

The following modification of Rômulo's answer works for me on Python 2 and 3 (2.7.12 and 3.6.1):

import os import subprocess process = subprocess.Popen(command, stdout=subprocess.PIPE) while True: line = process.stdout.readline() if line != '': os.write(1, line) else: break

Stan S. · Accepted Answer · 2022-04-21 11:56:44Z

I was having a problem with the arg list of Popen to update servers, the following code resolves this a bit.

import getpass from subprocess import Popen, PIPE username = 'user1' ip = '127.0.0.1' print ('What is the password?') password = getpass.getpass() cmd1 = f"""sshpass -p {password} ssh {username}@{ip}""" cmd2 = f"""echo {password} | sudo -S apt update""" cmd3 = " && " cmd4 = f"""echo {password} | sudo -S apt upgrade -y""" cmd5 = " && " cmd6 = "exit" commands = [cmd1, cmd2, cmd3, cmd4, cmd5, cmd6] command = " ".join(commands) cmd = command.split() with Popen(cmd, stdout=PIPE, bufsize=1, universal_newlines=True) as p: for line in p.stdout: print(line, end='')

And to run the update on a local computer, the following code example does this.

import getpass from subprocess import Popen, PIPE print ('What is the password?') password = getpass.getpass() cmd1_local = f"""apt update""" cmd2_local = f"""apt upgrade -y""" commands = [cmd1_local, cmd2_local] with Popen(['echo', password], stdout=PIPE) as auth: for cmd in commands: cmd = cmd.split() with Popen(['sudo','-S'] + cmd, stdin=auth.stdout, stdout=PIPE, bufsize=1, universal_newlines=True) as p: for line in p.stdout: print(line, end='')

duggi · Accepted Answer · 2023-08-17 21:19:10Z

An improved version of https://stackoverflow.com/a/57093927/2580077 and suitable to python 3.10

A function to iterate over both stdout and stderr of the process in parallel.

Improvements:

Unified queue to maintain the order of entries in stdout and stderr.
Yield all available lines in stdout and stderr - this is useful when the calling process is slower.
Use blocking in the loop to prevent the process from utilizing 100% of the CPU.

import time from queue import Queue, Empty from concurrent.futures import ThreadPoolExecutor def enqueue_output(file, queue, level): for line in file: queue.put((level, line)) file.close() def read_popen_pipes(p, blocking_delay=0.5): with ThreadPoolExecutor(2) as pool: q = Queue() pool.submit(enqueue_output, p.stdout, q, 'stdout') pool.submit(enqueue_output, p.stderr, q, 'stderr') while True: if p.poll() is not None and q.empty(): break lines = [] while not q.empty(): lines.append(q.get_nowait()) if lines: yield lines # otherwise, loop will run as fast as possible and utilizes 100% of the CPU time.sleep(blocking_delay)

Usage:

with subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, bufsize=1, universal_newlines=True) as p: for lines in read_popen_pipes(p): # lines - all the log entries since the last loop run. print('ext cmd', lines) # process lines

Steven Dickinson · Accepted Answer · 2023-09-19 09:01:13Z

I came here with the same problem, and found that none of the provided answers really worked for me. The closest was adding the sys.std.flush() to the child process, which works but means modifying that process, which I didn't want to do.

Setting the bufsize=1 in the Popen() didn't seem to have any effect for my use case. I guess the problem is that the child process is buffering, regardless of how I call the Popen().

However, I found this question with similar problem (How can I flush the output of the print function?) and one of the answers is to set the environment variable PYTHONUNBUFFERED=1 when calling Popen. This works how I want it to, i.e. real-time line-by-line reading of the output of the child process.

talljosh · Accepted Answer · 2024-03-26 03:44:05Z

On Linux (and presumably OSX), sometimes the parent process doesn't see the output immediately because the child process is buffering its output (see this article for a more detailed explanation).

If the child process is a Python program, you can disable this by setting the environment variable PYTHONUNBUFFERED to 1 as described in this answer.

If the child process is not a Python program, you can sometimes trick it into running in line-buffered mode by creating a pseudo-terminal like so:

import os import pty import subprocess # Open a pseudo-terminal master_fd, slave_fd = pty.openpty() # Open the child process on the slave end of the PTY with subprocess.Popen( ['python', 'fake_utility.py'], stdout=slave_fd, stdin=slave_fd, stderr=slave_fd) as proc: # Close our copy of the slave FD (without this we won't notice # when the child process closes theirs) os.close(slave_fd) # Convert the master FD into a file-like object with open(master_fd, 'r') as stdout: try: for line in stdout: # Do the actual filtering here print("test:", line.rstrip()) except OSError: # This happens when the child process closes its STDOUT, # usually when it exits pass

If the child process needs to read from STDIN, you can get away without the stdin=slave_fd argument to subprocess.Popen(), as the child process should be checking the status of STDOUT (not STDIN) when it decides whether or not to use line-buffering.

Finally, some programs may actually directly open and write to their controlling terminal instead of writing to STDOUT. If you need to catch this case, you can use the setsid utility by replacing ['python', 'fake_utility.py'] with ['setsid', 'python', 'fake_utility.py'] in the call to subprocess.Popen().

Collectives™ on Stack Overflow

read subprocess stdout line by line

14 Answers 14

13 Comments

9 Comments

Comments

Comments

Comments

1 Comment

2 Comments

3 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

14 Answers 14

13 Comments

9 Comments

Comments

Comments

Comments

1 Comment

2 Comments

3 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Related