running subprocesses in parallel with Python

Question

I am trying to understand how can I build a parallel computing pipeline for multiple subprocesses. As I see, each subprocess block waits for the previous code block to run, whereas I have a pipeline which does not have a dependency for the previous run, and it can be handled in parallel. I want to understand whether this is possible, and if so, a sample syntax for showing how to do that would be a great help! Thanks in advance.

import sys import os import subprocess subprocess.run("python pipelinecode1.py".split() + [run_date, this_wk, last_wk, prev_wk], shell=True) subprocess.run("python pipelinecode2.py".split() + [run_date, this_wk, last_wk, prev_wk], shell=True) subprocess.run("python pipelinecode3.py".split() + [run_date, this_wk, last_wk, prev_wk], shell=True)

user3666197 · Accepted Answer · 2020-03-03 22:08:56Z

The MCVE as-is shows zero dependency on the python-interpreter, so the most efficient step for running a set of mutualy independent tasks ( not a pipeline, where one-step-after-another order of processing steps "forms" the "pipeline" ) is GNU parallel:

$ parallel python {} run_date this_wk last_wk prev_wk ::: pipelinecode1.py \ pipelinecode2.py \ pipelinecode3.py

This way you do not waste CPU / cache resources and escape from the blocking and GIL-lock re-introduced re-[SERIAL]-isation of the code-execution without any add-on overhead costs.

For all configurables available read respective details in man parallel

I am not familiar with the syntax you suggest, shall I run it on cmd ? I am using Atom as editor.
@CagdasKanar yes, this is the standard GNU package, install it, if it was not installed so far, read the man-page and feel free to configure any additional tricks available to run this from terminal and enjoy the powers thereof

Collectives™ on Stack Overflow

running subprocesses in parallel with Python

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related