My understanding was that concurrent.futures relied on pickling arguments to get them running in different processes (or threads). Shouldn't pickling create a copy of the argument? On Linux it does not seem to be doing so, i.e., I have to explicitly pass a copy.
I'm trying to make sense of the following results:
<0> rands before submission: [17, 72, 97, 8, 32, 15, 63, 97, 57, 60] <1> rands before submission: [97, 15, 97, 32, 60, 17, 57, 72, 8, 63] <2> rands before submission: [15, 57, 63, 17, 97, 97, 8, 32, 60, 72] <3> rands before submission: [32, 97, 63, 72, 17, 57, 97, 8, 15, 60] in function 0 [97, 15, 97, 32, 60, 17, 57, 72, 8, 63] in function 1 [97, 32, 17, 15, 57, 97, 63, 72, 60, 8] in function 2 [97, 32, 17, 15, 57, 97, 63, 72, 60, 8] in function 3 [97, 32, 17, 15, 57, 97, 63, 72, 60, 8] Here's the code:
from __future__ import print_function import time import random try: from concurrent import futures except ImportError: import futures def work_with_rands(i, rands): print('in function', i, rands) def main(): random.seed(1) rands = [random.randrange(100) for _ in range(10)] # sequence 1 and sequence 2 should give the same results but they don't # only difference is that one uses a copy of rands (i.e., rands.copy()) # sequence 1 with futures.ProcessPoolExecutor() as ex: for i in range(4): print("<{}> rands before submission: {}".format(i, rands)) ex.submit(work_with_rands, i, rands) random.shuffle(rands) print('-' * 30) random.seed(1) rands = [random.randrange(100) for _ in range(10)] # sequence 2 print("initial sequence: ", rands) with futures.ProcessPoolExecutor() as ex: for i in range(4): print("<{}> rands before submission: {}".format(i, rands)) ex.submit(work_with_rands, i, rands[:]) random.shuffle(rands) if __name__ == "__main__": main() Where on earth is [97, 32, 17, 15, 57, 97, 63, 72, 60, 8] coming from? That's not even one of the sequences passed to submit.
The results differ slightly under Python 2.