Python threading/multiprocessing don't need Mutex?

Question

For a supervisor-like project, I use the threading library to manage some child process. At some point, the user can prompt command to send instructions to the process management thread. These commands are stored in a Queue object shared between the main process and process management thread. I thought I'll need mutex to solve concurrency issues so I made a a little script to try it out, but without mutex first to be sure I get the expected concurrency issue.

I expected from the script to print a messy list of int every second:

import threading import time def longer(l, mutex=None): while 1: last_val = l[-1] l.append(last_val + 1) time.sleep(1) return dalist = [0] t = threading.Thread(target=longer, args=(dalist,)) t.daemon = True t.start() while 1: last_val = dalist[-1] dalist.append(last_val + 1) print dalist time.sleep(1)

But in fact it print a nice list of following int like these:

[0, 1, 2] [0, 1, 2, 3] [0, 1, 2, 3, 4, 5, 6]

From this answer in another post I thought it come from the threading library, so I did the same with the multiprocessing lib:

import multiprocessing as mp import time def longer(l, mutex=None): while 1: last_val = l[-1] l.append(last_val + 1) time.sleep(1) return dalist = [0] t = mp.Process(target=longer, args=(dalist,)) t.start() while 1: last_val = dalist[-1] dalist.append(last_val + 1) print dalist time.sleep(1)

But I got the same result, a bit 'slower' :

[0, 1] [0, 1, 2] [0, 1, 2, 3] [0, 1, 2, 3, 4]

So I wonder do I really need mutex to manage a Queue-like object shared between thread ??? And also, from one of the code above, how could I effectively reproduce the expected concurrency issue I search for ?

Thanks for reading

Edit 1: From the remarks of user4815162342 I change the first snippet and I manage to have some sort of race condition by moving the sleep call inside the 'longer' function between value retrieval and list appending:

import threading import time def longer(l, mutex=None): while 1: last_val = l[-1] time.sleep(1) l.append(last_val + 1) return dalist = [0] t = threading.Thread(target=longer, args=(dalist,)) t.daemon = True t.start() while 1: last_val = dalist[-1] dalist.append(last_val + 1) print dalist time.sleep(1)

which give me stuffs like that:

[0, 1] [0, 1, 1, 2] [0, 1, 1, 2, 2, 3] [0, 1, 1, 2, 2, 3, 3, 4]

and I manage to solve my artificial issue using threading Lock like that:

import threading import time def longer(l, mutex=None): while 1: if mutex is not None: mutex.acquire() last_val = l[-1] time.sleep(1) l.append(last_val + 1) if mutex is not None: mutex.release() return dalist = [0] mutex = threading.Lock() t = threading.Thread(target=longer, args=(dalist, mutex)) t.daemon = True t.start() while 1: if mutex is not None: mutex.acquire() last_val = dalist[-1] dalist.append(last_val + 1) if mutex is not None: mutex.release() print dalist time.sleep(1)

which then produce:

[0, 1, 2] [0, 1, 2, 3, 4, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Note that the idiomatic way to acquire and release a mutex would be using a with statement rather than explicit calls to acquire and release methods. Also, one rarely needs to pass the mutex to the function - a mutex protecting a global resource will reside in a global variable, and a mutex protecting an object-level resource will reside in the object and be reachable as self.some_attribute. — user4815162342
– user4815162342, Commented Oct 31, 2015 at 14:23
Thanks for the additional info :) I come from C programming and I learned that global variables are bad habits (security issue, code cleaness etc..), do you know if global variable is a special concern in Python ? — shorty_ponton
– shorty_ponton, Commented Nov 2, 2015 at 12:58

user4815162342 · Accepted Answer · 2021-03-03 16:30:43Z

Your first code snippet contains a race condition and does need a mutex. The global interpreter lock makes the race condition rare because one thread is running at any given time. However, every several bytecode instructions the current thread relinquishes the ownership of the global interpreter lock to give other threads a chance to run. So, given your code:

last_val = dalist[-1] dalist.append(last_val + 1)

If the bytecode switch happens after executing the first line, another thread will pick up the same last_val and append it to the list. After giving control back to the initial thread, the value stored in last_val will be appended to the list, for the second time. A mutex would prevent the race in the obvious way: a context switch between list access and append would give control to the other thread, but it would immediately get blocked in the mutex and relinquish control back to the original thread.

Your second example only "works" because the two processes have separate list instances. Modifying one list doesn't affect the other, so the other process might as well not be running. Although multiprocessing has a drop-in replacement API for threading, the underlying concepts are vastly different, which needs to be accounted for when switching from one to the other.

Ok so with the first snippet the race condition issue may be more obvious if I used more than one thread I guess ?
@shorty_ponton Yes, and smaller sleep times, etc. And, of course, inserting even a tiny sleep before dalist.append(...) should provoke it.

Collectives™ on Stack Overflow

Python threading/multiprocessing don't need Mutex?

1 Answer 1

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Linked

Related