1

I have a subclass of Thread that I use across my project. In this class, I pass in the ContextVar manually. However, at times (once or twice a day), I notice that the ContextVar in the child thread is not set (reverted to a default value).

class MyThread(Thread): def __init__( self, group: None = None, target: Callable[..., Any] | None = None, name: str | None = None, args: tuple[Any, ...] = (), kwargs: dict[str, Any] | None = None, *, daemon: bool | None = None, ): super().__init__(group=group, target=target, name=name, args=args, kwargs=kwargs, daemon=daemon) self.my_resource = get_resource_info() def run(self): self._exception = None try: set_my_resource_info(self.my_resource.name, self.my_resource.kind) self._return_value = super().run() except BaseException as e: self._exception = e def join(self, timeout: float | None = None): super().join(timeout) if self._exception: raise self._exception return self._return_value 

And in another module I have:

@dataclass class MyResourceInfo: name: str kind: str ="unknown" resource_info: ContextVar[MyResourceInfo] = ContextVar( 'my_resource_info', default=MyResourceInfo(name=get_default_resource_name()), ) def set_resource_info(name: str, kind: str = 'unknown') -> Token[MyResourceInfo]: return resource_info.set(MyResourceInfo(name=name, kind=kind)) 

Why does the context var revert to default value intermittently in child threads?

8
  • That is intringuing, Could you rearrange your code so that we have a minimum reproducible example that will result in the problem? There is enough code to understand what the problem is, but if it is assembled in a single file, including the imports and a target for the threads, and a repetition loop that will launch enough threads that will trigger the problem, as well as a data structure (e.g. a list) which can show it happenning it will be easier to asnwer. I could run your code, but without a suitable body for the threads, I didn't see the problem taking place. Commented Oct 11, 2024 at 18:08
  • 1
    @jsbueno I am not sure its easy to provide the minimum reproducible code. This module is part of a very complex Flask app. I can probably give you an approximate structure, but the call graph of nested threads created would probably be impossible to recreate. Commented Oct 15, 2024 at 14:45
  • Do you have an estimate of how often that happens? I have the feeling it could be part of a reace condition when 2 new threads try to update a contextvar concurrently. Commented Oct 15, 2024 at 14:51
  • 1
    yes - that frequency is compatible with a race-condition in Python's contextvars implementation itself. It can't be reasonably fixed (btw, please state your exact Python version and platform, it is important at this point). For your problem, you will need a side-approach - detect when the default value is in use - possibly just after trying to set it in .run itself, and retry the setting. Commented Oct 17, 2024 at 16:29
  • 1
    This is Python 3.10.14 running in python:3.10-slim-bookworm docker container. Commented Oct 17, 2024 at 21:12

1 Answer 1

0

I could not reproduce the problem.

Sorry - I already discussed what you can possibly do in the comments - but as this could reveal a serious bug in Python contextvar implementation, I really tried to reproduce the problem.

I came up with the following script, and run it with different Python versions (3.10.7, 3.13.0,, 3.13.0t) on a fedora Linux, and also Python 3.10.15 in docker (python:3.10-slim-bookworm as indicated in the context), spinning up from 1000 to 1_000_000 threads as fast as possible, and with delays added in several different points and combinations, and not a single time the ContextVar failed to be set.

(I also tried, as visible in the commented-out lines, to do it in the regular target function, using the default threading.Thread class)

contextkabum.py :

import threading import contextvars import time import sys var = contextvars.ContextVar("var", default=0) errors = [] delay = sys.getswitchinterval() * 3 class T(threading.Thread): def run(self, *args): #time.sleep(delay) var.set(42) #time.sleep(delay) #time.sleep(0.001) return super().run(*args) def target(): #var.set(42) #time.sleep(0.001) if (x:=var.get()) != 42: errors.append(y:=(threading.current_thread(), time.time(), x)) print(y) time.sleep(0.001) def doit(n=1_000_000): threads = [] for i in range(n): threads.append(T(target=target)) for i, t in enumerate(threads): t.start() if not i % 30: pass #time.sleep(.01) if not i % 100: print(i) for t in threads: t.join() print (errors) doit() 

Dockerfile:

from python:3.10-slim-bookworm copy contextkabum.py /root cmd python /root/contextkabum.py 

workaround:

As stated in the comments for effects of the your working environment, just add an explicit check, which would be redundant in the total absence of the problem, and re-set the ContextVar in question:

 def run(self): self._exception = None try: set_my_resource_info(self.my_resource.name, self.my_resource.kind) if check_resource_is_default(): time.sleep(0.005) # you can use "sys.getswitchinterval()" to not hardcode the 0.005 here set_my_resource_info(self.my_resource.name, self.my_resource.kind) self._return_value = super().run() except BaseException as e: self._exception = e def check_resource_is_default(): ... 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.