39

I see that asyncio.to_thread() method is been added @python 3.9+, its description says it runs blocking codes on a separate thread to run at once. see example below:

def blocking_io(): print(f"start blocking_io at {time.strftime('%X')}") # Note that time.sleep() can be replaced with any blocking # IO-bound operation, such as file operations. time.sleep(1) print(f"blocking_io complete at {time.strftime('%X')}") async def main(): print(f"started main at {time.strftime('%X')}") await asyncio.gather( asyncio.to_thread(blocking_io), asyncio.sleep(1)) print(f"finished main at {time.strftime('%X')}") asyncio.run(main()) # Expected output: # # started main at 19:50:53 # start blocking_io at 19:50:53 # blocking_io complete at 19:50:54 # finished main at 19:50:54 

By explanation, it seems like using thread mechanism and not context switching nor coroutine. Does this mean it is not actually an async after all? is it same as a traditional multi-threading as in concurrent.futures.ThreadPoolExecutor? what is the benefit of using thread this way then?

2 Answers 2

57

Source code of to_thread is quite simple. It boils down to awaiting run_in_executor with a default executor (executor argument is None) which is ThreadPoolExecutor.

In fact, yes, this is traditional multithreading, сode intended to run on a separate thread is not asynchronous, but to_thread allows you to await for its result asynchronously.

Also note that the function runs in the context of the current task, so its context variable values will be available inside the func.

async def to_thread(func, /, *args, **kwargs): """Asynchronously run function *func* in a separate thread. Any *args and **kwargs supplied for this function are directly passed to *func*. Also, the current :class:`contextvars.Context` is propogated, allowing context variables from the main thread to be accessed in the separate thread. Return a coroutine that can be awaited to get the eventual result of *func*. """ loop = events.get_running_loop() ctx = contextvars.copy_context() func_call = functools.partial(ctx.run, func, *args, **kwargs) return await loop.run_in_executor(None, func_call) 
Sign up to request clarification or add additional context in comments.

2 Comments

@Cherrymelon reading/writing files should be made in Threads in async code.
NOTE: The default executor only has a bounded number of threads due to ThreadPoolExecutor being bounded to per-core(-ish) by default. Thus, to_thread is dangerous if you create many I/O bounded threads. I personally would just recommend defining your own to_thread to create a new ThreadPoolExecutor manually for I/O bounded threads.
10

You would use asyncio.to_thread whenever you need to call a blocking API from a third-party library that either does not have an asyncio adapter/interface, or when you do not want to create one because you are only using a limited number of functions from that library.


A concrete example:

I am currently writing an application that will eventually run as a daemon. At that point, it will use asyncio for its core event loop. The event loop will involve monitoring a Unix socket for notifications, which will trigger the daemon to take an action.

For rapid prototyping, it is currently a CLI. One of the dependencies/external systems the daemon will interact with is called libvirt, an abstraction layer for virtual machine management written in C, with a Python wrapper called libvirt-python.

The Python bindings are blocking and communicate with the libvirt daemon over a separate Unix socket using a blocking request-response protocol.

You can conceptually think of making a call to the libvirt bindings like making an HTTP request to a server and waiting for it to complete the action. The exact mechanics are not important for this discussion — just that it is a blocking I/O operation that depends on an external process and may take time. In other words, this is not a CPU-bound call and can be offloaded to a thread and awaited.

If I were to directly call:

domains = libvirt_conn.listAllDomains() 

in an async function, it would block my asyncio event loop until I got a response from libvirt.

So, if any events were received on the Unix socket that my main loop is monitoring, they would not be processed while we are waiting for the libvirt daemon to look up all domains and return a list of them back to us.

However, if I use:

domains = await asyncio.to_thread(libvirt_conn.listAllDomains) 

then the await call will suspend my current coroutine until we get the response, yielding execution back to the asyncio event loop. That means if the daemon receives a notification while we are waiting on libvirt, it can be scheduled to run concurrently instead of being blocked.


Another example:

In my application, I will also need to read and write to Linux special files in /sys. Linux has native AIO file support that can be used with asyncio via aiofiles, but it does not support the AIO interface for managing special files — so I have to use blocking I/O.

One way to do that in an async application would be to wrap the function that writes to the special files using asyncio.to_thread.

I could, and might, use a decorator to call run_in_executor directly since I own the write_sysfs function. But if I did not, then to_thread is more polite than monkeypatching someone else's library, and less work than creating my own wrapper API.


Hopefully, those are useful examples of where you might want to use to_thread. It is really just a convenience function. You can use run_in_executor to do the same thing with some additional overhead.

If you need to support older Python releases, you might prefer run_in_executor since it predates the introduction of to_thread. But if you can assume Python 3.9+, it is a nice addition to leverage when you need to.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.