85

What's the point of introducing async for and async with? I know there are PEPs for these statements, but they are clearly intended for language designers, not average users like me. A high-level rationale supplemented with examples would be greatly appreciated.

I did some research myself and found this answer:

The async for and async with statements are needed because you would break the yield from/await chain with the bare for and with statements.

The author didn't give an example of how the chain might be broken though, so I'm still confused. Furthermore, I notice that Python has async for and async with, but not async while and async try ... except. This sounds strange because for and with just syntax sugars for while and try ... except respectively. I mean, wouldn't async versions of the latter statements allow more flexibility, given that they are the building blocks of the former?

There is another answer discussing async for, but it only covers what it is not for, and didn't say much about what it is for.

As a bonus, are async for and async with syntax sugars? If they are, what are their verbose equivalent forms?

11
  • 6
    "for and with just syntax sugars for while and try ... except" — Nope, far from it, they're each their own thing. Commented Apr 14, 2021 at 12:50
  • 2
    for and with invoke methods on the objects you put in, which are supposed to return certain values immediately. With async for and async with, these methods can be async, allowing them to do some non-blocking work. Commented Apr 14, 2021 at 12:52
  • 6
    @deceze Well, the official docs states that the with statement "is semantically equivalent to" try...except...finally. And you can easily implement a for loop with while and next. Maybe they are not syntax sugars, but they are not that different either. Commented Apr 14, 2021 at 12:59
  • 2
    You need this new syntax, because where else would you put the await for async __enter__/__exit__/__iter__/__next__ if they're implicitly called by the "sugar" with/for statements? Commented Apr 14, 2021 at 13:07
  • 3
    If you want to put it this way, yes. for and with encapsulate protocols for specific patterns involving specific methods, which you can replicate "manually" with while and try..except..finally. But the point is exactly to make those patterns reusable instead of writing a ton of boilerplate every time. And since that boilerplate differs for async versions, you need specific async versions of them. Commented Apr 14, 2021 at 13:40

3 Answers 3

68

TLDR: for and with are non-trivial syntactic sugar that encapsulate several steps of calling related methods. This makes it impossible to manually add awaits between these steps – but properly usable async for/with need that. At the same time, this means it is vital to have async support for them.


Why we can't await nice things

Python's statements and expressions are backed by so-called protocols: When an object is used in some specific statement/expression, Python calls corresponding "special methods" on the object to allow customization. For example, x in [1, 2, 3] delegates to list.__contains__ to define what in actually means.
Most protocols are straightforward: There is one special method called for each statement/expression. If the only async feature we have is the primitive await, then we can still make all these "one special method" statements/expression "async" by sprinkling await at the right place.

In contrast, the for and with statements both correspond to multiple steps: for uses the iterator protocol to repeatedly fetch the __next__ item of an iterator, and with uses the context manager protocol to both enter and exit a context.
The important part is that both have more than one step that might need to be asynchronous. While we could manually sprinkle an await at one of these steps, we cannot hit all of them.

  • The easier case to look at is with: we can address at the __enter__ and __exit__ method separately.

    We could naively define a syncronous context manager with asynchronous special methods. For entering this actually works by adding an await strategically:

    with AsyncEnterContext() as acm: context = await acm print("I entered an async context and all I got was this lousy", context) 

    However, it already breaks down if we use a single with statement for multiple contexts: We would first enter all contexts at once, then await all of them at once.

    with AsyncEnterContext() as acm1, AsyncEnterContext() as acm2: context1, context2 = await acm1, await acm2 # wrong! acm1 must be entered completely before loading acm2 print("I entered many async contexts and all I got was a rules lawyer telling me I did it wrong!") 

    Worse, there is just no single point where we could await exiting properly.

While it's true that for and with are syntactic sugar, they are non-trivial syntactic sugar: They make multiple actions nicer. As a result, one cannot naively await individual actions of them. Only a blanket async with and async for can cover every step.

Why we want to async nice things

Both for and with are abstractions: They fully encapsulate the idea of iteration/contextualisation.

Picking one of the two again, Python's for is the abstraction of internal iteration – for contrast, a while is the abstraction of external iteration. In short, that means the entire point of for is that the programmer does not have to know how iteration actually works.

  • Compare how one would iterate a list using for or while:
    some_list = list(range(20)) index = 0 # lists are indexed from 0 while index < len(some_list): # lists are indexed up to len-1 print(some_list[index]) # lists are directly index'able index += 1 # lists are evenly spaced for item in some_list: # lists are iterable print(item) 
    The external while iteration relies on knowledge about how lists work concretely: It pulls implementation details out of the iterable and puts them into the loop. In contrast, internal for iteration only relies on knowing that lists are iterable. It would work with any implementation of lists, and in fact any implementation of iterables.

Bottom line is the entire point of for – and with – is not to bother with implementation details. That includes having to know which steps we need to sprinkle with async. Only a blanket async with and async for can cover every step without us knowing which.

Why we need to async nice things

A valid question is why for and with get async variants, but others do not. There is a subtle point about for and with that is not obvious in daily usage: both represent concurrency – and concurrency is the domain of async.

Without going too much into detail, a handwavy explanation is the equivalence of handling routines (()), iterables (for) and context managers (with). As has been established in the answer cited in the question, coroutines are actually a kind of generators. Obviously, generators are also iterables and in fact we can express any iterable via a generator. The less obvious piece is that context managers are also equivalent to generators – most importantly, contextlib.contextmanager can translate generators to context managers.

To consistently handle all kinds of concurrency, we need async variants for routines (await), iterables (async for) and context managers (async with). Only a blanket async with and async for can cover every step consistently.

Sign up to request clarification or add additional context in comments.

13 Comments

@CharlieParker In your example for loop, only the body is async. In an async for, the iterable itself can be async – for example, it could fetch data from a remote database, waiting for each item until it arrives.
@CharlieParker Well, yes – you could manually unroll async for just like you can unroll a regular for to a while. Both are abstractions, not fundamental primitives. Similarly, you could "unroll" both async with and with using try: except:.
@CharlieParker That you assume for is about "just increasing indexes or counters" just underlines that it is an important abstraction for hiding details. Even simple nestings of higher-order iterators like map or the itertools are extremely complex in total. An async iterator can be simpler logically, because async for+event-loops switch deterministically whereas for+threads can arbitrarily interleave.
@CharlieParker Yes, that is basically correct. If you think of for x in y: as a while repeatedly running x = y.__next__(), you can similarly think of async for x in y: as a while repeatedly running x = await y.__anext__(). This allows to suspend "inside" the async for waiting for the async iterator to produce the next item.
but that seems bad to me. Wouldn't that meen that async for's block? wouldn't it be better to run in the loop a bunch of tasks and then await them later in a gather outside the loop? What is the use for async for's if they block with the await keyword? They only provide a small benefit that perhaps we can run something else during the await but beyond that it basically blocks. I feel I am missing something.
|
39
+100

async for and async with are logical continuation of the development from lower to higher levels.

In the past, the for loop in a programming language used to be capable only of simple iterating over an array of values linearly indexed 0, 1, 2 ... max.

Python's for loop is a higher-level construct. It can iterate over anything supporting the iteration protocol, e.g. set elements or nodes in a tree - none of them has items numbered 0, 1, 2, ... etc.

The core of the iteration protocol is the __next__ special method. Each successive call returns the next item (which may be a computed value or retrieved data) or signals the end of iteration.

The async for is the asynchronous counterpart, instead of calling the regular __next__ it awaits the asynchronous __anext__ and everything else remains the same. That allows to use common idioms in async programs:

# 1. print lines of text stored in a file for line in regular_file: print(line) # 2A. print lines of text as they arrive over the network, # # The same idiom as above, but the asynchronous character makes # it possible to execute other tasks while waiting for new data async for line in tcp_stream: print(line) # 2B: the same with a spawned command async for line in running_subprocess.stdout: print(line) 

The situation with async with is similar. To summarize: the try .. finally construct was replaced by more convenient with block - now considered idiomatic - that can communicate with anything supporting the context manager protocol with its __enter__ and __exit__ methods for entering and exiting the block. Naturally, everything formerly used in a try .. finally was rewritten to become a context manager (locks, pairs of open-close calls, etc)

async with is again a counterpart with asynchronous __aenter__ and __aexit__ special methods. Other tasks may run while the asynchronous code for entering or exiting a with block waits for new data or a lock or some other condition to become fulfilled.

Note: unlike for, it was possible to use asynchronous objects with the plain (not async) with statement: with await lock:, it is deprecated or unsupported now (note that it was not an exact equivalent of async with).

12 Comments

Heads up that with await lock: could still be used, but it's something else than async with lock:. It means the object producing a context manager is async, not that the context manager itself is async.
basically it sounds like for async for that the syntax is just to make sure the for loop works properly with async code since its non-trivial to implement (it's my guess). So the for loop works just as normal but now allows the await key word to be used. Is this more or less right?
@CharlieParker I would say the implementation of async for loop with is at the same difficulty level as the plain for. The difference is that it loops over iterables that are working asynchronously in their internal implementation. In other words, you have to use the proper for to match the type of iterable. There are very few async iterables compared to regular iterables which are literally everywhere. That makes occurences of async for in code quite rare.
@VPfB thanks for your message! Let me re-iterate just to make sure I understood. So the async for is essential a generator getting things from io in an async manner so each time something is ready (crucially) in the right order then it returns the next thing. Is that right? So async for does not only allow the keyword await to be used inside it's body but also to have the iterator get the next item in an asynchronous manner and respecting the order of the iterator. Is that right?
@CharlieParker: Re 1): Yes, that is exactly the main reason for async iteration, just a tiny note: a better term than "expensive" would be "I/O Bound" (en.wikipedia.org/wiki/I/O_bound) Re 2): Well, it could be the case as in those examples where the iterator assembles entire lines (or other data units), but in general it is not a main characteristics of async interation. A plain iterator reading lines from a disk file does almost the same; the difference is that local file I/O is non-blocking and usually pretty fast, we can consider the result immediately available.
|
6

My understanding of async with is that it allows python to call the await keyword inside the context manager without python freaking out. Removing the async from the with results in errors. This is useful because the object created is most likely going to do expensive io operations we have to wait for - so we will likely await methods from the object created from this special asynced context manager. Without this closing and opening the context manager correctly likely creates issues within python (otherwise why bother users of python with even more nuanced syntax and semantics to learn?).

I have not fully tested what async for does or the intricacies of it but would love to see an example and might later test it once I need it and update this answer. I will put the example here once I get to it: https://github.com/brando90/ultimate-utils/blob/master/tutorials_for_myself/concurrency/asyncio_for.py

For now see my annotated example with async with (script lives https://github.com/brando90/ultimate-utils/blob/master/tutorials_for_myself/concurrency/asyncio_my_example.py):

""" 1. https://realpython.com/async-io-python/#the-asyncawait-syntax-and-native-coroutines 2. https://realpython.com/python-concurrency/ 3. https://stackoverflow.com/questions/67092070/why-do-we-need-async-for-and-async-with todo - async with, async for. todo: meaning of: - The async for and async with statements are only needed to the extent that using plain for or with would “break” the nature of await in the coroutine. This distinction between asynchronicity and concurrency is a key one to grasp - One exception to this that you’ll see in the next code is the async with statement, which creates a context manager from an object you would normally await. While the semantics are a little different, the idea is the same: to flag this context manager as something that can get swapped out. - download_site() at the top is almost identical to the threading version with the exception of the async keyword on the function definition line and the async with keywords when you actually call session.get(). You’ll see later why Session can be passed in here rather than using thread-local storage. - An asynchronous context manager is a context manager that is able to suspend execution in its enter and exit methods. """ import asyncio from asyncio import Task import time import aiohttp from aiohttp.client_reqrep import ClientResponse from typing import Coroutine async def download_site(coroutine_name: str, session: aiohttp.ClientSession, url: str) -> ClientResponse: """ Calls an expensive io (get data from a url) using the special session (awaitable) object. Note that not all objects are awaitable. """ # - the with statement is bad here in my opion since async with is already mysterious and it's being used twice # async with session.get(url) as response: # print("Read {0} from {1}".format(response.content_length, url)) # - this won't work since it only creates the coroutine. It **has** to be awaited. The trick to have it be (buggy) # synchronous is to have the main coroutine call each task we want in order instead of giving all the tasks we want # at once to the vent loop e.g. with the asyncio.gather which gives all coroutines, gets the result in a list and # thus doesn't block! # response = session.get(url) # - right way to do async code is to have this await so someone else can run. Note, if the download_site/ parent # program is awaited in a for loop this won't work regardless. response = await session.get(url) print(f"Read {response.content_length} from {url} using {coroutine_name=}") return response async def download_all_sites_not_actually_async_buggy(sites: list[str]) -> list[ClientResponse]: """ Code to demo the none async code. The code isn't truly asynchronous/concurrent because we are awaiting all the io calls (to the network) in the for loop. To avoid this issue, give the list of coroutines to a function that actually dispatches the io like asyncio.gather. My understanding is that async with allows the object given to be a awaitable object. This means that the object created is an object that does io calls so it might block so it's often the case we await it. Recall that when we run await f() f is either 1) coroutine that gains control (but might block code!) or 2) io call that takes a long time. But because of how python works after the await finishes the program expects the response to "actually be there". Thus, doing await blindly doesn't speed up the code. Do awaits on real io calls and call them with things that give it to the event loop (e.g. asyncio.gather). """ # - create a awaitable object without having the context manager explode if it gives up execution. # - crucially, the session is an aiosession - so it is actually awaitable so we can actually give it to # - asyncio.gather and thus in the async code we truly take advantage of the concurrency of asynchronous programming async with aiohttp.ClientSession() as session: # with aiohttp.ClientSession() as session: # won't work because there is an await inside this with tasks: list[Task] = [] responses: list[ClientResponse] = [] for i, url in enumerate(sites): task: Task = asyncio.ensure_future(download_site(f'coroutine{i}', session, url)) tasks.append(task) response: ClientResponse = await session.get(url) responses.append(response) return responses async def download_all_sites_truly_async(sites: list[str]) -> list[ClientResponse]: """ Truly async program that calls creates a bunch of coroutines that download data from urls and the uses gather to have the event loop run it asynchronously (and thus efficiently). Note there is only one process though. """ # - indicates that session is an async obj that will likely be awaited since it likely does an expensive io that # - waits so it wants to give control back to the event loop or other coroutines so they can do stuff while the # - io happens async with aiohttp.ClientSession() as session: tasks: list[Task] = [] for i, url in enumerate(sites): task: Task = asyncio.ensure_future(download_site(f'coroutine{i}', session, url)) tasks.append(task) responses: list[ClientResponse] = await asyncio.gather(*tasks, return_exceptions=True) return responses if __name__ == "__main__": # - args sites = ["https://www.jython.org", "http://olympus.realpython.org/dice"] * 80 start_time = time.time() # - run main async code # main_coroutine: Coroutine = download_all_sites_truly_async(sites) main_coroutine: Coroutine = download_all_sites_not_actually_async_buggy(sites) responses: list[ClientResponse] = asyncio.run(main_coroutine) # - print stats duration = time.time() - start_time print(f"Downloaded {len(sites)} sites in {duration} seconds") print('Success, done!\a') 

2 Comments

I'm still sort of puzzled about the use of async for for and withs. My understanding is that sync def creates a coroutine -- a function that can give up execution control to the caller. But in the async for x in range(10) -- I don't understand why the async is needed since I've written for loops that call awaits without issues e.g. ` for i in range(num_steps): await asyncio.sleep(1)`. So I don't understand why I need the async for the for loop for. Can you clarify this?
you don't need indeed to do async for x in range(10), however async for line in tcp_stream is concise and understandable

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.