1

The following code has a memory leak and I don't understand why there are references to MyObj. run(1) and run(2) are finished, the context is cleared.

import asyncio import gc from contextvars import ContextVar import objgraph ctx = ContextVar('ctx') class MyObj: def __init__(self, value): self.value = value async def run(value): ctx.set(MyObj(value)) async def main(): await asyncio.gather(run(1), run(2)) gc.collect() print('# of MyObj=', objgraph.count('MyObj')) for obj in objgraph.by_type('MyObj'): print('MyObj.value-',obj.value) print('outer ctx=', ctx.get(None)) objgraph.show_backrefs(objgraph.by_type('MyObj'), filename='ctx.png', max_depth=5) asyncio.run(main()) 
# of MyObj= 2 MyObj.value= 2 MyObj.value= 1 outer ctx= None 

objgraph

Why are there still 'finished tasks' even though I manually called gc.collect()?

Similar questions

2
  • Actually I doubt that variables in the context was cleared. And thus you see references to variables in the context. I may guess that this is how ContextVar works and you should account for this by adding additional coroutine wrapper for asyncio.gather Commented May 22, 2024 at 19:16
  • Yes we can see 2 Context objects in the diagram owned by Task-3 and Task-2. The question is why Task-3 and Task-2 are not garbage collected Commented May 22, 2024 at 19:25

1 Answer 1

0

Basically in your case the finished tasks weren't cleaned. You can see reference to them in asyncio.tasks._all_tasks:

from asyncio import tasks async def main(): task1, task2 = run(1), run(2) await asyncio.gather(task1, task2) futures = tasks._all_tasks for future in futures: print(future, future.done()) 

This code will output three tasks:

<Task finished name='Task-2' coro=<run() done...> True <Task finished name='Task-3' coro=<run() done...> True <Task pending name='Task-1' coro=<main() running ...> False 

So, because tasks still referenced in tasks._all_tasks the context wasn't cleaned. I guess this is one of the examples where Python memory management is not very efficient. You may solve this problem in one of the following ways:

Solution 1
You can add additional async short sleep, this will allow to collect the garbage from the memory.

async def main(): await asyncio.gather(run(1), run(2)) await asyncio.sleep(-0.01) gc.collect() 

Solution 2
Another option is to do additional 'wrapper' for 'asyncio.gather'. This will also allow to properly clean the memory in the context.

import asyncio import gc from contextvars import ContextVar import objgraph ctx = ContextVar('ctx') class MyObj: def __init__(self, value): self.value = value async def run(value): ctx.set(MyObj(value)) async def both_run(): task1 = asyncio.create_task(run(1)) task2 = asyncio.create_task(run(2)) await asyncio.gather(task1, task2) async def main(): task = asyncio.create_task(both_run()) await asyncio.gather(task) print('# of MyObj=', objgraph.count('MyObj')) for obj in objgraph.by_type('MyObj'): print('MyObj.value-',obj.value) print('outer ctx=', ctx.get(None)) objgraph.show_backrefs(objgraph.by_type('MyObj'), filename='ctx.png', max_depth=5) asyncio.run(main()) 
Sign up to request clarification or add additional context in comments.

5 Comments

Why do we need another layer of asyncio.create_task?
Ok, maybe it is just not very effective add a little wait before doing gc.collect: await asyncio.gather(run(1), run(2)) - await asyncio.sleep(0.1)- gc.collect()
kinda curious, why you set sleep time to negative?
Basically, this is same as zero sleeping time. The task will be moved to the end of current event loop.
hi, I have updated my answer. Pls check new explanation.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.