3

I am trying to request a bunch of URLs concurrently however the URLs are built from a list. Currently I am looping over the list and (I think) adding them to the queue as it happens. It is definitely 10x faster than requests.get, however I am not sure I am doing it correctly and so it can be optimized. I profiled it and noticing it is still locking 90% of the time after the concurrent requests are done i.e start -> 10+ concurrent requests -> lock for 5 seconds or so -> done

Additionally, this code results in a Unclosed client session message at the end. Any idea why? Pretty sure this is using a context manager properly.

I have searched and not found this exact question

 import signal import sys import asyncio import aiohttp import json import requests lists = ['eth', 'btc', 'xmr', 'req', 'xlm', 'etc', 'omg', 'neo', 'btc', 'xmr', 'req', 'xlm', 'etc', 'omg', 'neo'] loop = asyncio.get_event_loop() client = aiohttp.ClientSession(loop=loop) async def fetch(client, url): async with client.get(url) as resp: assert resp.status == 200 return await resp.text() async def main(loop=loop, url=None): async with aiohttp.ClientSession(loop=loop) as client: html = await fetch(client, url) print(html) def signal_handler(signal, frame): loop.stop() client.close() sys.exit(0) signal.signal(signal.SIGINT, signal_handler) tasks = [] for item in lists: url = "{url}/{endpoint}/{coin_name}".format( url='https://coincap.io', endpoint='page', coin_name=item.upper() ) print(url) tasks.append( asyncio.ensure_future(main(url=url)) ) loop.run_until_complete(asyncio.gather(*tasks)) 

1 Answer 1

8

Looks like what you have works, but as you thought you're not doing everything quite correctly:

  • you create a client which you never use, and don't close correctly (causing the Unclosed client session) warning
  • you're creating a client for each request which is much less efficient than reusing a client.
  • you're not running most of your code in a running event loop.
  • the signal handler as you have it is not necessary, if you have long running asyncio tasks you might want to use add_signal_handler

Here's my simplified take on your code:

import asyncio import aiohttp lists = ['eth', 'btc', 'xmr', 'req', 'xlm', 'etc', 'omg', 'neo', 'btc', 'xmr', 'req', 'xlm', 'etc', 'omg', 'neo'] async def fetch(client, item): url = 'https://coincap.io/{endpoint}/{coin_name}'.format( endpoint='page', coin_name=item.upper() ) async with client.get(url) as resp: assert resp.status == 200 html = await resp.text() print(html) async def main(): async with aiohttp.ClientSession() as client: await asyncio.gather(*[ asyncio.ensure_future(fetch(client, item)) for item in lists ]) loop = asyncio.get_event_loop() loop.run_until_complete(main()) 

If you want to then process the html, you can either do it inside the fetch coroutine or operate on all the results from gather.

Sign up to request clarification or add additional context in comments.

3 Comments

Great answer, just one minor nit: create_task() should be used in preference to ensure_future() if you know you have a coroutine object - rationale by Guido. And in this case neither is needed because asyncio.gather (and asyncio.wait etc.) will correctly handle being passed coroutine objects, or any other objects that can be converted to Future.
yes. I meant to change it but forgot, in this case ensure_future will just call create_task, but still better to use create_task.
But here you don't need it - asyncio.gather(*(fetch(client, item) for item in lists)) should work just fine. gather is explicitly documented to accept coroutines or futures.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.