Return to Answer

added 204 characters in body

edited Jul 24, 2023 at 7:13

10.6k
4
54
51

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main())

There is also the httpx library, which is a drop in-in replacement for requests with async/await support. However, httpx is somewhat slower than aiohttp.

Another option is curl_cffi, which has the ability to impersonate browsers' ja3 and http2 fingerprints.

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main())

There is also the httpx library, which is a drop in replacement for requests with async/await support.

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main())

There is also the httpx library, which is a drop-in replacement for requests with async/await support. However, httpx is somewhat slower than aiohttp.

Another option is curl_cffi, which has the ability to impersonate browsers' ja3 and http2 fingerprints.

added 142 characters in body

Source Link

edited Jan 26, 2022 at 7:44

ospider

10.6k
4
54
51

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main())

There is also the httpx library, which is a drop in replacement for requests with async/await support.

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main())

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main())

There is also the httpx library, which is a drop in replacement for requests with async/await support.

updated for parallel download

Source Link

edited Jun 25, 2018 at 14:22

ospider

10.6k
4
54
51

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: htmlfor =url awaitin urls:  tasks.append(fetch(session, 'http://pythonurl)) htmls = await asyncio.org'gather(*tasks) for html in htmls:   print(htmlhtml[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main())

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): async with aiohttp.ClientSession() as session: html = await fetch(session, 'http://python.org') print(html) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main())

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls:  tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls:   print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main())

Source Link

answered May 13, 2018 at 5:17

ospider

10.6k
4
54
51

Collectives™ on Stack Overflow

Return to Answer