Skip to main content
added 204 characters in body
Source Link
ospider
  • 10.6k
  • 4
  • 54
  • 51

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) 

There is also the httpx library, which is a drop in-in replacement for requests with async/await support. However, httpx is somewhat slower than aiohttp.

Another option is curl_cffi, which has the ability to impersonate browsers' ja3 and http2 fingerprints.

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) 

There is also the httpx library, which is a drop in replacement for requests with async/await support.

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) 

There is also the httpx library, which is a drop-in replacement for requests with async/await support. However, httpx is somewhat slower than aiohttp.

Another option is curl_cffi, which has the ability to impersonate browsers' ja3 and http2 fingerprints.

added 142 characters in body
Source Link
ospider
  • 10.6k
  • 4
  • 54
  • 51

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) 

There is also the httpx library, which is a drop in replacement for requests with async/await support.

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) 

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls: tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls: print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) 

There is also the httpx library, which is a drop in replacement for requests with async/await support.

updated for parallel download
Source Link
ospider
  • 10.6k
  • 4
  • 54
  • 51

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: htmlfor =url awaitin urls:  tasks.append(fetch(session, 'http://pythonurl)) htmls = await asyncio.org'gather(*tasks) for html in htmls:   print(htmlhtml[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) 

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): async with aiohttp.ClientSession() as session: html = await fetch(session, 'http://python.org') print(html) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) 

The answers above are still using the old Python 3.4 style coroutines. Here is what you would write if you got Python 3.5+.

aiohttp supports http proxy now

import aiohttp import asyncio async def fetch(session, url): async with session.get(url) as response: return await response.text() async def main(): urls = [ 'http://python.org', 'https://google.com', 'http://yifei.me' ] tasks = [] async with aiohttp.ClientSession() as session: for url in urls:  tasks.append(fetch(session, url)) htmls = await asyncio.gather(*tasks) for html in htmls:   print(html[:100]) if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) 
Source Link
ospider
  • 10.6k
  • 4
  • 54
  • 51
Loading