How to retry requests on specific exception

Question

I have currently been working with my own "retry" function where I would like to retry until the requests works. There is some scenarios where if I hit any 5xx status, I should retry with a long delays.

If I hit specific status code e.g. 200 or 404, it should not raise the status code else raise it.

So I have done something like this:

import time import requests from bs4 import BeautifulSoup from requests import ( RequestException, Timeout ) def do_request(): try: # There is some scenarios where I would use my own proxies by doing # requests.get("https://www.bbc.com/", timeout=0.1, proxies={'https': 'xxx.xxxx.xxx.xx')) while (response := requests.get("https://www.bbc.com/", timeout=0.1)).status_code >= 500: print("sleeping") time.sleep(20) if response.status_code not in (200, 404): response.raise_for_status() print("Successful requests!") soup = BeautifulSoup(response.text, 'html.parser') for link in soup.find_all("a", {"class": "media__link"}): yield link.get('href') except Timeout as err: print(f"Retry due to timed out: {err}") except RequestException as err: raise RequestException("Unexpected request error") # ----------------------------------------------------# if __name__ == '__main__': for found_links in do_request(): print(found_links)

The problem for me now is that I have on purpose set the timeout to 0.1 to trigger the exception Timeout to happend and what I want it to happend here is that it should retry the request again once it hits it.

Currently it is stopping there and I wonder what should I do to be able to retry the requests again if it hits a timeout where I do not raise the error?

Ofer Sadan · Accepted Answer · 2021-08-31 10:36:16Z

You can recursively call the function from itself in your case, although be careful about unexpected edge cases:

def do_request(retry: int = 3): try: # There is some scenarios where I would use my own proxies by doing # requests.get("https://www.bbc.com/", timeout=0.1, proxies={'https': 'xxx.xxxx.xxx.xx')) while (response := requests.get("https://www.bbc.com/", timeout=0.1)).status_code >= 500: print("sleeping") time.sleep(20) if response.status_code not in (200, 404): response.raise_for_status() print("Successful requests!") soup = BeautifulSoup(response.text, 'html.parser') for link in soup.find_all("a", {"class": "media__link"}): yield link.get('href') except Timeout as err: if retry: print(f"Retry due to timed out: {err}") yield from do_request(retry=retry - 1) else: raise except RequestException as err: raise RequestException("Unexpected request error")

This will attempt 3 times (or as many as you set in the parameter) until retry is equal to 0 or until another error is encountered

Hello! Smart way! I just tested it and sorry to say, it looks like it still stops at your if retry, where it doesn't seems like it retrying again after first round
@ProtractorNewbie I've just noticed you're using yield to get values so i've corrected for that and edited my answer, can you try again with the new version?
That did the job! :) Now just out of curiosity, if I had more than just retry in the parameter for do_request e.g. URL, I assume I would need to do something like yield from do_request(retry=retry - 1, url=URL) in that case?
yes! it should work. or you could use *args, **kwargs if you have many more parameters (but that's a subject for another question if you're not familiar with that)

Mahmoud Embaby · Accepted Answer · 2021-08-31 10:35:04Z

I would put it in a while loop and break the loop when the action is achieved.

Sample:

def do_request(): while True: try: # There is some scenarios where I would use my own proxies by doing # requests.get("https://www.bbc.com/", timeout=0.1, proxies={'https': 'xxx.xxxx.xxx.xx')) while (response := requests.get("https://www.bbc.com/", timeout=0.1)).status_code >= 500: print("sleeping") time.sleep(20) if response.status_code not in (200, 404): response.raise_for_status() print("Successful requests!") soup = BeautifulSoup(response.text, 'html.parser') for link in soup.find_all("a", {"class": "media__link"}): yield link.get('href') break except Timeout as err: print(f"Retry due to timed out: {err}") except RequestException as err: raise RequestException("Unexpected request error")

You can also add time.sleep(0.1) between each trial.

dx33 · Accepted Answer · 2021-09-01 09:44:10Z

The package tenacity tackles all kinds of retrying problems gracefully.

For your problem, simply add a decorator like this:

@retry(retry=retry_if_exception_type(Timeout)) def do_request(): while (response := requests.get("https://www.bbc.com/", timeout=0.1)).status_code >= 500: print("sleeping") time.sleep(20) if response.status_code not in (200, 404): response.raise_for_status() print("Successful requests!") soup = BeautifulSoup(response.text, 'html.parser') for link in soup.find_all("a", {"class": "media__link"}): yield link.get('href')

Hello! I just tested your code and it seems like nothing happens... It just exits the code when running this script. Have you checked in your end? Is it also possible to print out what kind of exception did happend?
Just tested it again with throwing HTTPError instead of Timeout, looks like it does not retry afterall... You sure it is working? :)

Collectives™ on Stack Overflow

How to retry requests on specific exception

3 Answers 3

4 Comments

Comments

3 Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

3 Comments

Related