0

I am scraping a website which has 2 versions at the moment, and when you visit the site you never know which one you are going to get. For this reason I have had to set up two separate files to scrape it.

For the sake of simplicity I have a master file which controls the running of the two files:

attempts = 0 while attempts < 10: try: try: runfile('file1.py') except SomeException: runfile('file2.py') break except: attempts += 1 

So basically this keeps trying a maximum of 10 times until the correct version of the site meets the correct scraper file.

The problem with this is that the files launch a webdriver every time, so I can end up with several empty browsers clogging up the machine. Is there any command which can just close all webdriver instances? I cannot use driver.quit() because in the environment of this umbrella script, driver is not a recognized variable.

I also cannot use driver.quit() at the end of file1.py or file2.py because when file1.py encounters an error, it ceases to run and so the driver.quit() command will not be executed. I can't use a try / except because then my master file won't understand that there was an error in file1.py and thus won't run file2.py.

4
  • What do you mean driver isn't a recognized variable? You have to store the webdriver object somewhere. Why can't you call .quit() on that? Commented Mar 23, 2018 at 17:50
  • You can use driver.quit() in files file1.py and file2.py at the end (i.e. when the work of the webdriver is done). Commented Mar 23, 2018 at 17:52
  • @KeyurPotdar when file1.py encounters an error, it ceases to run and so the driver.quit() command will not be executed. I can't use a try / except because then my master file won't understand that there was an error in file1.py and thus won't run file2.py. Does this make sense? Commented Mar 24, 2018 at 13:56
  • 1
    Yes, it does make sense. You should add that part in the question; it'll help others to answer your question (seems important to solve the problem). Commented Mar 24, 2018 at 13:58

2 Answers 2

1

Handle the exception in individual runners, close the driver and raise a common exception that you then handle in the caller.

In file1.py and file2.py

try: # routine except Exception as e: driver.quit() raise e 

You can factor this out to the caller by initializing the driver in the caller, and passing the driver instance to functions instead of modules.

Sign up to request clarification or add additional context in comments.

1 Comment

I like that, didn't realise you could manually raise an exception. Thanks
1

You can have a try..finally block in runfile.

def runfile(filename): driver = ... try: ... finally: # close the webdriver driver.quit() 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.