I am scraping a website which has 2 versions at the moment, and when you visit the site you never know which one you are going to get. For this reason I have had to set up two separate files to scrape it.
For the sake of simplicity I have a master file which controls the running of the two files:
attempts = 0 while attempts < 10: try: try: runfile('file1.py') except SomeException: runfile('file2.py') break except: attempts += 1 So basically this keeps trying a maximum of 10 times until the correct version of the site meets the correct scraper file.
The problem with this is that the files launch a webdriver every time, so I can end up with several empty browsers clogging up the machine. Is there any command which can just close all webdriver instances? I cannot use driver.quit() because in the environment of this umbrella script, driver is not a recognized variable.
I also cannot use driver.quit() at the end of file1.py or file2.py because when file1.py encounters an error, it ceases to run and so the driver.quit() command will not be executed. I can't use a try / except because then my master file won't understand that there was an error in file1.py and thus won't run file2.py.
driverisn't a recognized variable? You have to store the webdriver object somewhere. Why can't you call.quit()on that?driver.quit()in filesfile1.pyandfile2.pyat the end (i.e. when the work of the webdriver is done).file1.pyencounters an error, it ceases to run and so thedriver.quit()command will not be executed. I can't use atry/exceptbecause then my master file won't understand that there was an error infile1.pyand thus won't runfile2.py. Does this make sense?