22

When my selenium program crashes due to some error, it seems to leave behind running processes.

For example, here is my process list:

carol 30186 0.0 0.0 103576 7196 pts/11 Sl 00:45 0:00 /home/carol/test/chromedriver --port=51789 carol 30322 0.0 0.0 102552 7160 pts/11 Sl 00:45 0:00 /home/carol/test/chromedriver --port=33409 carol 30543 0.0 0.0 102552 7104 pts/11 Sl 00:48 0:00 /home/carol/test/chromedriver --port=42567 carol 30698 0.0 0.0 102552 7236 pts/11 Sl 00:50 0:00 /home/carol/test/chromedriver --port=46590 carol 30938 0.0 0.0 102552 7496 pts/11 Sl 00:55 0:00 /home/carol/test/chromedriver --port=51930 carol 31546 0.0 0.0 102552 7376 pts/11 Sl 01:16 0:00 /home/carol/test/chromedriver --port=53077 carol 31549 0.5 0.0 0 0 pts/11 Z 01:16 0:03 [chrome] <defunct> carol 31738 0.0 0.0 102552 7388 pts/11 Sl 01:17 0:00 /home/carol/test/chromedriver --port=55414 carol 31741 0.3 0.0 0 0 pts/11 Z 01:17 0:02 [chrome] <defunct> carol 31903 0.0 0.0 102552 7368 pts/11 Sl 01:19 0:00 /home/carol/test/chromedriver --port=54205 carol 31906 0.6 0.0 0 0 pts/11 Z 01:19 0:03 [chrome] <defunct> carol 32083 0.0 0.0 102552 7292 pts/11 Sl 01:20 0:00 /home/carol/test/chromedriver --port=39083 carol 32440 0.0 0.0 102552 7412 pts/11 Sl 01:24 0:00 /home/carol/test/chromedriver --port=34326 carol 32443 1.7 0.0 0 0 pts/11 Z 01:24 0:03 [chrome] <defunct> carol 32691 0.1 0.0 102552 7360 pts/11 Sl 01:26 0:00 /home/carol/test/chromedriver --port=36369 carol 32695 2.8 0.0 0 0 pts/11 Z 01:26 0:02 [chrome] <defunct> 

Here is my code:

from selenium import webdriver browser = webdriver.Chrome("path/to/chromedriver") browser.get("http://stackoverflow.com") browser.find_element_by_id('...').click() browser.close() 

Sometimes, the browser doesn't load the webpage elements quickly enough so Selenium crashes when it tries to click on something it didn't find. Other times it works fine.

This is a simple example for simplicity sake, but with a more complex selenium program, what is a guaranteed clean way of exiting and not leave behind running processes? It should cleanly exit on an unexpected crash and on a successful run.

9 Answers 9

22

As already pointed out you should run browser.quit()

But on linux (inside docker) this will leave defunct processes. These are typically not really a problem as they are mere an entry in the process-table and consume no resources. But if you have many of those you will run out of processes. Typically my server melts down at 65k processes.

It looks like this:

# root@dockerhost1:~/odi/docker/bf1# ps -ef | grep -i defunct | wc -l 28599 root@dockerhost1:~/odi/docker/bf1# ps -ef | grep -i defunct | tail root 32757 10839 0 Oct18 ? 00:00:00 [chrome] <defunct> root 32758 895 0 Oct18 ? 00:00:02 [chrome] <defunct> root 32759 15393 0 Oct18 ? 00:00:00 [chrome] <defunct> root 32760 13849 0 01:23 ? 00:00:00 [chrome] <defunct> root 32761 472 0 Oct18 ? 00:00:00 [chrome] <defunct> root 32762 19360 0 01:35 ? 00:00:00 [chrome] <defunct> root 32763 30701 0 00:34 ? 00:00:00 [chrome] <defunct> root 32764 17556 0 Oct18 ? 00:00:00 [chrome] <defunct> root 32766 8102 0 00:49 ? 00:00:00 [cat] <defunct> root 32767 9490 0 Oct18 ? 00:00:00 [chrome] <defunct> 

The following code will solve the problem:

def quit_driver_and_reap_children(driver): log.debug('Quitting session: %s' % driver.session_id) driver.quit() try: pid = True while pid: pid = os.waitpid(-1, os.WNOHANG) log.debug("Reaped child: %s" % str(pid)) #Wonka's Solution to avoid infinite loop cause pid value -> (0, 0) try: if pid[0] == 0: pid = False except: pass #---- ---- except ChildProcessError: pass 
Sign up to request clarification or add additional context in comments.

3 Comments

When I tried this solution, sometimes I get pid was (0, 0) so never ends loop. I solved with try-except code to check "if pid[0] == 0:" then I set the variable pid = False. It seems to work correctly. I can edit your answer to fix it if someone is interested.
In my case (linux machine) the os.waitpid returns everytime tuple (pid, status) and in case when no child process is found then the ChildProcessError exception is raised (the (0, 0) tuple is never returned). With this knowledge the code can be simplified and the usage of bare 'except' can be avoided. But I'm not sure if it will work in the same way on other platforms.
@eNca I don't think that's correct - ChildProcessError is thrown when there are no alive or zombie child processes at all, (0, 0) is returned when there are child processes, but they are still alive and have not exited yet
14

I see this pretty old thread, but maybe my case will be useful for somebody. For some reasons I had to run a lot of scrapers with separate webdriver instance with headfull (non headless) browser for every request in Docker container with Xvfb. So every request produced 2-3 zombie processes with the firefox. (and 12 whith Chromedriver). So after few minutes of scraing I had thousands of zombie processes. driver.close() and driver.quit() had no success. The Jimmy's Engelbrecht solution is better, but it was killing only part of processes. So the only working method for me was to enable init in docker container.

docker run --init container 

It protects you from software that accidentally creates zombie processes, which can (over time!) starve your entire system for PIDs (and make it unusable).

Comments

13

Whats happening is that your code is throwing an exception, halting the python process from continuing on. As such, the close/quit methods never get called on the browser object, so the chromedrivers just hang out indefinitely.

You need to use a try/except block to ensure the close method is called every time, even when an exception is thrown. A very simplistic example is:

from selenium import webdriver browser = webdriver.Chrome("path/to/chromedriver") try: browser.get("http://stackoverflow.com") browser.find_element_by_id('...').click() except: browser.close() browser.quit() # I exclusively use quit 

There are a number of much more sophisticated approaches you can take here, such as creating a context manager to use with the with statement, but its difficult to recommend one without having a better understanding of your codebase.

3 Comments

What to do when the python process is killed by OS?
@hldev you have to watch for SIGTERM or other OS signals: stackoverflow.com/questions/18499497/…
or try/finally in newer Python releases :)
3

Chromedriver.exe crowds the TaskManager ( in case of Windows) everytime Selenium runs on Chrome.Sometimes, it doesn't clear even if the browser didn't crash.

I usually run a bat file or a cmd to kill all the existing chromedriver.exe processes before launching another one.

Take a look at this : release Selenium chromedriver.exe from memory

I know this is a Unix-related question but I am sure the way it has been handled in Windows can be applied there.

2 Comments

The question appears to be *nix related, given the example process output.
what if the program be a multi-threaded one? If we kill all the webdriver's processes before running a new one, there might be other thread's webdrivers which are working fine that are killed in this way. Do u have any solution for this case?
1

I don't think that's a problem for the OP but it may help someone else landing here with a similar problem: deleting the arg no-sandbox fixed the issue for me (source)

c#:

public static string GetPageHeadless(string url, out string redirectedUrl) { var options = new ChromeOptions(); options.AddArguments(new List<string>() { "headless", "disable-gpu" }); using var service = ChromeDriverService.CreateDefaultService(); service.HideCommandPromptWindow = true; using var browser = new ChromeDriver(service, options); browser.Navigate().GoToUrl(url); var html = browser.ExecuteScript("return document.body.parentElement.outerHTML"); redirectedUrl = browser.Url; browser.Quit(); return html.ToString(); } 

Comments

1

Using dumb-init fixed issue for me.

I am using chrome and selenium inside AMD64 docker to run repetitive test that start browser and navigate to specific site and perform some actions. I can reproduce this issue only in docker and not in my local system. defunct processes left behind were child processes created by chrome every time web driver starts, and around 6 child processes were left behind with every web driver start and quit operation. (zygote, renderer, crash-handler, gpu-process, utility etc.) Eventually these child processes caused issues that prevented the web driver from starting up, and system required a restart to fix this.

The program was running inside docker via ENTRYPOINT, and in such case program is run as init process, thereby giving it a process id (pid) of 1. Child processes spawned by the init process were not gracefully handling termination signals when we called driver.quit().

dumb-init utility ensures our program is not run as init process directly. After using this proxy program, defunct child processes of chrome are not retained in my case.

Comments

0

I encountered the same problem: running chromedriver in docker. But when quit() is called, chromedriver becomes a zombie thread. I used dumb-init to solve my problem. I guess this problem does not only appear in chromedriver, it is related to the characteristics of docker, which lacks some components of Linux, which makes it impossible to handle sub-threads correctly.

Dockerfile add:

RUN wget https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_amd64.deb RUN sudo dpkg -i dumb-init_*.deb ENTRYPOINT ["/usr/bin/dumb-init", "--", "./entrypoint.sh"] 

entrypoint.sh:

#!/bin/sh echo "使用参数为 $*" exec java -jar $JAR_NAME "$@" 

ENTRYPOINT and exec is very important in docker.

Comments

0

Sometimes driver.close() or driver.quit() still leaves behind zombie threads. You can kill those using taskkill like this

import subprocess subprocess.call("TASKKILL /f /IM CHROME.EXE /T") subprocess.call("TASKKILL /f /IM CHROMEDRIVER.exe /T") 

In case where image naming is not conventional (in my case I used undetected_chromedriver), I handle like this since taskkill does not accept wildcard at the start of string

processes = subprocess.getoutput("tasklist /fo list | findstr \"chrome\"") array = processes.split('\n') for i in array: image_name = list(filter(None, i.split(' ')))[2] subprocess.call(f"TASKKILL /f /IM {image_name} /T") 
  • /f: force kill
  • /im: image name
  • /t: kill children too

More here on document

Comments

-2

A Very Easy And Safe Way To Kill Selenium Zombies

from selenium import webdriver from selenium.webdriver.chrome.service import Service import os import psutil from time import sleep def killDriverZombies(driver): try: # Get the WebDriver process ID webdriver_pid = driver.service.process.pid # Get the parent process of the WebDriver process parent_process = psutil.Process(webdriver_pid) # Get the WebDriver childern PID`S browser_pids = [i.pid for i in parent_process.children()] # Use os.kill with the parent id and 9 signal (KILL) for the WebDriver os.kill(parent_process.pid, 9) # Use os.kill with the parent id and 9 signal (KILL) for all childs [os.kill(browserPID,9) for browserPID in browser_pids] # Return True for a sucess return True except Exception as e: # Handle if there is a exception like the driver is alr closed as an example return [False,e] driver = webdriver.Chrome(service=Service('./chromedriver.exe')) driver.get("chrome://about") # Will work normally as exepected sleep(6) # Sleep 6 seconds before Killing the driver print(killDriverZombies(driver)) driver.get("chrome://version") # urllib3.exceptions.MaxRetryError as the WebDriver session is closed + browser 

2 Comments

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
Please avoid code only answer and provide an explanation. You also could have a look at how to answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.