I have a bunch of commands that I start with a start.sh script, storing their PID. Then later, I want to stop them with a stop.sh run at the user's convenience.
Watch the trap:
- I run
start.shtoday, which stores in a file the PIDs15000,15001,15002. - I forgot stopping my processes. And a week after, I rebooted my computer.
- I run the
stop.shscript, now. It attempts to kill the tasks with PIDs15000,15001,15002, reading them without thinking from the file.
=> These tasks, if some happen to have these PIDs on my new rebooted system, are no longer the ones I started by mystart.shscript, and I will put my system into an unknown state.
How, when I catch first the PID of a process with a $$ in a Linux script, may I gather other information to ensure I can have no confusion with another task of same PID that could appear in the future?
Gathering PPID, for example, or start date/time, or something that ensures some kind of "universal uniqueness", if I can write this.. ?
How do you gather process information and how do you kill it without confusion?
ENVVAR="somethingspecific" processname arguments, and later show env vars in ps (ex: ps auxww , on some OSes) and kill the ones having that specific combination? Childs may also inherit that envvar, which you can either overwrite (when starting them) or keep as iscommand &you may want to runscreen commandand then you have access to the running command, e.g., to send ctrl-c to stop it later on.waitpid(). This is why competently designed process supervision systems don't have this problem: They run as the parent of the services they supervise, and get an immediate SIGCHILD when the process exits and then callwaitpid()to reap the zombie process-table entry that the dead process leaves behind; before they callwaitpid(), nothing else can be assigned that process ID.