I'll treat this as in two-steps: 1. detecting the orphan(s), 2. set an alert when a new orphan(s) is found.
1. Detecting the orphan(s)
You may start with something like this:
ps -eo pid,ppid,ruser,stat,command
This will give you a table listing all processes with a header for the 5 outputs listed here (pid,ppid,ruser,stat,command). Please refer to the section STANDARD FORMAT SPECIFIERS of man ps for a description of these 5 output fields.
Successive pipes of the output to grep may be used to filter the list, or awk may be more efficient to capture the desired processes; e.g:
ps -eo pid,ppid,ruser,stat,command | awk '{ if ($2 == 1 && $3 == "pi") print $0 "\n";}' 926 1 pi Ss /lib/systemd/systemd --user
Here, awk prints the entire line/record ($0) of any line of the ps output IF the second field ($2) has a ppid of 1, AND the third field ($3) is a string for a user (pi in this case). Running this on my system yielded the result shown above.
But now it gets tricky... AFAIK, there is no value of the stat field that by and of itself identifies an orphan process; i.e. a process whose parent process has died, but has been adopted by the init process - whose pid = 1. You may inspect the PROCESS STATE CODES section of man ps to review the possible values of the stat field. A value of "Z" indicates Zombie process - which is similar to an orphan, but different.
In summary then, ps cannot tell us definitively whether or not a process is an orphan, and so we must regard the output of the command above as a list of suspects to be further investigated.
Depending upon your users and your system, you may be able to eliminate some processes based on the value of the stat code, or the command field. For example, the command above found a process (systemd) with stat=Ss; this process is not an orphan. AFAICT, the 5 parameters being output from the ps command above provide a reasonable basis for meeting your objectives.
2. Set an alert when a new orphan(s) is found
Following is the proposed approach to setting an alert using the orphan detection command in step 1. above.
Two (2) files will be used: OrphansOfRecord, and OrphansOfTheDay. OrphansOfTheDay will be generated as follows:
ps -eo pid,ppid,ruser,stat,command | awk '{ if ($2 == 1 && $3 == "pi") print $0 "\n";}' > OrphansOfTheDay
Once generated, each line of OrphansOfTheDay is compared with each line of OrphansOfRecord; i.e. for each line in OrphansOfTheDay:
- If that line is NOT found in
OrphansOfRecord, it is a NEW Orphan, and the Alarm is set. - If there are no NEW Orphans, the alarm is not set; i.e. no alarm for cleared Orphans
A NEW Orphan process is defined as follows:
- The
pid is new OR the command is new
awk 'NR == FNR{a[$5]b[$1];} !($5 in a)||!($1 in b){print "ALERT" > "alertfile"; close ("alertfile")}' OrphansOfRecord OrphansOfTheDay
Parsing this awk command:
NR == FNR is a condition in awk - a condition that evaluates TRUE while the first file listed in the arguments is being read - OrphansOfRecord in this case. The condition evaluates FALSE after reading the first file.
While, NR == FNR is true, the action {a[$5]b[$1];} is executed. This action stores the values of field $5 (command) and $1 (pid) from each line in OrphansOfRecord in the arrays a and b, respectively. For example, if there are 3 lines (or records) in the OrphansOfRecord file, then a & b will each contain 3 elements when this command is completed.
After reading all the lines in OrphansOfRecord, NR == FNR becomes false, and awk begins reading the second file OrphansOfTheDay. After the first line of OrphansOfTheDay is read, the 2nd condition is evaluated: !($5 in a)||!($1 in b). This condition compares the $5 value from that line against the values in array a, and the $1 value from the same line against the values in the array b. Note that the condition tests for no match of $5 in a or a match of $1 in b.
This continues as awk tests each line of OrphansOfTheDay. When the 2nd condition is TRUE, the 2nd action is executed: {print "ALERT" > "alertfile"; close ("alertfile")}. This action creates alertfile if it does not already exist by redirecting the print "ALERT" to alertfile & then closing it to ensure the output buffer is flushed.
The "ALERT" output may be used to signal that a new, suspected Orphan has been found. The presence of alertfile - or its contents "ALERT" - may be used to determine if an email needs to be sent.
At this point, OrphansOfTheDay has been processed, and an "ALERT" has been created if a suspected Orphan was found. Two things remain to be done:
- Write
OrphansOfTheDay to OrphansOfRecord:
mv OrphansOfTheDay OrphansOfRecord
- If an "ALERT" was set, an email is sent &
alertfile is cleared:
if [ -e alertfile ] then mail -s "ALERT: NEW ORPHAN FOUND" pi < OrphansOfRecord rm alertfile fi
Xmas is calling, and I'm out of time for a few days. I'll put the script together asap - or you may proceed on your own.
cronoutput. If you can confirm that's actually what you want, I'll add that in an edit tomorrow.alerting about newly orphaned pids& the subsequent reference<--- send email...: Could you please clarify what sort ofalertmight work here? If email is what you need, do you have an SMTP server set up? Is the mail to be strictly a local message, or (for example) a gmail account?