4

Problem

I would like to kill a process called raspivid (program which records videos using a Raspberry Pi Camera) but I cannot...

This is how I call it:

#!/bin/bash #Start recording... raspivid -w 800 -h 600 -t 15000 -o $1 -v -n -rot 270 >> /home/pi/log/camera_output.txt 2>&1 & #Waiting the video to be complete sleep 16 #Killing child process sudo kill -9 $! #Killing parent process sudo kill -9 $$ 

If I search for this process, it is still there:

pi@raspberrypi ~ $ ps -ef | grep raspivid root 7238 7234 0 21:53 ? 00:00:00 [raspivid] pi 17096 14925 0 22:05 pts/0 00:00:00 grep --color=auto raspivid 

If I try to kill it, it doesn't die. Instead it changes the parent PID to 1:

pi@raspberrypi ~ $ sudo killall raspivid pi@raspberrypi ~ $ ps -ef | grep raspivid root 7238 1 0 21:53 ? 00:00:00 [raspivid] pi 17196 14925 0 22:05 pts/0 00:00:00 grep --color=auto raspivid pi@raspberrypi ~ $ sudo killall raspivid 

Observations:

  1. The call works fine for a while (2 hours or something) then it starts hanging.
  2. Only a physical power off solves the issue. I cannot reboot via terminal (it hangs too)

My questions:

  1. Why does Linux assign the parent PID to 1?
  2. Why the process cannot get killed? (I also tried sudo kill -9 7238)

EDIT:

aecolley was right. The column S shows D:

0 D 0 11823 11819 0 80 0 - 0 down ? 00:00:00 raspivid 
12
  • 2
    Probably it's a zombie process. Check with top how many zombies do you have or please provide which flags (STAT) this process has (if it has Z, it's zombie). E.g. by ps wuax PID. Commented Feb 4, 2015 at 21:34
  • 3
    @kenorb, no, zombies are usually have (defunct) suffix. But square braces give a clue - it may be a kernel thread Commented Feb 4, 2015 at 21:38
  • It might still be hanging on to the device. Commented Feb 4, 2015 at 21:46
  • 2
    @myaut On my machines, the suffix shown by ps for zombies is actually <defunct> (with angle brackets). Commented Feb 4, 2015 at 21:48
  • @vinc17, yep, this is Solaris notation i got confused with Linux. Commented Feb 4, 2015 at 21:49

1 Answer 1

11

If you run ps -el instead of ps -ef, you'll get an S column with the process state. My guess is that the process is in state D, which means uninterruptible wait.

In other words, the process is stuck in the messier parts of a device driver, and the kernel doesn't think it's safe to kill it until the device driver lets go of it. You sometimes see this with processes that talk to sick NFS servers, or devices with errors. In this case, it looks like it's talking to a video-capture device.

Unfortunately, there's no silver-bullet way to unstick a process from D-wait, except for rebooting the system. You could try using the Solaris command truss to find out what the program did right before it got stuck, but there might not be anything you can do about it. You may just have a buggy device driver.

Finally, the reason the parent pid changes to 1 is that your killall is successfully killing the parent process. Whenever a process exits, its child processes are all inherited by pid 1. It's a minor mystery why the ps -f line for the parent process isn't matched by the grep.

4
  • 2
    Question is about Linux, so strace is Linux equivalent to truss on Unix. Commented Feb 5, 2015 at 11:41
  • @aecolley Thank you. That is exactly my issue. It prevents me to reboot the machine. Whatever reboot command I try it simply doesn't work. Commented Feb 5, 2015 at 14:04
  • @user1688175 If /sbin/reboot -f doesn't reboot, then it's definitely a buggy device driver. Commented Feb 6, 2015 at 22:03
  • aecolley see my changes in the questions Commented Feb 7, 2015 at 5:32

You must log in to answer this question.