15

For the purposes of profiling a partially evaluated program, I'm interested in knowing the best way to terminate a GHC program. This is useful for profiling programs that take a long time to run, possibly as long as forever.

With GHC 7.4.2, I was able to profile a non-terminating program by enabling profiling (-prof -auto-all) and running my program with +RTS -p. This generated incremental profiling data. The program could be killed with ^c, and the .prof file would contain data. In GHC 7.6 and later, it appears that if the program can be terminated with a single ^c, then profiling information is written to output. However (especially with newer versions of GHC?) a single ^c doesn't kill the program, at least not before I get impatient and hit ^c again. Usually two ^c will kill the program, but then no profiling data is written to output.

Concretely, consider the problem of trying to profile StupidFib.hs:

fib n = fib (n - 1) + fib (n - 2) main = print $ fib 100 

Compiling with -prof and running with +RTS -p, I can kill this program with a single ^c in the first approximately 10 seconds of execution, but after that only two ^c will do the job. Looking at my resources, this change appears to coincide with the program using all of my physical memory and moving to swap space, however that could be coincidental.

Why does ^c work sometimes, but not other times for the same program? What is the easiest way to ensure that profiling data will get printed when the program does not terminate on its own?

8
  • Are you sending ^C once or twice? Commented Nov 17, 2014 at 23:57
  • Usually I have to send it twice...once doesn't seem to kill the program. Commented Nov 18, 2014 at 0:15
  • 1
    Regardless of the GHC version, I've always found that if I can close it with one ^C it writes the profiling data and if it takes two, it doesn't. I'm not sure if there's a way around that though. Commented Nov 18, 2014 at 0:19
  • I don't think it matters how you kill the program, you can actually see the data when it is still running. I don't have 7.4.2 in front of me at the moment, but iirc that is how it works. Commented Nov 18, 2014 at 1:14
  • 2
    Well, here's an example I wrote up that seems to behave in this way (I've seen this in my own programs as well). It just eats memory quickly, so it's reasonable to kill it with one ^C, but it takes long enough that you could also use two ^Cs. If I enable profiling, it generates the profiling data with one ^C press, but it generates a blank file with two ^C presses: lpaste.net/114466. I'm using GHC 7.8.3. I'm guessing that when the RTS profiler receives a SIGINT, it writes its buffer to the file and cleans everything up. I'm not sure what happens with the second press exactly. Commented Nov 18, 2014 at 1:37

1 Answer 1

2

Most likely, the second signal is being delivered before the program has finished handling the first one, and at this point the signal's action has been reset to the default action, which (for SIGINT) is to terminate the program. Because of the swapping, there's a significant interval before the profiling code can write out the profiling data, during which time the program is vulnerable to a second SIGINT.

Moral of the story: be patient. If you wait long enough, the program will finish and the data will be written out. Regarding that second ^C, tell yourself, "Just don't do it!" :-)

One could argue that the Haskell runtime should set signal options such that a second SIGINT is ignored, but that would be risky because there'd be no easy way to terminate the program if things got really messed up trying to handle the signal.

You probably also want to avoid programs that exceed physical memory and induce a lot of swapping. At that point, your computation is effectively stalled and there's not much point in continuing. Use +RTS -M to limit the heap size to avoid getting into this situation.

Sign up to request clarification or add additional context in comments.

3 Comments

Well, sometimes it just.. does not work. It seems like any deadlock would make it fail to stop gracefully.
I'm not sure what you mean. Could you be a bit more specific?
Oh, nevermind my comment. Figured it out, it was caused on the FFI boundary.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.