6

We are working on a multithreaded memory-consuming application written in C++. We have to execute lots of shellscript/linux commands (and get the return code).

After reading that article we clearly understood that it would be a bad idea to use system() in our context.

A solution would be to fork after program starts and before creating any threads but the communication with that process may not be easy (socket, pipe ?).

The second solution we considered may consist of a dedicated deamon written in python (using xinetd ?) that would be able to process our system calls.

Have you ever had that problem? How did you solve it?

Note : Here is a more complete article that explain this problem : http://developers.sun.com/solaris/articles/subprocess/subprocess.html They recommend to use posix_spawn which uses vfork() instead of fork() (used in system()).

3
  • 2
    It's probably a bad idea to refer to the running of external programs as doing "system calls". In C and C++, system calls generally means talking to the kernel, i.e. directly doing OS-level calls. Commented Dec 20, 2011 at 11:39
  • Have you linked to the right article? It's about fork() producing single-threaded child processes even if the parent is multi-threaded, and doesn't say anything about system() being a bad idea. Commented Dec 20, 2011 at 11:53
  • system() does a fork() and then exec() cf man pages :( Commented Dec 20, 2011 at 14:25

3 Answers 3

2

The article you link to mostly talks about issues if you fork() and don't immediately follow it with an exec*(). As system() typically would be implemented by a fork() followed by exec() most of the issues don't apply. One issue which does apply is the point about closing file descriptors, though; Unless you have specific reasons to do otherwise, opening files with O_CLOEXEC by default is probably a good rule of thumb.

One issue with fork()+exec() of large memory consuming applications is that if your OS is configured to not allow memory overcommit, the fork() may fail. One solution to this is to fork an "external process handler" process before you start allocating a lot of memory in your main process.

The best solution is if the functionality you require is available as a library, obviating the need to fork in the first place. That probably doesn't warm your heart in the short term, though.

Sign up to request clarification or add additional context in comments.

3 Comments

fork normally should not fail just because overcommit=0, since pages are still shared (at least, until they start to diverge)
@jørgensen: The problem is that with overcommit disabled the OS must ensure that there is enough space in case the forked process decides to write to all it's writable pages, because the OS cannot a priori assume that the process won't do that.
@jørgensen: fork() does fail if it considers there is not enough memory (physical + swap). Been there.
0

You will got questions for how you should call an external program (fork/exec/wait, how else), but it's only one part of the problem. The real issue is the scheduling of this, I assume, you don't want to run too many external programs parallel.

Without knowing how does thread organizing go in your system, I can warn you for two issues.

It's an important issue is to keep the load low by limiting external command/script call. You may set up a parameter, which tells, how many parallel external command should run at same time. Before you call an external command, you should increase a variable which shows the number of the active external processes; if it exceeds the limit parameter, sleep() some and try again. After the process finished, decrease that variable. (Increasing and decreasing must be mutexed.)

Another issue is, when you're using an external program, managing its lifetime. You should set up a "timeout" for each external process, and kill it, if it hangs for a while. There should be a "timeout" thread (or is should be the main thread), which controls others.

Comments

0

How about this solution:

fork() at the very start of your program, and make the child dedicated to starting and managing your outside programs. Then have the parent start all of its threads and do the application logic, sending requests over a pipe when it needs an outside process.

This would step around the problems with threads as you fork before starting them.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.