6

I was trying to understand forks, and tried following in C:

#include<stdio.h> #include <unistd.h> void forker() { printf("%d: A\n",(int)getpid()); fork(); wait(); printf("%d: B\n",(int)getpid()); printf("%d: C\n",(int)getpid()); fork(); wait(); printf("%d: D\n",(int)getpid()); } int main(void) { forker(); return 0; } 

When I compiled and ran resultant a.out, here is what I observed:

> ./a.out 3560: A 3561: B 3561: C 3562: D 3561: D 3560: B 3560: C 3563: D 3560: D 

However when I do the following:

> ./a.out > t.txt 

something weird happens:

> cat t.txt 3564: A 3565: B 3565: C 3566: D 3564: A 3565: B 3565: C 3565: D 3564: A 3564: B 3564: C 3567: D 3564: A 3564: B 3564: C 3564: D 

Can someone please explain this behavior? Why is the output different when it is redirected to a file?

I am using Ubuntu 10.10, gcc version 4.4.5.

14
  • Are you sure you're showing the real code ? wait() makes little sense. Commented Nov 24, 2011 at 10:23
  • @cnicutar: Maybe it makes little sense, but it doesn't cause visible problems, so those calls can indeed be there. Commented Nov 24, 2011 at 10:25
  • I added wait() just to make sure that parent waits till the child is completely executed. Is that wrong/unnecessary? Commented Nov 24, 2011 at 10:26
  • 1
    Well, for one wait takes one argument. Commented Nov 24, 2011 at 10:27
  • 2
    @soulcheck - in C its legal (bad, but legal) to use a function which has not been prototyped. I think salil got lucky here and that the parameter location likely contained 0, a null pointer, which wait() must check for and accept without use... otherwise he would have core dumped or overwritten a memory location. Commented Nov 24, 2011 at 10:44

3 Answers 3

10

The reason this happens is data buffering. At the time of the fork(), in the case of directing to a file, your output has not been flushed yet... so both the parent and the child now have outstanding output buffers.

Put a call to fflush(stdout); before each fork(); to resolve this.

Sign up to request clarification or add additional context in comments.

5 Comments

Yes, thanks. that was it. I read about fflush, but didn't use it properly earlier.
btw, why is the issue there only when I redirect it to a file, and not when I print it on a console? Also, why does the child print starting from A in the file??
@Salil, see the link I provided in my answer. It explains how stdout is buffered in the kernel when you redirect to a file.
@Tudor, is it only the buffering issue? Because the process numbers (PIDs) which I am printing are also messed up. So is the execution order of the threads also affected when redirecting?
@Salil - the execution order is not messed up, the print() only executes once... however printf() does not directly send the output to its destination, it merely sends it to a buffer. fork() duplicates the buffer and its state (causing the buffer to be sent to its destination twice), and without some explicit thread management you cannot control which thread/process will flush its output first.
3

The problem is that the output of printf is passed through a library buffer before being sent to the file, which causes the strange behavior you mentioned. If you add a fflush(stdout) after each printf your output will be correct also inside the file.

You can read more about this here: http://www.pixelbeat.org/programming/stdio_buffering/

1 Comment

Thanks Tudor. That site is not currently up. I will have to check later. However, when the output is passed to file, why does the child process start printing from "A"? Also, the process numbers are confusing.
2

The other answers do not exactly describe what is happening, and I had to think a bit more to understand. So, in the second case (output buffered because of file redirection), and by using 1,2,3 and 4 instead of 3564, 3565, 3566 and 3567:

  • process 1 prints "A:1" in its internal stdout buffer;
  • process 1 forks and process 2 is created, this creation implies the copy of the internal stdout buffer which is still not printed;
  • process 1 prints "B:1" and "C:1" in its internal stdout buffer, process 2 "B:2" and "C:2";
  • both processes fork (in your case 1->4 and 2->3, but it could have been different), duplicating both internal buffers;
  • All 4 processes prints the D line in their buffers, then exit.

At this point, the contents of the 4 internal stdout buffers are:

- process 1: A:1 B:1 C:1 D:1 - process 2: A:1 B:2 C:2 D:2 - process 3: A:1 B:2 C:2 D:3 - process 4: A:1 B:1 C:1 D:4 
  • Finally, the 4 buffers are printed in non deterministic order. In your case, the order was 3, 2, 4, 1.

This behavior is not happening when stdout is the shell, or with fflush(), because the stdout buffer is dumped before each fork(), so only empty buffers are duplicated.

1 Comment

They are not printed in order, but may even overlap, because the file write offset is the same for all processes and is not updated atomically with writes. It's not happening when the output is a pseudo-tty device (not a shell!), because such a device is not capable of seeking and the question of atomicity of the file write offset updates is immaterial. To avoid overlapping with regular files, they should have the O_APPEND flag set. FWIW, according to POSIX, behavior of concurrent writes to files is unspecified.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.