I recently found that ksh may lose some data after printing more than 16K bytes to the stdout if it is blocked for a couple of seconds.
This test.sh script prints out 257*64 (16448) bytes:
#!/usr/bin/ksh i=0 while [[ i -lt 257 ]] do x=$(file /tmp) echo "0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE" i=$((i+1)) done | while read datafile do echo $datafile done I performed the following test:
0 $ ./test.sh | wc -c 16448 0 $ ./test.sh | (sleep 3; wc -c) 16384 The line x=$(file /tmp) seems to affect this behaviour although it does not pipe anything to the second loop.
If I use bash, it works as expected.
It looks like a bug to me in ksh. I am using Solaris 5.10. Is there a solution or workaround for this? What is the root cause of this issue? I guess it might be related to pipe buffer size.
Thanks, Peter
EDIT:
So running the test with truss, I can see an error at writing the last 64 bytes:
ioctl(0, I_PEEK, 0x08046B40) = 0 Received signal #18, SIGCLD, in write() [caught] siginfo: SIGCLD CLD_EXITED pid=6561 status=0x0000 write(1, " 0 1 2 3 4 5 6 7 8 9 A B".., 64) Err#4 EINTR lwp_sigmask(SIG_SETMASK, 0x00020000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] setcontext(0x08046670) read(0, 0x0809064C, 1) = 0 ioctl(0, TCGETA, 0x08046B18) Err#22 EINVAL Running the same script with dtksh looks like below. As Stephane indicated, the failed write is reattempted.
ioctl(0, I_PEEK, 0x08046694) = 1 read(0, " 0 1 2 3 4 5 6 7 8 9 A B".., 64) = 64 Received signal #18, SIGCLD, in write() [caught] siginfo: SIGCLD CLD_EXITED pid=28276 status=0x0000 write(1, " 0 1 2 3 4 5 6 7 8 9 A B".., 64) Err#4 EINTR lwp_sigmask(SIG_SETMASK, 0x00020000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] waitid(P_ALL, 0, 0x08046500, WEXITED|WTRAPPED|WSTOPPED|WNOHANG) = 0 waitid(P_ALL, 0, 0x08046500, WEXITED|WTRAPPED|WSTOPPED|WNOHANG) Err#10 ECHILD sigaction(SIGCLD, 0x08046510, 0x08046580) = 0 setcontext(0x08046430) write(1, 0x080F0FD8, 64) (sleeping...) write(1, " 0 1 2 3 4 5 6 7 8 9 A B".., 64) = 64 ioctl(0, I_PEEK, 0x08046694) = 0
trussto see what's going on. small writes are meant to be atomic on pipes, but ksh is known sometimes to use sockets instead of pipes for shell pipes. Also, ksh is known to do wild optimisations that sometimes affect behavior.trusswill tell you if for instance ksh does one write(2) per echo or if some write(2) fail or are partial. You could also uselsofto check whether ksh uses pipes or sockets. FWIW, I can't reproduce it with ksh93u+ on Linux. What version of ksh are you using?grep -i ver /usr/bin/ksh@(#)Version M-11/16/88iwritefor that echo is blocking so likely to be interrupted by the SIGCLD. Doing(echo ...), that is use a subshell, might WA the problem.