Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1064033 - ksh wait function doesn't appear to work in higher versions of rhel5
Summary: ksh wait function doesn't appear to work in higher versions of rhel5
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: ksh
Version: 5.10
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Michal Hlavinka
QA Contact: BaseOS QE - Apps
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-11 20:49 UTC by Dave Sullivan
Modified: 2018-12-04 17:26 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-12 14:41:54 UTC


Attachments (Terms of Use)

Description Dave Sullivan 2014-02-11 20:49:18 UTC
Description of problem:

customer noticed problem going from 5.5z to 5.8+



Here's my environment.

rhel5u9+ box fully-updated-rhel5


[root@fully-updated-rhel5 ~]# uname -a
Linux fully-updated-rhel5.sullyvon.com 2.6.18-348.16.1.el5 #1 SMP Sat Jul 27 01:05:23 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@fully-updated-rhel5 ~]# rpm -qa | egrep "openssh|ksh"
openssh-4.3p2-82.el5
openssh-clients-4.3p2-82.el5
ksh-20100621-18.el5
openssh-server-4.3p2-82.el5
openssh-askpass-4.3p2-82.el5

On this box we have a driver script

[root@fully-updated-rhel5 ~]# cat wait_test_5u2.sh 
#!/bin/ksh

for i in 192.168.1.81
do
  ssh $i /tmp/test.sh &
done

wait

echo "FINISHED"

That calls to a remote machine 192.168.1.81 and executes /tmp/test.sh

[root@unknown52540067dd0b ~]# ifconfig eth0 | grep 192
          inet addr:192.168.1.81  Bcast:192.168.1.255  Mask:255.255.255.0
[root@unknown52540067dd0b ~]# uname -a
Linux test5u2 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

[root@unknown52540067dd0b ~]# cat /tmp/test.sh 
#!/bin/ksh
echo "`date` : EXECUTING $0 on `hostname`"

grep -ri "TEST" /* >> /tmp/test.out &

echo "`date` : FINISHED $0 on `hostname`"

This is our baseline and works just fine until I hit a ctrl-c

[root@fully-updated-rhel5 ~]# strace -o /tmp/strace_remote_5u2.strace ./wait_test_5u2.sh
Tue Feb 11 20:15:03 CST 2014 : EXECUTING /tmp/test.sh on test5u2
Tue Feb 11 20:15:03 CST 2014 : FINISHED /tmp/test.sh on test5u2
grep: /dev/gpmctl: No such device or address
grep: /dev/log: No such device or address
grep: /dev/mixer: Invalid argument
grep: /dev/audio: Input/output error

We can see from the strace ERESTARTSYS which restarts until I hit the ctrl-c and we get the SIGINT

wait4(-1, 0x7fff862dfd04, WSTOPPED|WCONTINUED, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGINT (Interrupt) @ 0 (0) ---

Customer has complained that upgrading from 5.5z to 5.8 now shows a problem with this test.

I recreate the problem on a 5.6

Coming from the same host configuration as above but now hitting 5.6 remote we see that the driver script exits right away

[root@fully-updated-rhel5 ~]# cat wait_test_5u6.sh 
#!/bin/ksh

for i in 192.168.1.82
do
  ssh $i  /tmp/test.sh &
done

wait

echo "FINISHED"

[root@unknown525400270209 ~]# uname -a
Linux unknown525400270209 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
[root@unknown525400270209 ~]# ifconfig eth0 | grep 192
          inet addr:192.168.1.82  Bcast:192.168.1.255  Mask:255.255.255.0
[root@unknown525400270209 ~]# cat /tmp/test.sh 
#!/bin/ksh
echo "`date` : EXECUTING $0 on `hostname`"

grep -ri "TEST" /* >> /tmp/test.out &

echo "`date` : FINISHED $0 on `hostname`"

[root@fully-updated-rhel5 ~]# strace -o /tmp/strace_remote_5u6.strace ./wait_test_5u6.sh
Tue Feb 11 15:21:25 EST 2014 : EXECUTING /tmp/test.sh on unknown525400270209
Tue Feb 11 15:21:25 EST 2014 : FINISHED /tmp/test.sh on unknown525400270209
FINISHED

From the strace we do see a difference.

wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED|WCONTINUED, NULL) = 2943
--- SIGCHLD (Child exited) @ 0 (0) ---
rt_sigreturn(0x11)                      = 2943
wait4(-1, 0x7fff881cf424, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes)
wait4(-1, 0x7fff881cf424, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes)
rt_sigaction(SIGCHLD, {0x424a50, [], SA_RESTORER|SA_INTERRUPT, 0x3dd1c302d0}, {0x424a50, [], SA_RESTORER|SA_INTERRUPT, 0x3dd1c302d0}, 8) = 0

Looking for collaboration now on this, as this might be a kernel issue.

Comparing the two first wait4 from each strace

5u2
wait4(-1, 0x7fff862dfd04, WSTOPPED|WCONTINUED, NULL) = ? ERESTARTSYS (To be restarted)

5u6
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED|WCONTINUED, NULL) = 2943

showing a difference in status

 pid_t wait4(pid_t pid, int *status, int options,
             struct rusage *rusage);

Additional info:

In line 3 of the script test.sh, if the redirection of stdout is taken out, then in RH 5.8 also, the “wait” (line 6) in script “testwait.sh” waits until grep finishes.
 

Also, if you add redirection for stderr in line 3 of the script test.sh, in addition to current stdout redirection, then even in RH 5.3 the “wait” (line 6) in script “testwait.sh” DOES NOT wait for grep to finish.

Comment 3 Dave Sullivan 2014-02-12 14:41:54 UTC
agreed closing, thx for the follow up Filip


Note You need to log in before you can comment on or make changes to this bug.