It was a sort of a race condition.
The problem was that if the FORK function was really slow (e.g., it
GC'd), and the child process died really fast, then the scsh parent
would get a SIGCHLD interrupt, and reap the dead child before FORK
had a chance to stick the child's process-object data structure in
scsh's process-object table. When the interrupt handler tried to look
up the dead child's process object (so it can stash the exit status for
the dead child away), it is puzzled to find there is no object registered
for that pid. So it reports an error.
The fix is to make the fork-child - build-process-object pair of operations
happen atomically, by locking out interrupts. While I was at it, I made
the can't-find-the-pid's-process-object case trigger a warning, instead of
an error, so that random child processes unknown to scsh (perhaps started
by alien C code?) don't blow it up anymore.
We'll have this fix in the new release. But you can patch 0.5.0 by doing the
following.
1. Change ADD-REAPED-PROC! in scsh/procobj.scm to the following:
-------------------------------------------------------------------------------
;;; Add a newly-reaped proc to the list.
(define (add-reaped-proc! pid status)
((with-enabled-interrupts 0
(cond ((maybe-pid->proc pid) =>
(lambda (proc)
(set-proc:%status proc status)
(set! reaped-procs (cons (make-weak-pointer proc)
reaped-procs))
(lambda () #f)))
(else (lambda () ; Do this w/interrupts enabled.
(warn "Exiting child pid has no proc object." pid
status)))))))
-------------------------------------------------------------------------------
2. Change REALLY-FORK in scsh/scsh.scm to the following:
-------------------------------------------------------------------------------
(define (really-fork clear-interactive? maybe-thunk)
((with-enabled-interrupts 0
(let ((pid (%%fork)))
(if (zero? pid)
;; Child
(lambda () ; Do all this outside the WITH-INTERRUPTS.
(set! reaped-procs '())
(if clear-interactive?
(set-batch-mode?! #t)) ; Children are non-interactive.
(and (pair? maybe-thunk)
(call-terminally (car maybe-thunk))))
;; Parent
(let ((proc (new-child-proc pid)))
(lambda () proc)))))))
-------------------------------------------------------------------------------
3. Either just do a
make; make install
or fire up a scsh interactively, get into the scsh internals module
by saying
,in scsh-level-0
then drop in the two bits of code above, and dump out a new scsh image.
-------------------------------------------------------------------------------
Thank you for your patience and thanks for the bug reports.
-Olin
|