scsh-users
[Top] [All Lists]

Fix for process-object problem

To: scsh-news@martigny.ai.mit.edu
Subject: Fix for process-object problem
From: Olin Shivers <shivers@lambda.ai.mit.edu>
Date: 30 Apr 1997 13:42:34 -0400
Organization: Artificial Intelligence Lab, MIT
It was a sort of a race condition.

The problem was that if the FORK function was really slow (e.g., it
GC'd), and the child process died really fast, then the scsh parent
would get a SIGCHLD interrupt, and reap the dead child before FORK
had a chance to stick the child's process-object data structure in
scsh's process-object table. When the interrupt handler tried to look
up the dead child's process object (so it can stash the exit status for
the dead child away), it is puzzled to find there is no object registered
for that pid. So it reports an error.

The fix is to make the fork-child - build-process-object pair of operations
happen atomically, by locking out interrupts. While I was at it, I made
the can't-find-the-pid's-process-object case trigger a warning, instead of
an error, so that random child processes unknown to scsh (perhaps started
by alien C code?) don't blow it up anymore.

We'll have this fix in the new release. But you can patch 0.5.0 by doing the
following.

1. Change ADD-REAPED-PROC! in scsh/procobj.scm to the following:
-------------------------------------------------------------------------------
;;; Add a newly-reaped proc to the list.
(define (add-reaped-proc! pid status)
  ((with-enabled-interrupts 0
     (cond ((maybe-pid->proc pid) =>
            (lambda (proc)
              (set-proc:%status proc status)
              (set! reaped-procs (cons (make-weak-pointer proc)
                                       reaped-procs))
              (lambda () #f)))
           (else (lambda ()     ; Do this w/interrupts enabled.
                   (warn "Exiting child pid has no proc object." pid 
status)))))))
-------------------------------------------------------------------------------

2. Change REALLY-FORK in scsh/scsh.scm to the following:
-------------------------------------------------------------------------------
(define (really-fork clear-interactive? maybe-thunk)
  ((with-enabled-interrupts 0
     (let ((pid (%%fork)))
       (if (zero? pid)                          

           ;; Child
           (lambda ()   ; Do all this outside the WITH-INTERRUPTS.
             (set! reaped-procs '())
             (if clear-interactive?
                 (set-batch-mode?! #t)) ; Children are non-interactive.
             (and (pair? maybe-thunk)
                  (call-terminally (car maybe-thunk))))

           ;; Parent
           (let ((proc (new-child-proc pid)))
             (lambda () proc)))))))
-------------------------------------------------------------------------------

3. Either just do a 
    make; make install
   or fire up a scsh interactively, get into the scsh internals module
   by saying
    ,in scsh-level-0
   then drop in the two bits of code above, and dump out a new scsh image.
-------------------------------------------------------------------------------

Thank you for your patience and thanks for the bug reports.
    -Olin

<Prev in Thread] Current Thread [Next in Thread>
  • Fix for process-object problem, Olin Shivers <=