scsh-users
[Top] [All Lists]

find in Scsh -- functional systems hacking

To: scsh@martigny.ai.mit.edu
Subject: find in Scsh -- functional systems hacking
From: shivers@ai.mit.edu (Olin Shivers)
Date: 06 Feb 1996 09:29:31 -0500
Organization: Artificial Intelligence Lab, MIT
Reply-to: shivers@ai.mit.edu
   From: shriram@mahasamatman.cs.rice.EDU (Shriram Krishnamurthi)
   Subject: find in Scsh
   Newsgroups: alt.lang.scheme.scsh
   I suspect this topic has been done to death, but I'll ask anyway.  As
   a disclaimer, I should add that I'm using documentation that is many
   months old, but the behavior doesn't seem to have changed.

You are not out-of-date, Shriram.

   What was the rationale behind the behavior of file-match, ie, the
   restriction on the depth of traversal?  

It's a simple matcher. You don't always want general recursion.
A generally recursive file-system matcher is a much more complex
piece of machinery. I have designed two; I'll append them. I wrote
mine when Bill Sommerfeld sent me a file-tree walker he wrote; my
code is based on his.

I haven't released this code because I'd like to work out a convenient
macro for doing find operations, but haven't managed to come up with a
design with which I'm really happy. But your post has pushed me over
the edge. Here's the code. If anyone does anything interesting with it,
please post your results to the list.

These routines are very powerful abstractions. Too powerful for casual use,
actually. The machinery is sufficiently complex that you would either
        (1) want to hide it behind some macro sugar, or
        (2) build some more special-purpose tools on top of these two
            procedures.

Note that a big difference between these procedures and the Unix find(1)
program is that Scheme code can pass values around, whereas find can only
invoke programs that don't communicate up and down the tree. So we have
the potential to compute functions over the file tree. For example, we
could easily do bottom-up calculations such as "maximum-size file in the
tree" or top-down calculations such as "rename all .c files occuring
somewhere under any STASH directory to be .c.stash".

I'll remind you guys we are taking submissions to the contributed-code
library...
        -Olin
-------------------------------------------------------------------------------
;;; Written by Olin Shivers and Bill Sommerfeld. October 1995.
;;; Two procedures for walking over and operating on Unix file trees,
;;; somewhat like the Unix find(1) utility. I'm not sure which one is the
;;; right one, so I'll provide both, and you can try them out and see which
;;; one fits your needs.

;;; (reduce-file-tree root prefunc leaffunc postfunc . state-vals)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Apply a triple of procedures across a directory tree in the file-system.
;;; This procedure is to file trees as the REDUCE operator is to lists --
;;; a basic control-structure abstraction.
;;;
;;; The three procedures PREFUNC, LEAFFUNC, and POSTFUNC are all of the
;;; form
;;;     (func file-name info . state-vals) => state-vals'
;;; That is, they are applied to a file-name, its corresponding FILE-INFO
;;; record, and a set of state values, returning a new set of state values.
;;; They must produce as many state values as they consume; this can be
;;; zero, one, or more. As a special, ugly exception to save you from lots
;;; of tediousness, if there are zero initial state values, then these
;;; procedures can return as many values as they like, which will simply
;;; be ignored -- R-F-T will simply pretend they returned zero values.
;;; Another special-case hack: if the function is given as #F, then it
;;; is taken to be the identity function on the state values.
;;;
;;; R-F-T descends the directory tree starting at ROOT.
;;; - Base case:
;;;   If ROOT is not a directory, it simply applies LEAFFUNC to ROOT and
;;;   returns the resulting state values.
;;;
;;; - Recursive case:
;;;   If ROOT is a directory, R-F-T first applies PREFUNC to ROOT, producing
;;;   a new set of state values, then it recurses across all of the files
;;;   in ROOT, iteratively transforming the state values, then it finally
;;;   applies POSTFUNC to ROOT, and returns the final set of state values.
;;; 
;;; If R-F-T is unable to stat ROOT, it silently returns the original
;;; state values. If R-F-T is unable to obtain the files in a directory,
;;; it acts as if there are none. This is an arguable design decision.

;;; Unresolved issues: 
;;; - The FILE-INFO op doesn't chase symlinks.
;;;   Perhaps a flag? Ech. Perhaps two different procs?
;;; - What do we do if one of the FUNCS raises an exception?
;;;   Catch it and treat that application of the child proc as an identity
;;;   op? Let the exception propagate and blow us out of the water (this
;;;   is what we do currently)?
;;; - The multiple-value hacks make this code much uglier than it would
;;;   otherwise be. ML does it right.
;;; - Are there useful special-cases we could build -- bottom-up
;;;   or top-down recursions, etc.?

(define (reduce-file-tree root prefunc leaffunc postfunc . state)
  ;; These ...-carefully procs return #f or '() if there's an error.
  (let* ((file-info-carefully (lambda (f)
                                (with-errno-handler ((errno misc) (else #f))
                                  (file-info f #f))))
         (directory-files-carefully (lambda (f)
                                      (with-errno-handler
                                          ((errno misc) (else '()))
                                        (directory-files f))))

         ;; Default #f values for the ...FUNCS.
         (state-identity (lambda (root info . state) (apply values state)))
         (prefunc  (or prefunc  state-identity))
         (leaffunc (or leaffunc state-identity))
         (postfunc (or postfunc state-identity)))

    (letrec ((rft (lambda (root . state) ; This is the core.
                    ;; "What is this thing called ROOT?"
                    (let ((info (file-info-carefully root)))
                      (if (not info) (apply values state) ;Unix is shy; give up

                          (case (file-info:type info)

                            ((directory)    ; ROOT is a directory => recurse.
                             (receive state (apply prefunc root info state)
                               (let* ((dir (file-name-as-directory root))
                                      (g (lambda (f . s)
                                           (apply rft (string-append dir f) s)))
                                      (files (directory-files-carefully root)))
                                 (receive state (apply mv-reduce g files state)
                                   (apply postfunc root info state)))))
                            
                            (else (apply leaffunc root info state)))))))

             ;; Ugh. I'm completely special-casing the pure-side-effect one.
             (rft0 (lambda (root)
                     (let ((info (file-info-carefully root)))
                       (if info
                           (case (file-info:type info)
                             ((directory)
                              (prefunc root info)
                              (let ((dir (file-name-as-directory root)))
                                (for-each (lambda (f) (rft0 (string-append dir 
f)))
                                          (directory-files-carefully root)))
                              (postfunc root info))
                             (else (leaffunc root info))))))))
    
      (if (pair? state)
          (apply rft root state)
          (begin (rft0 root) (values))))))


;;; (file-tree-recur root dirfunc leaffunc . state-vals)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Apply a pair of procedures across a directory tree in the file-system.
;;; This procedure is to file trees as the REDUCE operator is to lists --
;;; a basic control-structure abstraction.
;;;
;;; F-T-R descends the directory tree starting at ROOT.
;;; - Base case:
;;;   If ROOT is not a directory, it simply calls and returns the values of
;;;     (LEAFFUNC root info . state-vals)
;;;   where INFO is the FILE-INFO record for file ROOT.
;;; 
;;; - Recursive case:
;;;   If ROOT is a directory, F-T-R calls and returns the values of
;;;     (DIRFUNC ROOT info elts recur . state-vals)
;;;   where INFO is the FILE-INFO record for directory ROOT,
;;;         ELTS is the list of the filenames in directory ROOT, and
;;;         (RECUR f . state-vals) is a procedure that is equivalent to
;;;             (FILE-TREE-RECUR f DIRFUNC LEAFFUNC . state-vals)
;;;   DIRFUNC can recurse over the ELTS list using explicit loops,
;;;   MAP, FOR-EACH, MV-REDUCE (see below) or other mechanisms, together 
;;;   with RECUR.
;;;
;;; Both LEAFFUNC and DIRFUNC must return exactly as many values as there are
;;; state values, which can be zero, one or more. As a special, ugly exception
;;; to save you from lots of tediousness, if there are zero initial state
;;; values, then these procedures can return as many values as they like,
;;; which will simply be ignored -- F-T-R will simply pretend they returned
;;; zero values.

;;; If F-T-R is unable to stat ROOT, it silently returns the original
;;; state values. If F-T-R is unable to obtain the files in a directory,
;;; it acts as if there are none. This is an arguable design decision.

(define (file-tree-recur root dirfunc leaffunc . state)
  ;; These ...-carefully procs return #f or '() if there's an error.
  (letrec ((file-info-carefully (lambda (f)
                                  (with-errno-handler ((errno misc) (else #f))
                                    (file-info f #f))))
           (directory-files-carefully (lambda (f)
                                        (with-errno-handler
                                            ((errno misc) (else '()))
                                          (directory-files f))))

           ;; This is the real core of the procedure.
           (rft (lambda (root . state)
                  ;; "What is this thing called ROOT?"
                  (let ((info (file-info-carefully root)))
                    (if (not info) (apply values state) ; Unix is shy; give up.

                        (case (file-info:type info)

                          ((directory)  ; ROOT is a directory => recurse.
                           (let ((dir (file-name-as-directory root)))
                             (apply dirfunc root info
                                    (map (lambda (f) (string-append dir f))
                                         (directory-files-carefully root))
                                    rft
                                    state)))

                          (else (apply leaffunc root info state)))))))

           ;; Ugh. I'm completely special-casing the pure-side-effect one.
           (rft0 (lambda (root)
                   (let ((info (file-info-carefully root)))
                     (if info
                         (case (file-info:type info)
                           ((directory)
                            (let ((dir (file-name-as-directory root)))
                              (dirfunc root info
                                       (map (lambda (f) (string-append dir f))
                                            (directory-files-carefully root))
                                       rft0))
                           (else (leaffunc root info))))))))
    
    (if (pair? state)
        (apply rft root state)
        (begin (rft0 root) (values)))))
           

;;; Multiple-value list-reduce: apply (f elt . sv) => sv' across the list.

(define (mv-reduce f lis . state-vals)
  (letrec ((recur (lambda (lis . state-vals)
                    (if (pair? lis)
                        (let ((elt (car lis))
                              (lis (cdr lis)))
                          (receive state-vals (apply f elt state-vals)
                            (apply recur lis state-vals)))
                        (apply values state-vals)))))
    (apply recur lis state-vals)))

<Prev in Thread] Current Thread [Next in Thread>