I was wondering what performance people have been having with the scsh
awk construct.
I am currently writing a script in scsh that uses the UNIX cpp output
to extract #include's from a .c file to create a make utility
dependency file. Here are two versions of a function to extract the
#include's, the first uses "awk", the second doesn't:
(note: put-nondup-in-list just inserts an element in a list, making
sure there are no duplicates)
(define (get-includes port-in)
(let ((read (field-reader (infix-splitter "[ \t\n\"]"))))
(awk (read port-in) (line fields) #f ((include-list '()))
("^(# 1 \")(.*h)(\")"
(put-nondup-in-list (nth fields 3) include-list)))))
(define (get-includes-old port-in)
(letrec ((get-line (record-reader))
(get-include (lambda (include-list)
(let ((line (get-line port-in)))
(if (eof-object? line)
include-list
(let ((match
(string-match
"^(# 1 \")(.*h)(\")"
line)))
(if (regexp-match? match)
(let ((string (match:substring
match 2)))
(get-include (put-nondup-in-list
string
include-list)))
(get-include include-list))))))))
(get-include '())))
I then timed the functions (using the scsh time+ticks syscall)
and found that for the a 5000 line file, the awk version took
about 10 seconds and the non-awk took around 4 seconds. The output
list ended up being about 10 elements, with only about 8 or so
matches weeded out by put-nondup ... I ran this on an SGI multiprocessor.
Any ideas? I'm further wondering why the second version even took
4 seconds ... (FYI -- the run/port used to fork cpp took less than
1 second).
Thanks,
Mike
--
Michael Hicks
Ph.D. student, the University of Pennsylvania
mwh@gradient.cis.upenn.edu
"In studying the way, realizing it is hard; once you have realized it,
preserving it is hard. When you can preserve it, putting it into practice
is hard." -- Zen saying
|