scsh-users
[Top] [All Lists]

Re: Regular expressions

To: scsh-news@zurich.ai.mit.edu
Subject: Re: Regular expressions
From: lord@emf.emf.net (Tom Lord)
Date: Mon, 04 Mar 2002 20:46:49 -0000
Organization: emf.net -- Quality Internet Access. (510) 704-2929 (Voice)
        Michel Schinz:
        But couldn't we imagine a regular expression matcher that fetches
        characters from ports as it needs? I sometimes wish I had a "lex"
        macro in scsh, providing something similar to what the (f)lex tools
        provides for C (and other languages). I find it sad that I have to
        revert to "read-char" as soon as I need to analyze text files which
        cannot be read line-by-line (e.g. any source file of a language
        allowing multi-line comments).

For general Posix regexps, it isn't quite as simple as fetching
characters as needed -- the matcher needs random access to 
characters already matched.

However, there is no need to use only your imagination.  The Rx 
regexp engine permits matching over non-contiguous, dynamically
constructed strings.  In both the Posix functions and some lower
level functions it has everything you need to make a fast lexer --
I use it for that purpose in Systas Scheme.  See http://www.regexps.com.

Primitives for fast I/O in scheme are a persistent need and Systas
has some nice examples of those, too.

Systas has a process mgt. interface largely cribbed from SCSH, but
would need a few more lines of code to actually be source-level
compatible.

(In many other ways, though, Systas basically sucks and I don't recommend
using it for anything critical -- it's just a place to consider grabbing
ideas and techniques for better regexp support and neat I/O primitives.)

-t


<Prev in Thread] Current Thread [Next in Thread>