Two days ago I suggested that I would rather have a procedural interface to
my matching tools. My point was that designing a -notation-, rather than
some data types and operators, seemed to quickly get us into programming
language design issues. (Should our notation support `lambda' as well as
`let'? How do we handle variable binding? Should we have macros?)
Unfortunately I phrased my suggestion this way:
What I really want is a toolkit of procedures that operate on regular
languages.
I said "regular languages" because that seemed to be the obvious abstract
data type for a toolkit of operators to work with. Olin replied:
I considered this, as well. But I rejected it because I don't think you
do want the ability to operate on regular languages. For example, regular
languages are closed under intersection and set-complement....
Which got us into the current discussion about the tractability of various
operations on automata. Indeed, set-complement and intersection may be
something you want to avoid [*], but that is actually an -orthogonal-
issue to the issue I was trying to raise. I want to make sure that my
original point was understood properly, so let me say it again in a
different way:
Looking at examples like this make me realize that what I really want
isn't an S-expression notation for regular languages. What I really want
is a toolkit of procedures.
I'm not trying to address the issue of what the procedures operate -on-,
I'm just saying that I want procedures, and -not- a S-expression based
notation.
One problem that might be solved by thinking purely in procedural terms is
the problem of naming "submatches". The usual string-based method is to
number the submatches by counting the open parens from the beginning of the
expression. Unfortunately, this is a horribly non-compositional naming
scheme. If you construct a new regular expression out of some existing
regular expressions, the numbering of all the submatches in the new
expression is some complicated function of the numberings from the original
expressions. Yuck.
It's possible that a cool notation solves this problem neatly. But I think
it much -more- likely that a procedural toolkit can solve it neatly.
- Alan
[*] I found them very useful when I wrote the tokenizer for the Lisp
Machine's reader -- but never mind.
|