scsh-users
[Top] [All Lists]

Re: Regexp notation

To: scsh@martigny.ai.mit.edu
Subject: Re: Regexp notation
From: haertel@ichips.intel.com (Mike Haertel)
Date: 10 Jan 1997 05:36:23 GMT
Organization: Intel Corporation
In article <qijd8vebe9x.fsf@lambda.ai.mit.edu>,
Olin Shivers <shivers@ai.mit.edu> wrote:
>OK, I'm convinced it's handy. But it's also tricky. I do not think it is
>trivial to do complement or intersection on RE's using the standard
>notation. Systems that work by translating to classical string notation can no
>longer do a simple translation. The only way that I can think of to do this is
>to (1) compute the NDFA for the regexp(s), (2) do the complement or
>intersection op on the NDFA, (3) run the state-minimisation algorithm on the
>result (because intersection make a big NDFA), (4) run the NDFA -> regexp
>algorithm. This is a mess! Very slow, too. Could you preserve submatches info
>across all these operations? Anybody volunteering to write the code?

I strongly recommend that you abandon any attempt to do intersection
or complement of regular languages.  The reason is that intersection
can shorten the length of a regexp by an exponential amount, or
equivalently, grow the automaton by an expontential amount.

<Prev in Thread] Current Thread [Next in Thread>