Andreas Bernauer wrote:
> Hello,
Hi Andreas,
> can anybody help me with this? Why does using tmprx result in an
> error, but using tmp2rx works?
To me, this looks like a bug, maybe in simplify-regexp, see below.
Some comments on what you wrote, before.
> Top level
> ;;
>> (fold (lambda (choice regexp)
> (rx (| (submatch ,choice) ,regexp)))
> (rx eos bos) ;; the regexp that never
> ;; matches anything
I would suggest that you use the ADT creation functions to create what
you want. Something like this:
(re-choice (map (lambda (s) (make-re-submatch (re-string s)))
'("yes" "no")))
This doesn't need any regexp matching nothing.
To come back to your problem, a nice way of seeing what is going on is
to use the regexp->sre function, as follows:
----------------------------------------------------------------------
> (define ab-rx (fold (lambda (choice regexp)
(rx (| (submatch ,choice) ,regexp)))
(rx eos bos)
'("yes" "no")))
> (regexp->sre ab-rx)
'(| (submatch "no") (| "yes" (: eos bos)))
----------------------------------------------------------------------
Here you see that the submatch for "yes" was removed by the comma
inclusion, as you guessed. Now the reason why I think there is a
problem:
----------------------------------------------------------------------
> (simplify-regexp ab-rx)
Error: exception
wrong-type-argument
(checked-record-ref '#{Re-choice} '#{Record-type 48 re-seq} 1)
1>
----------------------------------------------------------------------
Here I get the same error as you did. Now, if I do a round-trip with
regexp->sre and sre->regexp (the composition of which should be the
identity function, as I understand it), it works:
----------------------------------------------------------------------
> (simplify-regexp (sre->regexp (regexp->sre ab-rx)))
'#{Re-choice}
> (regexp->sre ##)
'(| (submatch "no") "yes" (: eos bos))
----------------------------------------------------------------------
But it's still not what you want. What I proposed above works, though:
----------------------------------------------------------------------
> (re-choice (map (lambda (s) (make-re-submatch (re-string s)))
'("yes" "no")))
'#{Re-choice}
> (regexp->sre ##)
'(| (submatch "yes") (submatch "no"))
----------------------------------------------------------------------
I find it a little strange that you want each alternative to be
included in a separate submatch, though. Are you sure that you do not
want to include the whole "or" part in a submatch? If this is the
case, you should use the following instead:
----------------------------------------------------------------------
> (make-re-submatch (re-choice (map re-string '("yes" "no"))))
'#{Re-submatch}
> (regexp->sre ##)
'(submatch (| "yes" "no"))
----------------------------------------------------------------------
HTH,
Michel.
|