## New pair of regexp replacement functions

 New pair of regexp replacement functions
Olin Shivers (shivers@ai.mit.edu)
27 Feb 1997 16:27:39 -0500
 I have designed and implemented two new functions for doing string editing using regexps. I want to put them in the next release of scsh. So I'm describing them here for public comment -- if you like them or hate them or have suggestions or criticisms, post them or mail them to me. I greatly improved my field-reader package this way when I designed the Awk package for scsh, so your comments really do have effect. We have two new functions, (substitute-regexp port match . items) (substitute-regexp/global port regexp string . items) A design heuristic here is "don't encode" -- in this case, I have avoided inventing some string-transform language that would be embedded in a string. When you do that, the programmer has to use escape sequences to encode operations and to quote things that look like escape sequences. The procedure then has to parse this language at run time. So instead of using a Scheme string like "foo\\1bar\\\\bar\\1foo" to mean First the string "foo", then submatch 1, then the string "bar", then the string "\bar", then submatch 1, then the string "foo" I instead specify a sequence of Scheme values -- "foo" 1 "bar\\bar" 1 "foo" No parsing required. It would be very easy to macro-expand such a transform into equivalent code. (substitute-regexp port match . items) Writes the items out to the port. - If an item is a string, it is copied directly to the port. - If an item is an integer, the corresponding submatch from MATCH is written to the port. - If an item is 'PRE, the prefix of the matched string (the stuff before the match) is written to the port. - If an item is 'POST, the suffix of the matched string is written. - If an item is a procedure, it is applied to zero arguments. - If PORT is #f, nothing is written, and a string is constructed and returned instead. Procedure items must return strings in this case. (substitute-regexp/global port regexp string . items) Matches REGEXP against STRING - If there is no match on STRING, then write nothing and return false. - If there is a match, then write the items out to the port and return true. - If an item is a string, it is copied directly to the port. - If an item is an integer, the corresponding submatch from MATCH is written to the port. - If an item is 'PRE, the prefix of the matched string (the stuff before the match) is written to the port. * If an item is 'POST, the procedure recurses on the suffix string. - If an item is a procedure, it is applied to the regexp match. If PORT is #f: - If there is no match, return false. - If there is a match, then construct and return a string as described above. - Procedure items must return strings. In either procedure, you may use the same submatch spec (an integer, 'PRE, or 'POST) multiple times as an item. -Olin 
