From scsh-users-request@scsh.net Fri Oct 6 06:00:25 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id A100310A; Fri, 6 Oct 2006 06:00:23 +0200 (DFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 61384-05; Fri, 6 Oct 2006 06:00:17 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 4DFB2132; Fri, 6 Oct 2006 06:00:16 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 3406F5EDE; Fri, 6 Oct 2006 06:00:16 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net X-Injected-Via-Gmane: http://gmane.org/ To: scsh-users@scsh.net From: William Xu Subject: run/sexp doesn't support non-ascii characeters? Date: Thu, 05 Oct 2006 23:00:59 +0800 Organization: the Church of Emacs Lines: 20 Message-ID: <878xjuiz84.fsf@www.williamxu.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 221.221.24.184 User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/22.0.50 (gnu/linux) Cancel-Lock: sha1:VRWm5UmuvbSJg3oOn6zO6Gcp9Fc= Sender: news Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/339 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Fri, 6 Oct 2006 06:00:16 +0200 (MST) Hi all, Seems run/sexp is unable to handle to non-ascii characters? Consider the following, the second are two Chinese characters. > (run/sexp (echo "hi")) 'hi > (run/sexp (echo "中国")) Error: illegal character read #\ä #{Input-fdport #{Input-channel 3}} 1> -- William "Otherwise, please speak to a doctor about removing your head from your ass, I believe it would be beneficial to all involved." -- Zephaniah E. Hull, flaming someone on a mailing list From scsh-users-request@scsh.net Fri Oct 6 14:58:11 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 26C7713C; Fri, 6 Oct 2006 14:58:10 +0200 (DFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 40132-04; Fri, 6 Oct 2006 14:58:04 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id D240412A; Fri, 6 Oct 2006 14:58:02 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 99E815ED9; Fri, 6 Oct 2006 14:58:02 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net From: Michael Sperber To: William Xu Cc: scsh-users@scsh.net Subject: Re: run/sexp doesn't support non-ascii characeters? References: <878xjuiz84.fsf@www.williamxu.com> Date: Fri, 06 Oct 2006 14:57:19 +0200 In-Reply-To: <878xjuiz84.fsf@www.williamxu.com> (William Xu's message of "Thu, 05 Oct 2006 23:00:59 +0800") Message-ID: User-Agent: Gnus/5.110006 (No Gnus v0.6) XEmacs/21.5-b27 (darwin) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Virus-Scanned: ClamAV using ClamSMTP Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/340 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Fri, 6 Oct 2006 14:58:02 +0200 (MST) --=-=-= Content-Type: text/plain; charset=gb2312 Content-Transfer-Encoding: quoted-printable William Xu writes: > Hi all,=20 > > Seems run/sexp is unable to handle to non-ascii characters?=20 Well, scsh generally is not Unicode-capable, so it doesn't natively know about, say Chinese. >> (run/sexp (echo "=D6=D0=B9=FA")) > > Error: illegal character read --=-=-= Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable > #\=C3=A4 > #{Input-fdport #{Input-channel 3}} This, however, means that your input terminal is sending the wrong byte sequence inside the quotes to scsh, probably one containing a " or a \ character. --=20 Cheers =3D8-} Mike Friede, V=F6lkerverst=E4ndigung und =FCberhaupt blabla --=-=-=-- From scsh-users-request@scsh.net Fri Oct 6 16:35:48 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 72559114; Fri, 6 Oct 2006 16:35:47 +0200 (DFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 45538-05; Fri, 6 Oct 2006 16:35:43 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id BE560124; Fri, 6 Oct 2006 16:35:40 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 4230B5EDE; Fri, 6 Oct 2006 16:35:40 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net X-Injected-Via-Gmane: http://gmane.org/ To: scsh-users@scsh.net From: William Xu Subject: Re: run/sexp doesn't support non-ascii characeters? Date: Fri, 06 Oct 2006 22:31:36 +0800 Organization: the Church of Emacs Lines: 16 Message-ID: <87irixqzw7.fsf@www.williamxu.com> References: <878xjuiz84.fsf@www.williamxu.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 221.221.30.188 User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/22.0.50 (gnu/linux) Cancel-Lock: sha1:YnTtYCkpNa7sc5zBqrQoxnTFftA= Sender: news Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/341 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Fri, 6 Oct 2006 16:35:40 +0200 (MST) Michael Sperber writes: >>> (run/sexp (echo "中国")) >> >> Error: illegal character read >> #\ä >> #{Input-fdport #{Input-channel 3}} > > This, however, means that your input terminal is sending the wrong > byte sequence inside the quotes to scsh, probably one containing a " > or a \ character. Hmm, are there any workarounds? -- William From scsh-users-request@scsh.net Sat Oct 7 12:15:01 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id BF1FB130; Sat, 7 Oct 2006 12:14:59 +0200 (DFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 37010-05; Sat, 7 Oct 2006 12:14:56 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id C00541B1; Sat, 7 Oct 2006 12:09:37 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 736775EE3; Sat, 7 Oct 2006 12:09:37 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net From: Michael Sperber To: William Xu Cc: scsh-users@scsh.net Subject: Re: run/sexp doesn't support non-ascii characeters? References: <878xjuiz84.fsf@www.williamxu.com> <87irixqzw7.fsf@www.williamxu.com> Date: Sat, 07 Oct 2006 12:09:27 +0200 In-Reply-To: <87irixqzw7.fsf@www.williamxu.com> (William Xu's message of "Fri, 06 Oct 2006 22:31:36 +0800") Message-ID: User-Agent: Gnus/5.110006 (No Gnus v0.6) XEmacs/21.5-b27 (darwin) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Virus-Scanned: ClamAV using ClamSMTP Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/342 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Sat, 7 Oct 2006 12:09:37 +0200 (MST) --=-=-= Content-Type: text/plain; charset=gb2312 Content-Transfer-Encoding: quoted-printable William Xu writes: > Michael Sperber writes: > >>>> (run/sexp (echo "=D6=D0=B9=FA")) >>> >>> Error: illegal character read --=-=-= Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable >>> #\=C3=A4 >>> #{Input-fdport #{Input-channel 3}} >> >> This, however, means that your input terminal is sending the wrong >> byte sequence inside the quotes to scsh, probably one containing a " >> or a \ character. > > Hmm, are there any workarounds? You need to configure your terminal differently. scsh can't see what you're doing. --=20 Cheers =3D8-} Mike Friede, V=F6lkerverst=E4ndigung und =FCberhaupt blabla --=-=-=-- From scsh-users-request@scsh.net Sat Oct 7 17:04:37 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id C8D50127; Sat, 7 Oct 2006 17:04:35 +0200 (DFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 56724-03; Sat, 7 Oct 2006 17:04:30 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 4140F15E; Sat, 7 Oct 2006 17:04:29 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id DD9465EDE; Sat, 7 Oct 2006 17:04:28 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net Message-ID: <4527C190.7090709@informatik.uni-tuebingen.de> Date: Sat, 07 Oct 2006 17:02:40 +0200 From: Eric Knauel User-Agent: Thunderbird 1.5.0.7 (Macintosh/20060909) MIME-Version: 1.0 To: scsh-users@scsh.net Subject: MySQL client/server protocol implementation X-Enigmail-Version: 0.94.0.0 OpenPGP: id=FD866533 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/343 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Sat, 7 Oct 2006 17:04:28 +0200 (MST) I'm pleased to announce the release of the first version of Myscsh. Myscsh is an implementation of the MySQL client/server protocol written entirely in Scheme. This package provides functions to connect to a MySQL database server, authenticate, send queries, and receive and parse result sets. The API is quite low-level: it's just about reading and parsing messages and sending messages. Thus, using the low-level API of Myscsh requires a bit of knowledge of the MySQL client/server protocol. Future versions of Myscsh will include a higher-level API for convenient database programming. Myscsh implements the MySQL 4.1 protocol (which has the internal protocol version number 10) and has been tested in conjunction with a MySQL 4.1 server on Linux. It might work with a 5.x server as well, but that's completely untested. It won't work with 3.20 servers, that's for sure. * Download The latest version is available at * Why an implementation of the protocol? An alternative approach for connecting to MySQL is to use the C library libmysqlclient.so. A lot of language implementations provide bindings to this library. Hence, this is a well-tested approach. However, here are some reasons for implementing the protocol in Scheme: o No C code to compile, no header files to search for, no shared libraries to search for, and no dynamic modules to load. Less high-tech, less trouble. o In scsh and Scheme 48, calling C functions from Scheme blocks the whole Scheme system until the C function returns. Hence, sending a complex query to MySQL using the corresponding C function will stop all Scheme threads until the SQL result is available. This is not acceptable, especially since the SQL queries may take seconds to compute. There is no easy way to fix this. o The Scheme code is quite portable and could be used by other Scheme implementations. The code only uses very few Scheme 48 or scsh specific features, i. e. for network connections. o Writing C bindings is boring. I have written too much C bindings in the past years. Implementing the protocol is not a particular original idea: For example, there is a Ruby implementation of the 3.20 protocol. * Source code repository The latest version of the source code resides in a darcs repository at the following address: * Known bugs, limitations Some things that may cause trouble: o The code completely ignores character encoding issues. It just assumes that the character encoding the Scheme system is using is the right one for communicating with the server. This works in many cases, but is a bad idea in principal. Some things that have not been implemented or tested yet: o receiving result sets that contain binary values o prepared statements, parameter messages, and long data packets o compression * Future work o Write some documentation o Add a high-level API for convenient database programming o Better error-handling using SRFI 34 and SRFI 35 * Bug reports, questions, patches, and author's address: Please send bug reports, questions, patches directly to the author of Myscsh: Eric Knauel knauel@informatik.uni-tuebingen.de From scsh-users-request@scsh.net Mon Oct 23 05:20:04 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 23035118; Mon, 23 Oct 2006 05:20:03 +0200 (DFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 35212-04; Mon, 23 Oct 2006 05:20:00 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id CCE7712B; Mon, 23 Oct 2006 05:19:59 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 616D45EE3; Mon, 23 Oct 2006 05:19:59 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net Date: Sun, 22 Oct 2006 22:19:56 -0500 From: "Matthew R. Dempsky" To: scsh-users@scsh.net Subject: Regexp errors on OpenBSD Message-ID: <20061023031956.GB10866@odin.dempsky.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/344 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Mon, 23 Oct 2006 05:19:59 +0200 (MST) I installed scsh 0.6.7 on OpenBSD 3.9, but I'm having problems with regular expressions: $ ( echo '(regexp-search (rx printing) "foo")'; echo ',exit' ) | ./go | cat -v Welcome to scsh 0.6.7 (R6RS) Type ,? for help. > Error: Posix regexp ([^M-2M-3M-9M-^A-^H^N-^_^?-M-^_]) : invalid character range #{Regexp} 1> (cat -v because the M-* and ^* sequences aren't ASCII characters and aren't pasting properly otherwise.) Any ideas what's up? The same input to OpenBSD's scsh 0.6.2 package runs without error. I've tried using gdb to track down where the error's originating, but after I hit the breakpoint at posix_compile_regexp, trying to step forward at all immediately takes me to the above error. :/ From scsh-users-request@scsh.net Mon Oct 23 08:25:27 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx4.informatik.uni-tuebingen.de (Postfix) with ESMTP id B3A8114B6; Mon, 23 Oct 2006 08:25:25 +0200 (DFT) Received: from mx4.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx4 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 38834-03; Mon, 23 Oct 2006 08:25:22 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx4.informatik.uni-tuebingen.de (Postfix) with ESMTP id A9A18135F; Mon, 23 Oct 2006 08:25:20 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 6AB345EDA; Mon, 23 Oct 2006 08:25:20 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net From: Michael Sperber To: "Matthew R. Dempsky" Cc: scsh-users@scsh.net Subject: Re: Regexp errors on OpenBSD References: <20061023031956.GB10866@odin.dempsky.org> Date: Mon, 23 Oct 2006 08:25:13 +0200 In-Reply-To: <20061023031956.GB10866@odin.dempsky.org> (Matthew R. Dempsky's message of "Sun, 22 Oct 2006 22:19:56 -0500") Message-ID: User-Agent: Gnus/5.110006 (No Gnus v0.6) XEmacs/21.5-b27 (darwin) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 X-Virus-Scanned: ClamAV using ClamSMTP Content-Transfer-Encoding: quoted-printable Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/345 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Mon, 23 Oct 2006 08:25:20 +0200 (MST) "Matthew R. Dempsky" writes: > I installed scsh 0.6.7 on OpenBSD 3.9, but I'm having problems with > regular expressions: > > $ ( echo '(regexp-search (rx printing) "foo")'; echo ',exit' ) | ./= go | cat -v > Welcome to scsh 0.6.7 (R6RS) > Type ,? for help. > >=20 > Error: Posix regexp ([^M-2M-3M-9M-^A-^H^N-^_^?-M-^_]) : in= valid character range > #{Regexp} > 1> > > (cat -v because the M-* and ^* sequences aren't ASCII characters and > aren't pasting properly otherwise.) > > Any ideas what's up? The same input to OpenBSD's scsh 0.6.2 package > runs without error. I've tried using gdb to track down where the > error's originating, but after I hit the breakpoint at > posix_compile_regexp, trying to step forward at all immediately takes > me to the above error. :/ Yes, this has come up before. OpenBSD's regexps, for some reason, don't seem to allow characters above the ASCII range. Possibly, this is a locale issue. Have you tried running with en_US.ISO8859-1 or whatever the local equivalent is? (I have no explanation why it didn't occur with 0.6.2, but that was a long time ago.) --=20 Cheers =3D8-} Mike Friede, V=F6lkerverst=E4ndigung und =FCberhaupt blabla From scsh-users-request@scsh.net Mon Oct 23 23:14:46 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx4.informatik.uni-tuebingen.de (Postfix) with ESMTP id BB22013F0; Mon, 23 Oct 2006 23:14:44 +0200 (DFT) Received: from mx4.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx4 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 24444-03; Mon, 23 Oct 2006 23:14:38 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx4.informatik.uni-tuebingen.de (Postfix) with ESMTP id EF6EC1415; Mon, 23 Oct 2006 23:14:36 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 88CF25EE3; Mon, 23 Oct 2006 23:14:36 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net Date: Mon, 23 Oct 2006 16:14:32 -0500 From: "Matthew R. Dempsky" To: scsh-users@scsh.net Subject: Re: Regexp errors on OpenBSD Message-ID: <20061023211432.GA30094@odin.dempsky.org> References: <20061023031956.GB10866@odin.dempsky.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/346 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Mon, 23 Oct 2006 23:14:36 +0200 (MST) On Mon, Oct 23, 2006 at 08:25:13AM +0200, Michael Sperber wrote: > Yes, this has come up before. OpenBSD's regexps, for some reason, > don't seem to allow characters above the ASCII range. A nuissance to be sure (and I'm looking into if it can be fixed), but couldn't scsh easily avoid passing non-ASCII characters in cases where they're unnecessary? > Possibly, this is a locale issue. Have you tried running with > en_US.ISO8859-1 or whatever the local equivalent is? I don't think OpenBSD's regex library cares about locale. (I'd be interested to hear if anyone can confirm that (regexp-search (rx printing) "foo") works with scsh 0.6.7 on FreeBSD or NetBSD.) > (I have no explanation why it didn't occur with 0.6.2, but that was a > long time ago.) I tried setting a breakpoint on regcomp and seeing what string scsh passes to the regex library, and scsh 0.6.7 passes the string mentioned in my original post, while scsh 0.6.2 passes "[\t-\r -~]". Any ideas why this might have changed? From scsh-users-request@scsh.net Mon Oct 23 23:52:04 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 34B951B4; Mon, 23 Oct 2006 23:52:03 +0200 (DFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 31246-03; Mon, 23 Oct 2006 23:51:57 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 04035110; Mon, 23 Oct 2006 23:51:56 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id B2AA25ED9; Mon, 23 Oct 2006 23:51:55 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net Date: Mon, 23 Oct 2006 16:51:52 -0500 From: "Matthew R. Dempsky" To: scsh-users@scsh.net Subject: Patch for scsh regexp bug Message-ID: <20061023215152.GB2145@odin.dempsky.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Resent-Message-ID: <3yau9C.A.VSE.7lTPFB@bernard> Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/347 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Mon, 23 Oct 2006 23:51:55 +0200 (MST) I noticed that in posix_compile_regexp, scsh doesn't properly compute the flags argument to pass to regcomp (``|'' has higher precedence than ``?:''). (Unfortunately, this is unrelated to the ``invalid character set'' error from before.) --- regexp1.c~ Mon Oct 23 16:29:43 2006 +++ regexp1.c Mon Oct 23 16:29:43 2006 @@ -66,10 +66,10 @@ s48_value sch_regex; int status; S48_DECLARE_GC_PROTECT(1); - int flags = S48_EXTRACT_BOOLEAN(extended_p) ? REG_EXTENDED : 0 | - S48_EXTRACT_BOOLEAN(ignore_case_p) ? REG_ICASE : 0 | - S48_EXTRACT_BOOLEAN(submatches_p) ? 0 : REG_NOSUB | - S48_EXTRACT_BOOLEAN(newline_p) ? REG_NEWLINE : 0; + int flags = (S48_EXTRACT_BOOLEAN(extended_p) ? REG_EXTENDED : 0) | + (S48_EXTRACT_BOOLEAN(ignore_case_p) ? REG_ICASE : 0) | + (S48_EXTRACT_BOOLEAN(submatches_p) ? 0 : REG_NOSUB) | + (S48_EXTRACT_BOOLEAN(newline_p) ? REG_NEWLINE : 0); S48_GC_PROTECT_1(pattern); From scsh-users-request@scsh.net Tue Oct 24 08:09:55 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 0EAB0267; Tue, 24 Oct 2006 08:09:54 +0200 (DFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 31616-04; Tue, 24 Oct 2006 08:09:47 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id E2E23133; Tue, 24 Oct 2006 08:09:45 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 9B2545EDE; Tue, 24 Oct 2006 08:09:45 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net From: Michael Sperber To: "Matthew R. Dempsky" Cc: scsh-users@scsh.net Subject: Re: Regexp errors on OpenBSD References: <20061023031956.GB10866@odin.dempsky.org> <20061023211432.GA30094@odin.dempsky.org> Date: Tue, 24 Oct 2006 08:04:13 +0200 In-Reply-To: <20061023211432.GA30094@odin.dempsky.org> (Matthew R. Dempsky's message of "Mon, 23 Oct 2006 16:14:32 -0500") Message-ID: User-Agent: Gnus/5.110006 (No Gnus v0.6) XEmacs/21.5-b27 (darwin) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 X-Virus-Scanned: ClamAV using ClamSMTP Content-Transfer-Encoding: quoted-printable Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/348 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Tue, 24 Oct 2006 08:09:45 +0200 (MST) "Matthew R. Dempsky" writes: > On Mon, Oct 23, 2006 at 08:25:13AM +0200, Michael Sperber wrote: >> Yes, this has come up before. OpenBSD's regexps, for some reason, >> don't seem to allow characters above the ASCII range. > > A nuissance to be sure (and I'm looking into if it can be fixed), but > couldn't scsh easily avoid passing non-ASCII characters in cases where > they're unnecessary? It could, but then you'd still have problems with the cases where they're necessary. > I tried setting a breakpoint on regcomp and seeing what string scsh > passes to the regex library, and scsh 0.6.7 passes the string > mentioned in my original post, while scsh 0.6.2 passes "[\t-\r -~]". > Any ideas why this might have changed? Probably to fix bugs related to handling characters above 127 (which people do work with). --=20 Cheers =3D8-} Mike Friede, V=F6lkerverst=E4ndigung und =FCberhaupt blabla From scsh-users-request@scsh.net Fri Oct 27 16:51:41 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id E877310F; Fri, 27 Oct 2006 16:51:39 +0200 (DFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 67034-05; Fri, 27 Oct 2006 16:51:38 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 595C5C2; Fri, 27 Oct 2006 16:51:37 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 24F4D5ED9; Fri, 27 Oct 2006 16:51:37 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net X-Injected-Via-Gmane: http://gmane.org/ To: scsh-users@scsh.net From: Emilio Lopes Subject: Re: Regexp errors on OpenBSD Date: Fri, 27 Oct 2006 16:45:05 +0200 Organization: The Church of Emacs Lines: 28 Message-ID: References: <20061023031956.GB10866@odin.dempsky.org> <20061023211432.GA30094@odin.dempsky.org> Mime-Version: 1.0 Content-Type: text/plain; charset=latin-iso8859-1 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: m8315.m.pppool.de User-Agent: Emacs Gnus Cancel-Lock: sha1:5CmEFXLoNUVU1eS8tToajmh8iRI= Sender: news Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/349 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Fri, 27 Oct 2006 16:51:37 +0200 (MST) Matthew R Dempsky writes: > On Mon, Oct 23, 2006 at 08:25:13AM +0200, Michael Sperber wrote: >> Yes, this has come up before. OpenBSD's regexps, for some reason, >> don't seem to allow characters above the ASCII range. This is also the case with the regexp library shipped with Cygwin. As I remember it is also documented somewhere in the sources, maybe in the README file. I tracked the problem down to this change in scsh/rx/parse.scm, in the procedure `char-set->in-pair': @@ -709,7 +714,7 @@ . ,ranges)))) (values loose ranges))))) - (let lp ((i 127) (from #f) (to #f) (loose '()) (ranges '())) + (let lp ((i 255) (from #f) (to #f) (loose '()) (ranges '())) (if (< i 0) (add-range from to loose ranges) I've not found any evil effects after reverting this one change some months ago (dreimal klopfen auf den Kopf!). -- Emílio C. Lopes Munich, Germany From scsh-users-request@scsh.net Sat Oct 28 11:16:18 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx4.informatik.uni-tuebingen.de (Postfix) with ESMTP id 7250013EB; Sat, 28 Oct 2006 11:16:16 +0200 (DFT) Received: from mx4.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx4 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 31080-04; Sat, 28 Oct 2006 11:16:14 +0200 (DFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx4.informatik.uni-tuebingen.de (Postfix) with ESMTP id CE1E713F6; Sat, 28 Oct 2006 11:16:13 +0200 (DFT) Received: by www.scsh.net (Postfix, from userid 3123) id 8C4FF5ED9; Sat, 28 Oct 2006 11:16:13 +0200 (MST) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net From: Michael Sperber To: Emilio Lopes Cc: scsh-users@scsh.net Subject: Re: Regexp errors on OpenBSD References: <20061023031956.GB10866@odin.dempsky.org> <20061023211432.GA30094@odin.dempsky.org> Date: Sat, 28 Oct 2006 11:16:07 +0200 In-Reply-To: (Emilio Lopes's message of "Fri, 27 Oct 2006 16:45:05 +0200") Message-ID: User-Agent: Gnus/5.110006 (No Gnus v0.6) XEmacs/21.5-b27 (darwin) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 X-Virus-Scanned: ClamAV using ClamSMTP Content-Transfer-Encoding: quoted-printable Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/350 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Sat, 28 Oct 2006 11:16:13 +0200 (MST) Emilio Lopes writes: > I tracked the problem down to this change in scsh/rx/parse.scm, in the > procedure `char-set->in-pair': > > @@ -709,7 +714,7 @@ > . ,ranges)))) > (values loose ranges))))) > > - (let lp ((i 127) (from #f) (to #f) (loose '()) (ranges '())) > + (let lp ((i 255) (from #f) (to #f) (loose '()) (ranges '())) > (if (< i 0) > (add-range from to loose ranges) > > I've not found any evil effects after reverting this one change some > months ago (dreimal klopfen auf den Kopf!). Except you get unwanted effects with non-ASCII characters. :-( --=20 Cheers =3D8-} Mike Friede, V=F6lkerverst=E4ndigung und =FCberhaupt blabla From scsh-users-request@scsh.net Tue Oct 31 10:35:53 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id 0B30F11F; Tue, 31 Oct 2006 10:35:52 +0100 (NFT) Received: from mx1.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx1 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 66680-02; Tue, 31 Oct 2006 10:35:47 +0100 (NFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx1.informatik.uni-tuebingen.de (Postfix) with ESMTP id A4636172; Tue, 31 Oct 2006 10:35:44 +0100 (NFT) Received: by www.scsh.net (Postfix, from userid 3123) id 931C15EE4; Tue, 31 Oct 2006 10:35:44 +0100 (MET) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net From: Michael Sperber To: "Matthew R. Dempsky" Cc: scsh-users@scsh.net Subject: Re: Patch for scsh regexp bug References: <20061023215152.GB2145@odin.dempsky.org> Date: Tue, 31 Oct 2006 10:35:37 +0100 In-Reply-To: <20061023215152.GB2145@odin.dempsky.org> (Matthew R. Dempsky's message of "Mon, 23 Oct 2006 16:51:52 -0500") Message-ID: User-Agent: Gnus/5.110006 (No Gnus v0.6) XEmacs/21.5-b27 (darwin) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 X-Virus-Scanned: ClamAV using ClamSMTP Content-Transfer-Encoding: quoted-printable Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/351 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Tue, 31 Oct 2006 10:35:44 +0100 (MET) "Matthew R. Dempsky" writes: > I noticed that in posix_compile_regexp, scsh doesn't properly compute > the flags argument to pass to regcomp (``|'' has higher precedence > than ``?:''). Thanks; there's a few more instances of this type of bug in the same file. I've just committed the fix. --=20 Cheers =3D8-} Mike Friede, V=F6lkerverst=E4ndigung und =FCberhaupt blabla From scsh-users-request@scsh.net Tue Oct 31 10:45:54 2006 Return-Path: X-Original-To: scsh@informatik.uni-tuebingen.de Delivered-To: scsh@informatik.uni-tuebingen.de Received: from localhost (loopback [127.0.0.1]) by mx4.informatik.uni-tuebingen.de (Postfix) with ESMTP id D216E137B; Tue, 31 Oct 2006 10:45:52 +0100 (NFT) Received: from mx4.informatik.uni-tuebingen.de ([127.0.0.1]) by localhost (mx4 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 23704-01; Tue, 31 Oct 2006 10:45:15 +0100 (NFT) Received: from www.scsh.net (bernard.Informatik.Uni-Tuebingen.De [134.2.12.122]) by mx4.informatik.uni-tuebingen.de (Postfix) with ESMTP id BA4881322; Tue, 31 Oct 2006 10:44:37 +0100 (NFT) Received: by www.scsh.net (Postfix, from userid 3123) id 594435EDE; Tue, 31 Oct 2006 10:44:37 +0100 (MET) Old-Return-Path: X-Original-To: scsh-users@scsh.net Delivered-To: scsh-users@scsh.net X-Injected-Via-Gmane: http://gmane.org/ To: scsh-users@scsh.net From: Emilio Lopes Subject: Re: Regexp errors on OpenBSD Date: Tue, 31 Oct 2006 09:43:09 +0000 (UTC) Lines: 35 Message-ID: References: <20061023031956.GB10866@odin.dempsky.org> <20061023211432.GA30094@odin.dempsky.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: main.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 192.109.190.88 (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7) Sender: news Resent-Message-ID: Resent-From: scsh-users@scsh.net X-Mailing-List: archive/latest/352 X-Loop: scsh-users@scsh.net List-Post: List-Help: List-Subscribe: List-Unsubscribe: Precedence: list Resent-Sender: scsh-users-request@scsh.net List-Id: List-Archive: Resent-Date: Tue, 31 Oct 2006 10:44:37 +0100 (MET) Michael Sperber informatik.uni-tuebingen.de> writes: > Emilio Lopes gmx.net> writes: > > > I tracked the problem down to this change in scsh/rx/parse.scm, in the > > procedure `char-set-ein-pair': > > > > -709,7 +714,7 > > . ,ranges)))) > > (values loose ranges))))) > > > > - (let lp ((i 127) (from #f) (to #f) (loose '()) (ranges '())) > > + (let lp ((i 255) (from #f) (to #f) (loose '()) (ranges '())) > > (if (< i 0) > > (add-range from to loose ranges) > > > > I've not found any evil effects after reverting this one change some > > months ago (dreimal klopfen auf den Kopf!). > > Except you get unwanted effects with non-ASCII characters. Michael is right. The right thing to do is to link Scsh against a POSIX compliant regexp library. One can use the GNU regexp library, as described e.g. in http://www.openldap.org/faq/data/cache/148.html. It's really simple and worked for me on our old Cygwin version. I guess it would also work on OpenBSD. I'll upload Cygwin binaries for Scsh-0.6.7 to my web site in the next days. -- Emílio C. Lopes Munich, Germany