scsh-users
[Top] [All Lists]

Re: report

To: shivers@mintaka.lcs.mit.edu
Subject: Re: report
From: Scott Schwartz <schwartz@galapagos.cse.psu.edu>
Date: Sun, 22 Jan 1995 23:26:41 -0500
Cc: scsh-bugs@martigny.ai.mit.edu
| Right. It's an ASCII reader, and strings are ASCII strings. I'd have to
| and change the low-level string and character reps before I could change
| the reader.

While there is some merit to making scheme-48 a fully Unicode based
system, that wasn't what I was proposing (and, unfortunately, don't
have nearly enough time for).  Rather, I was objecting to the fact that
read.scm limits the characters that it will read into a symbol, instead
of just accepting any characters that aren't reserved for something else.
That would make scsh iso-latin-1 clean, which would be enough to use
UTF-8 (multibyte encoded Unicode characters) in symbol names, even
without teaching the scheme runtime anything about Unicode or changing
the representation of anything. 

In order to support unicode, Plan 9 uses both arrays of chars (bytes)
and arrays of Runes (16 bit characters). The system stores files as
UTF-8 chars, not as Runes, just as in Unix.  That means that you always
need char*, and often need Rune*, but Scheme has only one character
type and one string type.  Sounds complicated to fix.

<Prev in Thread] Current Thread [Next in Thread>
  • report, Olin Shivers
    • Re: report, Scott Schwartz <=