scsh-users
[Top] [All Lists]

Rscheme etc.

To: gel@cygnus.com
Subject: Rscheme etc.
From: Tom Lord <lord@beehive.cygnus.com>
Date: Fri, 10 Nov 1995 10:42:23 -0800
Cc: scsh@martigny.ai.mit.edu, wilson@cs.utexas.edu
Reply-to: lord@cygnus.com
Although recent discussion of Guile has mentioned SCSH once or twice,
it is not really about SCSH.  Please direct all followups away from
the scsh mailing list.

All this chatter about Guile is kind of interesting, but hacking Guile
is *more* interesting.  I want to try to rebut some of the points Paul
Wilson and others make, but I hope not to say anything more in this
thread, other than to make the next release of Guile in a few weeks
time.

For one thing, the next release will come with a big juicy reference
manual that puts a lot of the interfaces explicitly on the table.  It
seems to me that lack of documentation is Paul's biggest and most
accurate complaint; a lot of the others are conceivably just
misunderstandings that documentation will help to clear up.

For another thing, all this talk about "replacing Guile", especially
by an implementation based on compiling, is a little silly.  Guile
deliberately has all sorts of yummy dynamic behaviors that would be
difficult and pointless to try to get right in a compiler (examples:
lazilly computed top-levels, Guile's low-level macro system, and
very-low-latency eval).

Guile also deliberatey has an execution model that is good for some
kinds of compiler (like hobbit and, i suspect, Rscheme's) and bad for
others (from what i read, the calling conventions would just be an
obstacle for, say, twobit).

I suspect that what makes more sense is "augmenting Guile".  

For example, hobbit is handy, but maybe Rscheme's Scheme->C compiler
will have some features hobbit lacks.  Or maybe hobbit will be good
for one kind of program while Rscheme does a good job on another.  The
best outcome is one in which *both* Rscheme and hobbit can generate
object files that are Guile compatible.  (This is analogous to, though
ulitimately more complicated than, the interchangability of C
compilers.)

Guile completely lacks any optimizing compiler that emits a portable
executable format; perhaps the Rscheme (or Bigloo or whatever)
compiler can be adapted for that.

I doubt any compiler will ever non-trivially compile *all* Guile
programs or be able to replace the Guile evaluator -- but many
compilers could be adapted so that the code they compile can call and
be called from Guile.

I really hope that forward thinking implementors like the Rscheme team
will consider the approach of "augmenting" Guile.  It is simpler all
around and yields the greatest utility.

For example, a compiler implementor might look at the upcoming
reference manual, and derive a spec that says "If you limit your use
of Guile to just features X,Y and Z...we're able to compile your
program into a <shared object file | dynamicly loaded object file |
bytecodes> that can be loaded into Guile.  Of course if you use
<low-level macros | the top-level as a first-class module | etc.> then
all bets are off.

But for now...um...err.... "flame on"?

[BTW, not all, maybe not even *any* of the following complaints should
really be blamed on Paul.]



     The Guile Is Good For You FAQ: Replies to Common Complaints
     ===========================================================

    (with apologies to the helpful critics who inspired this faq)
    -------------------------------------------------------------


* SCM is really ugly.  I don't like reading the code.

Please look at guile-iii.  Superficially, at least, the code was a lot
cleaner than that of SCM.  guile-iv is cleaner still.  And the
documentation is improving too.

On the subject of truth and beauty, I think close examination reveals
SCM to be a startlingly elegant and foresightful implementation
cunningly disguised by the miserable rags of a cramped code-formatting
style.





* SCM is really undisciplined code.   Implementation details aren't 
encapsulated!

To some extent, internally to libguille, this is true (though the
degree to which it is true has been decreasing over time).

More importantly, almost all implementation details ARE encapsulated
by the public C interface.  So the purported lack of discipline
describes only interpreter internals.

Let's take a specific example.  There is an internal SETCDR macro, but
it isn't used everywhere.  Some code just uses "CDR(X) =" which
exposes the detail that cdr mutation is just an ordinary C assignment.
That's unfortunate, because some day we will need a write barrier on
cdr mutation.  Yup, the code is really ugly that way.  It'll probably
take a whole day, maybe even two or three to fix cdr mutations, when
the time comes.

The public C interface, on the other hand, includes "gscm_set_cdr"
and if you try "gscm_cdr(X) =", the compiler will give an error.




* There's no documentation.

There is, but it is scattered.  Guile-iv will include a juicy
reference manual that cleverly collects documentation for all of the
mistakes and offensive design choices in one handy document handsomely
bound and suitable for reference while writing those midnight flames
against Guile.




* Its not standard!   You bastards!  You changed () into #f.

Actually, we changed #f into ().




* Its not sTanDarD!   yoU bAstArds!  yoU made sYmbols case SenSative!.

Yes, we did.




* Mark Sweep?  Mark!?! and Sweep?!?   Peasants!

This complaint usually comes from someone who hates GC pauses and who
maybe likes real-time guarantees.  The complaint is partly motivated
by a false assumption about the limitations of mark and sweep
collectors and partly by true assumptions about recent results in GC
research.

I don't think that anytime soon Guile is going to become the real-time
Scheme environment of choice; this honor will probably belong to an
implementation that doesn't try so hard to play with C and Unix.

On the other hand, I do believe in the medium-term prospects for a
concurrent, incremental, mark-sweep collector, giving Guile at least
modest real-time capabilities and making it useful for making
real-time C programs extensible.  (Say, you wanna help with this
tricky-and-interesting coding project?)





* Conservative Stack Scanning?!?!  Dweebs!

This complaint usually comes from someone who hates memory
fragmentation or who has other reasons to like copying collection.

It is true that without major major surgery, a strictly copying
collector (one that *must* copy objects during collection) won't fit
with the current Guile implementation.  

With a little clevery hackery, the public C API of Guile, which does
rely on conservative scanning, could be integrated with a strictly
copying collector.  I'm not sure i believe anyone will really want to
do this, but i believe it to be straightforwardly possible.

On the other hand, a mostly copying collector (one that copies only
objects not directly protected from a conservative region) could
certainly work, even with the current implementation, if someone felt
a genuine need to write one.

Something to keep in mind as you throw stones is that in the early
days of Guile, when C interfaces were proposed that did not bask in
the luxury of conservative stack scanning, many prospective users of
the library screamed bloody murder.  

Among those screaming the loudest were people with several years of
experience writing C code against an interpreter interface that did
not rely on conservative stack scanning.  Universally, they reported
that such interfaces are more error prone, and lead to hard to detect,
hard to track down bugs.  Anecdotally, at least, conservative stack
sweeps make for an easier to use, less error prone C interface.

Also note that the assumption of conservative scanning within libguile
leads to a faster implementation than if that assumption were lost:
Suppose Guile had to use a macro like Emacs' "GCPRO", that would be
extra instructions for every local variable -- mostly wasted
instructions at that since GC is rare.  On the other hand, suppose
Guile simply avoided using the C stack to hold SCM variables: in
addition to complicating the public C interface, this would lose all
the benefit currently obtained by letting the C compiler manage
storage for these variables.

Something else to keep in the back of one's mind is that if (someday)
the compiler helps, stack scanning need not be conservative, strict
copying collection becomes possible, and the C API to Guile imposes no
restriction on implementation strategies.  If you like, you can think of
the conservative scanning of today as a provisional implementation
to get us to the point where compiler-assisted precise scanning is
possible.

True, making an implementation that relies on a non-standard compiler
enhancement is not a decision to be made lightly, but the option is at
least plausible because we live in an era of increasingly ubiquitous,
free, C compilers such as GCC or LCC.  Such a decision might also be
palatable within the context of the current Guile implementation since
there already exists a widely ported GC that doesn't depend on
compiler help.

As for memory fragmentation...

Guile's strategy for large objects (a cons-pair handle pointing to a
large malloc region) is already useful for fighting fragmentation; the
malloced regions can be relocated at any time, nearly independently of
GC (for example, the GNU relocating malloc could be used).

Also, among the cons pair handles, the freelist is always sorted so
allocations of cons pairs tend to cluster.  (Not that this has been
measured. :-)





* Only dorks call-with-current-continuation by copying the stack!

Usually, this complaint is made by someone who wants to use continuations
to implement threads.

There is this rumour that comes to us out of history.  The rumour
goes:

        Context switching and "(call/cc proc)" where proc was created
        by "call/cc" are semanticly very close and (in the absense of
        dynamic-wind) can even be implemented by very similar
        instruction sequences -- therefore, call/cc should be used to
        implement context switching and implementations should optimize
        for this use.

Modern call-with-current-continuation is a lousy way to implement
threads.  First, it gives a model only for single-processor execution
of threads -- truly concurrent threads can't be implemented using
call-with-current-continuation, at least not without adding additional
non-standard constructs and complicating the semantics.  Second,
call-with-current-continuation interacts with dynamic-wind in a way
that further complicates threads: should thread switching require
unwinding one wind-stack and winding into another?  That's very
expensive!

Sometimes the complaint is made by someone who wants to use fast
continuations to implement some tricky algorithm, perhaps something
that does a lot of backtracking.  It's conceivable that Guile's
strategy for call-with-current-continuation could benefit some such
algorithms, but undoubtedly it will get in the way of others.  Let's
talk about the latter case.

In my opinion, such uses of call-with-current-continuation are
legitimate, but rare (as is *any* use of
call-with-current-continuation).  Guile hasn't yet been optimized for
that case.  Can it be?  Sure -- simply by implementing a top-level
environment in which programs are implicitly translated to explicit
continuation passing style.  That translation imposes a slight tax on
procedure application, but in return it provides the service of a very
fast implementation of call-with-current-continuation.

Alternatively, a wiley hacker who wants cheap call/cc within Guile
might take the approach of contributing to the development of a
suitable bytecode interpreter and port of a compiler to Guile.




* Can't we substitute implementation <xyzzy> instead, since it is so much 
better?

Sure, just so long as you can implement all the stuff in the upcoming
reference manual.  If you can't, one thing to do is to offer patches
to the reference manual AND the current implementation.  The sooner
this kind of activity takes place, the less inertia there will be
working against making changes.

Another approach, especially suitable if <xyzzy> is a Scheme compiler,
is not to *substitute* <xyzzy>, but to *add* <xyzzy>.  If you take
this approach, there is no need to make <xyzzy> able to do everything
the existing implementation can.  Instead, <xyzzy> can focus on
compiling a *subset* of Guile very well.  

All of Guile can be divided into three parts:

        * A Scheme programming environment, 
          written largely in Scheme, based on a certain 
          set of built-ins required of the implementation.

        * A C API

        * An implementation of the required built-ins and 
          C API.

And along side these, we're starting to accumulate a modest library of
plug-ins that depend on the Scheme environment and/or the C API.

Substituting implemenations makes perfect sense and the C API should
try not to preclude this.   However, ease of use, robustness,
completeness, and flexibility in combination with C programs are 
also very important priorities for the C API.  They are priorities
that can easily conflict with making a C API that is neutral wrt.
implementation strategies.






* Tags are for pinheads; scheme objects should be implemented as C++ objects!

Implementations that add a virtual table pointer to every cons pair
use 50% more memory per cons pair than Guile.  That's a lot.

Dispatching some performance critical functions (such as apply) on tag
bits requires fewer operations (especially: fewer memory references)
than a vtable lookup.

Tags are, therefore, nifty.

Tags are also awkward to manipulate in C which is why the Guile C API
hides them.

Incidently, if you want to see some virtuous documentation, i
recommend the first few pages "code.doc" the original programmer
documentation for SCM.  It vividly and completely sums up SCMs
allocation of tags in a very few lines of ASCII and reveals a great
deal about SCM's object model.   It's an excellent diagram.



* Interpretting S-expressions?  That's kid stuff!  Compile to bytecodes!

Actually, Guile *does* compile source forms to an internal word-code
that is close to Guile Scheme, but lower-level.  The compiler is
exceedingly simple, compilation is lazy (only code actually being
executed is compiled), on-the-fly (compilation is mingled with
evaluation), and almost no optimizations are performed (which at least
makes the generated code trivial to predict).  The compiler is
extensible using Guile's low-level macro facilities.  The code
generated by the compiler has list structure and a semantics close to
Scheme, so one can easily write mutators for it to achieve novel
semantic effects (i.e., its fun to hack on).

By design, the Guile compiler minimizes the latency between calling
"eval" on some source form, and beginning execution of that source
form.  Therefore, loading large bodies of source and starting to run
them can happen very quickly in Guile -- it involves about as much
work as the lexical analysis phase of a convential compiler.

In spite of the minimalist approach to compilation, Guile achieves
performance that competes with several bytecode compilers (and there
is head-room -- obvious ways to speed up Guile without making huge
changes).

Still, in some situations one wants to pay the price of optimizing
compilation to get the benefits (such as generating fast-loading,
fast-running but not-directly-mutable and not-humanly-readable object
files).  Personally, i very much want to see Guile have this
capability in at least two forms: one, an optimizing bytecode compiler
and a corresponding interpreter within guile; the other,
re-integration of hobbit with Guile in order to have a
Scheme->C->dynamicly loaded object compiler (as SCM has).  (The main
difference between these approaches is that one generates a portable
executable, the other a machine dependent executable.)

There are, currently in development, a number of free, promising,
optimizing bytecode compilers for Scheme.  I am sure at least one of
them can be integrated with Guile, if not more than one.




* Guile isn't designed to be compiled!  Are we having FUN yet?

This complaint is both true and false.

It is true that Guile has some features that no compiler could hope to
recover from if it tried to deal with them.  Lazy top levels and
low-level macros are two examples.  Features like this are being added
to Guile to support interactive use as an extension language
environment.

It is false because the design of Guile mixes very well with compiled
code.  Guile Scheme programs that avoid the more dynamic features of
Guile can be compiled perfectly well.  

The complaint is also false because the next release will contain a
module system that allows compiler writers to design Guile top levels
containing only compilable constructs but able to export definitions
to full-blown top-levels.



* Your module system SUCKS!  It isn't Scheme-y at all!

Actually, our module system is extensible, so if there is a more
Scheme-y interface that you'd prefer, it can probably be implemented
in just a few pages of code.



* Guile doesn't have an object sytem!  Where are your manners?!?

But it does, just well hidden.  The upcoming reference manual will
make it more apparent.



* So its got objects, but i still can't subclass cons pairs and 
  have the built-in CAR procedure "do-the-right-thing".  I'd
  have to write a *whole new CAR procedure* and make it available
  in some optional interface via the module system.  If i do that,
  users will have a *choice* about whether to pay the overhead cost
  of all that extra generality!  Choice, I say!  Nauseating!

That's "car" to you, buddy, not "CAR".

Actually, sooner or later i hope someone will come up with a simple
type-checking interface to replace the ad-hoc assertions currently
sprinkled all over libguile.  A nicer type-checking interface could
serve the dual purpose of making callouts to user-code in the event of
a type-error, effectively turning the built-in functions into
extensible generics.

We can add such type-checking calls to the public C interace, as well.

It would be a manageable amount of work to modify all the builtins in
libguile to use such a type-checking mechanism.  I'm sure it would be
worth the effort.

The net effect of this modification is that those who want a generic
car would still have to create a special (and optional) top-level, but
the generic car could often dispatch on its argument much faster than
if the built-in version of car doesn't help.

Another effect of the described modification might be an increase
in the robustness of type error checking and reporting.



* Guile doesn't have a C++ interface!

At least one has been written.  I hope more will be.

The usual problems are encountered: there are no free tools (that i
know of) that are easy to use and that parse c++ to yield handy
description of the definitions found there.  There are several
approximations of such tools, but they all have problems.

Moreover, there are no universally applicable conventions about memory
mgt that can be used to relate the C++ objects to the GC.  There are
many arbitrarilly decided ways to hack around these limitations, and
that's the point: there are many arbitrary possibilities to choose
from.

By supporting C cleanly, and given the compatability and layering of
C++ and C, guile makes it possible to design and implement a C++
interface to suit the specific needs of your project.  Have at it!



* Guile doesn't have a module system.

Beginning with Guile-iv, it does.


* Guile doesn't have a debugger.

Beginning with Guile-iv, it does.


* Guile doesn't have a compiler.

Hobbit (which may or may not be included in guile-iv) can compile
for Guile.


<Prev in Thread] Current Thread [Next in Thread>