>In article <firstname.lastname@example.org>,
>Bryan O'Sullivan <email@example.com> wrote:
>: Perl isn't going to escape outside of the areas where it already has
>: hegemony. If you don't use Perl very frequently, you're not going to
>: remember from one use of it to the next what on earth any of the
>: myriad of notational shortcuts do, or even what they look like. As
>: for Perl's evolution, recent additions to the language such as
>: references, garbage collection (don't forget, kids, you can't GC
>: circular structures in Perl!) and object orientation are sufficiently
>: crufty as to make strong men weep. Perl's appeal is self-limiting,
>: much like that of chainsaw juggling. Mind you, I'd hate to think of a
>: world without chainsaw juggling.
>English isn't going to escape outside of the areas where it already has
>hegemony. If you don't use English very frequently, you're not going
>to remember from one use of it to the next what on earth any of the
>myriad of notational shortcuts do, or even what they look like.
Are you suggesting that everyone should speak Perl as often as they
do their day-to-day human language, so it's okay for Perl syntax
to be as crufty as English syntax? :-)
But seriously, this metaphor has some utility. On the other hand, it
has severe limits, which are obvious if you take it seriously.
English is a hard language to learn to speak, if you're more than about
7 years old, and even then it takes years to become fluent. It's also
extraordinarily hard to learn to spell.
I think most linguists would agree that English is a very hacked language,
with some painful artifacts of its evolution carried along whether they're
good or not.
It's also optimized for a very different user community than any programming
language. It's optimized for a high commincation rate in the presence
of noise and a poorly specified underlying semantics. It's designed for
people who are very good at detecting when they don't understand what
the text means, and are willing to go through extra iterations, reworking
the text, to clarify the meaning. (Hey, maybe English and Perl do have a
lot in common! :-)
Taking the Perl/English metaphor seriously, we're talking about crufty
languages that are designed primarily for Power Users, who live, eat, and
breathe in the language. Years of continuous exposure and practice are
required in order to be able remember the rules and idioms, so that
old texts can be understood as intended.
Not my favorite set of goals for a programming language.
> As for
>English's evolution, recent additions to the language such as latinate
>words and the entire pronoun system (don't forget, kids, you can't make
>case distinctions on ordinary nouns in Engish) are sufficiently crufty
>as to make strong men weep.
Agreed. As a native speaker of English with many years of continuous
practice, it tends not to bother me most of the time. But I have great
sympathy with various people around the world who hate English with
a passion because it's unnecessarily crufty and presents a high barrier
to entry. There are practical reasons why English has become The
International Language for the next century.
One reason is that it's The Standard, like it or not, without a close
contender, so it has too much momentum to stop. Another is that it has
lots of merits along with it's horrid demerits, due to hundreds (and in
some sense, thousands) of years of evolution in heavy use.
I think the evolution of natural language has interesting lessons to teach
about programming language design. One of the lessons is that you should
be very concerned with how the language will actually be used, and you
should be willing to evolve the language in response. Another lesson is
that if you don't *design* the language well, it will be a painful
kludge that will cause unnecessary suffering for a very long time.
Unless, of course, you're lucky enough that the language dies of its
own kruftiness---as most languages have---to be replaced by something
better. If you're lucky, the best features of the earlier languages
get propagated into the new languages.
> English's appeal is self-limiting, much
>like that of chainsaw juggling. Mind you, I'd hate to think of a world
>without chainsaw juggling.
>Me either. :-)
Agreed. As long as *I* don't have to do it, and too many unsuspecting
people aren't misled into following the example of chainsaw jugglers,
with suboptimal results.
So what do we make of this? What useful lessons can we learn from
Perl and common shell languages, without having a Scheme-based shell
degenerate into a whole bunch of syntactic quirks?
What do we look for in a syntax for a shell language, including one
that's typed at an interactive prompt? I think the parenthophobes
hold sway, rightly or wrongly, and not *entirely* wrongly.
One thing I've been thinking about lately is that (again, rightly
or wrongly) people are willing to learn command languages that have
a very different syntax from normal programming languages, in that
the function position of a command is not distinguished from the
argument positions, e.g.,
grep foo bar
Notice that this is like Scheme, only without the enclosing parentheses.
The delimiters are inferred somehow, using other rules and constraints.
What's interesting to me about this is that people who have tried to
put a less "objectionable" syntax on Lisp or Scheme have usually
tried to make it look more like an algebraic language? Why? Why
not make it look like a shell language, only better? For whatever
reason, people *are* willing to put up with shell syntax in many
cases. For whatever reason, many people already *know* basic shell
syntax (if only in the form of simple commands), so we can leverage
that to avoid alienating newbies and get them sucked into the Right
kind of language. :-)
Most people who try to design a new syntax for Lisp or Scheme try to
make it look like Pascal or C, which may be a poor fit. (Actually,
I like Dylan's syntax reasonably well, but it's a little too verbose,
and it's not designed for interactive commands.) I'm interested in
exploring other alternatives, which may seem radical in the context
of conventional programming languages, but not at all weird in the
context of normal interactive computer use. I don't want to sacrifice
having a good syntax for programming in the large, but I think that
maybe if we re-think syntax a little, we don't have to sacrifice much
The idea I'm toying with is to eliminate most of the parentheses in
normal Scheme expressions, in a fairly straightforward way---by making
them implicit in common expression types. (Note: the following has
not been thoroughly thought out. Comments welcome, and flames to
(define (foo bar baz)
(quux (bleen baz))
would look something like
def (foo bar baz)
then (quux (bleen baz))
else (quux baz)
The basic ideas are
0. We preserve the general Scheme property that there is no distinction
between commands and expressions. Everything's an expression that
may optionally have side effects.
1. Common special forms like IF use balanced keywords (maybe palindromic,
as above, maybe not) which act as implicit parentheses as well as
specifying which special form is meant.
2. You still parenthesize other subexpressions. I think this may
be less objectionable than it seems at first glance.
#2 requires some explanation. I've been pretty surprised to find out
how many people will put up with Tcl, and that they don't seem to mind
having to put brackets around subexpressions. It's pretty natural
if you come from a command-oriented background to have the idea that
a command doesn't require parentheses around it (or around its arguments,
as in an algebraic syntax). They can easily understand that this
makes nested expressions ambiguous in many cases, and that they need
to put brackets around it to delimit "subcommands" that compute
results used by the enclosing commands. (In Tcl, these are all called
Note that I'm not defending Tcl here. I personally can't stand Tcl, which
is why I'm interested in coming up with a "command syntax" for Scheme shell
programming. Tcl has horrid rules for when things are evaluated and
when they are not, with the wrong defaults. (Like many shell languages,
you have to force evaluation of a variable, because the default is that
things are not evaluated. I prefer Lisp and Scheme's default, which
is that things *are* evaluated, which would be a problem in that lots
of literals would have to be quoted, except that many common literals
are "self evaluating." Apparently Ousterhout and most shell language
designers have never caught on to self-evaluating literals, because
they never learned Lisp. So you end up with things like [set $a $a+1].
I firmly believe that Scheme is a better basis for such things, if
we come up with some handy syntax that can easily be explained.
Some issues to consider:
1. Can we come up with a command syntax that's nice and terse for
scripting purposes, but doesn't make it hard to write larger
routines? Ideally, I'd like a syntax that works for both
scripting and "real" programming, so I don't have to switch
syntaxes. (Naturally, there would be no problem of interoperability
of code written in different Scheme syntaxes---it's just sugar.)
I think this is more doable than it may seem at first glance.
Declaration syntax for fancy stuff doesn't have to be as terse,
because most scripts don't declare many new things---they just glom
together the same old things as a bunch of commands. So there should
be things like class declarations than things like normal statements
and the commonest control constructs. (I'm assuming that any
good Scheme implementation will have an object system, or at
least a record-definition facility. It will probably have
a module system, but for scripting purposes a default scripting
environment can be used and the casual user won't have to worry
In particular, most little scripts consist mostly of sequences
of commands. As long as the syntax for sequences and function
calls is terse, most scripts will be terse. The commonest
control constructs and declaratons should be terse as well, at
least in their commonest forms. (Equally important, they should
not be unnecessarily objectionable to newbies. The if...fi
syntax above isn't actually as terse as standard Scheme's (if ...),
but it's more understandable and familiar to a lot of people.
2. How do we distinguish between calls to built-in or user-defined
Scheme functions and commands that are sent to programs outside
Scheme? For scsh, you currently have to explicitly indicate
that you want to run a UNIX program by using a (run ...) form
or whatever. I think this may be the right default for
nontrivial programs, but for many scripts it's awkward. Somebody
(John Ellis?) did a Lisp shell several years ago, where any
unbound function name was assumed to be an indication that
the programmer meant to call a UNIX program. So, for example,
if rm was bound, (rm foo) meant call the rm procedure, but if
rm was unbound, the shell would look for a unix program named
rm, and call it with the argument foo.
I think that for little scripts, something like this is a good
way to go. This raises the question of how to decide when to
do it, and how to present the choice to the user. I'd think
that for a shell language, this should be the default, but
the programmer should be able to turn it off. E.g., for a big
shell scripts that's a serious program, you might forget to
define a function, and then try to use it. If there just happens
to be a UNIX program by that name, surprising things could
I suggest a special kind of toplevel environment, which defaults
to UNIX as a special kind of outer "environment." If you
don't want this behavior, the first thing you do is switch
to a normal toplevel environment that doesn't have this default
behavior. Then you can still call UNIX programs explicitly,
when you say that's what you mean using (run...), but otherwise
you're just writing Scheme code. So the difference between
a scsh-like scripting language and a command interpreter is
just one of what kind of environment you're in.
3. Whatever syntax is used, it shouldn't make it hard to support
lexically-scoped ("hygeinic") macros, like R4RS high-level
macros, or the Indiana syntax-case, or RScheme's (extended)
syntactic closure system. It's important to be able to add
new control constructs for shell and scripting applications,
and syntax extensibility makes that much easier than having
to hack the interpreter or compiler.
(The palindromic syntax for special forms is intended to make
it easy to parse in the context of lexically-scoped macros.
It essentially preserves Scheme's property of having a two-level
syntax, with a simple nested surface syntax for the first phase
of parsing---"reading"---and user-defined syntax for interpreting
| Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (firstname.lastname@example.org)
| Papers on memory allocators, garbage collection, memory hierarchies,
| persistence and Scheme interpreters and compilers available via ftp from
| ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)