>>>>> "Olin" == shivers <shivers@cc.gatech.edu> writes:
>>>> Martin, I have gone over your proposal and added comments and proposals
>>>> marked by ">>> " prefixes.
>>>> -Olin
Olin> FROM: Martin Gasbichler
Olin> DATE: 01/22/2001 06:20:32
Olin> SUBJECT: [Scsh-hackers] 0.6 API
Olin> Here comes my proposal for the new stuff in the 0.6 API. Note that it
Olin> sometimes doesn't match the CVS tree in which case the proposal is my
Olin> last thought. Packages marked with (o) are opened by default.
Olin> scsh-events:
Olin> -----------
Olin> A few weeks ago, I replaced David Fishers implementation of events;
Olin> his code was based on placeholders. The problem in his implementation
Olin> was, that the RTS deadlocked, if all threads were waiting for an
Olin> interrupt since the RTS only saw the blocked placeholders. Now, the
Olin> base event system is a part of the RTS itself and the scheduler checks
Olin> if there are any threads, waiting for an interrupt, just as it does
Olin> for I/O.
>>>> I don't understand the problem. Can you explain it again?
The RTS checks whether there are any runnable tasks. If there are
none, a deadlock-exception is raised. The exception is not raised if
there are threads, that sleep or wait for I/O-completion. David
implemented events as placeholders which get their value set by the
signal handler. A thread waiting for a placeholder is not
runnable. The RTS did not see the possibility that the signal handler
would set the placeholder, the thread was just blocked and if it was
the only thread, a deadlock occurred. In my new implementation the
event system is part of the RTS and the RTS checks for threads that
are blocked on events in addition to sleeping threads and those
blocked on I/O.
Olin> ; obvious:
Olin> (most-recent-event)
Olin> ; block, until interrupt occurs:
Olin> (wait-interrupt int last-event)
Olin> ; block until one of the interrupts in set occurs:
Olin> ; interrupt sets are constructed as in 0.5.2
Olin> (wait-interrupt-set int-set last-event)
Olin> ; Same as above, but return if no pending interrupt exists
Olin> maybe-wait-interrupt
Olin> maybe-wait-interrupt-set
>>>> I am not so fond of the "maybe-" prefix to mean non-blocking.
>>>> Also, the WAIT-INTERRUPT procedure doesn't always wait. You can
>>>> use it to scan and re-scan events that are in the past, if you've
>>>> retained a pointer to an old event.
>>>> - Why not merge WAIT-INTERRUPT's functionality into NEXT-EVENT:
>>>> NEXT-EVENT event [filter] -> event
>>>> - Then make a NEXT-EVENT/NO-WAIT procedure that returns false
>>>> if you scan off the end of the event chain.
Agreed.
>>>> The FILTER parameter is a set of "event classes" (e.g., interrupt codes)
>>>> or a general Scheme predicate; I could go either way on that one. A
>>>> predicate is Schemeish & general, but the generality prevents you from
>>>> putting up threads on a fixed set of event-class queues -- you simply have
>>>> to re-execute all predicates of all blocked threads whenever a new event
>>>> occurs. That doesn't scale well for lots of threads.
I don't think, it makes sense to have lots of threads blocking on
signals and I currently transform the interrupt sets into predicates
anyway. But what else than the membership of an interrupt in a set
should the predicates check?
Olin> ; record, returned by wait-interrupt-X
Olin> event?
Olin> next-event
Olin> event-type
>>>> - What is the range of the EVENT-TYPE function? Is it just Unix async
>>>> interrupts?
The async interrupts and the post-gc-interrupt.
>>>> - If the only events are Unix async interrupts, then "event" is perhaps
>>>> overly general. Are there other sorts of events? If not, possibly
>>>> change the name to "sigevent"? Do we anticipate ever extending the
>>>> set?
We certainly need more possibilities for inter-thread communication,
but probably not via this mechanism. So let's rename it to sigevent.
Olin> scsh-interrupts: (o)
Olin> ---------------
Olin> number-of-interrupts
Olin> ; from 0.5.2
Olin> interrupt-set
Olin> ; extensions to get a useful ADT
Olin> (interrupt-in-set? int set)
Olin> (insert-interrupt int set)
Olin> (remove-interrupt int set)
>>>> Urp. Are interrupt sets pure or side-effectable? Pure or "linear update"
>>>> would work, I think. In which case, we should use set lexemes from SRFI-1,
>>>> such as "adjoin", and "-contains-" instead of "-in-"
Let's make them pure. As I didn't find anything like -contains- in
SRFI-1, please list the exact names you wish to have.
Olin> ; includes all interrupts
Olin> full-interrupt-set
Olin> ; all interrupts
Olin> interrupt/...
Olin> signal-handler: (o)
Olin> --------------
Olin> There is a fundamental problem with the interaction of signal-handlers
Olin> and the event system: While it is possible to have both of them
Olin> (actually sighandlers is built on top of the event system now) the
Olin> default actions for signal handlers will normally just kill the
Olin> process. For compatibility with old code, the signal handlers should be
Olin> turned on by default. Maybe something like
Olin> (disable-all-signal-handlers!)
Olin> would come in handy. On the other hand, the signal handler for SIGINT
Olin> is very useful as it allows you to stop all threads.
Olin> interface as in 0.5.2, but without interrupt/
No comments for this????
Olin> ------------------------------------------------------------------------
Olin> I'd like to declare select and select! "deprecated" as it doesn't work
Olin> well with the thread system. There should be only one select in the
Olin> whole system.
>>>> Why? A single thread may still wish to attend to multiple i/o
>>>> sources/sinks. We just have to provide a "fake" select, just as we provide
>>>> "fake" blocking I/O implemented in terms of non-blocking I/O and SIGIO
>>>> or scheduler-loop polling.
Yes, that's another alternative I also thought of. It's just that the
use of select doesn't encourage a concurrent programming style. Select
was introducted because of the single threadness of C. And
implementing the exact behaviour of select is probably a tedious task.
Olin> ------------------------------------------------------------------------
Olin> network: (o)
Olin> -------
Olin> The code in 0.6 is built on top of channels. This was necessary to let the
Olin> scheduler call other threads if something blocks.
Olin> internet-host-addresses are now represented as byte-vectors. There
Olin> exist a few conversion functions:
Olin> (number->internet-host-addresse address32) ->bv
Olin> (internet-host-address->number bv) -> old representation
Olin> (bytes->internet-host-address b4 b3 b2 b1) ->bv
Olin> (internet-host-address->-bytes bv) -> (b4 .. b1)
Olin> (internet-host-address->dotted-string bv) -> "123.123.123.123"
Olin> (dotted-string->internet-host-address string) ->bv
>>>> Brian should comment on this. Why do we need to represent IP addresses
>>>> with byte vectors?
I don't like the idea of having addresses floating around as normal
integers. Byte vectors are much more opaque, but maybe it's even
better to have a separate record for this.
>>>> Let us assume we have reasonable bit-ops on ints;
>>>> then we can always extract the octets as needed. And note that IP
>>>> addresses don't really come in octets; that's just an external
>>>> written form. The partitioning into net & subnet & netmask varies at
>>>> bit granularity. The dotted-string parser/unparser routines, however,
>>>> seem like a nice convenience.
So we probably need some masking procedures too.
>>>> It would be much shorter and just as clear to replace
>>>> "internet-host-address" with "ip-address", which is also a precise
>>>> technical name for the thing.
We also have to think about IP6, so how about ip4-address?
Olin> crypt:
Olin> -----
Olin> ; Simply calls the C library function and returns its return value.
Olin> (crypt key salt)
>>>> This was not Posix when I did the very first implementation of scsh. Of
>>>> course, neither were symlinks, and I put those in. How portable is crypt?
Maybe Mike can comment on this.
>>>> Doesn't FreeBSD complicate matters with the crypt-classic/crypt-MD5
>>>> split?
No, FreeBSD uses the first two characters to determine which algorithm
to use and is therefore backward compatible.
Olin> syslog:
Olin> ------
Olin> The Scsh system assigns syslog-ids to every call of openlog. The
Olin> syslog-id of the last openlog is recored. If syslog-w/id is called
Olin> later and the syslog-id of the last open is not the same as the
Olin> argument of syslog-w/id, openlog is called with the values of
Olin> syslog-id prior the actual syslog call.
>>>> Excellent idea -- another global resource eliminated.
Olin> ; do openlog, return a syslog-id
Olin> (openlog ident [option [facility]]) -> syslog-id
Olin> ; version without syslog-id for the brave.
Olin> (syslog message [level [facility]])
Olin> ; call openlog, if current syslog-id is not the given one
Olin> (syslog-w/id syslog-id message-id [level [facility]])
Olin> (closelog)
>>>> - Not very Schemeish names. I propose OPEN-SYSLOG, CLOSE-SYSLOG and SYSLOG.
>>>> CLOSE-SYSLOG returns true if the the syslog was previously open; false if
>>>> it was already closed. Syslogs are also closed by GC.
No, they are not Schemeish, they are taken from the C functions they
call ;-)
>>>> - Let's not call these things "syslog-ids." Let's call them "syslog
>>>> channels," since each one is a connection to the syslog system.
OK
>>>> Now we should play the standard game we play with global resources:
>>>> turn them into explicit resources, with facilities to allow us to
>>>> control the default with dynamic scope. There's a standard set of
>>>> facilities and naming conventions one does for these things, common
>>>> in architecture across current i/o ports, cwd, umask, and so forth.
But if we have the syslog channels, they are already expicit
resources. There is no concept of a "current syslog channel", so we
don't have to care about its meaning in a dynamic scope.
>>>> That would give us a core facility of the following
>>>> (open-syslog ident [option [facility]]) -> syslog-channel
>>>> (close-syslog syslog-channel) -> boolean
>>>> (syslog-channel? x) -> boolean
>>>> (syslog-write string [level [facility [syslog-channel]]]) -> unspecified
>>>> Passing SYSLOG-FACILITY/DEFAULT as the facility for SYSLOG-WRITE
>>>> gets you the facility you specified when you opened the channel.
>>>> Similarly for SYSLOG-LEVEL/DEFAULT. Or maybe allow #f for this case?
>>>>
>>>> Extension:
>>>> (syslog-format syslog-channel level facility fmt-string . params) ->
>>>> unspecified
>>>> Acts like FORMAT.
>>>>
>>>> (call/syslog-channel ident option facility proc) -> value(s) of proc.
>>>> Applies proc to the channel, and guarantees to close the channel
>>>> even if you throw out.
>>>>
>>>> Dynamic scoping of syslog channels:
>>>>
>>>> (with-current-syslog-channel* slchan thunk) -> value(s) of thunk
>>>> (with-current-syslog-channel slchan body ...) -> value(s) of thunk
>>>> Introduces new dynamic scope.
>>>>
>>>> (current-syslog-channel) -> syslog-channel
>>>> (set-current-syslog-channel! slchan) -> unspecified
>>>> Side effect is visible to all who share this dynamic scope.
>>>>
>>>> (with-current-syslog-channel* ident option facility thunk) -> value(s) of
>>>> thunk
>>>> (with-current-syslog-channel ident option facility body ...) -> value(s)
>>>> of body
>>>> These three close the channel for you if you throw out.
>>>> Err... I don't have good names for these two to distinguish them
>>>> from the simple current-syslog-channel binders. Don't we have
>>>> an analogous case in 0.6 with cwd's, where we have both "cursors" and
>>>> strings that name directories?
Olin> ; As syslog is not part of any standard, this is an intersection of
Olin> ; Linux, FreeBSD, AIX, IRIX, HP-UX and Solaris.
>>>> Too bad!
>>>> Below I list some alternate names for options. I like names that use
>>>> longer, lexemes-separated-with-hyphens Scheme names that are more clear.
>>>> This has been a consistent tradition in scsh naming (e.g., see the tty
>>>> driver options).
Olin> syslog-option/default
Olin> syslog-option/cons >>> syslog-option/console-on-error
Olin> syslog-option/ndelay >>> syslog-option/open-now
Olin> syslog-option/pid >>> syslog-option/include-pid ??? I dunno...
Yes, include-pid is ok. It means, that the pid should be part of the
log message.
Olin> syslog-facility/default
Olin> syslog-facility/auth >>> /authorisation
Olin> syslog-facility/daemon
Olin> syslog-facility/kern >>> /kernel
Olin> syslog-facility/local0
Olin> syslog-facility/local1
Olin> syslog-facility/local2
Olin> syslog-facility/local3
Olin> syslog-facility/local4
Olin> syslog-facility/local5
Olin> syslog-facility/local6
Olin> syslog-facility/local7
Olin> syslog-facility/lpr
Olin> syslog-facility/mail
Olin> syslog-facility/user
Olin> syslog-level/default
Olin> syslog-level/emerg >>> /emergency
Olin> syslog-level/alert
Olin> syslog-level/crit >>> /critical
Olin> syslog-level/err >>> /error
Olin> syslog-level/warning
Olin> syslog-level/notice
Olin> syslog-level/info
Olin> syslog-level/debug
OK
>>>> Can y'all explain something to me? Here are (2 of 3 of) the syslog calls
>>>> on my Linux man page:
>>>>
>>>> void openlog( char *ident, int option, int facility)
>>>> void syslog( int priority, char *format, ...)
>>>>
>>>> It says that PRIORITY is a "combination" of facility & level. What does
>>>> this mean? You OR them or add them together to creat a priority value? And
>>>> a 0 facility (i.e., just a level value) means "use the facility passed to
>>>> openlog"?
This is explained at
http://www.unix-systems.org/onlinepubs/007908799/xsh/closelog.html
Olin> dot-locking:
Olin> -----------
Olin> Performs an obscure series of open, close, delete and ln,
Olin> ending up in a file named filename.lock.
Olin> ; I'd like to add further locking strategies, all obeying this interface:
Olin> (obtain-fs-lock filename)
Olin> (maybe-obtain-fs-lock filename)
Olin> (release-fs-lock filename)
Olin> (with-fs-lock filename body) :syntax
>>>> I don't understand what particular problem these functions solve,
>>>> and reading the source hasn't helped. Can you explain to me the intended
>>>> use?
You lock a file by creating a file named filename.lock. The
"algorithm" guarantees that only one process will succesfully create
this file. It now holds the lock on the file filename.
Maybe Mike can provide some comments on this.
>>>> BTW,
>>>> - I think you can simplify this code a little bit:
>>>> (define (create-temp filename)
>>>> (create-temp-file filename))
>>>> - the OBTAIN-LOCK loop waits 1000 seconds between tries. ???
:-) This uses the sleep of the thread structure, which takes milliseconds
as argument...
BTW: sleep is another problem in 0.6: We can't use the syscall for
obvious reasons. We could implement it via the thread
sleep. Unfortunately both procedures are currently named "sleep" so
what to do?
>>>> - Syntax of the form WITH-foo is conventionally accompanied by
>>>> a procedure with a name like WITH-foo*, which takes a thunk
>>>> where WITH-foo has a body block. So there ought to be a WITH-LOCK*
>>>> to go with WITH-LOCK (and then WITH-LOCK is a one-line macro).
OK
>>>> What properties do we want the locking system to have?
>>>> - Locks named in a process-global way in the filesystem namespace?
>>>> - Scalable (no polling)
>>>> If we don't need #1, we can use a hack involving pipes, where the pipe
>>>> either has a single byte in it (unlocked) or no byte (locked). If
>>>> we want locks to be visible in the filesystem namespace, for inter-process
>>>> coordination, can the FILENAME lock-name name a file we can
>>>> modify/delete/create, or must we *not* modify that file? (E.g., perhaps
>>>> we are locking *access* to a perfectly good file?)
>>>> Note that testing for the existence of a file requires polling, so the
>>>> implement-locks-by-creating-a-file trick doesn't scale, and you can do
>>>> it better with named fifos.
We need #1. This is for cooperation with other applications.
Olin> libscsh:
Olin> -------
Olin> Libscsh resembles Scsh as a C-library. It is intended for applications,
Olin> that want to use Scheme/Scsh as their scripting language. This is
Olin> vital to our fight against guile.
>>>> Woo, cool!
Olin> libscsh = scshvm without "main". Call
Olin> int s48_main(long heap_size, long stack_size, char *image_name, int argc,
char** argv)
Olin> to fire up Scsh out of your own program. By default, s48_main behaves
Olin> just like Scsh itself: it will start a REPL. For batch mode, add the
Olin> appropriate switches to argv (e.g. "-c", "-s",...).
>>>> - Wait, I don't understand. s48_main() is not who determines if a repl
>>>> happens, it's the *image* that determines this. If I dump out an image
>>>> whose top-level does something else, then that's what happens when I
>>>> fire up the vm w/that image.
You're right, it depends on the image. But for images, dumped by
dump-scsh and dump-scsh-program, my specification stays.
>>>> - I think we need to export an in-core heap-image or heap data structure.
>>>> Then you could have (1) a function to read a heap image from a file
>>>> into memory, and (2) another function to fire up the vm on a heap.
>>>> One advantage of this is that one could have read-only heap images
>>>> linked into the text segment of the binary, making a standalone
>>>> binary that could call out to scsh quickly.
>>>> - Possibly I am asking for something here that requires too much vm
>>>> hacking. But the s48_main interface above does seem pretty crude.
Mike has code which uses mmap to fade in heap-images for fast startup,
but it's for an older version of the VM and will require some work to
port to a newer version. Apart from that, I don't see, why the
s48_main interface is "crude".
Olin> If you want to add your own C functions to call from Scheme, write
Olin> an initialization function as described in external.ps, but instead of
Olin> adding this function to EXTERNAL_INITIALIZERS of the Makefile, apply
Olin> int s48_add_external_init(void (*init)())
Olin> to it. This feature is new in Scsh, Scheme48 doesn't have it. You
Olin> have to ensure, that all calls to s48_add_external_init() happen
Olin> before you call s48_main.
Olin> As the VM uses global variables, it's not possible to start several
Olin> Scshs at the same time.
>>>> This seems like a mistake that will get us eventually. Is there a way
>>>> we can do this, but make the API be such that we leave the path open to
>>>> possibly fixing this later?
How should we ever fix this? This would require hacking the
PreScheme-Compiler. Any volunteers?
To solve your problem, we could specify the maximum number of parallel
running VMs allowed and add a parameter to s48_main saying which of
the VMs you which to start. This suggests turning the global variables
into arrays and we could limit the number to 1 for now. Great idea,
isn't it?
--
Martin
|