scsh-hackers
[Top] [All Lists]

[Scsh-hackers] Proposal for packaging of scsh libraries

To: scsh-hackers@lists.sourceforge.net
Subject: [Scsh-hackers] Proposal for packaging of scsh libraries
From: Michel Schinz <Michel.Schinz@epfl.ch>
Date: Sat Nov 1 22:23:01 2003
List-id: Discussion among the implementors <scsh-hackers.lists.sourceforge.net>
Sender: scsh-hackers-admin@lists.sourceforge.net
So, as I promised, here is my first draft of a document describing the packaging of libraries for scsh. A few notes about it:

1. This is in fact more a set of notes, ideas and remarks about scsh package than a draft of a document; my aim is to discuss the various points and collect different points of view, and then produce something looking like a real document for package authors.

2. I have included discussions and remarks about several things which are not really needed *now*, like a central repository of modules, or a tool to download/configure/install packages. I included that just to give an idea of what the ideal situation would be, in my opinion. I also wanted to try to avoid making mistakes now which could be annoying later.

3. I did not really try to make my proposal close to anything which exists now. That is, all the scsh packages I'm aware of will have to be changed if what I propose is accepted. My aim was not to show disdain for these packages, far from that, but since there are only few of them and they did not establish a de facto standard until now (i.e. they pretty much all have different installation procedures), I thought it was better to start from scratch.

4. Nothing is said about packages which include C code, like scx. That's largely because I don't know much about them, so I would be interested to get comments about the issues encountered with them. Also, one point to keep in mind with these packages is that they are implementation-dependent. As long as there is only one scsh implementation that's fine, but we should maybe keep that in mind if we want to talk about them in this document.

So, here is the document, please comment!
---------------------
In Emacs, read this file in -*- Outline -*- mode.

* Goals

The following document proposes a standard way to prepare scsh
packages and install them on a target machine.

The main goals of this proposal are:

  * packages should be easy to find, by being stored at (or referenced
    from) a central repository with good searching capabilities,

  * packages should be easy to install by an end-user, either by hand
    or using a special tool which remains to be defined,

  * packages should be easy to remove from the system, either by hand
    or using the aforementioned tool,

  * it should be easy to find the list of all packages installed on a
    system, with their version,

  * it should be possible to have several versions of a given package
    installed simultaneously on a system, with one of these versions
    used as default,

  * it should be straightforward to use a package from a scsh program,
    through S48's module system; in particular, it should be possible
    to write portable scsh scripts which make use of packages through
    scsh's -ll option.

* Package identification and naming

A scsh package is identified by a name, which must be globally unique.

Several versions of a given package can exist. Versions are a
specified as a sequence of integers.

A given version of a package is identified by a name which is
obtained by concatenating:
  - the package name,
  - a hyphen ("-"),
  - the package version, with components separated by dots (".").

In what follows, the word "package" is often used to designate some
version of a package.

* Package repository

There should be somewhere (on scsh.net I would guess) a package
repository, where people can upload their packages.

On the repository, packages should be stored in archive files whose
name is obtained by concatenating:

  - the full name of the package,
  - an extension indicating the kind of archive,
  - an extension indicating the compression method, if any.

All packages should be stored in a *single* directory, to make it
possible to fetch a package whose name is known.

Additionally, there could be a hierarchy of directories to classify
packages by subject. In these directories, links to the actual package
files could be stored.

For example, the packages for Functional Postscript, sunet and
sunterlib could be organised as follows in the package repository:

  all_packages
    sunterlib-0.3.tar.gz
    sunterlib-0.4.tar.gz
    sunterlib-0.5.tar.gz
    sunet-2.0.tar.gz
    fps-1.0.tar.gz

  by_subject
    misc
      (links to all versions of sunterlib in /all_packages)
    text
      (link to fps in /all_packages)
    networking
      (link to sunet in /all_packages)

* Package configuration, building, testing and installation

Once a package archive has been downloaded from the repository, it
must be decompressed (if needed) and expanded.

The archive should expand to a single directory, whose name is the
full name of the package. This directory should contain at least an
installation script called "install.scm". Invoking this script with
"scsh -s install.scm" should configure and install the package,
according to the rules of the following sections.

    Rationale: Traditionally, on Unix systems, configuration is
    handled by autoconf-produced scripts, and building/installation is
    handled by make.

    For scsh packages, though, a solution based on a scsh installation
    script seems preferable, for the following reasons:

      * autoconf seems superfluous, since it solves problems which are
        hidden by scsh's abstractions (and if running "configure" is
        really needed, it can be done by the scsh script),

      * everything which is important to know to install a package,
        like the location where packages should be put, is (or should
        be) accessible through scsh variables or functions,

      * package authors do not have to learn/use a different language
        for their configuration and installation script, and they have
        the full power of Scheme at their disposal, should they need
        it.

    To help package authors write their installation script, a small
    library could be provided with scsh. This library should contain
    functions to install packages according to the scheme proposed
    here. (That includes functions to get the name of the directory in
    which scsh packages should reside---what is called the package
    root below).

* Layout of installed packages' files

The files which make up a package have to be copied to various
locations on the target file system during installation. This section
discusses the layout of the part of the target file system containing
package files.

I propose the following layout:

  - There is somewhere a directory which contains all files related to
    scsh packages. This directory could be by default
    $prefix/lib/scsh/modules (where $prefix is the prefix given to
    configure during scsh's installation). This directory is called
    the "package root".

  - The package root contains two sub-directories (and nothing else):
    the first, called "installed" contains all versions of all
    installed packages; the second, called "active", contains exactly
    one symbolic link per installed package (see below). Both of these
    directories are included in scsh's default library directories.

  - The "installed" directory contains exactly one directory per
    package installed on the system. These directories have the same
    name as the package they contain (e.g. "sunet" or "sunterlib").
    They are refered to as "package directories" below.

  - Every package directory contains one or more directories, each
    containing a specific version of the package; these directories
    have the name of the version (e.g. "2.0" or "0.5.4"). They are
    referred to as "package version directories" below.

  - Every package version directory contains one file called "pkg.scm"
    (maybe "packages.scm" if we want to keep the current practice)
    which contains the definition of *all* the interfaces and
    structures which make up the package.

  - Every package version directory contains a sub-directory called
    "doc" containing the documentation about the package; in this
    sub-directory there are further sub-directories which hold the
    documentation in various formats:

      - "html" contains the HTML documentation, if any; this
        directory should contain at least a file called "index.html"
        which is the entry-point of the documentation,

      - "pdf" contains the PDF documentation, if any,

      - and so on for other formats (PostScript, text, info, ...).

    [This organisation has unfortunately a drawback: it is not really
     compatible with tools which look up documentation in a given set
     of directories. For example, "man" looks in a set of directories
     for documentation, given by the MANPATH environment variable.
     This variable would have to be augmented each time a package is
     installed, which is not easy to do. I don't think this is a real
     problem, as "man" might not be the format of choice these days.]

  - For every package installed, there is exactly one symbolic link
    in the "active" directory. The link has the name of the package
    and points to a specific version directory of the package, which
    is the "active" one, i.e. the one to use by default.

With such an organisation, using a given package from a script can be
done with a simple

  -ll <package_name>/pkg.scm

which will select the active version of the package. A specific
version can be requested with a slightly more verbose syntax:

  -ll <package_name>/<package_version>/pkg.scm

(This relies on the fact that both the "active" and the "installed"
directory are in scsh's library directories).

Note: if the files "pkg.scm" were given the name of their package
(e.g. "sunterlib.scm"), using a package could even be easier thanks
to scsh's ability to recursively search directories containing
modules. However, such a search would slow down startup, which makes
me prefer the above solution.

    Rationale: There are basically two ways to organise the file
    hierarchy:

      1. by grouping files by "kind",

      2. by grouping files by package.

    Grouping files by kind means that, for example, the documentation
    for all packages will reside under a common "documentation"
    directory, and all the Scheme code will reside in another
    directory. It is the standard way of organising files under Unix,
    where (for example) all executable files are put in /bin.

    Grouping files by package means that *all* the files which belong
    to a package reside in a directory, and this directory (and its
    children) contains *only* files belonging to that package.

    Among the two techniques described above, I think the second one
    has several advantages:

      - removing a package is trivial (just delete the directory
        containing it),

      - it is easy to know to which package some file belongs,

      - it is easy to obtain a list of all installed packages.

    All these operations can be accomplished without some external
    database containing information about the packages, which is
    something I want to avoid. Such databases are not trivial to
    maintain (i.e. a tool has to be written to manage them) and they
    can get out-of-sync with the file system if some user plays
    directly with the files which make up packages.

To give a small example of the layout proposed above, here is what the
root package directory would look like if versions 1.0 and 2.0 of
package "sunet" are installed (the latter being active), as well as
version "1.2.3" of package "sunterlib".

  installed/
    sunet/
      1.0/
        pkg.scm
        doc/
          ...
        ...
      2.0/
        pkg.scm
        doc/
          html/
            index.html
            ...
        ...
    sunterlib/
      1.2.3/
        pkg.scm
        doc/
          ...
        ...
  active/
    sunet (symbolic link to ../installed/sunet/2.0)
    sunterlib (symbolic link to ../installed/sunet/1.2.3)

Such a layout makes it quite easy to perform the operations which I
would consider common, namely:

  - using a package, be it some specific version or the currently
    active one: see example above,

  - removing version <v> of package <p>: remove directory
    installed/<p>/<v> and remove or correct symbolic link in active/
    (removing all versions of a package is equaly trivial, of course),

  - listing all packages installed: list the contents of installed/
    (and its first level of sub-directories to know which versions are
    installed),

  - finding out which version of a package is active: look at what
    the symbolic link in active/ points to.

* Image creation

Not written yet.

* Automatic management of packages

Ultimately, a tool to automate the management of packages could be nice
to have. Such a tool would make it possible to download, install,
update, list and remove packages.

Some "meta-data" about packages would need to be defined somewhere,
like the dependencies among packages.
---------------------

Michel.



<Prev in Thread] Current Thread [Next in Thread>