Software Distribution Standards

Table of Contents

  1. Introduction
  2. User's prospective
  3. Pathnames
  4. Common system file modifications
  5. Products and filesets
  6. Dependencies
  7. Versions
  8. Control scripts
  9. Compilation
  10. Software distribution
  11. Porting flow chart
  12. Configuration of machines used for porting/building packages

Introduction

The Software Porting and Archive Centre for HP-UX is a consortium of universities and HP-UX user groups dedicated to improving access to the thousands of high-quality public domain software packages for users and system administrators. The Centre ports and redistributes public domain software for HP-UX series 700 and 800 workstations and servers. In order to make it as easy as possible to utilize the software the Centre creates binary installation packages of the software which can be installed on HP workstations using Software Distributor (SD-UX) just like HP software. The software is available through a wide variety of channels, including installation over the network and from CD-ROM.

In order for the software to fit seamlessly into a wide variety of HP-UX installations, and in order for the various software packages to work together without conflicts, every package must be carefully configured to match standard configurations.

One goal of the archive is make it possible for users to upgrade their operating systems and still have all the previously installed software remain functional without requiring upgrades of all the Centre's software. Since the Centre is continually updating the software archive, and since the Centre does not match the HP-UX release schedule, binary packages distributed by the Centre must be able to run on several major and minor releases of the operating system. In addition, the software packages are updated independently, which means the Centre must minimize the effects and scope of version-specific dependencies between packages.

There are some packages for which the standard configuration may not make sense, such as perl, and it is expected that the porting engineers will use their best judgement. Basically, the software should follow the "Rule of least surprises", meaning we should offer the users as few surprises as possible.

User's prospective

There are two important classes of users: system administrators and end-users. The system administrator has to worry about installation and maintenance of the system, especially the software, while the end-user is the one who eventually uses the software. Of course, a single person often fills both roles at once.

From the system administrator's perspective the software should be installed using the same software installation and management tools as are used to install software from HP. Also, the software should be configured in a standard way to minimize the number of surprises and different packages should work together without conflicts. Finally, system administrators frequently administer multiple machines, so the software should be configured so that administrators can allow multiple machines to utilize the software via a read-only shared file system.

The end-users simply want to use the software without worrying about how each individual package is configured and where it is installed. It is important to remember that there are several environment variables that must be configured properly in order for the software to fit seamlessly in the user's environment. Whenever possible, package-specific modifications to each end-user's environment should be minimized by re-using common installation paths. Some example of environment variables that may need to be correctly configured for packages to work properly include: PATH, XUSERFILESEARCHPATH, and MANPATH.

System administrator

System administrators have several options for installing software: from CD-ROM, over the network, and from installation files. For administrators managing more than one machine there is the possibility of using a read-only shared file system for most of the files, and then installing machine-specific filesets on each machine for some package.

To the system administrator software from the Software Porting and Archive Centre should have a very similar look+feel to software delivered from HP, particularly when it is installed from a CD-ROM. The product/subproduct/fileset naming conventions should mirror the HP conventions, and the file layouts should mirror the HP file system layout.

End-user

As much as possible, end-user visible differences between Centre software and HP software should be minimized. Also, the end-user should have to do as few customizations to their environment to utilize the Centre software.

End-users will have to do some simple modifications to theirs environment to utilize installed Centre software, primarily extending various environment PATH variables to include /opt/<appname>-based directories. Some of these additions can be handled invisibly by the package control scripts during installation by updating the /etc/PATH, /etc/MANPATH, and /etc/SHLIBS_PATH files. The user may need to add the following entries to the listed environment variables:

Environment VariableEntry
PATH:/opt/<appname>/bin:/opt/<appname>/bin/X11:
XUSERFILESEARCHPATH:/opt/<appname>/lib/X11/app-defaults/:
MANPATH:/opt/<appname>/share/man:/opt/<appname>/share/man/%L:

Pathnames

The single most important aspect of configuring software is determining where the software should be installed. The configuration should conform to the HP-UX specification for optional software packages. It should also support accessing the software through a shared read-only network file system, although some packages may require some configuration or log files to be located in a system-dependent, writeable location.

Some packages will not easily conform to these standards, and the porting engineers should use their discretion. However, changes that are visible to the user (e.g. requiring new elements in the PATH or MANPATH) should be absolutely minimized.

A software package may contain several types of files, some generated by the original author and some created by the Centre.

In a networked environment with software shared between machines there are two types of files included in a software package: shared/read-only and machine-specific/writable. Shared files may reside in a shared location and may be used by all the machines while machine-specific files cannot be shared by different machines. In the file system layout, shared files are located under /opt, while machine-specific files are located under /var or /etc. However, files should not be directly installed into /var or /etc. Rather they should be installed under /opt and the control scripts should copy them to the machine-specific location during configuration.

The paths would look like this:

Access typePathnameDescription
Shared / Read-only/opt/<appname>/admSystem admin scripts/config files
/opt/<appname>/binBinaries (non-X11 progs)
/opt/<appname>/bin/X11/opt/<appname>/bin/X11
/opt/<appname>/etcSystem admin scripts/config files
/opt/<appname>/include/X11Include files for prog (X11)
/opt/<appname>/includeInclude files for prog (non-X11)
/opt/<appname>/lbinrun-time back-end executables
/opt/<appname>/lib/X11/app-defaultsX11 application defaults files
/opt/<appname>/lib/X11prog run-time files (X11)
/opt/<appname>/libprog run-time files (non-X11)
/opt/<appname>/newconfigmachine-specific files to be copied to /etc and /var
/opt/<appname>/sbinBinaries (system administration)
/opt/<appname>/shareArchitecture-independent shareable files
/opt/<appname>/share/docOff-line prog docs (non-X11)
/opt/<appname>/share/infoGNU Info files
/opt/<appname>/share/libMiscellaneous shareable libraries
/opt/<appname>/share/manMan page tree (man1, man1.Z etc.)
/opt/<appname>/srcSources for prog
Machine-specific / Writable/etc/opt/<appname>Machine-specific configuration files
/var/opt/<appname>Machine-specific and log files
/var/opt/<appname>/spoolQueueing and spooling area
/sbin/init.dsystem startup scripts
/etc/rc.config.dsystem startup configuration scripts

Key tradeoffs on the major pathname configuration proposals:

  1. /opt/<appname>/{bin,lib,share,...}
    1. Follows the standard filesystem layout for HP-UX 10.x
    2. Reduces possibility of inter-package conflicts
    3. Risk of conflicts between versions of a single software package
    4. Configuration of some utilities becomes difficult (e.g. imake)
    5. Explosion in length of various PATH environment variables
  2. /opt/<appname>-<version>/{bin,lib,share,...}
    1. Allows a system to have multiple versions of the same software without conflicts
    2. Excacerbates explosion in length of various PATH environment variables
    3. Potentially adds to the pollution of /opt with obsolete versions of software
    4. Means that every software dependency becomes version specific because software has to know how to find the files it depends upon.
  3. /opt/hppd/{bin,lib,share,...}
    1. Dramatically reduces the explosion of directories under /opt
    2. Minimal effect on various PATH environment variables
    3. Incompatible with the new standard
    4. Greater potential for files to collide between packages
    5. imake configuration is simpler

Common system file modifications

There are a number of system files which will need to be updated by some packages during installation. These updates should be handled automatically by the control scripts. This is a list of some of the more common file modifications:

FileModification
/etc/rc.configadd references to files in /etc/rc.config.d for system startup
/etc/PATHadd system-wide PATH elements
/etc/MANPATHadd system-wide MANPATH elements
/etc/SHLIB_PATHadd system-wide SHLIB_PATH elements

Products, subproducts and filesets

Each software package should be contained in its own SD-UX product. Larger and more complicated packages should be broken down into subproducts and filesets which can be loaded independently. In order for users to easily determine which subproducts or filesets they require, we should follow conventions similar to HPs guidelines. This proposal is modelled on the policies that are used to develop HP-UX.

Subproducts

Only the larger software packages will require subproducts, and not all packages will use all subproducts.

NameDescription
RuntimeFilesets needed to execute the functionality in the product. Examples are most executables and shared libraries. [-RUN, -MIN, -AUX, -SHLIBS filesets]
MinimumRuntimeA strict subset of Runtime. Contains only the filesets needed for a minimum configuration of the product.
ManualsMan pages and any other electronic documentation
DevelopmentFilesets required only to do software development
SourceSoftware sources [-SRC fileset]

Filesets

NameDescription
base-RUNThe run-time fileset for base. If base is entirely in the execution environment, the -RUN suffix can be omitted (just name the fileset base). If the product's execution environment must be broken into smaller pieces for configurability, instead use -MIN and -AUX as shown below.
base-MINThe minimum run-time fileset for base.
base-AUXAuxiliary files for base other than thos in base-MIN.
base-SHLIBSShared libraries for base.
base-PRGProgramming environment for base. If the header files for base programming are small, include them in this fileset.
base-INCIf header files for base programming must be placed in a separate fileset, use base-INC.
base-MANThe manual entries associated with base. If the release notes for this product are small in size, include them in this fileset.
base-NOTESRelease notes for this package (if any). This should only be defined if the release notes are fairly large (say more than 50kB). Otherwise include the release notes in the base-MAN fileset.
base-SRCThe sources for base.

Dependencies

Software frequently requires the presence of other software to run correctly; these requirements are known as dependencies. There are many different reasons one package requires another to operate correctly. Each reason has different implications for system administrators. For example, it is common for a dynamically linked executable to require the presence of shared libraries belonging to other software packages. It is also common for one program to call another program or for one program to require a configuration file from another software package.

Software distributor allows package developers to specify two types of dependencies: corequisite and prerequisite dependencies. Corequisite dependencies imply that the other package must be present only by the time you use the software, while prerequisite dependencies imply that the other package must be present before you install the software. Nearly all dependencies are of the corequisite variety; prerequisite dependencies are very rare.

Dependencies can include version specifications which is useful when building a package that uses a shared library from another package because shared libraries are rarely compatible across versions.

Versions

Every package should have a version associated with it. Some packages are not assigned versions by their authors, in which case the porting engineer must assign a version number.

One of the most difficult problems currently faced by the Centre is specifying a general process for handling different versions of the same software package so version upgrades can proceed seamlessly. In general you would like executables and libraries to be available in a version-neutral form or name, so when the package is upgraded you invisibly start using the new version. Sometimes there are incompatible changes between software versions (e.g. perl4 and perl5), in which case the upgrade is not invisible and the version number must be exposed. This issue of version specification in files and dependencies is the major remaining open issue.

There is also the problem of managing the obsolescence of old versions of software. In most cases you probably want to obsolesce (remove) old software versions as soon as they are no longer required by any other package (or shortly thereafter). In some cases you want to keep the old software around forever because there is so much legacy software (not contained in the archive) that depends on the old version. A perfect example is perl4.

Whenever a package is upgraded, all packages that depend on the upgraded package should be checked for version incompatability. If possible they should be modified (e.g. recompiled) to match the new package, otherwise the dependency should be modified to indicate the old version of the upgraded package.

Control Scripts

Software Distributor allow the software publisher to create ten control scripts that are used by SD-UX to perform various actions at different parts of the software life cycle. Some of the scripts are used during installation, configuration, de-configuration, or de-installation. Most activity should be done by the package's configure/unconfigure scripts rather than the pre/post-install or pre/post-deinstall scripts. In particular, copying files to the local system area and modifying system configuration files should be done by the configuration scripts.

There will be a script (/opt/hppd/etc/hppd.control) that provides routines for many common configure/deconfigure operations which can be used by control script authors. The script will be included in a base HPPD fileset (HPPD-CORE) which will be required by all packages with control scripts. The script will provide functions to:

In addition, there is a script /usr/lbin/sw/control_utils that contains general purpose utilities and procedures that can be used by control scripts.

Control scripts will never require user input. When user input is required or desired, the package will supply an additional script to be run by the user after installation, and the configure script will echo a message informing the user of this second script and giving a sample command line to be used to execute it. (Requiring user input during a configure script will cause the installation to hang.)

Compilation

There are a number of issues associated with compiling software so it will run properly on as many machines as possible. Since disk space is becoming less important over time, software should be compiled using static linking because it is most likely to work on as many machines as possible. Optionally, some software may be compiled using dynamic linking.

One critical issue is static vs. dynamic linking.

  1. Dynamic linking
    1. Reduced disk space consumption
    2. Dynamic loading works
    3. Very sensititive to the specific versions of every shared library (e.g. a patch or version upgrade to a shared library might break dozens of applications)
  2. static linking
    1. Reduced sensitivity to version-specific dependencies
    2. Increased disk space consumption
    3. Unclear whether dynamic loading works for statically linked binaries. (e.g. a dynamically loaded module might require a function from libc.a which was not linked into the executable because the base executable did not reference that function).

For some packages the difference in size can be quite dramatic. However, statically linked binaries are far less fragile in the face of operating system and software upgrades. In addition, it requires some user sophistication to understand the difference between the two formats and to determine when each format is appropriate. So far the Centre has compromised by offering both statically and dynamically linked binary packages, but this compromise is somehow unsatisfying.

Some standard configuration issues to address when compiling the software:

Software distribution

Software will be distributed in Software Distributor format through several channels: CD-ROM, anonymous-ftp, and network installation. Each mirror site will continue to offer the individual software packages in separate compressed software distributor files. In addition, each mirror site will provide a Software Distributor depot containing all of the archive software from which users can install software directly onto their machines over the network. The software depots should be updated nightly as part of the daily mirroring activity. If we continue to distribute two versions of software (statically linked and dynamically linked), then we should probably have two depots, one for statically linked packages and one for dynamically linked packages.

Software contained in the depots should be compressed both to save disk space and for the network installation to reduce network bandwidth requirements.

Periodically, some of the partners may create and distribute CD-ROMs containing a copy of the Software Distributor depot which will allow users to install software onto their machine from the CD-ROM. It is possible to create the CD-ROMs so that they contain a copy of the archive's web interface for improved browsing/searching capabilities. The web pages can include push-button installation actions that automatically install desired software.

Porting flow chart

Configuration of machines used for building/installing packages

 


Author: Carl Staelin <staelin@hpl.hp.com>
Last modified: May 31, 1996
Version: 0.2