NB: The following information is known to be accurate as of revision 851.

Setup

All scripts in the ISG account should be dependent on the central setup program, which performs the following tasks:

  • Sets the PATH environment variable so that all subdirectories of bin (the directory housing setup) are in the PATH, followed by an attempt at a reasonable cross-platform path setting. It guards against unnecessarily regenerating the PATH (a somewhat computationally expensive operation) by storing the path in the environment variable ISG_BIN_SETUP_DIRS, which it does not recompute if it already exists.
  • Modifies the environment so that things known to break code are no longer present; for example, a setting of LD_LIBRARY_PATH recommended by instructors for some courses prevents perl from executing.
  • Attempts to ensure that specific stable versions of perl and bash are available by default in the PATH.
  • Conditionally sets dependencies on /bin/showpath only if it exists, maximizing the use of xhier-based utilities while ensuring a reasonable degree of portability.
  • Sets environment variables for perl (PERL5LIB) and LaTeX (TEXINPUTS) so information on the ISG account can be found readily by these programs.
  • Provides a bash function which will add required packages to the PATH, dumping an error message if no such package can be found.

Setup can be used in two ways: it can be sourced directly (in which case the program should conditionally re-execute itself to ensure that a proper version of bash is used, checking for the ISG_BIN_SETUP_DIRS environment variable so it doesn't do this endlessly), or it can executed and be given a number of arguments which it will in turn execute in its modified environment.

Note that the requirement of setup means that the entry point of every normal ISG utility must be a bash script.

To keep individual scripts from needing to know where in the directory hierarchy they live, there are symlinks to the main setup in many subdirectories (and this must be done via symlinks). Setup can then just be executed/sourced via $(dirname $0)/setup; the setup utility itself will resolve its real location to determine the root bin directory

Typical use of setup

All script entry points are symlinks to ISGScriptStartTemplate; see that page for more details.

This runs a script with the same name but ending with _impl, which then only has a small amount of work to do.

#!/usr/bin/env bash

# Make sure requirepackage, absolute, etc. are defined if needed by sourcing setup...
. "$(dirname "$0")/setup"

# Any needed requirepackage statements now go here

This script can then continue on with whatever other work it needs to do.

Old method

A typical way for all programs to start was as follows:

#!/usr/bin/env bash

# We want to make sure 'env bash' is the *right* bash; if setup has not yet
# been run by this or a calling script, rerun self in the environment created 
# by setup.
if [ -z "$ISG_BIN_SETUP_DIRS" ]; then

   # Try to make sure dirname is defined before calling it
   if [ -x /bin/showpath ]; then
      export PATH=`/bin/showpath standard`
   fi

   exec "$(dirname "$0")/setup" "$0" "$@"
fi

# Make sure requirepackage is defined if needed by sourcing setup...
. "$(dirname "$0")/setup"

# Any needed requirepackage statements now go here

They can then also exec another program if bash is not the desired implementation language.

Or, in the special case where this is just a wrapper for a perl program, this can be condensed to:

#!/usr/bin/env bash

# Try to make sure dirname is defined before calling it
if [ -x /bin/showpath ]; then
   export PATH=`/bin/showpath standard`
fi

exec "$(dirname "$0")"/setup perl <arg1> ... <argN>

The danger with this approach is that, in absence of /bin/showpath, this setup script will fall back to the user-supplied PATH to call dirname and resolve the path of setup. This is not safe. The same is true of /usr/bin/env in the sh-bang at the top of the file.

There is also a larger amount of duplication of code than under the newer recommended system.

Caching environment variables

TODO: Update to current conventions

One of the effects of running setup is that it attempts to clean the environment when it is first run. While this is normally desirable, it occassionally isn't (for example, when running a CGI program). You'll need to cache the variables you care about explicitly and re-export them.

As an example, the following code is a modification of the above to explicitly retain the environment variables REMOTEUSER, USER, and TERM:

readonly e_remoteuser="$REMOTEUSER"
readonly e_user="$USER"
readonly e_term="$TERM"

if [ -z "$ISG_BIN_SETUP_DIRS" ]; then

   # Try to make sure dirname is defined before calling it
   if [ -x /bin/showpath ]; then
      export PATH=`/bin/showpath standard`
   fi

   exec "$(dirname "$0")/setup" "$0" "$e_remoteuser" "$e_user" "$e_term" "$@"
fi

export REMOTEUSER="$1"
export USER="$2"
export TERM="$3"

# Shift arguments we added artificially.
shift 3


# Make sure requirepackage is defined if needed by sourcing setup...  Also
# provides the absolute function, which helps this work on ubuntu.student.cs at
# the time of writing (where there is no absolute command).
#
# Since setup has already been run, this call will not wipe our environment variable settings
. "$(dirname "$0")/setup"

NB: This only works for a top-level script (ie, the script that initially runs setup). If this script may be called by a script that already ran setup or top-level, this will require further modification.

Technical Details

The setup program is meant to be able to run in any version of bash from 2.0 onward. It is fairly complex, but that complexity is present here with the goal of removing complexity from the other scripts. They can all assume they are using a sane PATH, with a clean environment, running in bash 3, and with the ability to request particular versions of particular pieces of software to ensure that functionality isn't broken by changes caused by software upgrades.

Helper functions

See also:

absolute

This function exists to give a standardized way for scripts to find a path to a file that has all symlinks resolved. While the GNU tools provide readlink, which will accomplish this with the -f option, this was not present on our Solaris systems; there, instead, readlink was a program in the tetex-1.0 package which had a completely different interface. And while the Solaris systems have the MFCF-written absolute tool to handle this, in early iterations it was not present on the Linux machines, and there's always a possibility it will disappear in future setups of those machines.

auto_ssh

This provides a way for ssh to operate more reasonably non-interactively, given the expectations and requirements of the ISG scripts. It suppresses any key warnings for localhost and ensures no password prompts are given.

   ssh -q -o "NoHostAuthenticationForLocalhost=yes" -o "StrictHostKeyChecking=no" "$@"

path_add

Takes a directory and only adds it to the PATH if the directory is readable and executable. If it's on a system with /bin/showpath, it will try to use that command to add to the PATH in a way that avoids duplicates; otherwise, grep is used on the PATH to try to prevent this.

init_path

Tries to set up some sort of reasonable default path before all of the other ISG tools are added, making educated guesses about desired directories and their priorities.

initpath () {
   export PATH=''
   pathadd '/bin'
   pathadd '/sbin'
   pathadd '/usr/bin'
   pathadd '/usr/sbin'
   pathadd '/usr/local/bin'
   pathadd '/usr/local/sbin'
   pathadd '/opt/local/bin'
   pathadd '/opt/local/sbin'
   pathadd '/sw/bin'
   pathadd '/sw/sbin'
   pathadd '/software/.admin/bins/bin'
   if [[ -x "/bin/showpath" ]]; then
      export PATH=$("/bin/showpath" -PackageWarnings gnu standard current)
   fi
}

timed_auto_ssh

This is a limited-use function that puts a limit on how long ssh can attempt to run before timing out. It is only used in ISGScriptRequirePackage.

recursedirectories

This is the function that is used to add all of the subdirectories of ~isg/bin to the PATH. It makes explicit use of GNU find, so it will die mysteriously if this is not at the front of the path already. It returns these directories as a colon-delimited list.

dvips_tmp

One of the stated purposes of this setup script and ISGScriptRequirePackage is to avoid exactly this situation.

Unfortunately, it was not possible to install the same version of LaTeX on both the Solaris and Linux servers. This led to a very unfortunate situation where they have different interfaces. Fortunately, the damage appears to be isolated to the dvips command; unfortunately, that means it needs a special hack so it can run on both systems.

# This should make everybody cry.
# Force dvips to work uniformly in absence of tetex-1.0 package.
# This is not natural and undesirable, and I look forward to
# a day when the LaTeX packaging on all systems can be unified
# under a solid unchanging package and this function can be
# removed permanently.
dvips_tmp () {
   local DVIPS="$(new-which dvips)"

   if [[ "$(uname -s)" != 'Linux' ]]; then
      # Assume this is a Solaris 8 system; run the Xhiered dvips
      # in the tetex-1.0 package.
      "$DVIPS" "$@"
   else
      # Assume we're on a system where dvips auto-writes to a file.
      # Now, run dvips, specifying output to a newly-created temporary file,
      # hoping this dvips supports the -o flag
      local tmpfile="$(createUniqueFile /tmp/.dvips.$(whoami))"
      "$DVIPS" -o "$tmpfile" "$@"

      # Then, cat the file and rm it.
      if [ -e "$tmpfile" ]; then
         cat "$tmpfile"
         rm "$tmpfile"
      fi
   fi
}

General Functionality

First, initpath is called; this wipes out any user supplied path and replaces it with something standardized for every run of setup on this machine.

Next, a check is done to see if setup has been run before (existence of a value in the ISG_BIN_SETUP_DIRS environment variable). If it has not, the following actions are taken:

  • All environment variables with the exception of PATH are unset.
  • BASE_ISG_DIRECTORY is set to ~isg/bin (or the equivalent in a subversion checkout), and ISG_BIN_SETUP_DIRS is set to the result of calling recursedirectories on BASE_ISG_DIRECTORY. See ISGScriptRequirePackage for details.
  • A number of other variables are set:
    • PERL5LIB so various perl modules can be found
    • TEXINPUTS so provided styles for LaTeX are available
    • ISG_PREFERRED_SOLARIS and ISG_PREFERRED_LINUX are set to the round-robin servers for each architecture type.
    • ISG_COURSE_ACCESS_GROUP is set to cs-marks, which is the Unix group used to restrict access to some of the utilities to the course accounts.

All commands at this point have the contents of ISG_BIN_SETUP_DIRS prepended to the PATH, require the packages perl-5.8.1 and bash-3.0, and then either rerun setup, execute the passed in command, or fall through if this was sourced.

The rerun behaviour handles the case where we are running on Solaris, because in a bash 2 context not every action above would have worked correctly. The other two depend on how setup was called; it can be run in a context where it will execute another command, guaranteeing this is done in a reasonable environment, or it can be sourced from another bash script so the provided functions like auto_ssh and requirepackage become available for use.

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r3 - 2021-09-08 - YiLee
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback