The lam-helpfile provides detailed error messages and suggestions for help on how to fix common problems. In many places in LAM, when an error occurs, this help file is consulted to display a detailed message of what the error was and, when possible, suggestions on how to fix the problem. It consists of much of the information from the LAM FAQ (particularly in dealing with getting LAM up and running)
At present, the following LAM tools use this help file (it is expected that more will use it in future releases. If you have suggestions for locations where more detailed error messages would be helpful, please let us know):
hboot lamboot lamexec lamhalt lamnodes lamwipe mpicc (hcc) mpiCC (hcp) mpif77 (hf77) mpirun recon tkill tping
The help file is multiple blocks of help text separated by single line delimiters. The delimiter lines are of the format:
Where programname is the general name of the program (or group of programs) that this help message applies to, and topicname is the specific topic that this message applies two.
The special keyword ALL can be used for either the programname or the topicname in some cases; this is usually a "wildcard" value where little specific information is available.
Within the block of the message, lines that begin with a "#" are treated as comments; they are not printed out.
Three special escape sequences can be used within the help message:
The exact location of the help file is configurable. This allows system administrators and/or users to customize the help file for their particular environment.
When LAM attempts to print an error message from the help file, it looks for the help file in the following locations (in order):
$HOME/lam-helpfile $HOME/lam-7.1.2-helpfile $HOME/etc/lam-helpfile $HOME/etc/lam-7.1.2-helpfile $LAMHELPDIR/lam-helpfile $LAMHELPDIR/lam-7.1.2-helpfile $LAMHOME/etc/lam-helpfile $LAMHOME/etc/lam-7.1.2-helpfile $TROLLIUSHOME/etc/lam-helpfile $TROLLIUSHOME/etc/lam-7.1.2-helpfile $SYSCONFDIR/lam-helpfile $SYSCONFDIR/lam-7.1.2-helpfile
Note the variable $LAMHELPDIR; this variable can be set according to platform, for example, to provide operating system-specific information, or information specific to particular groups of machines, etc. It can also be set to provide help messages in different languages.
$SYSCONFIDIR is typically $prefix/etc, where $prefix is the location to where LAM was installed; it was the option supplied to ./configure when LAM was built (or /usr/local/lam-7.1.2, by default). However, note that the value of $SYSCONFDIR can be overridden when LAM is configured with the --sysconfdir switch.
The following is an example customization of the help for the hboot and lamboot programs, when the user supplies a host file name that is not found.
-*-boot:open-hostfile-*- %1 could not open the hostfile "%2" for the following reason: %perror Things to check: - ensure that the file exists try "ls -l %2" - ensure that you have read permissions on the file try "cat %2" You may not need to specify a host file at all; the system administrators have defined the all of Beowulf cluster host names in the LAM default host name list. If you wish to use all of the Beowulf nodes, simply execute: %1 -v If you have any problems with LAM, please send mail to: email@example.com