Section: Sun Grid Engine Administrative Commands (8)Updated: $Date$Local indexUp
NAME
sge_shadowd - Sun Grid Engine shadow master daemon
SYNOPSIS
sge_shadowd
DESCRIPTION
sge_shadowd
is a "light weight" process which can be run on so-called shadow
master hosts in a Sun Grid Engine cluster to detect failure of the current
Sun Grid Engine master daemon,
and to start-up a new
on the host on which the
sge_shadowd
runs. If multiple shadow daemons are active in a cluster, they
run a protocol which ensures that only one of them will start-up
a new master daemon.
The hosts suitable for being used as shadow master hosts must have
shared root read/write access to the directory $SGE_ROOT/$SGE_CELL/common
as well as to the master daemon spool directory
(by default $SGE_ROOT/$SGE_CELL/spool/qmaster).
The names of the shadow master hosts need to be contained in the file
$SGE_ROOT/$xQS_NAME_Sxx_CELL/common/shadow_masters.
RESTRICTIONS
sge_shadowd
may only be started by root.
ENVIRONMENT VARIABLES
SGE_ROOT
Specifies the location of the Sun Grid Engine standard configuration
files.
SGE_CELL
If set, specifies the default Sun Grid Engine cell. To address a Sun Grid Engine
cell
sge_shadowd
uses (in the order of precedence):
The name of the cell specified in the environment
variable SGE_CELL, if it is set.
The name of the default cell, i.e. default.
SGE_DEBUG_LEVEL
If set, specifies that debug information
should be written to stderr. In addition the level of
detail in which debug information is generated is defined.
SGE_QMASTER_PORT
If set, specifies the tcp port on which
is expected to listen for communication requests.
Most installations will use a services map entry for the
service "sge_qmaster" instead to define that port.
SGE_DELAY_TIME
This variable controls the interval in which
sge_shadowd
pauses if a takeover bid fails. This value is used only when there are multiple
sge_shadowd
instances and they are contending to be the master.
The default is 600 seconds.
SGE_CHECK_INTERVAL
This variable controls the interval in which the
sge_shadowd
checks the heartbeat file (60 seconds by default).
SGE_GET_ACTIVE_INTERVAL
This variable controls the interval when a
sge_shadowd
instance tries to take over when the heartbeat file has not changed.