The RLS server
globus-rls-server
supports both a Location Replica Catalog (LRC) server, which manages
Logical FileName (LFN) to Physical FileName (PFN) mappings in a database,
and a Replica Location Index (RLI) server, which manages mappings of
LFNs to LRC servers.
globus-rls-server
may be configured as either an LRC or RLI server, or both. Both LRCs
and RLIs may be configured to send updates to other RLIs (using
globus-rls-admin(8)).
Clients wishing to locate 1 or more physical filenames associated with
a logical filename may first contact an RLI server, which will return
a list of LRCs that may know about the LFN. The LRC servers are then
contacted in turn to find the physical filenames. Note that RLI information
may be out of date, so clients should be prepared to get a negative response
when contacting an LRC (or no response at all if the LRC server is
unavailable).
globus-rls-server
uses syslog(3) to log errors and other information (facility LOG_DAEMON)
when it's running in normal (daemon) mode. If the -d option (debug)
is specified then log messages are written to stdout.
LRC to RLI Updates
Two methods exist for LRC or RLI servers to inform RLI servers of their LFNs.
By default the list of LFNs are sent from the source to the RLI. This
can be time consuming if the number of LFNs is large, but does give
the RLI an exact list of the LFNs known to the LRC. This allows
wildcard searching of the RLI. Alternatively Bloom filters may be sent,
which are highly compressed summaries of the LFNs, however they do not allow
wildcard searching, and they will generate more "false positives" when
querying an RLI. Please see below for more on Bloom filters.
The program
globus-rls-admin(8)
can be used to manage the list of RLIs that an LRC or RLI server sends
updates to, this includes partitioning LFNs amongst multiple RLI servers.
A softstate algorithm is used for updates, periodically the
source server sends its state (LFN information) to the RLI servers it updates.
The RLI servers add these LFNs to their index, or update a timestamp
if the LFNs were already known. RLI servers expire information about LFN,LRC
mappings if they haven't been updated for a period longer than the softstate
update interval.
Options that can be configured to control the softstate algorithm
when a source server updates an RLI by sending LFNs are include:
rli_expire_int (seconds)
How often an RLI server will check for stale entries in its database.
rli_expire_stale (seconds)
How old an entry must be in an RLI database before it's considered stale.
This value should be no smaller than
update_ll_int.
Note if the LRC server is responding this value is not used, instead the
value of
update_ll_int
or
update_bf_int
is retrieved from the LRC server, multiplied by 1.2, and used as the
value for
rli_expire_stale.
update_bf_int seconds
Interval between RLI updates when using Bloom filters.
update_ll_int (seconds)
Interval between RLI updates when using LFN lists for softstate updates.
Updates to an LRC (new LFNs or deleted LFNs) normally don't propagate
to RLI servers until the next softstate update (controlled
by
update_ll_int
and
update_bf_int).
However by enabling "immediate update" mode an LRC will send updates to
an RLI within
update_buftime
seconds. Immedate updates are enabled by setting
update_immediate
to true. If updates are done with LFN lists then only the LFNs that
have been added or deleted to the source server are sent, if Bloom filters
are used then the entire Bloom filter is sent.
When immediate updates are enabled
the interval between softstate updates is multiplied by
update_factor
as long as no updates have failed (source and RLI are considered to
be in sync). This can greatly reduce the number of softstate updates a
source needs to send to an RLI. Incremental updates are buffered by the source
server until either 100 udpates have accumulated (when LFN lists are used), or
update_buftime
seconds have passed since the last update.
A Bloom filter is an array of bits. Each LFN is hashed multiple times
and the corresponding bits in the Bloom filter are set. Querying an RLI
to verify if an LFN exists is done by performing the same hashes, and
checking if the bits in the filter are on. If not then the LFN is known
not to exist, however if they're all on then all that's known is that
the LFN probably exists. The size of the Bloom filter (as a multiple
of the number of LFNs) and the number of hash functions, control the
false positive rate. The default values of 10 and 3 give a false positive
rate of approximately 1%. The advantage of Bloom filters is their
efficiency. For example, if the LRC has 1,000,000 LFNs in its database,
of average length 20 bytes, then 20,000,000 bytes must be sent to an RLI
during a softstate update (assuming no partitioning). The RLI server must
perform 1,000,000 updates to its database to create new LFN,LRC mappings,
or update timestamps on existing entries. With Bloom filters only 1,250,000
bytes are sent (10 x 1,000,000 bits / 8), and there are no database operations
on the RLI (Bloom filters are maintained entirely in memory). A comparison
of the time to perform a 1,000,000 LFN update took 20 minutes sending all the
LFNs, and less than 1 second using a Bloom filter. However as noted before
wild card searches of an RLI are not supported with Bloom filters.
The options that control Bloom filter updates are:
rli_bloomfilter true|false
RLI servers must have this set to accept Bloom filter updates.
rli_bloomfilter_dir none|default|pathname
Bloom filters saved in this directory and read at start time if
not "none". See CONFIGURATION for details.
lrc_bloomfilter_numhash N
Number of hash functions, an integer from 1 to 8. The default is 3.
lrc_bloomfilter_ratio N
Size of the Bloom filter as a multiple of the number of LFNs in the LRC
database. Too small a value will generate too many false positives, too
large wastes memory and network bandwidth.
Note an LRC server can update some RLIs with Bloom filters, and others with
LFNs. However an RLI server can only be updated using one method, and
an RLI acting as a source for updates can only send the type of updates
that it receives.
OPTIONS
-b maxbackoff
Maximum time, in seconds, that
globus-rls-server
will attempt to reopen the socket it listens on after an I/O error.
-C rlscertfile
Name of X.509 certificate file that identifies the server, sets environment
variable X509_USER_CERT.
-c conffile
Name of configuration file for server. The default is
$GLOBUS_LOCATION/etc/globus-rls-server.conf
if the environment variable GLOBUS_LOCATION is set, else
/etc/globus-rls-server.conf.
-d
Enable debugging. Server will not detach from controlling terminal and
log messages will be written to stdout rather than syslog. For additional
logging verbosity set loglevel (see -L option) to higher values.
-e rli_expire_int
Interval (seconds) at which an RLI server should expire stale entries.
-F update_factor
If
update_immediate
mode is on, and the source server is in sync with an RLI server (an LRC and
RLI are synced if there have been no failed updates since the last full
softstate update), then the interval between RLI updates for this server
(
update_ll_int
) is multipled by
update_factor.
-f maxfreethreads
Maximum number of idle threads server will leave running. Excess threads
are terminated.
-I true|false
Turns LRC to RLI immediate update mode on or off. Default is false.
-i idletimeout
Seconds after which idle client connections are timed out.
-K rlskeyfile
Name of X.509 key file. Sets environment variable X509_USER_KEY.
-L loglevel
Sets log level. By default this is 0, which means only errors will
be logged. Higher values mean more verbose logging. Level 1 causes logging
of major events (eg start of full softstate update), 2 includes medium level
events (eg writing pending updates to an RLI), 3 enables all tracing. Level
4 includes all the SQL commands executed by the server.
-l true|false
Configure whether server is an LRC server. Default is false.
-M maxconnections
Maximum number of active connections. Should be small enough to prevent
server from running out of open file descriptors. Default is 100.
-m maxthreads
Maximum number of threads server will start up to support simultaneous
requests.
-N
Disable authentication checking. Intended for debugging. Clients
should use the URL
RLSN://host
to disable authentication on the client side.
-o update_buftime
Softstate updates are buffered until either the buffer is full or this
much time has elapsed since the last update. Default is 30 seconds.
-p pidfiledir
Directory where pid file should be written.
-r
Configure whether server is an RLI server. Default is false.
-S rli_expire_stale
Interval after which entries in the RLI database are considered stale
(presumably because they were deleted in the LRC). Stale entries are
not returned in queries.
-s startthreads
Number of threads to start up initially.
-t timeout
Timeout (in seconds) for calls to other RLS servers (eg for LRC calls to
send an update to an RLI). A value of 0 disables timeouts. The default
is 30 seconds.
-U myurl
URL for this server.
-u update_ll_int
Interval (in seconds) between lfn-list LRC to RLI updates.
-v
Show version and exit.
SIGNALS
The server will reread its configuration file if it receives a HUP signal.
It will wait for all current requests to complete and shut down cleanly if
sent a INT, QUIT or TERM signal.
CONFIGURATION
If the configuration file is not specified on the command line (see the -c
option) then it's looked for in
$GLOBUS_LOCATION/etc/globus-rls-server.conf,
or
/etc/globus-rls-server.conf
if GLOBUS_LOCATION is not set.
Most command line options may also be set in the configuration file,
however command line options always override items found in the configuration
file. The configuration file is a sequence of lines consisting of a
keyword, whitespace, and a value. Comments begin with a # and end
with a newline.
acl user: permission [permission]
user
is a regular expression matching distinguished names (or local usernames if
a gridmap file is used) of users allowed to make calls to the server.
Permission is one or more of
lrc_read,lrc_update,rli_read,rli_update,admin,stats,
and
all.
There may be multiple
acl
entries, the first match found is used to determine a user's privileges.
The
admin
privilege is necessary to update an LRC's list of RLIs to send updates to.
The
stats
privilege allows a client to read performance statistics.
A gridmap file may also be used to map DNs to local usernames, which
in turn are matched against the regular expressions in the
acl
list to determine the user's permissions.
acl
entries may be a combination of DNs and local usernames. If a DN is
not found in the gridmap file then it is used to search the
acl
list.
authentication true|false
Enable or disable GSI authentication. The default is true. If
authentication is enabled clients should use the URL schema "rls:"
to connect to the server, if disabled "rlsn:".
db_pwd password
Password to use to connect to MYSQL server, default is
changethis.
db_user databaseuser
Username to use to connect to MYSQL server, default is
dbperson.
idletimeout seconds
Seconds after which idle connections closed, default is
900.
loglevel N
Sets loglevel to N (default is 0). Higher levels mean more verbosity.
lrc_bloomfilter_numhash N
Number of hash functions to use in Bloom filters. The default is 3.
Possible values are 1 to 8.
This value, in conjunction with
lrc_bloomfilter_ratio,
will determine the number of false positives that may be expected when
querying an RLI that is updated via Bloom filters. The default values of
3 and 10 give a false positive rate of approximately 1%.
lrc_bloomfilter_ratio N
Sets ratio of bloom filter size (in bits) to number of LFNs in the LRC
catalog. Only meaningful if Bloom filters are used to update an RLI.
The default is 10.
lrc_dbname
Name of LRC database, default is
lrcdb.
lrc_server true|false
True if LRC server, default is
false.
maxbackoff seconds
Max seconds to wait before retrying listen in the event of an I/O error,
default is
300.
maxfreethreads N
Maximum number of idle threads, excess threads are killed. Default is
5.
maxconnections N
Maximum number of simultaneous connections. Default is 100.
maxthreads N
Maximum number of threads running at one time, default is 30.
myurl URL
URL of server. Default is
rls://<hostname>:port
odbcini filename
Sets environment variable ODBCINI. If not specified, and ODBCINI is not
already set, then defaults to
$GLOBUS_LOCATION/var/odbc.ini.
pidfiledir directory
Directory where pid file should be written, default is /var/run.
port N
Port server listens on, default is
39281.
result_limit limit
Sets the maximum number of results returned by a query. If a query
request includes a limit greater than this value an error (GLOBUS_RLS_BADARG)
is returned. If the query request has no limit specified then at most
result_limit
records are returned by a query. A value of zero means no limit, this
is the default.
rli_bloomfilter true|false
If true then only Bloom filter updates are accepted from source servers,
otherwise full LFN lists are accepted. Note if Bloom filters are enabled
then the RLI does not support wildcarded queries.
rli_bloomfilter_dir none|default|pathname
If an RLI is configured to accept bloom filters (rli_bloomfilter true)
then bloom filters may be saved to this directory after updates.
This directory is scanned when an RLI server starts up and is used to
initialize Bloom filters for each LRC that updated the RLI. This option
is useful when it is desired that the RLI recover its data immediately
after a restart rather than wait for LRCs to send another update. If
the LRCs are updating frequently this option is unnecessary, and may
be wasteful in that each Bloom filter is written to disk after each
update.
If
rli_bloomfilter_dir
is set to the string "none" then Bloom filters are not saved to disk,
this is the default. If "default" then the default directory is used,
which is $GLOBUS_LOCATION/var/rls-bloomfilters if GLOBUS_LOCATION
is set, else /tmp/rls-bloomfilters. Any other string is used as the
directory name unchanged. The Bloom filter files in this directory
have the name of the URL of the LRC that sent the Bloom filter, with
slashes (/) changed to percent signs (%), and ".bf" appended.
rli_dbname database
Name of RLI database, default is
rlidb.
rli_expire_int seconds
Interval between RLI expirations of stale entries. Default is
28800
seconds.
rli_expire_stale seconds
Interval after which entries in the RLI database are considered stale
(presumably because they were deleted in the LRC). Default is
86400
seconds. Stale RLI entries are not returned in queries.
rli_server true|false
True if RLI server, default is
false.
rlscertfile filename
Name of X.509 certificate file identifying server, set by setting
environment variable X509_USER_CERT.
rlskeyfile
Name of X.509 key file for server, set by setting environment variable
X509_USER_KEY.
startthreads N
Number of threads to start initially, default is
3.
timeout seconds
Timeout (in seconds) for calls to other RLS servers (eg for LRC calls to
send an update to an RLI).
update_bf_int seconds
Interval between RLI updates when the RLI is updated by Bloom filters.
The default is 900 seconds.
update_buftime N
RLI updates are buffered until either the buffer is full or this
much time has elapsed since the last update. Default is 30 seconds.
update_factor N
If
update_immediate
mode is on, and the source server is in sync with an RLI server (a source and
RLI are synced if there have been no failed updates since the last full
softstate update), then the interval between RLI updates for this server
(
update_ll_int
) is multipled by
update_factor.
update_immediate true|false
Turn LRC to RLI immediate mode updates on or off. Default is false.
update_ll_int seconds
Seconds between lfn-list softstate updates, default is
86400
seconds.
update_retry seconds
Seconds to wait before a source server will retry to connect to an RLI server
that it needs to update. Default is 300.