The dggath program gathers distributed graphs into centralized
graphs. It reads a set of files igfile representing fragments of a
distributed source graph, and writes them back on the form of a
single centralized source graph ogfile.
The dgscat program scatters centralized source graphs into
distributed graphs. It reads a centralized source graph igfile and
writes it back on the form of a set of files ogfile representing
fragments of the corresponding distributed source graph.
The gscat program does exactly the same as dgscat, but does not
require to be run in a parallel environment. Since gscat processes
the input centralized graph file as a text stream, it does not need
to load the full graph in memory before building the distributed
graph fragment files. It is therefore much less resource consuming,
but does not allow for the checking of graph consistency, as it has
no global vision of the graph structure.
When file names are not specified, data is read from standard input
and written to standard output. Standard streams can also be
explicitly represented by a dash '-'.
When the proper libraries have been included at compile time, dggath
and dgscat can directly handle compressed graphs, both as input and
output. A stream is treated as compressed whenever its name is
postfixed with a compressed file extension, such as
in 'brol.grf.bz2' or '-.gz'. The compression formats which can be
supported are the bzip2 format ('.bz2'), the gzip format ('.gz'),
and the lzma format ('.lzma', on input only).
dggath and dgscat base on implementations of the MPI interface to
spread work across the processing elements. It is therefore not
likely to be run directly, but instead through some launcher command
such as mpirun.
DISTRIBUTED FILE NAMES
In order to tell whether programs should read from, or write to, a
single file located on only one processor, or to multiple instances
of the same file on all of the processors, or else to distinct files
on each of the processors, a special grammar has been designed,
which is based on the '%' escape character. Four such escape
sequences are defined, which are interpreted independently on every
processor, prior to file opening. By default, when a filename is
provided, it is assumed that the file is to be opened on only one of
the processors, called the root processor, which is usually process
0 of the communicator within which the program is run. The index
of the root processor can be changed by means of the -r
option. Using any of the first three escape sequences below will
instruct programs to open in parallel a file of name equal to the
interpreted filename, on every processor on which they are run.
%p
Replaced by the number of processes in the global communicator in
which the program is run. Leads to parallel opening.
%r
Replaced on each process running the program by the rank of this
process in the global communicator. Leads to parallel opening.
%-
Discarded, but leads to parallel opening. This sequence is mainly
used to instruct programs to open on every processor a file of
identical name. The opened files can be, according whether the
given path leads to a shared directory or to directories that
are local to each processor, either to the opening of multiple
instances of the same file, or to the opening of distinct files
which may each have a different content, respectively (but in
this latter case it is much recommended to identify files by
means of the '%r' sequence).
%%
Replaced by a single '%' character. File names using this escape
sequence are not considered for parallel opening, unless one or
several of the three other escape sequences are also present.
For instance, filename 'brol' will lead to the opening of file 'brol'
on the root processor only, filename '%-brol' (or even 'br%-ol') will
lead to the parallel opening of files called 'brol' on every
processor, and filename 'brol%p-%r' will lead to the opening of
files 'brol2-0' and 'brol2-1', respectively, on each of the two processors
on which the program were to run.
OPTIONS
-c
For dggath and dgscat only. Check the consistency of the
input source graph after loading it into memory.
-h
Display some help.
-rpnum
Set root process for centralized files (default is 0).
-V
Display program version and copyright.
EXAMPLE
Run dgscat on 5 processing elements to scatter centralized graph
file brol.grf into 5 gzipped file fragments brol5-0.dgr.gz to
brol5-4.dgr.gz.