Poster of Linux kernelThe best gift for a Linux geek
Parallel::Sort

Parallel::Sort

Section: C Library Functions (3) Updated: Thu Apr 7 2011
Local index Up
 

NAME

Parallel::Sort -  

SYNOPSIS


#include <parallel_sort.h>  

Public Member Functions


Sort (std::vector< KeyType > &d, const unsigned int n_procs=libMesh::n_processors(), const unsigned int proc_id=libMesh::processor_id())

void sort ()

const std::vector< KeyType > & bin ()
 

Private Member Functions


void binsort ()

void communicate_bins ()

void sort_local_bin ()

template<> void binsort ()

template<> void communicate_bins ()
 

Private Attributes


const unsigned int _n_procs

const unsigned int _proc_id

bool _bin_is_sorted

std::vector< KeyType > & _data

std::vector< unsigned int > _local_bin_sizes

std::vector< KeyType > _my_bin
 

Detailed Description

 

template<typename KeyType> class Parallel::Sort< KeyType >

The parallel sorting method is templated on the type of data which is to be sorted. It may later be templated on other things if we are ambitious. This class knows about MPI, and knows how many processors there are. It is responsible for transmitting data between the processors and ensuring that the data is properly sorted between all the processors. We assume that a Sort is instantiated on all processors.

Definition at line 46 of file parallel_sort.h.  

Constructor & Destructor Documentation

 

template<typename KeyType > Parallel::Sort< KeyType >::Sort (std::vector< KeyType > &d, const unsigned intn_procs = libMesh::n_processors(), const unsigned intproc_id = libMesh::processor_id())Constructor takes the number of processors, the processor id, and a reference to a vector of data to be sorted. This vector is sorted by the constructor, therefore, construction of a Sort object takes O(nlogn) time, where n is the length of the vector.

Definition at line 42 of file parallel_sort.C.

References Parallel::Sort< KeyType >::_data, Parallel::Sort< KeyType >::_local_bin_sizes, Parallel::Sort< KeyType >::_n_procs, and Parallel::Sort< KeyType >::sort().

                                                :       
  _n_procs(n_procs),
  _proc_id(proc_id),
  _bin_is_sorted(false),
  _data(d)
{
  std::sort(_data.begin(), _data.end());

  // Allocate storage
  _local_bin_sizes.resize(_n_procs);
}
 

Member Function Documentation

 

template<typename KeyType > const std::vector< KeyType > & Parallel::Sort< KeyType >::bin ()Return a constant reference to _my_bin. This allows us to do things like check if sorting was successful by printing _my_bin.

Definition at line 358 of file parallel_sort.C.

Referenced by MeshCommunication::assign_global_indices(), and MeshCommunication::find_global_indices().

{
  if (!_bin_is_sorted)
    {
      std::cout << 'Warning! Bin is not yet sorted!' << std::endl; 
    }

  return _my_bin;
}
 

template<typename KeyType > void Parallel::Sort< KeyType >::binsort () [private]Sorts the local data into bins across all processors. Right now it constructs a BenSorter<KeyType> object. In the future this could be a template parameter.

Definition at line 91 of file parallel_sort.C.

References Parallel::BinSorter< KeyType >::binsort(), std::max(), and Parallel::BinSorter< KeyType >::sizeof_bin().

{  
  // Find the global min and max from all the
  // processors.
  std::vector<KeyType> global_min_max(2);

  // Insert the local min and max for this processor
  global_min_max[0] = -_data.front();
  global_min_max[1] =  _data.back();

  // Communicate to determine the global
  // min and max for all processors.
  Parallel::max(global_min_max);

  // Multiply the min by -1 to obtain the true min
  global_min_max[0] *= -1;

  // Bin-Sort based on the global min and max
  Parallel::BinSorter<KeyType> bs(_data);
  bs.binsort(_n_procs, global_min_max[1], global_min_max[0]);

  // Now save the local bin sizes in a vector so
  // we don't have to keep around the BinSorter.
  for (unsigned int i=0; i<_n_procs; ++i)
    _local_bin_sizes[i] = bs.sizeof_bin(i);
}
 

template<> void Parallel::Sort< Hilbert::HilbertIndices >::binsort () [private]

Definition at line 125 of file parallel_sort.C.

References Parallel::BinSorter< KeyType >::binsort(), libMesh::COMM_WORLD, and Parallel::BinSorter< KeyType >::sizeof_bin().

{
  // Find the global min and max from all the
  // processors.  Do this using MPI_Allreduce.
  Hilbert::HilbertIndices 
    local_min,  local_max,
    global_min, global_max;

  if (_data.empty())
    {
      local_min.rack0 = local_min.rack1 = local_min.rack2 = static_cast<Hilbert::inttype>(-1);
      local_max.rack0 = local_max.rack1 = local_max.rack2 = 0;
    }
  else
    {
      local_min = _data.front();
      local_max = _data.back();
    }
  
  MPI_Op hilbert_max, hilbert_min;
  MPI_Datatype hilbert_type;

  MPI_Type_contiguous (3, MPI_UNSIGNED, &hilbert_type);
  MPI_Type_commit     (&hilbert_type);
  MPI_Op_create       ((MPI_User_function*)__hilbert_max_op, true, &hilbert_max);
  MPI_Op_create       ((MPI_User_function*)__hilbert_min_op, true, &hilbert_min);

  // Communicate to determine the global
  // min and max for all processors.
  MPI_Allreduce(&local_min,
                &global_min,
                1,
                hilbert_type,
                hilbert_min,
                libMesh::COMM_WORLD);

  MPI_Allreduce(&local_max,
                &global_max,
                1,
                hilbert_type,
                hilbert_max,
                libMesh::COMM_WORLD);

  MPI_Type_free (&hilbert_type);
  MPI_Op_free   (&hilbert_max);
  MPI_Op_free   (&hilbert_min);

  // Bin-Sort based on the global min and max
  Parallel::BinSorter<Hilbert::HilbertIndices> bs(_data);
  bs.binsort(_n_procs, global_max, global_min);

  // Now save the local bin sizes in a vector so
  // we don't have to keep around the BinSorter.
  for (unsigned int i=0; i<_n_procs; ++i)
    _local_bin_sizes[i] = bs.sizeof_bin(i);
}
 

template<typename KeyType > void Parallel::Sort< KeyType >::communicate_bins () [private]Communicates the bins from each processor to the appropriate processor. By the time this function is finished, each processor will hold only its own bin(s).

Definition at line 186 of file parallel_sort.C.

References libMesh::COMM_WORLD.

{
#ifdef LIBMESH_HAVE_MPI
  // Create storage for the global bin sizes.  This
  // is the number of keys which will be held in
  // each bin over all processors.
  std::vector<unsigned int> global_bin_sizes = _local_bin_sizes;
  
  // Sum to find the total number of entries in each bin.
  Parallel::sum(global_bin_sizes);

  // Create a vector to temporarily hold the results of MPI_Gatherv
  // calls.  The vector dest  may be saved away to _my_bin depending on which
  // processor is being MPI_Gatherv'd.
  std::vector<KeyType> dest;

  unsigned int local_offset = 0;
  
  for (unsigned int i=0; i<_n_procs; ++i)
    {
      // Vector to receive the total bin size for each
      // processor.  Processor i's bin size will be
      // held in proc_bin_size[i]
      std::vector<unsigned int> proc_bin_size;

      // Find the number of contributions coming from each
      // processor for this bin.  Note: allgather combines
      // the MPI_Gather and MPI_Bcast operations into one.
      Parallel::allgather(_local_bin_sizes[i], proc_bin_size);

      // Compute the offsets into my_bin for each processor's
      // portion of the bin.  These are basically partial sums
      // of the proc_bin_size vector.
      std::vector<unsigned int> displacements(_n_procs);
      for (unsigned int j=1; j<_n_procs; ++j)
        displacements[j] = proc_bin_size[j-1] + displacements[j-1];

      // Resize the destination buffer
      dest.resize (global_bin_sizes[i]);
          
      MPI_Gatherv((_data.size() > local_offset) ?
                    &_data[local_offset] :
                    NULL,                        // Points to the beginning of the bin to be sent
                  _local_bin_sizes[i],           // How much data is in the bin being sent.
                  Parallel::datatype<KeyType>(), // The data type we are sorting
                  (dest.empty()) ?
                    NULL :
                    &dest[0],                    // Enough storage to hold all bin contributions
                  (int*) &proc_bin_size[0],      // How much is to be received from each processor
                  (int*) &displacements[0],      // Offsets into the receive buffer
                  Parallel::datatype<KeyType>(), // The data type we are sorting
                  i,                             // The root process (we do this once for each proc)
                  libMesh::COMM_WORLD);

      // Copy the destination buffer if it
      // corresponds to the bin for this processor
      if (i == _proc_id)
        _my_bin = dest;
          
      // Increment the local offset counter
      local_offset += _local_bin_sizes[i];
    } 
#endif // LIBMESH_HAVE_MPI
}
 

template<> void Parallel::Sort< Hilbert::HilbertIndices >::communicate_bins () [private]

Definition at line 258 of file parallel_sort.C.

References libMesh::COMM_WORLD.

{
  // Create storage for the global bin sizes.  This
  // is the number of keys which will be held in
  // each bin over all processors.
  std::vector<unsigned int> global_bin_sizes(_n_procs);
  
  libmesh_assert (_local_bin_sizes.size() == global_bin_sizes.size());

  // Sum to find the total number of entries in each bin.
  // This is stored in global_bin_sizes.  Note, we
  // explicitly know that we are communicating MPI_UNSIGNED's here.
  MPI_Allreduce(&_local_bin_sizes[0],
                &global_bin_sizes[0],
                _n_procs,
                MPI_UNSIGNED,
                MPI_SUM,
                libMesh::COMM_WORLD);

  MPI_Datatype hilbert_type;

  MPI_Type_contiguous (3, MPI_UNSIGNED, &hilbert_type);
  MPI_Type_commit     (&hilbert_type);

  // Create a vector to temporarily hold the results of MPI_Gatherv
  // calls.  The vector dest  may be saved away to _my_bin depending on which
  // processor is being MPI_Gatherv'd.
  std::vector<Hilbert::HilbertIndices> sendbuf, dest;

  unsigned int local_offset = 0;
  
  for (unsigned int i=0; i<_n_procs; ++i)
    {
      // Vector to receive the total bin size for each
      // processor.  Processor i's bin size will be
      // held in proc_bin_size[i]
      std::vector<unsigned int> proc_bin_size(_n_procs);

      // Find the number of contributions coming from each
      // processor for this bin.  Note: Allgather combines
      // the MPI_Gather and MPI_Bcast operations into one.
      // Note: Here again we know that we are communicating
      // MPI_UNSIGNED's so there is no need to check the MPI_traits.
      MPI_Allgather(&_local_bin_sizes[i], // Source: # of entries on this proc in bin i
                    1,                    // Number of items to gather                 
                    MPI_UNSIGNED,           
                    &proc_bin_size[0],    // Destination: Total # of entries in bin i
                    1,
                    MPI_UNSIGNED,
                    libMesh::COMM_WORLD);
      
      // Compute the offsets into my_bin for each processor's
      // portion of the bin.  These are basically partial sums
      // of the proc_bin_size vector.
      std::vector<unsigned int> displacements(_n_procs);
      for (unsigned int j=1; j<_n_procs; ++j)
        displacements[j] = proc_bin_size[j-1] + displacements[j-1];

      // Resize the destination buffer
      dest.resize (global_bin_sizes[i]);
      
      MPI_Gatherv((_data.size() > local_offset) ?
                    &_data[local_offset] :
                    NULL,                   // Points to the beginning of the bin to be sent
                  _local_bin_sizes[i],      // How much data is in the bin being sent.
                  hilbert_type,             // The data type we are sorting
                  (dest.empty()) ?
                    NULL :
                    &dest[0],               // Enough storage to hold all bin contributions
                  (int*) &proc_bin_size[0], // How much is to be received from each processor
                  (int*) &displacements[0], // Offsets into the receive buffer
                  hilbert_type,             // The data type we are sorting
                  i,                        // The root process (we do this once for each proc)
                  libMesh::COMM_WORLD);

      // Copy the destination buffer if it
      // corresponds to the bin for this processor
      if (i == _proc_id)
        _my_bin = dest;

      // Increment the local offset counter
      local_offset += _local_bin_sizes[i];
    }

  MPI_Type_free (&hilbert_type);  
}
 

template<typename KeyType > void Parallel::Sort< KeyType >::sort ()This is the only method which needs to be called by the user. Its only responsibility is to call three private methods in the correct order.

Definition at line 59 of file parallel_sort.C.

Referenced by MeshCommunication::assign_global_indices(), MeshCommunication::find_global_indices(), and Parallel::Sort< KeyType >::Sort().

{
  // Find the global data size.  The sorting
  // algorithms assume they have a range to
  // work with, so catch the degenerate cases here
  unsigned int global_data_size = _data.size();

  Parallel::sum (global_data_size);

  if (global_data_size < 2)
    {
      // the entire global range is either empty
      // or contains only one element
      _my_bin = _data;

      Parallel::allgather (static_cast<unsigned int>(_my_bin.size()),
                           _local_bin_sizes);
    }
  else
    {
      this->binsort();
      this->communicate_bins();
      this->sort_local_bin();
    }
  
  // Set sorted flag to true
  _bin_is_sorted = true;
}
 

template<typename KeyType > void Parallel::Sort< KeyType >::sort_local_bin () [private]After all the bins have been communicated, we can sort our local bin. This is nothing more than a call to std::sort

Definition at line 350 of file parallel_sort.C.

{
  std::sort(_my_bin.begin(), _my_bin.end());
}
 

Member Data Documentation

 

template<typename KeyType> bool Parallel::Sort< KeyType >::_bin_is_sorted [private]Flag which lets you know if sorting is complete

Definition at line 92 of file parallel_sort.h.  

template<typename KeyType> std::vector<KeyType>& Parallel::Sort< KeyType >::_data [private]The raw, unsorted data which will need to be sorted (in parallel) across all processors.

Definition at line 99 of file parallel_sort.h.

Referenced by Parallel::Sort< KeyType >::Sort().  

template<typename KeyType> std::vector<unsigned int> Parallel::Sort< KeyType >::_local_bin_sizes [private]Vector which holds the size of each bin on this processor. It has size equal to _n_procs.

Definition at line 106 of file parallel_sort.h.

Referenced by Parallel::Sort< KeyType >::Sort().  

template<typename KeyType> std::vector<KeyType> Parallel::Sort< KeyType >::_my_bin [private]The bin which will eventually be held by this processor. It may be shorter or longer than _data. It will be dynamically resized when it is needed.

Definition at line 114 of file parallel_sort.h.  

template<typename KeyType> const unsigned int Parallel::Sort< KeyType >::_n_procs [private]The number of processors to work with.

Definition at line 82 of file parallel_sort.h.

Referenced by Parallel::Sort< KeyType >::Sort().  

template<typename KeyType> const unsigned int Parallel::Sort< KeyType >::_proc_id [private]The identity of this processor.

Definition at line 87 of file parallel_sort.h.

 

Author

Generated automatically by Doxygen for libMesh from the source code.


 

Index

NAME
SYNOPSIS
Public Member Functions
Private Member Functions
Private Attributes
Detailed Description
template<typename KeyType> class Parallel::Sort< KeyType >
Constructor & Destructor Documentation
template<typename KeyType > Parallel::Sort< KeyType >::Sort (std::vector< KeyType > &d, const unsigned intn_procs = libMesh::n_processors(), const unsigned intproc_id = libMesh::processor_id())Constructor takes the number of processors, the processor id, and a reference to a vector of data to be sorted. This vector is sorted by the constructor, therefore, construction of a Sort object takes O(nlogn) time, where n is the length of the vector.
Member Function Documentation
template<typename KeyType > const std::vector< KeyType > & Parallel::Sort< KeyType >::bin ()Return a constant reference to _my_bin. This allows us to do things like check if sorting was successful by printing _my_bin.
template<typename KeyType > void Parallel::Sort< KeyType >::binsort () [private]Sorts the local data into bins across all processors. Right now it constructs a BenSorter<KeyType> object. In the future this could be a template parameter.
template<> void Parallel::Sort< Hilbert::HilbertIndices >::binsort () [private]
template<typename KeyType > void Parallel::Sort< KeyType >::communicate_bins () [private]Communicates the bins from each processor to the appropriate processor. By the time this function is finished, each processor will hold only its own bin(s).
template<> void Parallel::Sort< Hilbert::HilbertIndices >::communicate_bins () [private]
template<typename KeyType > void Parallel::Sort< KeyType >::sort ()This is the only method which needs to be called by the user. Its only responsibility is to call three private methods in the correct order.
template<typename KeyType > void Parallel::Sort< KeyType >::sort_local_bin () [private]After all the bins have been communicated, we can sort our local bin. This is nothing more than a call to std::sort
Member Data Documentation
template<typename KeyType> bool Parallel::Sort< KeyType >::_bin_is_sorted [private]Flag which lets you know if sorting is complete
template<typename KeyType> std::vector<KeyType>& Parallel::Sort< KeyType >::_data [private]The raw, unsorted data which will need to be sorted (in parallel) across all processors.
template<typename KeyType> std::vector<unsigned int> Parallel::Sort< KeyType >::_local_bin_sizes [private]Vector which holds the size of each bin on this processor. It has size equal to _n_procs.
template<typename KeyType> std::vector<KeyType> Parallel::Sort< KeyType >::_my_bin [private]The bin which will eventually be held by this processor. It may be shorter or longer than _data. It will be dynamically resized when it is needed.
template<typename KeyType> const unsigned int Parallel::Sort< KeyType >::_n_procs [private]The number of processors to work with.
template<typename KeyType> const unsigned int Parallel::Sort< KeyType >::_proc_id [private]The identity of this processor.
Author

This document was created by man2html, using the manual pages.
Time: 21:52:00 GMT, April 16, 2011