MRS  1.0
A C++ Class Library for Statistical Set Processing
subpavings::AdaptiveHistogramCollator Class Reference

A wrapper or manager for a CollatorSPSnode. More...

List of all members.

Public Member Functions

 AdaptiveHistogramCollator ()
 Default constructor.
 AdaptiveHistogramCollator (const AdaptiveHistogram &adh)
 initialised constructor. Initialised with an AdaptiveHistogram object.
 AdaptiveHistogramCollator (const AdaptiveHistogramCollator &other)
 Copy constructor.
AdaptiveHistogramCollatoroperator= (AdaptiveHistogramCollator rhs)
 Copy assignment operator.
 ~AdaptiveHistogramCollator ()
 Destructor.
CollatorSPnodegetSubPaving () const
 Return a pointer to the CollatorPSnode this manages.
bool isEmptyCollation () const
 Return true if there is nothing in the collator this manages.
cxsc::ivector getRootBox () const
 Get the box of the subpaving managed by this.
int getDimensions () const
 get the dimensions of the subpaving this manages.
AdaptiveHistogramCollatoroperator+= (const AdaptiveHistogramCollator &rhs)
const AdaptiveHistogramCollator operator+ (const AdaptiveHistogramCollator &rhs) const
AdaptiveHistogramCollatoroperator+= (const AdaptiveHistogram &rhs)
const AdaptiveHistogramCollator operator+ (const AdaptiveHistogram &rhs) const
const AdaptiveHistogramCollator makeAverage () const
const AdaptiveHistogramCollator makeNormalised () const
const AdaptiveHistogramCollator makeMarginal (const std::vector< int > &reqDims) const
 Make a marginalised version of this histogram collator.
double findCoverage (const rvector &pt) const
 Find the coverage value for a data point.
std::vector< const
CollatorSPnode * > & 
findDensityRegion (std::vector< const subpavings::CollatorSPnode * > &covNodes, double cov) const
double findEmpiricalDensity (const rvector &pt) const
 Find the empirical density for a data point.
bool histCollatorBoxContains (const rvector &pt) const
 Get whether the root box for this contains a given point.
bool splitToShape (std::string instruction)
 Split subpaving managed by this to a specified shape.
void addToCollation (const AdaptiveHistogram &adh)
 Add an AdaptiveHistogram object to the collation.
void addToCollation (const std::vector< AdaptiveHistogram > &samples)
 Add a collection of AdaptiveHistogram objects to the collation.
size_t getNumberCollated () const
 Get the number of Adaptive Histogram objects collated.
bool outputGraphDot () const
 Make a .dot graph file from collated histogram structure.
std::ostream & outputToStreamTabs (std::ostream &os, int prec=5) const
 Output the subpaving managed by this to a given stream.
void exportCollator (const std::string &s, int prec=5) const
 Export a description of the collation in a format that can be read in again to remake it.
void outputLog (const std::string &s, const int i, int prec=5) const
 Add current state of collation to a log file.
void swap (AdaptiveHistogramCollator &adh)
Get a collection of the L1 distance values

for each element of the collation against an AdaptiveHistogram.

The L1 distance for a element of this against a statistical subpaving is defined over all the leaves of a non-minimal union of the subpaving managed by this and the given statistical subpaving. The L1 distance for an element in this is the sum, over these leaves, of the absolute differences between the 'height' value for that element for the leaf and the 'height' value of the statistical subpaving for the leaf (ie counter/volume normalised by total count in the whole paving), multiplied by the volume of the leaf.

Throws a NullSubpavingPointer_Error if the pointer to the subpaving managed by this is NULL or if the pointer to the subpaving managed by adh is NULL.

Parameters:
adhA pointer to a statistical subpaving to calculate distances against.
containera reference to container to use to store the L1 distance values. Any contents of the given container will be discarded before new values are added.
Returns:
An ordered collection of the L1 distance values for each element of the collation against adh, in the same order as the elements are added to this collation.
Precondition:
the pointers to the subpavings managed by this and by adh should be non-NULL.
RealVec getL1DistancesToAverage () const
RealVecgetL1DistancesToAverage (RealVec &container) const
RealVec getL1DistancesToAverage (const AdaptiveHistogramCollator &other) const
RealVecgetL1DistancesToAverage (RealVec &container, const AdaptiveHistogramCollator &other) const
RealVec getL1Distances (const AdaptiveHistogram &adh) const
RealVecgetL1Distances (RealVec &container, const AdaptiveHistogram &adh) const
void outputAverageToTxtTabs (const std::string &s, int prec=5) const
 Output average normalised histogram over collation to a txt file.
void outputAverageToTxtTabs (const std::string &s, int prec, bool confirm) const
void outputToTxtTabs (const std::string &s, int prec=5) const
 Output the collated information to a txt file.
void outputToTxtTabs (const std::string &s, int prec, bool confirm) const

Static Public Member Functions

static AdaptiveHistogramCollator importCollator (const std::string &s)

Detailed Description

A wrapper or manager for a CollatorSPSnode.

AdaptiveHistogramCollator class objects manage CollatorSPnode objects for the purpose of collating information from a number of AdaptiveHistogram objects.

The AdaptiveHistogramCollator's CollatorSPnode tree represents the subpaving that is the union of all the subpavings associated with each AdaptiveHistogram in the collation. Each node in the tree has a data member which is a container structure holding one value for each collated histogram. For a collation of histograms the container holds the normalised height, for each collated histogram, of the histogram bin represented by the box of that node.

(The normalised height associated with a bin which is represented by the box of a leaf node of a tree managed by an AdaptiveHistogram object is the number of data points associated with that bin divided by (the total number of data points in the histogram x the volume of the bin). Thus the areas (heights x volumes) of the bins sum to 1.

Since the tree represents the union of the subpavings associated with each AdaptiveHistogram in the collation, the collation tree will have at least as many and usually more bins than any of the collated histograms.

Each collated histogram will be represented in the summaries in the order in which it was added to the collation. Eg, the heights of the bins of the first histogram to be collated will be first (index [1]) in the summary container.

If the collated AdaptiveHistograms have been properly formed and added to the collation, the sum, over all the leaf nodes of the collation, of the volume of the box of the leaf node multiplied by the values of the leaf node summary corresponding a particular collated histogram will be 1 for each collated histogram.


Constructor & Destructor Documentation

initialised constructor. Initialised with an AdaptiveHistogram object.

Throws a NullSubpavingPointer_Error if the pointer to the statistical subpaving managed by adh is NULL.

  : rootCollator(NULL)
{
    try {
    if (!adh.hasSubPaving()) {
      throw NullSubpavingPointer_Error(
      "AdaptiveHistogramCollator::AdaptiveHistogramCollator(const AdaptiveHistogram&)");
    }
    
    rootCollator = new CollatorSPnode(*adh.getSubPaving());
    }
  catch (exception const& e) {
    constructor_error_handler();
    }

}

Copy constructor.

Throws a NullSubpavingPointer_Error if the pointer to the collator managed by other is NULL.

        : rootCollator(NULL)
{
    try {
    if (other.getSubPaving() == NULL) {
      // this should not be possible with current constructors
      throw NullSubpavingPointer_Error(
      "AdaptiveHistogramCollator::AdaptiveHistogramCollator(const AdaptiveHistogramCollator&)");
    }
    
        rootCollator = new CollatorSPnode(*(other.getSubPaving()));
    }
    catch (exception const& e) {
    constructor_error_handler();
    }
}

Member Function Documentation

Add an AdaptiveHistogram object to the collation.

Attempts to add an AdaptiveHistogram object into the collation of AdaptiveHistogram information.

Warning:
If an exception is thrown during the addition process, this may be left in an incoherent state. Users can make a 'backup copy' of a collator before adding to the collation if they want to be able to return to the state before the failed addition process.
Parameters:
adhthe AdaptiveHistogram to be included in the collation.
Precondition:
if !isEmptyCollation(), adh must have the same dimensions as this.
Postcondition:
This will include summary data from the adh.
{
    
  (*this) += adh;
  
}
void AdaptiveHistogramCollator::addToCollation ( const std::vector< AdaptiveHistogram > &  samples)

Add a collection of AdaptiveHistogram objects to the collation.

Warning:
If an exception is thrown during the addition process, this may be left in an incoherent state. Users can make a 'backup copy' of a collator before adding to the collation if they want to be able to return to the state before the failed addition process.
Parameters:
samplesthe collection of AdaptiveHistogram objects to be included in the collation.
Precondition:
everything in samples must have the same dimensions as each other. If !isEmptyCollation(), everything in samples must have the same dimensions as this.
Postcondition:
This will include summary data from the elements of samples.
{
  for (std::vector < AdaptiveHistogram >::const_iterator it = samples.begin();
        it < samples.end();
        ++it) {
    (*this) += (*it);
  }
}
void AdaptiveHistogramCollator::exportCollator ( const std::string &  s,
int  prec = 5 
) const

Export a description of the collation in a format that can be read in again to remake it.

If isEmptyCollation() then the file s created will be empty;

Parameters:
sis the name of the file to export to.
precthe precision for output formatting. ie, number of decimal places.
{
  ofstream os(s.c_str());
  if (os.is_open()) {
  
    if (!isEmptyCollation() ) {
      vector <RealVec> ranges;
      
      ranges = getSubPaving()->getAllRangeCollections(ranges);
      
      string leafNodeLevels = getSubPaving()->getLeafNodeLevelsString();
      
      ivector box = getRootBox();
    
      os << leafNodeLevels << endl;
      
      os << cxsc::SetPrecision(prec+2, prec);
      
      os << box << endl;
      
      for (vector <RealVec>::const_iterator it = ranges.begin();
          it < ranges.end();
          ++it) {
        os << (*it) << endl;
      }
    }
    os.close();
    
  }
  else {
      std::cerr << "Error: could not open file named "
        << s << std::endl << std::endl;
  }
}
double AdaptiveHistogramCollator::findCoverage ( const rvector &  pt) const

Find the coverage value for a data point.

The coverage value is 1 - (sum of density of all the boxes with heights > the height of the box where the data point is).

Height of a box is taken as the total summary value and density is taken as normalised product of height and volume of box.

If the point is not in the histogram at all, coverage = 0; If the point is in the lowest box in the histogram, coverage = count lowest box / total count; If the point is in the highest box of the histogram, coverage = 1

Warning:
Coverage only makes sense for collations of positive heights (i.e. proper histograms). The results are unpredictable if collations have somehow got negative heights.

Throws a UnfulfillableRequest_Error if this has nothing collated.

Throws a IncompatibleDimensions_Error if the dimensions of this and pt are not equal.

Parameters:
ptthe point to find coverage for
Returns:
coverage for the point given.
Precondition:
!isEmptyCollation() and dimensions of pt and this must match.
{
  if (isEmptyCollation()) {
    throw UnfulfillableRequest_Error("AdaptiveHistogramCollator::findCoverage(const rvector&)");
  }
  if (getDimensions() != (Ub(pt) - Lb(pt) + 1)) {
    throw IncompatibleDimensions_Error("AdaptiveHistogramCollator::findCoverage(const rvector&)");
  }
  
  return _coverage(pt);
}
double AdaptiveHistogramCollator::findEmpiricalDensity ( const rvector &  pt) const

Find the empirical density for a data point.

The empirical density is the relative density of the histogram at the box the given data point is in.

If the point is not in the histogram at all, empirical density = 0; If the point is in the some leaf box in the histogram with 'normalised' height d (d = total summary for box / sum over leaves of (total summary x box volume)), then d is the empirical density.

Warning:
empirical density only makes sense for collations of positive heights (i.e. proper histograms). The results are unpredictable if collations have somehow got negative heights.

Throws a UnfulfillableRequest_Error if this has nothing collated.

Throws a IncompatibleDimensions_Error if the dimensions of this and pt are not equal.

Parameters:
ptthe point to find empirical density for.
Returns:
the empirical density at the point.
Precondition:
!isEmptyCollation() and the dimensions of the point and the root paving match.
{
  if (isEmptyCollation()) {
    throw UnfulfillableRequest_Error(
      "AdaptiveHistogramCollator::findEmpiricalDensity(const rvector&)");
  }
  if (getDimensions() != (Ub(pt) - Lb(pt) + 1)) {
    throw IncompatibleDimensions_Error(
      "AdaptiveHistogramCollator::findEmpiricalDensity(const rvector&)");
  }
  
  return _empiricalDensity(pt);
}

get the dimensions of the subpaving this manages.

Returns:
0 if this does not have a subpaving with a box, else returns the dimensions of the subpaving.
{
  int retValue = 0;
  if ((getSubPaving() != NULL) && (!getSubPaving()->isEmpty())) {
    
    retValue = getSubPaving()->getDimension();
  }
  return retValue;
}
cxsc::ivector AdaptiveHistogramCollator::getRootBox ( ) const

Get the box of the subpaving managed by this.

Note:
with the present constructors, it is possible for this to have a subpaving but for the subpaving to have no box.
Returns:
copy of the box of the subpaving managed by this.
Precondition:
!getSubPaving()->isEmpty().
{
  if (getSubPaving() == NULL) {
    throw NullSubpavingPointer_Error(
            "AdaptiveHistogramCollator::::getRootBox()");
  }
  if (getSubPaving()->isEmpty()) {
    throw NoBox_Error(
            "AdaptiveHistogramCollator::::getRootBox()");
  }
  return getSubPaving()->getBox();
}
bool AdaptiveHistogramCollator::histCollatorBoxContains ( const rvector &  pt) const

Get whether the root box for this contains a given point.

Does the support of the collator include the point pt?

Throws a UnfulfillableRequest_Error if this has nothing collated.

Throws an IncompatibleDimensions_Error if the given point does not have the same dimensions as the subpaving that this manages.

Parameters:
ptthe point to check.
Returns:
true if the root box of the collator subpaving this manages contains pt, false otherwise.
Precondition:
!isEmptyCollation() and the dimensions of the point and the root paving match.
{
  if (isEmptyCollation()) {
    throw UnfulfillableRequest_Error(
        "AdaptiveHistogram::histCollatorBoxContains(const rvector&)");
  }
  if (getDimensions() != (Ub(pt) - Lb(pt) + 1)) {
    throw IncompatibleDimensions_Error(
      "AdaptiveHistogramCollator::histCollatorBoxContains(const rvector&)");
  }
  
  return ( getSubPaving()->findContainingNode(pt) != NULL ); 
}

Return true if there is nothing in the collator this manages.

Returns:
True if this' rootCollator has collated no histograms, false otherwise.

An AdaptiveHistogramCollator which is the average over this collation.

Makes and returns an AdaptiveHistogramCollator which is the average over the collation of histograms represented by this. The tree managed by the average has structure exactly the same as the tree managed by this and one value in the summary for each node where that value is the average of the summary of the corresponding node in this.

Throws an UnfulfillableRequest_Error if this has nothing collated.

Returns:
An AdaptiveHistogramCollation which is the average of the given collation.
Precondition:
!isEmptyCollation().
{
  
  if (isEmptyCollation()) {
    throw UnfulfillableRequest_Error("AdaptiveHistogramCollator::makeAverage()");
  }

  AdaptiveHistogramCollator temp(*this);

  temp._average();

  return temp;
}
const AdaptiveHistogramCollator AdaptiveHistogramCollator::makeMarginal ( const std::vector< int > &  reqDims) const

Make a marginalised version of this histogram collator.

Marginalises to take out the given dimensions and adjust summaries so that for each element in the collation, sum of (node vol x node value) is the same as before marginalisation, and hence that the overall sum of (node vol x accumulated values for node) is the same as before marginalisation.

Note:
allowed dimensions start at 1, ie dimensions to marginalise on can include 1, 2, ... dimensions of this.

Throws a UnfulfillableRequest_Error if this has nothing collated.

Throws an std::invalid_argument if the required dimensions reqDim is empty or contains dimensions outside the range of the dimensions of this.

Parameters:
reqDimsis a vector of the dimensions to include in marginal.
Returns:
An AdaptiveHistogramCollator managing a subpaving which is the marginalised version of the subpaving managed by this.
Precondition:
reqDims must be compatible with current dimensions and !isEmptyCollation().
Postcondition:
returned histogram will have the same size of collation as before and have sum of (node vol x accumulated summaries) equal to that for this.
{
  if (isEmptyCollation()) {
    throw UnfulfillableRequest_Error(
      "AdaptiveHistogramCollator::makeMarginal(const std::vector<int>&)");
  }
  
  AdaptiveHistogramCollator temp(*this);
  
  temp._marginalise(reqDims);
  return temp;
}

Make a normalised version of this collation.

Normalises this collation so that the sum over all the leaf nodes of the volume of the box represented by the leaf node and the total summary value of the leaf node (ie, height) is 1

All summaries in the returned collation contain only one summary value.

Throws a UnfulfillableRequest_Error if this has nothing collated.

Throws an std::logic_error if the subpaving managed by this has no 'area', ie getTotalAbsValueTimesVol() == 0.

Returns:
The normalised version of this.
Precondition:
!isEmptyCollation() and getTotalAbsValueTimesVol() != 0.
{
  if (isEmptyCollation()) {
    throw UnfulfillableRequest_Error(
      "AdaptiveHistogramCollator::makeNormalised()");
  }
  
  AdaptiveHistogramCollator temp(*this);
  
  temp._normalise();
  return temp;
}
const AdaptiveHistogramCollator AdaptiveHistogramCollator::operator+ ( const AdaptiveHistogramCollator rhs) const

Addition operator.

Addition gives a histogram collator managing a tree which represents a subpaving which is the union of the subpavings represented by the operand collators. The summary for each node in the tree contains all the values from the summaries of the corresponding nodes in the trees managed by the operand AdaptiveHistogramCollators.

{
    AdaptiveHistogramCollator temp(*this);
      
  temp += rhs;
  
  return temp;

}
const AdaptiveHistogramCollator AdaptiveHistogramCollator::operator+ ( const AdaptiveHistogram rhs) const

Addition operator.

Addition gives a histogram collator managing a tree which represents a subpaving which is the union of the subpavings represented by this collator and the collator representation of rhs. The summary for each node in the tree contains all the values from the summaries of the corresponding nodes in the trees for this collator and the collator representation of rhs.

{
  AdaptiveHistogramCollator temp(*this);
      
  temp += rhs;
  
  return temp;
}
AdaptiveHistogramCollator & AdaptiveHistogramCollator::operator+= ( const AdaptiveHistogramCollator rhs)

Incremental or inplace addition operator.

Addition gives this histogram collator an expanded tree which represents a subpaving which is the union of the subpavings represented by the this's original tree and the tree represented by the rhs collator. The summary for each node in the expandd tree contains all the values from the summaries of the original nodes and the summaries of the corresponding nodes in the tree managed by the rhs collator.

{
  //nothing to add
  if ( rhs.isEmptyCollation() ) {
    
    return *this;
  }
      
  // if this has no subpaving or an empty one, it should just copy the other one
  if ( isEmptyCollation() ) {
    
    *this = rhs;
    return *this;
  }

  // get here only if both have something in their collations
  
    getSubPaving()->addPaving( rhs.getSubPaving() );
    return *this;
}
AdaptiveHistogramCollator & AdaptiveHistogramCollator::operator+= ( const AdaptiveHistogram rhs)

Incremental or inplace addition operator.

Addition gives this histogram collator an expanded tree which represents a subpaving which is the union of the subpaving represented by the this's original tree and the tree represented by rhs as a collator. The summary for each node in the expanded tree contains all the values from the summaries of the original nodes and the summaries of the corresponding nodes in the tree managed by the collator representation of rhs.

Warning:
If an exception is thrown during the addition process, this may be left in an incoherent state. Users can make a 'backup copy' of a collator before addition if they want to be able to return to the state before the failed addition process.
{

  //nothing to add
  if ( !rhs.hasSubPaving() || rhs.getSubPaving()->isEmpty() ) {
    
    return *this;
  }
      
  // if this has nothing collated
  if ( isEmptyCollation() ) {
    
    *this = AdaptiveHistogramCollator(rhs);
    return *this;
  }

  // get here only if both have subpavings
  
  getSubPaving()->addPaving( rhs.getSubPaving() );
  return *this;
}
void AdaptiveHistogramCollator::outputAverageToTxtTabs ( const std::string &  s,
int  prec = 5 
) const

Output average normalised histogram over collation to a txt file.

This method does not make the average histogram directly but, for each leaf node in the collated tree, calculates and outputs the average of the summary associated with that leaf node.

Output tab delimited data on the average to a text file. Outputs the normalised average histogram bins and heights.

Parameters:
sthe name of the file to send the output to.
precthe precision for output formatting. ie, number of decimal places.
confirmis a boolean controlling whether confirmation goes to console output.
{
  outputAverageToTxtTabs(s, prec, false);
}

Make a .dot graph file from collated histogram structure.

Makes a simple .dot graph from the histogram using node names and the .png image for this graph.

Postcondition:
a .dot file and a .png in the same directory as the program creating it was run in.
{
  bool success = false;
  
    if (isEmptyCollation()) {

        std::cerr << "Sorry, you can't make a graph with nothing collated"
                << std::endl;
    }

    else success = getSubPaving()->outputGraphDot();

    return success;
}
void AdaptiveHistogramCollator::outputLog ( const std::string &  s,
const int  i,
int  prec = 5 
) const

Add current state of collation to a log file.

Parameters:
sis the name of the file to log to.
iis a number representing the index of this state in a sequence.
precthe precision for output formatting. ie, number of decimal places.
{
    // To add output of the AdaptiveHistogramCollator object to file
    ofstream os(s.c_str(), ios::app);         // append
    if (os.is_open()) {
        os << std::endl;
        os << "Pass " << i << std::endl; // numbering
        getSubPaving()->leavesOutputTabs(os, prec); // the output
        os.close();
    }
    else {
        std::cerr << "Error: could not open file named "
            << s << std::endl << std::endl;
    }
}
std::ostream & AdaptiveHistogramCollator::outputToStreamTabs ( std::ostream &  os,
int  prec = 5 
) const

Output the subpaving managed by this to a given stream.

Format is a tab-delimited file of numeric data starting with nodeName, then the node box volume, then the node summary, then the description of the node box as a tab-delimited list of interval upper and lower bounds.

Parameters:
osis a reference to the stream to output the histogramm to.
precthe precision for output formatting. ie, number of decimal places.
Returns:
a reference to the given stream.
{
  if (NULL != getSubPaving()) {

     return getSubPaving()->leavesOutputTabs(os, prec); // the output
  }
  
    else return os;
}
void AdaptiveHistogramCollator::outputToTxtTabs ( const std::string &  s,
int  prec = 5 
) const

Output the collated information to a txt file.

Output tab delimited data on the collation to a text file.

Parameters:
sthe name of the file to send the output to.
precthe precision for output formatting. ie, number of decimal places.
confirmis a boolean controlling whether confirmation goes to console output..
{
  outputToTxtTabs(s, prec, false);
}
bool AdaptiveHistogramCollator::splitToShape ( std::string  instruction)

Split subpaving managed by this to a specified shape.

Used for testing.

Throws a NullSubpavings_Error if the subpaving that this manages is a NULL pointer.

Throws a NoBox_Error if the subpaving box is empty.

Prints a message to the standard error output if the instruction could not be carried out.

Parameters:
instructionspecifies the required shape, eg "3, 3, 2, 1"
Returns:
true if the split was successful, false otherwise
Precondition:
!isEmptyCollation().
{
  
  // checks:  is there a root paving, is the string properly formed?
  if (NULL == getSubPaving()) {
    throw NullSubpavingPointer_Error(
        "AdaptiveHistogramCollator::splitToShape()");
  }
  bool success = false;
  CollatorSPnode temp(*getSubPaving()); // copy to temp
  try {
    if (instruction.length() == 0) {
      throw std::invalid_argument(
        "AdaptiveHistogramCollator::splitToShape() : No instruction");
    }

    std::string legal(", 0123456789");
    if (instruction.find_first_not_of(legal) != std::string::npos) {
      throw std::invalid_argument(
        "AdaptiveHistogramCollator::splitToShape() : Illegal character");
    }

    // all seems to be okay, we can start splitting the root paving
    
    success = getSubPaving()->splitRootToShape(instruction);

    if (!success) {
      handleSplitToShapeError(temp);
     }
     
  }
  catch (std::invalid_argument const& ia) {
    cerr << ia.what() << endl;
    handleSplitToShapeError(temp);
    success = false;
  }
  catch (std::logic_error const& le) {
    cerr << le.what() << endl;
    handleSplitToShapeError(temp);
    success = false;
  }
  return success;
  // any other exceptions are unhandled
}

The documentation for this class was generated from the following files:
 All Classes Namespaces Functions Variables Typedefs Enumerations Friends