$results = sigma_clip( coords => $coords, weight => $weight, mask => $mask, %opts);
This module collects various algorithms for determining the center of a dataset into one place. It accepts data stored as PDL variables (piddles)
Currently it contains a single function,
sigma_clip, which provides an iterative algorithm which successively removes outliers by clipping those whose distances from the current center are greater than a given number of standard deviations.
sigma_clip finds the center of a data set by:
- ignoring the data whose distance to the current center is a specified number of standard deviations
- calculating a new center by performing a (weighted) centroid of the remaining data
- calculating the standard deviation of the distance from the data to the center
- repeat at step 1 until either a convergence tolerance has been met or the iteration limit has been exceeded
The initial center may be explicitly specified, or may be calculated by performing a (weighted) centroid of the data.
The initial standard deviation is calculated using the initial center and either the entire dataset, or from a clipped region about the initial center.
sigma_clip can center sparse (e.g., input is a list of coordinates) or dense datasets (input is a hyper-rectangle) with or without weights. It accepts a mask which directs it to use only certain elements in the dataset.
The coordinates may be transformed using (PDL::Transform)[https://metacpan.org/pod/PDL::Transform]. This is mostly useful for dense datasets, where coordinates are generated from the indices of the passed hyper-rectangle. This functionality is not currently documented, as tests for it have not yet been written.
More information is available at the github repo page, https://github.com/djerius/PDLx-Algorithm-Center