## Synopsis

```
$results = sigma_clip( coords => $coords,
weight => $weight,
mask => $mask,
%opts);
```

## Description

This module collects various algorithms for determining the center of a dataset into one place. It accepts data stored as PDL variables (piddles)

Currently it contains a single function, `sigma_clip`

, which provides an iterative algorithm which successively removes outliers by clipping those whose distances from the current center are greater than a given number of standard deviations.

`sigma_clip`

finds the center of a data set by:

- ignoring the data whose distance to the current center is a specified number of standard deviations
- calculating a new center by performing a (weighted) centroid of the remaining data
- calculating the standard deviation of the distance from the data to the center
- repeat at step 1 until either a convergence tolerance has been met or the iteration limit has been exceeded

The initial center may be explicitly specified, or may be calculated by performing a (weighted) centroid of the data.

The initial standard deviation is calculated using the initial center and either the entire dataset, or from a clipped region about the initial center.

`sigma_clip`

can center sparse (e.g., input is a list of coordinates) or dense datasets (input is a hyper-rectangle) with or without weights. It accepts a mask which directs it to use only certain elements in the dataset.

The coordinates may be transformed using (PDL::Transform)[https://metacpan.org/pod/PDL::Transform]. This is mostly useful for dense datasets, where coordinates are generated from the indices of the passed hyper-rectangle. This functionality is not currently documented, as tests for it have not yet been written.

More information is available at the github repo page, https://github.com/djerius/PDLx-Algorithm-Center

## Comments

Please sign up to post a review.