The Kahan summation is a compensated summation algorithm, which significantly reduces numerical errors when adding a sequence of finite-precision floating point numbers.
This is done by keeping a separate running compensation (a variable to accumulate small errors).
This class can internally use multiple accumulators (template parameter N
). When filled from a collection that supports index access from a contiguous block of memory, compilers such as gcc, clang and icc can auto-vectorise the accumulation. This happens by cycling through the internal accumulators based on the value of "`index % N`", so N
accumulators can be filled from a block of N
numbers in a single instruction.
The usage of multiple accumulators might slightly increase the precision in comparison to the single-accumulator version with N = 1
. This depends on the order and magnitude of the numbers being accumulated. Therefore, in rare cases, the accumulation result can change in dependence of N, even when the data are identical. The magnitude of such differences is well below the precision of the floating point type, and will therefore mostly show in the compensation sum(see Carry()). Increasing the number of accumulators therefore only makes sense to speed up the accumulation, but not to increase precision.
T | The type of the values to be accumulated. |
N | Number of accumulators. Defaults to 1. Ideal values are the widths of a vector register on the relevant architecture. Depending on the instruction set, good values are:
|
Public Member Functions | |
KahanSum (T initialValue=T{}) | |
Initialise the sum. More... | |
template<class Container_t > | |
void | Add (const Container_t &inputs) |
Fill from a container that supports index access. More... | |
template<class Iterator > | |
void | Add (Iterator begin, Iterator end) |
Accumulate from a range denoted by iterators. More... | |
void | Add (T x) |
Single-element accumulation. Will not vectorise. More... | |
void | AddIndexed (T input, std::size_t index) |
Add input to the sum. More... | |
T | Carry () const |
operator T () const | |
Auto-convert to type T. More... | |
KahanSum< T, N > & | operator+= (T arg) |
Add arg into accumulator. Does not vectorise. More... | |
T | Result () const |
T | Sum () const |
Static Public Member Functions | |
template<class Iterator > | |
static KahanSum< T, N > | Accumulate (Iterator begin, Iterator end, T initialValue=T{}) |
Iterate over a range and return an instance of a KahanSum. More... | |
Private Attributes | |
T | fCarry [N] |
T | fSum [N] |
#include <Math/Util.h>
|
inline |
|
inlinestatic |
Iterate over a range and return an instance of a KahanSum.
See Add(Iterator,Iterator) for details.
[in] | begin | Beginning of a range. |
[in] | end | End of the range. |
[in] | initialValue | Optional initial value. |
|
inline |
|
inline |
Accumulate from a range denoted by iterators.
This function will auto-vectorise with random-access iterators.
[in] | begin | Beginning of a range. Needs to be a random access iterator for automatic vectorisation, because a contiguous block of memory needs to be read. |
[in] | end | End of the range. |
|
inline |
|
inline |
Add input
to the sum.
Particularly helpful when filling from a for loop. This function can be inlined and auto-vectorised if the index parameter is used to enumerate consecutive fills. Use Add() or Accumulate() when no index is available.
[in] | input | Value to accumulate. |
[in] | index | Index of the value. Determines internal accumulator that this value is added to. Make sure that consecutive fills have consecutive indices to make a loop auto-vectorisable. The actual value of the index does not matter, as long as it is consecutive. |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
|
private |
|
private |