Template Numerical Library version\ main:df396df
|
Namespace for fundamental TNL algorithms. More...
Namespaces | |
namespace | Segments |
Namespace for the segments data structures. | |
Classes | |
struct | AtomicOperations |
struct | AtomicOperations< Devices::Cuda > |
struct | AtomicOperations< Devices::Host > |
struct | AtomicOperations< Devices::Sequential > |
class | CudaReductionBuffer |
struct | Multireduction |
struct | Multireduction< Devices::Cuda > |
struct | Multireduction< Devices::Host > |
struct | Multireduction< Devices::Sequential > |
struct | SegmentedScan |
Computes segmented scan (or prefix sum) on a vector. More... | |
struct | SegmentedScan< Devices::Cuda, Type > |
struct | SegmentedScan< Devices::Host, Type > |
struct | SegmentedScan< Devices::Sequential, Type > |
struct | SequentialFor |
Wrapper to ParallelFor which makes it run sequentially. More... | |
Functions | |
template<typename Array , typename Sorter = typename Sorting::DefaultSorter< typename Array::DeviceType >::SorterType> | |
void | ascendingSort (Array &array, const Sorter &sorter=Sorter{}) |
Function for sorting elements of array or vector in ascending order. | |
template<typename Array > | |
bool | contains (const Array &array, typename Array::ValueType value, typename Array::IndexType begin=0, typename Array::IndexType end=0) |
Checks if an array/vector/view contains an element with given value. | |
template<typename Array > | |
bool | containsOnlyValue (const Array &array, typename Array::ValueType value, typename Array::IndexType begin=0, typename Array::IndexType end=0) |
Checks if all elements of an array/vector/view have the given value. | |
template<typename DestinationDevice , typename SourceDevice = DestinationDevice, typename DestinationElement , typename SourceElement , typename Index > | |
void | copy (DestinationElement *destination, const SourceElement *source, Index size) |
Copies memory from source to destination. | |
template<typename DestinationDevice , typename DestinationElement , typename Index , typename SourceIterator > | |
void | copy (DestinationElement *destination, Index destinationSize, SourceIterator begin, SourceIterator end) |
Copies memory from source iterator range to destination. | |
template<typename Array , typename DestinationElement , typename = std::enable_if_t< IsArrayType< Array >::value >> | |
void | copy (std::vector< DestinationElement > &destination, const Array &source) |
Copies memory from the source TNL array-like container to the destination STL vector. | |
template<typename Array , typename Sorter = typename Sorting::DefaultSorter< typename Array::DeviceType >::SorterType> | |
void | descendingSort (Array &array, const Sorter &sorter=Sorter{}) |
Function for sorting elements of array or vector in descending order. | |
template<typename InputDistributedArray , typename OutputDistributedArray , typename Reduction > | |
void | distributedExclusiveScan (const InputDistributedArray &input, OutputDistributedArray &output, typename InputDistributedArray::IndexType begin, typename InputDistributedArray::IndexType end, Reduction &&reduction, typename OutputDistributedArray::ValueType identity) |
Computes an exclusive scan (or prefix sum) of a distributed array in-place. | |
template<typename InputDistributedArray , typename OutputDistributedArray , typename Reduction = TNL::Plus> | |
void | distributedExclusiveScan (const InputDistributedArray &input, OutputDistributedArray &output, typename InputDistributedArray::IndexType begin=0, typename InputDistributedArray::IndexType end=0, Reduction &&reduction=TNL::Plus{}) |
Overload of distributedExclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default. | |
template<typename InputDistributedArray , typename OutputDistributedArray , typename Reduction > | |
void | distributedInclusiveScan (const InputDistributedArray &input, OutputDistributedArray &output, typename InputDistributedArray::IndexType begin, typename InputDistributedArray::IndexType end, Reduction &&reduction, typename OutputDistributedArray::ValueType identity) |
Computes an inclusive scan (or prefix sum) of a distributed array in-place. | |
template<typename InputDistributedArray , typename OutputDistributedArray , typename Reduction = TNL::Plus> | |
void | distributedInclusiveScan (const InputDistributedArray &input, OutputDistributedArray &output, typename InputDistributedArray::IndexType begin=0, typename InputDistributedArray::IndexType end=0, Reduction &&reduction=TNL::Plus{}) |
Overload of distributedInclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default. | |
template<typename DistributedArray , typename Reduction > | |
void | distributedInplaceExclusiveScan (DistributedArray &array, typename DistributedArray::IndexType begin, typename DistributedArray::IndexType end, Reduction &&reduction, typename DistributedArray::ValueType identity) |
Computes an exclusive scan (or prefix sum) of a distributed array in-place. | |
template<typename DistributedArray , typename Reduction = TNL::Plus> | |
void | distributedInplaceExclusiveScan (DistributedArray &array, typename DistributedArray::IndexType begin=0, typename DistributedArray::IndexType end=0, Reduction &&reduction=TNL::Plus{}) |
Overload of distributedInplaceExclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default. | |
template<typename DistributedArray , typename Reduction > | |
void | distributedInplaceInclusiveScan (DistributedArray &array, typename DistributedArray::IndexType begin, typename DistributedArray::IndexType end, Reduction &&reduction, typename DistributedArray::ValueType identity) |
Computes an inclusive scan (or prefix sum) of a distributed array in-place. | |
template<typename DistributedArray , typename Reduction = TNL::Plus> | |
void | distributedInplaceInclusiveScan (DistributedArray &array, typename DistributedArray::IndexType begin=0, typename DistributedArray::IndexType end=0, Reduction &&reduction=TNL::Plus{}) |
Overload of distributedInplaceInclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default. | |
template<typename DestinationDevice , typename SourceDevice = DestinationDevice, typename DestinationElement , typename SourceElement , typename Index > | |
bool | equal (DestinationElement *destination, const SourceElement *source, Index size) |
Compares memory from source with destination. | |
template<typename InputArray , typename OutputArray , typename Reduction > | |
void | exclusiveScan (const InputArray &input, OutputArray &output, typename InputArray::IndexType begin, typename InputArray::IndexType end, typename OutputArray::IndexType outputBegin, Reduction &&reduction, typename OutputArray::ValueType identity) |
Computes an exclusive scan (or prefix sum) of an input array and stores it in an output array. | |
template<typename InputArray , typename OutputArray , typename Reduction = TNL::Plus> | |
void | exclusiveScan (const InputArray &input, OutputArray &output, typename InputArray::IndexType begin=0, typename InputArray::IndexType end=0, typename OutputArray::IndexType outputBegin=0, Reduction &&reduction=TNL::Plus{}) |
Overload of exclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default. | |
template<typename Device , typename Element , typename Index > | |
void | fill (Element *data, const Element &value, Index size) |
Fills memory between data and data + size with a value . | |
template<typename Device , typename Element , typename Index > | |
void | fillRandom (Element *data, Index size, Element min_val, Element max_val) |
Fills memory between data and data + size with random Element values in the given range. | |
template<typename Container , typename ValueType > | |
std::pair< bool, typename Container::IndexType > | find (const Container &container, const ValueType &value) |
Find the first occurrence of a value in an array. | |
template<typename InputArray , typename OutputArray , typename Reduction > | |
void | inclusiveScan (const InputArray &input, OutputArray &output, typename InputArray::IndexType begin, typename InputArray::IndexType end, typename OutputArray::IndexType outputBegin, Reduction &&reduction, typename OutputArray::ValueType identity) |
Computes an inclusive scan (or prefix sum) of an input array and stores it in an output array. | |
template<typename InputArray , typename OutputArray , typename Reduction = TNL::Plus> | |
void | inclusiveScan (const InputArray &input, OutputArray &output, typename InputArray::IndexType begin=0, typename InputArray::IndexType end=0, typename OutputArray::IndexType outputBegin=0, Reduction &&reduction=TNL::Plus{}) |
Overload of inclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default. | |
template<typename Array , typename Reduction > | |
void | inplaceExclusiveScan (Array &array, typename Array::IndexType begin, typename Array::IndexType end, Reduction &&reduction, typename Array::ValueType identity) |
Computes an exclusive scan (or prefix sum) of an array in-place. | |
template<typename Array , typename Reduction = TNL::Plus> | |
void | inplaceExclusiveScan (Array &array, typename Array::IndexType begin=0, typename Array::IndexType end=0, Reduction &&reduction=TNL::Plus{}) |
Overload of inplaceExclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default. | |
template<typename Array , typename Reduction > | |
void | inplaceInclusiveScan (Array &array, typename Array::IndexType begin, typename Array::IndexType end, Reduction &&reduction, typename Array::ValueType identity) |
Computes an inclusive scan (or prefix sum) of an array in-place. | |
template<typename Array , typename Reduction = TNL::Plus> | |
void | inplaceInclusiveScan (Array &array, typename Array::IndexType begin=0, typename Array::IndexType end=0, Reduction &&reduction=TNL::Plus{}) |
Overload of inplaceInclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default. | |
template<typename Array > | |
bool | isAscending (const Array &arr) |
Functions returning true if the array elements are sorted in ascending order. | |
template<typename Array > | |
bool | isDescending (const Array &arr) |
Functions returning true if the array elements are sorted in descending order. | |
template<typename Array , typename Compare > | |
bool | isSorted (const Array &arr, const Compare &compare) |
Functions returning true if the array elements are sorted according to the lmabda function comparison . | |
template<typename Device , typename Begin , typename End , typename Function , typename... FunctionArgs> | |
std::enable_if_t< std::is_integral_v< Begin > &&std::is_integral_v< End > > | parallelFor (const Begin &begin, const End &end, Function f, FunctionArgs... args) |
Parallel for-loop function for 1D range specified with integral values with default launch configuration. | |
template<typename Device , typename Begin , typename End , typename Function , typename... FunctionArgs> | |
std::enable_if_t< IsStaticArrayType< Begin >::value &&IsStaticArrayType< End >::value > | parallelFor (const Begin &begin, const End &end, Function f, FunctionArgs... args) |
Parallel for-loop function for range specified with multi-index values with default launch configuration. | |
template<typename Device , typename Begin , typename End , typename Function , typename... FunctionArgs> | |
std::enable_if_t< std::is_integral_v< Begin > &&std::is_integral_v< End > > | parallelFor (const Begin &begin, const End &end, typename Device::LaunchConfiguration launch_config, Function f, FunctionArgs... args) |
Parallel for-loop function for 1D range specified with integral values. | |
template<typename Device , typename Begin , typename End , typename Function , typename... FunctionArgs> | |
std::enable_if_t< IsStaticArrayType< Begin >::value &&IsStaticArrayType< End >::value > | parallelFor (const Begin &begin, const End &end, typename Device::LaunchConfiguration launch_config, Function f, FunctionArgs... args) |
Parallel for-loop function for range specified with multi-index values. | |
template<typename Array , typename Device = typename Array::DeviceType, typename Reduction , typename Result > | |
auto | reduce (const Array &array, Reduction &&reduction, Result identity) |
Variant of reduce for arrays, views and compatible objects. | |
template<typename Array , typename Device = typename Array::DeviceType, typename Reduction = TNL::Plus> | |
auto | reduce (const Array &array, Reduction &&reduction=TNL::Plus{}) |
Variant of reduce for arrays, views and compatible objects. | |
template<typename Device , typename Index , typename Result , typename Fetch , typename Reduction > | |
Result | reduce (Index begin, Index end, Fetch &&fetch, Reduction &&reduction, const Result &identity) |
reduce implements (parallel) reduction for vectors and arrays. | |
template<typename Device , typename Index , typename Fetch , typename Reduction = TNL::Plus> | |
auto | reduce (Index begin, Index end, Fetch &&fetch, Reduction &&reduction=TNL::Plus{}) |
Variant of reduce with functional instead of reduction lambda function. | |
template<typename Array , typename Device = typename Array::DeviceType, typename Reduction > | |
auto | reduceWithArgument (const Array &array, Reduction &&reduction) |
Variant of reduceWithArgument for arrays, views and compatible objects. | |
template<typename Array , typename Device = typename Array::DeviceType, typename Reduction , typename Result > | |
auto | reduceWithArgument (const Array &array, Reduction &&reduction, Result identity) |
Variant of reduceWithArgument for arrays, views and compatible objects. | |
template<typename Device , typename Index , typename Fetch , typename Reduction > | |
auto | reduceWithArgument (Index begin, Index end, Fetch &&fetch, Reduction &&reduction) |
Variant of reduceWithArgument with functional instead of reduction lambda function. | |
template<typename Device , typename Index , typename Result , typename Fetch , typename Reduction > | |
std::pair< Result, Index > | reduceWithArgument (Index begin, Index end, Fetch &&fetch, Reduction &&reduction, const Result &identity) |
Variant of reduce returning also the position of the element of interest. | |
template<typename Array , typename Compare , typename Sorter = typename Sorting::DefaultSorter< typename Array::DeviceType >::SorterType> | |
void | sort (Array &array, const Compare &compare, const Sorter &sorter=Sorter{}) |
Function for sorting elements of array or vector based on a user defined comparison lambda function. | |
template<typename Device , typename Index , typename Compare , typename Swap , typename Sorter = typename Sorting::DefaultInplaceSorter< Device >::SorterType> | |
void | sort (const Index begin, const Index end, Compare &&compare, Swap &&swap, const Sorter &sorter=Sorter{}) |
Function for general sorting based on lambda functions for comparison and swaping of two elements.. | |
template<typename Index , Index begin, Index end, typename Func , typename... ArgTypes> | |
constexpr void | staticFor (Func &&f, ArgTypes &&... args) |
Generic loop with constant bounds and indices usable in constant expressions. | |
template<typename Index , Index begin, Index end, Index unrollFactor = 8, typename Func > | |
constexpr void | unrolledFor (Func &&f) |
Generic for-loop with explicit unrolling. | |
Namespace for fundamental TNL algorithms.
It contains algorithms like for-loops, memory operations, (parallel) reduction, multireduction, scan etc.
void TNL::Algorithms::ascendingSort | ( | Array & | array, |
const Sorter & | sorter = Sorter{} ) |
Function for sorting elements of array or vector in ascending order.
Array | is a type of container to be sorted. It can be, for example, TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, TNL::Containers::VectorView. |
Sorter | is an algorithm for sorting. It can be TNL::Algorithms::Sorting::STLSort for sorting on host and TNL::Algorithms::Sorting::Quicksort or TNL::Algorithms::Sorting::BitonicSort for sorting on CUDA GPU. |
array | is an instance of array/array view/vector/vector view for sorting. |
sorter | is an instance of sorter. |
bool TNL::Algorithms::contains | ( | const Array & | array, |
typename Array::ValueType | value, | ||
typename Array::IndexType | begin = 0, | ||
typename Array::IndexType | end = 0 ) |
Checks if an array/vector/view contains an element with given value.
By default, all elements of the array are checked. If begin or end is set to a non-zero value, only elements in the sub-interval [begin, end)
are checked.
array | The array to be searched. |
value | The value to be checked. |
begin | The beginning of the array sub-interval. It is 0 by default. |
end | The end of the array sub-interval. The default value is 0 which is, however, replaced with the array size. |
true
if there is at least one element in the sub-interval [begin, end)
which has the value value. Returns false
if the range is empty. bool TNL::Algorithms::containsOnlyValue | ( | const Array & | array, |
typename Array::ValueType | value, | ||
typename Array::IndexType | begin = 0, | ||
typename Array::IndexType | end = 0 ) |
Checks if all elements of an array/vector/view have the given value.
By default, all elements of the array are checked. If begin or end is set to a non-zero value, only elements in the sub-interval [begin, end)
are checked.
array | The array to be searched. |
value | The value to be checked. |
begin | The beginning of the array sub-interval. It is 0 by default. |
end | The end of the array sub-interval. The default value is 0 which is, however, replaced with the array size. |
true
if all elements in the sub-interval [begin, end)
have the same value value. Returns true
if the range is empty. void TNL::Algorithms::copy | ( | DestinationElement * | destination, |
const SourceElement * | source, | ||
Index | size ) |
Copies memory from source to destination.
The source data is allocated on the device specified by SourceDevice and the destination data is allocated on the device specified by DestinationDevice.
DestinationDevice | is the device where the destination data is allocated. |
SourceDevice | is the device where the source data is allocated. |
DestinationElement | is the type of the destination data. |
SourceElement | is the type of the source data. |
Index | is the type of the size of the data. |
destination | is the pointer to the destination data. |
source | is the pointer to the source data. |
size | is the size of the data. |
void TNL::Algorithms::copy | ( | DestinationElement * | destination, |
Index | destinationSize, | ||
SourceIterator | begin, | ||
SourceIterator | end ) |
Copies memory from source iterator range to destination.
The source data must be allocated on the host device. The destination data is allocated on the device specified by DestinationDevice.
DestinationDevice | is the device where the destination data is allocated. |
DestinationElement | is the type of the destination data. |
Index | is the type of the size of the data. |
SourceIterator | is the iterator type for the source data. |
destination | is the pointer to the destination data. |
destinationSize | is the size of the destination data. |
begin | is the iterator to the first element of the source data range. |
end | is the one-past-the-end iterator of the source data range. |
void TNL::Algorithms::copy | ( | std::vector< DestinationElement > & | destination, |
const Array & | source ) |
Copies memory from the source TNL array-like container to the destination STL vector.
Array | is the type of array where the source data is stored. It can be for example TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector or TNL::Containers::VectorView. |
DestinationElement | is the type of the destination data stored in the STL vector. |
destination | is the destination STL vector. |
source | is the source TNL array. |
void TNL::Algorithms::descendingSort | ( | Array & | array, |
const Sorter & | sorter = Sorter{} ) |
Function for sorting elements of array or vector in descending order.
Array | is a type of container to be sorted. It can be, for example, TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, TNL::Containers::VectorView. |
Sorter | is an algorithm for sorting. It can be TNL::Algorithms::Sorting::STLSort for sorting on host and TNL::Algorithms::Sorting::Quicksort or TNL::Algorithms::Sorting::BitonicSort for sorting on CUDA GPU. |
array | is an instance of array/array view/vector/vector view for sorting. |
sorter | is an instance of sorter. |
void TNL::Algorithms::distributedExclusiveScan | ( | const InputDistributedArray & | input, |
OutputDistributedArray & | output, | ||
typename InputDistributedArray::IndexType | begin, | ||
typename InputDistributedArray::IndexType | end, | ||
Reduction && | reduction, | ||
typename OutputDistributedArray::ValueType | identity ) |
Computes an exclusive scan (or prefix sum) of a distributed array in-place.
Exclusive scan (or prefix sum) operation turns a sequence \(a_1, \ldots, a_n\) into a sequence \(\sigma_1, \ldots, \sigma_n\) defined as
\[ \sigma_i = \sum_{j=1}^{i-1} a_i. \]
DistributedArray | type of the distributed array to be scanned |
Reduction | type of the reduction functor |
input | input array |
output | output array |
begin | the first element in the array to be scanned |
end | the last element in the array to be scanned |
reduction | functor implementing the reduction operation |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
The reduction functor takes two variables to be reduced:
void TNL::Algorithms::distributedExclusiveScan | ( | const InputDistributedArray & | input, |
OutputDistributedArray & | output, | ||
typename InputDistributedArray::IndexType | begin = 0, | ||
typename InputDistributedArray::IndexType | end = 0, | ||
Reduction && | reduction = TNL::Plus{} ) |
Overload of distributedExclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default.
The identity element is taken as reduction.template getIdentity< typename OutputDistributedArray::ValueType >()
. See distributedExclusiveScan for the explanation of other parameters. Note that when end
equals 0 (the default), it is set to input.getSize()
.
void TNL::Algorithms::distributedInclusiveScan | ( | const InputDistributedArray & | input, |
OutputDistributedArray & | output, | ||
typename InputDistributedArray::IndexType | begin, | ||
typename InputDistributedArray::IndexType | end, | ||
Reduction && | reduction, | ||
typename OutputDistributedArray::ValueType | identity ) |
Computes an inclusive scan (or prefix sum) of a distributed array in-place.
Inclusive scan (or prefix sum) operation turns a sequence \(a_1, \ldots, a_n\) into a sequence \(s_1, \ldots, s_n\) defined as
\[ s_i = \sum_{j=1}^i a_i. \]
DistributedArray | type of the distributed array to be scanned |
Reduction | type of the reduction functor |
input | input array |
output | output array |
begin | the first element in the array to be scanned |
end | the last element in the array to be scanned |
reduction | functor implementing the reduction operation |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
The reduction functor takes two variables to be reduced:
void TNL::Algorithms::distributedInclusiveScan | ( | const InputDistributedArray & | input, |
OutputDistributedArray & | output, | ||
typename InputDistributedArray::IndexType | begin = 0, | ||
typename InputDistributedArray::IndexType | end = 0, | ||
Reduction && | reduction = TNL::Plus{} ) |
Overload of distributedInclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default.
The identity element is taken as reduction.template getIdentity< typename OutputDistributedArray::ValueType >()
. See distributedInclusiveScan for the explanation of other parameters. Note that when end
equals 0 (the default), it is set to input.getSize()
.
void TNL::Algorithms::distributedInplaceExclusiveScan | ( | DistributedArray & | array, |
typename DistributedArray::IndexType | begin, | ||
typename DistributedArray::IndexType | end, | ||
Reduction && | reduction, | ||
typename DistributedArray::ValueType | identity ) |
Computes an exclusive scan (or prefix sum) of a distributed array in-place.
Exclusive scan (or prefix sum) operation turns a sequence \(a_1, \ldots, a_n\) into a sequence \(\sigma_1, \ldots, \sigma_n\) defined as
\[ \sigma_i = \sum_{j=1}^{i-1} a_i. \]
DistributedArray | type of the distributed array to be scanned |
Reduction | type of the reduction functor |
array | input array, the result of scan is stored in the same array |
begin | the first element in the array to be scanned |
end | the last element in the array to be scanned |
reduction | functor implementing the reduction operation |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
The reduction functor takes two variables to be reduced:
void TNL::Algorithms::distributedInplaceExclusiveScan | ( | DistributedArray & | array, |
typename DistributedArray::IndexType | begin = 0, | ||
typename DistributedArray::IndexType | end = 0, | ||
Reduction && | reduction = TNL::Plus{} ) |
Overload of distributedInplaceExclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default.
The identity element is taken as reduction.template getIdentity< typename DistributedArray::ValueType >()
. See distributedInplaceExclusiveScan for the explanation of other parameters. Note that when end
equals 0 (the default), it is set to array.getSize()
.
void TNL::Algorithms::distributedInplaceInclusiveScan | ( | DistributedArray & | array, |
typename DistributedArray::IndexType | begin, | ||
typename DistributedArray::IndexType | end, | ||
Reduction && | reduction, | ||
typename DistributedArray::ValueType | identity ) |
Computes an inclusive scan (or prefix sum) of a distributed array in-place.
Inclusive scan (or prefix sum) operation turns a sequence \(a_1, \ldots, a_n\) into a sequence \(s_1, \ldots, s_n\) defined as
\[ s_i = \sum_{j=1}^i a_i. \]
DistributedArray | type of the distributed array to be scanned |
Reduction | type of the reduction functor |
array | input array, the result of scan is stored in the same array |
begin | the first element in the array to be scanned |
end | the last element in the array to be scanned |
reduction | functor implementing the reduction operation |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
The reduction functor takes two variables to be reduced:
void TNL::Algorithms::distributedInplaceInclusiveScan | ( | DistributedArray & | array, |
typename DistributedArray::IndexType | begin = 0, | ||
typename DistributedArray::IndexType | end = 0, | ||
Reduction && | reduction = TNL::Plus{} ) |
Overload of distributedInplaceInclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default.
The identity element is taken as reduction.template getIdentity< typename DistributedArray::ValueType >()
. See distributedInplaceInclusiveScan for the explanation of other parameters. Note that when end
equals 0 (the default), it is set to array.getSize()
.
bool TNL::Algorithms::equal | ( | DestinationElement * | destination, |
const SourceElement * | source, | ||
Index | size ) |
Compares memory from source with destination.
The source data is allocated on the device specified by SourceDevice and the destination data is allocated on the device specified by DestinationDevice.
DestinationDevice | is the device where the destination data is allocated. |
SourceDevice | is the device where the source data is allocated. |
DestinationElement | is the type of the destination data. |
SourceElement | is the type of the source data. |
Index | is the type of the size of the data. |
destination | is the pointer to the destination data. |
source | is the pointer to the source data. |
size | is the size of the data. |
true
if all elements are equal, false
otherwise. void TNL::Algorithms::exclusiveScan | ( | const InputArray & | input, |
OutputArray & | output, | ||
typename InputArray::IndexType | begin, | ||
typename InputArray::IndexType | end, | ||
typename OutputArray::IndexType | outputBegin, | ||
Reduction && | reduction, | ||
typename OutputArray::ValueType | identity ) |
Computes an exclusive scan (or prefix sum) of an input array and stores it in an output array.
Exclusive scan (or prefix sum) operation turns a sequence \(a_1, \ldots, a_n\) into a sequence \(\sigma_1, \ldots, \sigma_n\) defined as
\[ \sigma_i = \sum_{j=1}^{i-1} a_i. \]
InputArray | type of the array to be scanned |
OutputArray | type of the output array |
Reduction | type of the reduction functor |
input | the input array to be scanned |
output | the array where the result will be stored |
begin | the first element in the array to be scanned |
end | the last element in the array to be scanned |
outputBegin | the first element in the output array to be written. There must be at least end - begin elements in the output array starting at the position given by outputBegin . |
reduction | functor implementing the reduction operation |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
The reduction functor takes two variables to be reduced:
void TNL::Algorithms::exclusiveScan | ( | const InputArray & | input, |
OutputArray & | output, | ||
typename InputArray::IndexType | begin = 0, | ||
typename InputArray::IndexType | end = 0, | ||
typename OutputArray::IndexType | outputBegin = 0, | ||
Reduction && | reduction = TNL::Plus{} ) |
Overload of exclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default.
The identity element is taken as reduction.template getIdentity< typename OutputArray::ValueType >()
. See exclusiveScan for the explanation of other parameters. Note that when end
equals 0 (the default), it is set to input.getSize()
.
void TNL::Algorithms::fill | ( | Element * | data, |
const Element & | value, | ||
Index | size ) |
Fills memory between data
and data + size
with a value
.
Device | is the device where the data is allocated. |
Element | is the type of the data. |
Index | is the type of the size of the data. |
data | is the pointer to the memory where the value will be set. |
value | is the value to be filled. |
size | is the size of the data. |
void TNL::Algorithms::fillRandom | ( | Element * | data, |
Index | size, | ||
Element | min_val, | ||
Element | max_val ) |
Fills memory between data
and data + size
with random Element values in the given range.
Device | is the device where the data is allocated. |
Element | is the type of the data. |
Index | is the type of the size of the data. |
data | is the pointer to the memory where the random values will be set. |
size | is the size of the data. |
min_val | is the minimum random value |
max_val | is the maximum random value |
std::pair< bool, typename Container::IndexType > TNL::Algorithms::find | ( | const Container & | container, |
const ValueType & | value ) |
Find the first occurrence of a value in an array.
Container | is the type of the container. |
ValueType | is the type of the value to be found. |
IndexType | is the type used for indexing. |
container | is the array where the value is searched. |
value | is the value to be found. |
(found, position)
where found is a boolean indicating if the value was found and position is the position of the first occurrence in the container. void TNL::Algorithms::inclusiveScan | ( | const InputArray & | input, |
OutputArray & | output, | ||
typename InputArray::IndexType | begin, | ||
typename InputArray::IndexType | end, | ||
typename OutputArray::IndexType | outputBegin, | ||
Reduction && | reduction, | ||
typename OutputArray::ValueType | identity ) |
Computes an inclusive scan (or prefix sum) of an input array and stores it in an output array.
Inclusive scan (or prefix sum) operation turns a sequence \(a_1, \ldots, a_n\) into a sequence \(s_1, \ldots, s_n\) defined as
\[ s_i = \sum_{j=1}^i a_i. \]
InputArray | type of the array to be scanned |
OutputArray | type of the output array |
Reduction | type of the reduction functor |
input | the input array to be scanned |
output | the array where the result will be stored |
begin | the first element in the array to be scanned |
end | the last element in the array to be scanned |
outputBegin | the first element in the output array to be written. There must be at least end - begin elements in the output array starting at the position given by outputBegin . |
reduction | functor implementing the reduction operation |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
The reduction functor takes two variables to be reduced:
void TNL::Algorithms::inclusiveScan | ( | const InputArray & | input, |
OutputArray & | output, | ||
typename InputArray::IndexType | begin = 0, | ||
typename InputArray::IndexType | end = 0, | ||
typename OutputArray::IndexType | outputBegin = 0, | ||
Reduction && | reduction = TNL::Plus{} ) |
Overload of inclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default.
The identity element is taken as reduction.template getIdentity< typename OutputArray::ValueType >()
. See inclusiveScan for the explanation of other parameters. Note that when end
equals 0 (the default), it is set to input.getSize()
.
void TNL::Algorithms::inplaceExclusiveScan | ( | Array & | array, |
typename Array::IndexType | begin, | ||
typename Array::IndexType | end, | ||
Reduction && | reduction, | ||
typename Array::ValueType | identity ) |
Computes an exclusive scan (or prefix sum) of an array in-place.
Exclusive scan (or prefix sum) operation turns a sequence \(a_1, \ldots, a_n\) into a sequence \(\sigma_1, \ldots, \sigma_n\) defined as
\[ \sigma_i = \sum_{j=1}^{i-1} a_i. \]
Array | type of the array to be scanned |
Reduction | type of the reduction functor |
array | input array, the result of scan is stored in the same array |
begin | the first element in the array to be scanned |
end | the last element in the array to be scanned |
reduction | functor implementing the reduction operation |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
The reduction functor takes two variables to be reduced:
void TNL::Algorithms::inplaceExclusiveScan | ( | Array & | array, |
typename Array::IndexType | begin = 0, | ||
typename Array::IndexType | end = 0, | ||
Reduction && | reduction = TNL::Plus{} ) |
Overload of inplaceExclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default.
The identity element is taken as reduction.template getIdentity< typename Array::ValueType >()
. See inplaceExclusiveScan for the explanation of other parameters. Note that when end
equals 0 (the default), it is set to array.getSize()
.
void TNL::Algorithms::inplaceInclusiveScan | ( | Array & | array, |
typename Array::IndexType | begin, | ||
typename Array::IndexType | end, | ||
Reduction && | reduction, | ||
typename Array::ValueType | identity ) |
Computes an inclusive scan (or prefix sum) of an array in-place.
Inclusive scan (or prefix sum) operation turns a sequence \(a_1, \ldots, a_n\) into a sequence \(s_1, \ldots, s_n\) defined as
\[ s_i = \sum_{j=1}^i a_i. \]
Array | type of the array to be scanned |
Reduction | type of the reduction functor |
array | input array, the result of scan is stored in the same array |
begin | the first element in the array to be scanned |
end | the last element in the array to be scanned |
reduction | functor implementing the reduction operation |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
The reduction functor takes two variables to be reduced:
void TNL::Algorithms::inplaceInclusiveScan | ( | Array & | array, |
typename Array::IndexType | begin = 0, | ||
typename Array::IndexType | end = 0, | ||
Reduction && | reduction = TNL::Plus{} ) |
Overload of inplaceInclusiveScan which uses a TNL functional object for reduction. TNL::Plus is used by default.
The identity element is taken as reduction.template getIdentity< typename Array::ValueType >()
. See inplaceInclusiveScan for the explanation of other parameters. Note that when end
equals 0 (the default), it is set to array.getSize()
.
bool TNL::Algorithms::isAscending | ( | const Array & | arr | ) |
Functions returning true if the array elements are sorted in ascending order.
Array | is the type of array/vector. It can be, for example, TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, TNL::Containers::VectorView. |
arr | is an instance of tested array. |
bool TNL::Algorithms::isDescending | ( | const Array & | arr | ) |
Functions returning true if the array elements are sorted in descending order.
Array | is the type of array/vector. It can be, for example, TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, TNL::Containers::VectorView. |
arr | is an instance of tested array. |
bool TNL::Algorithms::isSorted | ( | const Array & | arr, |
const Compare & | compare ) |
Functions returning true if the array elements are sorted according to the lmabda function comparison
.
Array | is the type of array/vector. It can be, for example, TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, TNL::Containers::VectorView. |
Compare | is a lambda function for comparing of two elements. It returns true if the first argument should be ordered before the second - both are given by indices representing their positions. The lambda function is supposed to be defined as follows: |
arr | is an instance of tested array. |
compare | is an instance of the lambda function for elements comparison. |
std::enable_if_t< std::is_integral_v< Begin > &&std::is_integral_v< End > > TNL::Algorithms::parallelFor | ( | const Begin & | begin, |
const End & | end, | ||
typename Device::LaunchConfiguration | launch_config, | ||
Function | f, | ||
FunctionArgs... | args ) |
Parallel for-loop function for 1D range specified with integral values.
Device | is a type of the device where the reduction will be performed. |
Begin | must be an integral type. |
End | must be an integral type. |
begin | is the left bound of the iteration range [begin, end) . |
end | is the right bound of the iteration range [begin, end) . |
f | is the function to be called in each iteration. Arguments of the function are the iteration index and arguments from the args... variadic pack. |
launch_config | specifies kernel launch parameters. |
args | are additional parameters to be passed to the function f. |
std::enable_if_t< IsStaticArrayType< Begin >::value &&IsStaticArrayType< End >::value > TNL::Algorithms::parallelFor | ( | const Begin & | begin, |
const End & | end, | ||
typename Device::LaunchConfiguration | launch_config, | ||
Function | f, | ||
FunctionArgs... | args ) |
Parallel for-loop function for range specified with multi-index values.
Device | is a type of the device where the reduction will be performed. |
Begin | must satisfy the constraints checked by the TNL::IsStaticArrayType type trait. |
End | must satisfy the constraints checked by the TNL::IsStaticArrayType type trait. |
begin | is the left bound of the iteration range [begin, end) . |
end | is the right bound of the iteration range [begin, end) . |
f | is the function to be called in each iteration. Arguments of the function are the iteration multi-index, which is an instance of the End type, and arguments from the args... variadic pack. |
launch_config | specifies kernel launch parameters. |
args | are additional parameters to be passed to the function f. |
auto TNL::Algorithms::reduce | ( | const Array & | array, |
Reduction && | reduction, | ||
Result | identity ) |
Variant of reduce for arrays, views and compatible objects.
The referenced reduce function is called with:
Device
, which is typename Array::DeviceType
by default, as the Device
type,0
as the beginning of the interval for reduction,array.getSize()
as the end of the interval for reduction,array.getConstView()
as the fetch
functor,reduction
as the reduction operation,identity
as the identity element of the reduction.auto TNL::Algorithms::reduce | ( | const Array & | array, |
Reduction && | reduction = TNL::Plus{} ) |
Variant of reduce for arrays, views and compatible objects.
Reduction can be one of the following TNL::Plus, TNL::Multiplies, TNL::Min, TNL::Max, TNL::LogicalAnd, TNL::LogicalOr, TNL::BitAnd or TNL::BitOr. TNL::Plus is used by default.
The referenced reduce function is called with:
Device
, which is typename Array::DeviceType
by default, as the Device
type,0
as the beginning of the interval for reduction,array.getSize()
as the end of the interval for reduction,array.getConstView()
as the fetch
functor,reduction
as the reduction operation,Result TNL::Algorithms::reduce | ( | Index | begin, |
Index | end, | ||
Fetch && | fetch, | ||
Reduction && | reduction, | ||
const Result & | identity ) |
reduce implements (parallel) reduction for vectors and arrays.
Reduction can be used for operations having one or more vectors (or arrays) elements as input and returning one number (or element) as output. Some examples of such operations can be vectors/arrays comparison, vector norm, scalar product of two vectors or computing minimum or maximum. If one needs to know even the position of the smallest or the largest element, the function reduceWithArgument can be used.
Device | is a type of the device where the reduction will be performed. |
Index | is a type for indexing. |
Result | is a type of the reduction result. |
Fetch | is a lambda function for fetching the input data. |
Reduction | is a lambda function performing the reduction. |
Device can be on of the following TNL::Devices::Sequential, TNL::Devices::Host and TNL::Devices::Cuda.
begin | defines range [begin, end) of indexes which will be used for the reduction. |
end | defines range [begin, end) of indexes which will be used for the reduction. |
fetch | is a lambda function fetching the input data. |
reduction | is a lambda function defining the reduction operation. |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
The fetch
lambda function takes one argument which is index of the element to be fetched:
The reduction
lambda function takes two variables which are supposed to be reduced:
auto TNL::Algorithms::reduce | ( | Index | begin, |
Index | end, | ||
Fetch && | fetch, | ||
Reduction && | reduction = TNL::Plus{} ) |
Variant of reduce with functional instead of reduction lambda function.
Device | is a type of the device where the reduction will be performed. |
Index | is a type for indexing. |
Fetch | is a lambda function for fetching the input data. |
Reduction | is a functional performing the reduction. |
Device can be on of the following TNL::Devices::Sequential, TNL::Devices::Host and TNL::Devices::Cuda.
Reduction can be one of the following TNL::Plus, TNL::Multiplies, TNL::Min, TNL::Max, TNL::LogicalAnd, TNL::LogicalOr, TNL::BitAnd or TNL::BitOr. TNL::Plus is used by default.
begin | defines range [begin, end) of indexes which will be used for the reduction. |
end | defines range [begin, end) of indexes which will be used for the reduction. |
fetch | is a lambda function fetching the input data. |
reduction | is a lambda function defining the reduction operation. |
The fetch
lambda function takes one argument which is index of the element to be fetched:
auto TNL::Algorithms::reduceWithArgument | ( | const Array & | array, |
Reduction && | reduction ) |
Variant of reduceWithArgument for arrays, views and compatible objects.
Reduction can be one of TNL::MinWithArg, TNL::MaxWithArg.
The referenced reduceWithArgument function is called with:
Device
, which is typename Array::DeviceType
by default, as the Device
type,0
as the beginning of the interval for reduction,array.getSize()
as the end of the interval for reduction,array.getConstView()
as the fetch
functor,reduction
as the reduction operation,auto TNL::Algorithms::reduceWithArgument | ( | const Array & | array, |
Reduction && | reduction, | ||
Result | identity ) |
Variant of reduceWithArgument for arrays, views and compatible objects.
The referenced reduceWithArgument function is called with:
Device
, which is typename Array::DeviceType
by default, as the Device
type,0
as the beginning of the interval for reduction,array.getSize()
as the end of the interval for reduction,array.getConstView()
as the fetch
functor,reduction
as the reduction operation,identity
as the identity element of the reduction.auto TNL::Algorithms::reduceWithArgument | ( | Index | begin, |
Index | end, | ||
Fetch && | fetch, | ||
Reduction && | reduction ) |
Variant of reduceWithArgument with functional instead of reduction lambda function.
Device | is a type of the device where the reduction will be performed. |
Index | is a type for indexing. |
Result | is a type of the reduction result. |
Reduction | is a functional performing the reduction. |
Fetch | is a lambda function for fetching the input data. |
Device can be on of the following TNL::Devices::Sequential, TNL::Devices::Host and TNL::Devices::Cuda.
Reduction can be one of TNL::MinWithArg, TNL::MaxWithArg.
begin | defines range [begin, end) of indexes which will be used for the reduction. |
end | defines range [begin, end) of indexes which will be used for the reduction. |
fetch | is a lambda function fetching the input data. |
reduction | is a lambda function defining the reduction operation and managing the elements positions. |
pair.first
is the element position and pair.second
is the reduction result.The fetch
lambda function takes one argument which is index of the element to be fetched:
The reduction
lambda function takes two variables which are supposed to be reduced:
std::pair< Result, Index > TNL::Algorithms::reduceWithArgument | ( | Index | begin, |
Index | end, | ||
Fetch && | fetch, | ||
Reduction && | reduction, | ||
const Result & | identity ) |
Variant of reduce returning also the position of the element of interest.
For example, in case of computing minimal or maximal element in array/vector, the position of the element having given value can be obtained. This method is, however, more flexible.
Device | is a type of the device where the reduction will be performed. |
Index | is a type for indexing. |
Result | is a type of the reduction result. |
Reduction | is a lambda function performing the reduction. |
Fetch | is a lambda function for fetching the input data. |
Device can be on of the following TNL::Devices::Sequential, TNL::Devices::Host and TNL::Devices::Cuda.
begin | defines range [begin, end) of indexes which will be used for the reduction. |
end | defines range [begin, end) of indexes which will be used for the reduction. |
fetch | is a lambda function fetching the input data. |
reduction | is a lambda function defining the reduction operation and managing the elements positions. |
identity | is the identity element for the reduction operation, i.e. element which does not change the result of the reduction. |
pair.first
is the element position and pair.second
is the reduction result.The fetch
lambda function takes one argument which is index of the element to be fetched:
The reduction
lambda function takes two variables which are supposed to be reduced:
void TNL::Algorithms::sort | ( | Array & | array, |
const Compare & | compare, | ||
const Sorter & | sorter = Sorter{} ) |
Function for sorting elements of array or vector based on a user defined comparison lambda function.
Array | is a type of container to be sorted. It can be, for example, TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, TNL::Containers::VectorView. |
Compare | is a lambda function for comparing of two elements. It returns true if the first argument should be ordered before the second. The lambda function is supposed to be defined as follows (ValueType is type of the array elements): auto compare = [] __cuda_callable__ ( const ValueType& a , const ValueType& b ) -> bool { return .... };
|
Sorter | is an algorithm for sorting. It can be TNL::Algorithms::Sorting::STLSort for sorting on host and TNL::Algorithms::Sorting::Quicksort or TNL::Algorithms::Sorting::BitonicSort for sorting on CUDA GPU. |
array | is an instance of array/array view/vector/vector view for sorting. |
compare | is an instance of the lambda function for comparison of two elements. |
sorter | is an instance of sorter. |
void TNL::Algorithms::sort | ( | const Index | begin, |
const Index | end, | ||
Compare && | compare, | ||
Swap && | swap, | ||
const Sorter & | sorter = Sorter{} ) |
Function for general sorting based on lambda functions for comparison and swaping of two elements..
Device | is device on which the sorting algorithms should be executed. |
Index | is type used for indexing of the sorted data. |
Compare | is a lambda function for comparing of two elements. It returns true if the first argument should be ordered before the second - both are given by indices representing their positions. The lambda function is supposed to be defined as follows: |
Swap | is a lambda function for swaping of two elements which are ordered wrong way. Both elements are represented by indices as well. It supposed to be defined as: __cuda_callable__ constexpr void swap(Type &a, Type &b) This function swaps values of two parameters. Definition Math.h:496 |
Sorter | is an algorithm for sorting. It can be TNL::Algorithms::Sorting::BitonicSort for sorting on CUDA GPU. Currently there is no algorithm for CPU :(. |
begin | is the first index of the range [begin, end) to be sorted. |
end | is the end index of the range [begin, end) to be sorted. |
compare | is an instance of the lambda function for comparison of two elements. |
swap | is an instance of the lambda function for swapping of two elements. |
sorter | is an instance of sorter. |
|
constexpr |
Generic loop with constant bounds and indices usable in constant expressions.
staticFor is a generic C++17 implementation of a static for-loop using constexpr functions and template metaprogramming. It is equivalent to executing a function f(i, args...)
for arguments i
from the integral range [begin, end)
, but with the type std::integral_constant rather than int
or std::size_t
representing the indices. Hence, each index has its own distinct C++ type and the value of the index can be deduced from the type. The args...
are additional user-supplied arguments that are forwarded to the staticFor function.
Also note that thanks to constexpr
cast operator, the argument i
can be used in constant expressions and the staticFor function can be used from the host code as well as CUDA kernels (TNL requires the --expt-relaxed-constexpr
parameter when compiled by nvcc
).
Index | is the type of the loop indices. |
begin | is the left bound of the iteration range [begin, end) . |
end | is the right bound of the iteration range [begin, end) . |
Func | is the type of the functor (it is usually deduced from the argument used in the function call). |
ArgTypes | are the types of additional arguments passed to the function. |
f | is the functor to be called in each iteration. |
args | are additional user-supplied arguments that are forwarded to each call of f. |
|
constexpr |
Generic for-loop with explicit unrolling.
unrolledFor performs explicit loop unrolling of short loops which can improve performance in some cases. The bounds of the for-loop must be constant (i.e. known at the compile time). Loops longer than unrollFactor are not unrolled and executed as a normal for-loop.
The unroll factor is configurable, but note that full unrolling does not make sense for very long loops. It might even trigger the compiler's limit on recursive template instantiation. Also note that the compiler will (at least partially) unroll loops with static bounds anyway.
Index | is the type of the loop indices. |
begin | is the left bound of the iteration range [begin, end) . |
end | is the right bound of the iteration range [begin, end) . |
unrollFactor | is the maximum length of loops to fully unroll via recursive template instantiation. |
Func | is the type of the functor (it is usually deduced from the argument used in the function call). |
f | is the functor to be called in each iteration. |