Template Numerical Library version\ main:4904c12
Loading...
Searching...
No Matches
TNL::Matrices Namespace Reference

Namespace for matrix formats. More...

Classes

class  DenseMatrix
 Implementation of dense matrix, i.e. matrix storing explicitly all of its elements including zeros. More...
class  DenseMatrixBase
 Implementation of dense matrix view. More...
class  DenseMatrixElement
 Accessor for dense matrix elements. More...
class  DenseMatrixRowView
 RowView is a simple structure for accessing rows of dense matrix. More...
class  DenseMatrixView
 Implementation of dense matrix view. More...
class  DistributedMatrix
struct  GeneralMatrix
 General non-symmetric matrix type. More...
class  GinkgoOperator
 Wraps a general TNL matrix as a Ginkgo LinOp. More...
class  HypreCSRMatrix
 Wrapper for Hypre's sequential CSR matrix. More...
class  HypreParCSRMatrix
 Wrapper for Hypre's sequential CSR matrix. More...
struct  isSparseCSRMatrix
 This checks if the sparse matrix is in CSR format. More...
struct  isSparseCSRMatrix< SparseMatrix< Real, Device, Index, MatrixType, Segments, ComputeReal, RealAllocator, IndexAllocator > >
struct  isSparseCSRMatrix< SparseMatrixView< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > >
class  LambdaMatrix
 "Matrix-free matrix" based on lambda functions. More...
class  LambdaMatrixElement
 Accessor for elements of lambda matrix. More...
struct  LambdaMatrixFactory
 Helper class for creating instances of LambdaMatrix. More...
class  LambdaMatrixRowView
 RowView is a simple structure for accessing rows of Lambda matrix. More...
class  LambdaMatrixRowViewIterator
class  MatrixBase
 Base class for the implementation of concrete matrix types. More...
struct  MatrixInfo
class  MatrixOperations
class  MatrixOperations< Devices::Cuda >
class  MatrixReader
 Helper class for importing of matrices from different input formats. More...
class  MatrixRowViewIterator
struct  MatrixType
 Structure for specifying type of sparse matrix. More...
class  MatrixWriter
 Helper class for exporting of matrices to different output formats. More...
class  MultidiagonalMatrix
 Implementation of sparse multidiagonal matrix. More...
class  MultidiagonalMatrixBase
 A common base class for MultidiagonalMatrix and MultidiagonalMatrixView. More...
class  MultidiagonalMatrixElement
 Accessor for multidiagonal matrix elements. More...
class  MultidiagonalMatrixRowView
 RowView is a simple structure for accessing rows of multidiagonal matrix. More...
class  MultidiagonalMatrixView
 Implementation of sparse multidiagonal matrix. More...
class  SparseMatrix
 Implementation of sparse matrix, i.e. matrix storing only non-zero elements. More...
class  SparseMatrixBase
 Implementation of sparse matrix view. More...
class  SparseMatrixElement
 Accessor for sparse matrix elements. More...
class  SparseMatrixRowView
 RowView is a simple structure for accessing rows of sparse matrix. More...
class  SparseMatrixView
 Implementation of sparse matrix view. More...
class  StaticMatrix
struct  SymmetricMatrix
 Symmetric matrix type. More...
class  TridiagonalMatrix
 Implementation of sparse tridiagonal matrix. More...
class  TridiagonalMatrixBase
 A common base class for TridiagonalMatrix and TridiagonalMatrixView. More...
class  TridiagonalMatrixRowView
 RowView is a simple structure for accessing rows of tridiagonal matrix. More...
class  TridiagonalMatrixView
 Implementation of sparse tridiagonal matrix. More...

Typedefs

template<typename T>
using is_dense_matrix = decltype( isDenseMatrix( std::declval< T >() ) )
template<typename T>
using is_matrix = decltype( isMatrix( std::declval< T >() ) )
template<typename T>
using is_matrix_view = decltype( isMatrixView( std::declval< T >() ) )
template<typename Matrix>
using is_sparse_csr_matrix = isSparseCSRMatrix< Matrix >
template<typename Matrix>
using is_sparse_matrix = decltype( isSparseMatrix( std::declval< Matrix >() ) )

Enumerations

enum  ElementsOrganization
enum class  MatrixElementsEncoding : std::uint8_t { Complete , SymmetricLower , SymmetricUpper , SymmetricMixed }
 Encoding of the matrix elements in initializer lists or STL maps. More...
enum class  TransposeState : std::uint8_t { None , Transpose }

Functions

template<typename Matrix>
void compressSparseMatrix (Matrix &A)
 Avoids unnecessary zero elements.
template<typename Matrix, typename AdjacencyMatrix>
void copyAdjacencyStructure (const Matrix &A, AdjacencyMatrix &B, bool has_symmetric_pattern=false, bool ignore_diagonal=true)
template<typename Matrix1, typename Matrix2>
void copyDenseToDenseMatrix (Matrix1 &A, const Matrix2 &B)
template<typename Matrix1, typename Matrix2>
void copyDenseToSparseMatrix (Matrix1 &A, const Matrix2 &B)
template<typename Matrix1, typename Matrix2>
void copySparseMatrix (Matrix1 &A, const Matrix2 &B)
template<typename Matrix1, typename Matrix2>
void copySparseToDenseMatrix (Matrix1 &A, const Matrix2 &B)
template<typename TargetMatrix, typename SourceMatrix>
void copySparseToSparseMatrix (TargetMatrix &A, const SourceMatrix &B)
 Copies sparse matrix to sparse matrix.
template<typename Matrix1, typename Matrix2>
void copySymmetricSparseToGeneralSparseMatrix (Matrix1 &A, const Matrix2 &B)
template<typename Real>
__cuda_callable__ Real determinant (const StaticMatrix< Real, 2, 2 > &A)
template<typename Real>
__cuda_callable__ Real determinant (const StaticMatrix< Real, 3, 3 > &A)
template<typename Real>
__cuda_callable__ Real determinant (const StaticMatrix< Real, 4, 4 > &A)
template<typename Matrix, typename Function>
void forAllElements (const Matrix &matrix, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of all matrix rows of constant matrix and applies the specified lambda function.
template<typename Matrix, typename Function>
void forAllElements (Matrix &matrix, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of all matrix rows and applies the specified lambda function.
template<typename Matrix, typename Condition, typename Function>
void forAllElementsIf (const Matrix &matrix, Condition &&condition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of all matrix rows based on a condition.
template<typename Matrix, typename Condition, typename Function>
void forAllElementsIf (Matrix &matrix, Condition &&condition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of all matrix rows based on a condition.
template<typename Matrix, typename Function>
void forAllRows (const Matrix &matrix, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all matrix rows and applies the given lambda function to each row. This function is for constant matrices.
template<typename Matrix, typename Function>
void forAllRows (Matrix &matrix, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all matrix rows and applies the given lambda function to each row.
template<typename Matrix, typename RowCondition, typename Function>
void forAllRowsIf (const Matrix &matrix, RowCondition &&rowCondition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all matrix rows, applying a condition to determine whether each row should be processed. This function is for constant matrices.
template<typename Matrix, typename RowCondition, typename Function>
void forAllRowsIf (Matrix &matrix, RowCondition &&rowCondition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all matrix rows, applying a condition to determine whether each row should be processed.
template<typename Matrix, typename Array, typename Function>
void forElements (const Matrix &matrix, const Array &rowIndexes, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of matrix rows with the given indexes and applies the specified lambda function. This function is for constant matrices.
template<typename Matrix, typename Array, typename IndexBegin, typename IndexEnd, typename Function>
void forElements (const Matrix &matrix, const Array &rowIndexes, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of matrix rows with the given indexes and applies the specified lambda function. This function is for constant matrices.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Function>
void forElements (const Matrix &matrix, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of constant matrix in the given range of matrix rows and applies the specified lambda function.
template<typename Matrix, typename Array, typename Function>
void forElements (Matrix &matrix, const Array &rowIndexes, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of matrix rows with the given indexes and applies the specified lambda function.
template<typename Matrix, typename Array, typename IndexBegin, typename IndexEnd, typename Function>
void forElements (Matrix &matrix, const Array &rowIndexes, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of matrix rows with the given indexes and applies the specified lambda function.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Function>
void forElements (Matrix &matrix, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements in the given range of matrix rows and applies the specified lambda function.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Function>
void forElementsIf (const Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements in a given range of rows based on a condition. This function is for constant matrices.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Function>
void forElementsIf (Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements in a given range of rows based on a condition.
template<typename Matrix, typename Array, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value >>
void forRows (const Matrix &matrix, const Array &rowIndexes, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over matrix rows with the given indexes and applies the specified lambda function to each row. This function is for constant matrices.
template<typename Matrix, typename Array, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value && std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void forRows (const Matrix &matrix, const Array &rowIndexes, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over matrix rows with the given indexes and applies the specified lambda function to each row. This function is for constant matrices.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void forRows (const Matrix &matrix, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over matrix rows within the specified range of row indexes and applies the given lambda function to each row. This function is for constant matrices.
template<typename Matrix, typename Array, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value >>
void forRows (Matrix &matrix, const Array &rowIndexes, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over matrix rows with the given indexes and applies the specified lambda function to each row.
template<typename Matrix, typename Array, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value && std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void forRows (Matrix &matrix, const Array &rowIndexes, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over matrix rows with the given indexes and applies the specified lambda function to each row.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void forRows (Matrix &matrix, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over matrix rows within the specified range of row indexes and applies the given lambda function to each row.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename RowCondition, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void forRowsIf (const Matrix &matrix, IndexBegin begin, IndexEnd end, RowCondition &&rowCondition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over rows within the given range of row indexes, applying a condition to determine whether each row should be processed. This function is for constant matrices.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename RowCondition, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void forRowsIf (Matrix &matrix, IndexBegin begin, IndexEnd end, RowCondition &&rowCondition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over rows within the given range of row indexes, applying a condition to determine whether each row should be processed.
template<typename RealType, typename IndexType>
__global__ void GeamCudaKernel (const IndexType m, const IndexType n, const RealType alpha, const RealType *A, const IndexType lda, const RealType beta, const RealType *B, const IndexType ldb, RealType *C, const IndexType ldc)
template<typename RealType, typename IndexType>
__global__ void GemvCudaKernel (const IndexType m, const IndexType n, const RealType alpha, const RealType *A, const IndexType lda, const RealType *x, const RealType beta, RealType *y)
template<typename Matrix>
auto getDiagonal (const Matrix &matrix) -> TNL::Containers::Vector< typename Matrix::RealType, typename Matrix::DeviceType, typename Matrix::IndexType >
template<typename Matrix, typename Vector>
void getDiagonal (const Matrix &matrix, Vector &diagonalElements)
template<typename Matrix>
auto getGinkgoMatrixCsr (std::shared_ptr< const gko::Executor > exec, Matrix &matrix) -> std::unique_ptr< gko::matrix::Csr< typename Matrix::RealType, typename Matrix::IndexType > >
 Converts any TNL sparse matrix to a Ginkgo Csr matrix.
template<typename Matrix>
auto getGinkgoMatrixCsrView (std::shared_ptr< const gko::Executor > exec, Matrix &matrix) -> std::unique_ptr< gko::matrix::Csr< typename Matrix::RealType, typename Matrix::IndexType > >
 Creates a Ginkgo Csr matrix view from a TNL CSR matrix.
template<typename Matrix, typename Real, int tileDim = 16>
void getInPlaceTransposition (Matrix &matrix, Real matrixMultiplicator=1.0)
template<typename ResultMatrix, typename Matrix1, typename Matrix2, typename Real, int tileDim = 16>
void getMatrixProduct (ResultMatrix &resultMatrix, const Matrix1 &matrix1, const Matrix2 &matrix2, Real matrixMultiplicator=1.0, TransposeState transposeA=TransposeState::None, TransposeState transposeB=TransposeState::None)
template<typename OutMatrix, typename InMatrix>
OutMatrix getSymmetricPart (const InMatrix &inMatrix)
 This function computes \(( A + A^T ) / 2 \), where \( A \) is a square matrix.
template<typename ResultMatrix, typename Matrix, typename Real, int tileDim = 16>
void getTransposition (ResultMatrix &resultMatrix, const Matrix &matrix, Real matrixMultiplicator=1.0)
template<typename Real>
__cuda_callable__ StaticMatrix< Real, 2, 2 > inverse (const StaticMatrix< Real, 2, 2 > &A)
template<typename Real>
__cuda_callable__ StaticMatrix< Real, 3, 3 > inverse (const StaticMatrix< Real, 3, 3 > &A)
template<typename Real>
__cuda_callable__ StaticMatrix< Real, 4, 4 > inverse (const StaticMatrix< Real, 4, 4 > &A)
constexpr std::false_type isDenseMatrix (...)
 This checks if the matrix is dense matrix.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
constexpr std::true_type isDenseMatrix (const DenseMatrixBase< Real, Device, Index, Organization > &)
constexpr std::false_type isMatrix (...)
 This checks if given type is matrix.
template<typename MatrixElementsLambda, typename CompressedRowLengthsLambda, typename Real, typename Device, typename Index>
constexpr std::true_type isMatrix (const LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index > &)
template<typename Real, typename Device, typename Index, typename MatrixType, ElementsOrganization Organization>
constexpr std::true_type isMatrix (const MatrixBase< Real, Device, Index, MatrixType, Organization > &)
constexpr std::false_type isMatrixView (...)
 This checks if given type is matrix view.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
constexpr std::true_type isMatrixView (const DenseMatrixView< Real, Device, Index, Organization > &)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
constexpr std::true_type isMatrixView (const MultidiagonalMatrixView< Real, Device, Index, Organization > &)
template<typename Real, typename Device, typename Index, typename MatrixType, template< typename, typename > typename SegmentsView, typename ComputeReal>
constexpr std::true_type isMatrixView (const SparseMatrixView< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > &)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
constexpr std::true_type isMatrixView (const TridiagonalMatrixView< Real, Device, Index, Organization > &)
constexpr std::false_type isSparseMatrix (...)
 This checks if the type is sparse matrix.
template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>
constexpr std::true_type isSparseMatrix (const SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > &)
template<typename Value, std::size_t Rows1, std::size_t SharedDim, std::size_t Columns2, typename Permutation>
StaticMatrix< Value, Rows1, Columns2, Permutation > operator* (const StaticMatrix< Value, Rows1, SharedDim, Permutation > &matrix1, const StaticMatrix< Value, SharedDim, Columns2, Permutation > &matrix2)
template<typename Value, std::size_t Rows, std::size_t Columns, typename Permutation, typename T>
__cuda_callable__ StaticMatrix< Value, Rows, Columns, Permutation > operator* (const T &value, StaticMatrix< Value, Rows, Columns, Permutation > a)
template<typename Value, std::size_t Rows, std::size_t Columns, typename Permutation, typename T>
__cuda_callable__ StaticMatrix< Value, Rows, Columns, Permutation > operator* (StaticMatrix< Value, Rows, Columns, Permutation > a, const T &value)
template<typename Value, std::size_t Rows, std::size_t Columns, typename Permutation>
__cuda_callable__ StaticMatrix< Value, Rows, Columns, Permutation > operator+ (StaticMatrix< Value, Rows, Columns, Permutation > a, const StaticMatrix< Value, Rows, Columns > &b)
template<typename Value, std::size_t Rows, std::size_t Columns, typename Permutation>
__cuda_callable__ StaticMatrix< Value, Rows, Columns, Permutation > operator- (StaticMatrix< Value, Rows, Columns, Permutation > a, const StaticMatrix< Value, Rows, Columns > &b)
template<typename Value, std::size_t Rows, std::size_t Columns, typename Permutation, typename T>
__cuda_callable__ StaticMatrix< Value, Rows, Columns, Permutation > operator/ (StaticMatrix< Value, Rows, Columns, Permutation > a, const T &b)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator<< (File &&file, const DenseMatrixBase< Real, Device, Index, Organization > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator<< (File &&file, const MultidiagonalMatrixBase< Real, Device, Index, Organization > &matrix)
template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>
Fileoperator<< (File &&file, const SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator<< (File &&file, const TridiagonalMatrixBase< Real, Device, Index, Organization > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator<< (File &file, const DenseMatrixBase< Real, Device, Index, Organization > &matrix)
 Serialization of dense matrices into binary files.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator<< (File &file, const MultidiagonalMatrixBase< Real, Device, Index, Organization > &matrix)
 Serialization of multidiagonal matrices into binary files.
template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>
Fileoperator<< (File &file, const SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > &matrix)
 Serialization of sparse matrices into binary files.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator<< (File &file, const TridiagonalMatrixBase< Real, Device, Index, Organization > &matrix)
 Serialization of tridiagonal matrices into binary files.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
std::ostreamoperator<< (std::ostream &str, const DenseMatrixBase< Real, Device, Index, Organization > &matrix)
 Insertion operator for dense matrix and output stream.
template<typename MatrixElementsLambda, typename CompressedRowLengthsLambda, typename Real, typename Device, typename Index>
std::ostreamoperator<< (std::ostream &str, const LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index > &matrix)
 Insertion operator for lambda matrix and output stream.
template<typename MatrixElementsLambda, typename CompressedRowLengthsLambda, typename Real, typename Index>
std::ostreamoperator<< (std::ostream &str, const LambdaMatrixRowView< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Index > &row)
 Insertion operator for a Lambda matrix row.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
std::ostreamoperator<< (std::ostream &str, const MultidiagonalMatrixBase< Real, Device, Index, Organization > &matrix)
 Overloaded insertion operator for printing a matrix to output stream.
template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>
std::ostreamoperator<< (std::ostream &str, const SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > &matrix)
 Overloaded insertion operator for printing a matrix to output stream.
template<typename SegmentView, typename ValuesView, typename ColumnsIndexesView>
std::ostreamoperator<< (std::ostream &str, const SparseMatrixRowView< SegmentView, ValuesView, ColumnsIndexesView > &row)
 Insertion operator for a sparse matrix row.
template<typename Value, std::size_t Rows, std::size_t Columns, typename Permutation>
std::ostreamoperator<< (std::ostream &str, const StaticMatrix< Value, Rows, Columns, Permutation > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
std::ostreamoperator<< (std::ostream &str, const TridiagonalMatrixBase< Real, Device, Index, Organization > &matrix)
 Overloaded insertion operator for printing a matrix to output stream.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization, typename RealAllocator>
Fileoperator>> (File &&file, DenseMatrix< Real, Device, Index, Organization, RealAllocator > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator>> (File &&file, DenseMatrixView< Real, Device, Index, Organization > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization, typename RealAllocator, typename IndexAllocator>
Fileoperator>> (File &&file, MultidiagonalMatrix< Real, Device, Index, Organization, RealAllocator, IndexAllocator > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator>> (File &&file, MultidiagonalMatrixView< Real, Device, Index, Organization > &matrix)
template<typename Real, typename Device, typename Index, typename MatrixType, template< typename, typename, typename > class Segments, typename ComputeReal, typename RealAllocator, typename IndexAllocator>
Fileoperator>> (File &&file, SparseMatrix< Real, Device, Index, MatrixType, Segments, ComputeReal, RealAllocator, IndexAllocator > &matrix)
template<typename Real, typename Device, typename Index, typename MatrixType, template< typename, typename > class SegmentsView, typename ComputeReal>
Fileoperator>> (File &&file, SparseMatrixView< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization, typename RealAllocator>
Fileoperator>> (File &&file, TridiagonalMatrix< Real, Device, Index, Organization, RealAllocator > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator>> (File &&file, TridiagonalMatrixView< Real, Device, Index, Organization > &matrix)
template<typename Real, typename Device, typename Index, ElementsOrganization Organization, typename RealAllocator>
Fileoperator>> (File &file, DenseMatrix< Real, Device, Index, Organization, RealAllocator > &matrix)
 Deserialization of dense matrices from binary files.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator>> (File &file, DenseMatrixView< Real, Device, Index, Organization > &matrix)
 Deserialization of dense matrix views from binary files.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization, typename RealAllocator, typename IndexAllocator>
Fileoperator>> (File &file, MultidiagonalMatrix< Real, Device, Index, Organization, RealAllocator, IndexAllocator > &matrix)
 Deserialization of multidiagonal matrices from binary files.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator>> (File &file, MultidiagonalMatrixView< Real, Device, Index, Organization > &matrix)
 Deserialization of multidiagonal matrix views from binary files.
template<typename Real, typename Device, typename Index, typename MatrixType, template< typename, typename, typename > class Segments, typename ComputeReal, typename RealAllocator, typename IndexAllocator>
Fileoperator>> (File &file, SparseMatrix< Real, Device, Index, MatrixType, Segments, ComputeReal, RealAllocator, IndexAllocator > &matrix)
 Deserialization of sparse matrices from binary files.
template<typename Real, typename Device, typename Index, typename MatrixType, template< typename, typename > class SegmentsView, typename ComputeReal>
Fileoperator>> (File &file, SparseMatrixView< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > &matrix)
 Deserialization of sparse matrix views from binary files.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization, typename RealAllocator>
Fileoperator>> (File &file, TridiagonalMatrix< Real, Device, Index, Organization, RealAllocator > &matrix)
 Deserialization of tridiagonal matrices from binary files.
template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
Fileoperator>> (File &file, TridiagonalMatrixView< Real, Device, Index, Organization > &matrix)
 Deserialization of tridiagonal matrix views from binary files.
template<typename Matrix, typename PermutationArray>
void permuteMatrixColumns (Matrix &matrix, const PermutationArray &iperm)
template<typename Matrix, typename PermutationArray>
void permuteMatrixRows (Matrix &matrix, const PermutationArray &perm)
template<typename Matrix, typename Fetch, typename Reduction, typename Store>
void reduceAllRows (const Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows with automatic identity deduction (const version).
template<typename Matrix, typename Fetch, typename Reduction, typename Store, typename FetchValue>
void reduceAllRows (const Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows (const version).
template<typename Matrix, typename Fetch, typename Reduction, typename Store>
void reduceAllRows (Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows with automatic identity deduction.
template<typename Matrix, typename Fetch, typename Reduction, typename Store, typename FetchValue>
void reduceAllRows (Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows.
template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType reduceAllRowsIf (const Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows based on a condition with automatic identity deduction (const version).
template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue>
Matrix::IndexType reduceAllRowsIf (const Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows based on a condition (const version).
template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType reduceAllRowsIf (Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows based on a condition with automatic identity deduction.
template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue>
Matrix::IndexType reduceAllRowsIf (Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows based on a condition.
template<typename Matrix, typename Fetch, typename Reduction, typename Store>
void reduceAllRowsWithArgument (const Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows while returning also the position of the element of interest with automatic identity deduction (const version).
template<typename Matrix, typename Fetch, typename Reduction, typename Store, typename FetchValue>
void reduceAllRowsWithArgument (const Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows while returning also the position of the element of interest (const version).
template<typename Matrix, typename Fetch, typename Reduction, typename Store>
void reduceAllRowsWithArgument (Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows while returning also the position of the element of interest with automatic identity deduction.
template<typename Matrix, typename Fetch, typename Reduction, typename Store, typename FetchValue>
void reduceAllRowsWithArgument (Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows while returning also the position of the element of interest.
template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType reduceAllRowsWithArgumentIf (const Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows based on a condition while returning also the position of the element of interest with automatic identity deduction (const version).
template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue = decltype( std::declval< Fetch >()( 0, 0, std::declval< typename Matrix::RealType >() ) )>
Matrix::IndexType reduceAllRowsWithArgumentIf (const Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows based on a condition while returning also the position of the element of interest (const version).
template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType reduceAllRowsWithArgumentIf (Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows based on a condition while returning also the position of the element of interest with automatic identity deduction.
template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue = decltype( std::declval< Fetch >()( 0, 0, std::declval< typename Matrix::RealType >() ) )>
Matrix::IndexType reduceAllRowsWithArgumentIf (Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over all rows based on a condition while returning also the position of the element of interest.
template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void reduceRows (const Matrix &matrix, const Array &rowIndexes, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within matrix rows specified by a given set of row indexes with automatic identity deduction (const version).
template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void reduceRows (const Matrix &matrix, const Array &rowIndexes, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within matrix rows specified by a given set of row indexes (const version).
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void reduceRows (const Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes with automatic identity deduction (const version).
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void reduceRows (const Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes (const version).
template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void reduceRows (Matrix &matrix, const Array &rowIndexes, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within matrix rows specified by a given set of row indexes with automatic identity deduction.
template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void reduceRows (Matrix &matrix, const Array &rowIndexes, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within matrix rows specified by a given set of row indexes.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void reduceRows (Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes with automatic identity deduction.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void reduceRows (Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType reduceRowsIf (const Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes based on a condition with automatic identity deduction (const version).
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue>
Matrix::IndexType reduceRowsIf (const Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes based on a condition (const version).
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType reduceRowsIf (Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes based on a condition with automatic identity deduction.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue>
Matrix::IndexType reduceRowsIf (Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes based on a condition.
template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void reduceRowsWithArgument (const Matrix &matrix, const Array &rowIndexes, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within matrix rows specified by a given set of row indexes while returning also the position of the element of interest with automatic identity deduction (const version).
template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void reduceRowsWithArgument (const Matrix &matrix, const Array &rowIndexes, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within matrix rows specified by a given set of row indexes while returning also the position of the element of interest (const version).
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void reduceRowsWithArgument (const Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes while returning also the position of the element of interest with automatic identity deduction (const version).
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void reduceRowsWithArgument (const Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes while returning also the position of the element of interest (const version).
template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void reduceRowsWithArgument (Matrix &matrix, const Array &rowIndexes, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within matrix rows specified by a given set of row indexes while returning also the position of the element of interest with automatic identity deduction.
template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void reduceRowsWithArgument (Matrix &matrix, const Array &rowIndexes, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within matrix rows specified by a given set of row indexes while returning also the position of the element of interest.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void reduceRowsWithArgument (Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes while returning also the position of the element of interest with automatic identity deduction.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void reduceRowsWithArgument (Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes while returning also the position of the element of interest.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType reduceRowsWithArgumentIf (const Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes based on a condition while returning also the position of the element of interest with automatic identity deduction (const version).
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue = decltype( std::declval< Fetch >()( 0, 0, std::declval< typename Matrix::RealType >() ) )>
Matrix::IndexType reduceRowsWithArgumentIf (const Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes based on a condition while returning also the position of the element of interest (const version).
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType reduceRowsWithArgumentIf (Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes based on a condition while returning also the position of the element of interest with automatic identity deduction.
template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue = decltype( std::declval< Fetch >()( 0, 0, std::declval< typename Matrix::RealType >() ) )>
Matrix::IndexType reduceRowsWithArgumentIf (Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Performs parallel reduction within each matrix row over a given range of row indexes based on a condition while returning also the position of the element of interest.
template<typename Array1, typename Array2, typename PermutationArray>
void reorderArray (const Array1 &src, Array2 &dest, const PermutationArray &perm)
template<typename Matrix1, typename Matrix2, typename PermutationArray>
void reorderSparseMatrix (const Matrix1 &matrix1, Matrix2 &matrix2, const PermutationArray &perm, const PermutationArray &iperm)
template<typename Real>
__cuda_callable__ Containers::StaticVector< 2, Real > solve (const StaticMatrix< Real, 2, 2 > &A, const Containers::StaticVector< 2, Real > &b)
template<typename Real>
__cuda_callable__ Containers::StaticVector< 3, Real > solve (const StaticMatrix< Real, 3, 3 > &A, const Containers::StaticVector< 3, Real > &b)
template<typename Real>
__cuda_callable__ Containers::StaticVector< 4, Real > solve (const StaticMatrix< Real, 4, 4 > &A, const Containers::StaticVector< 4, Real > &b)
template<typename Value, std::size_t Rows, std::size_t Columns, typename Permutation>
StaticMatrix< Value, Columns, Rows, Permutation > transpose (const StaticMatrix< Value, Rows, Columns, Permutation > &A)
template<typename Device, typename Real, typename Index>
SparseMatrixView< Real, Device, Index, GeneralMatrix, Algorithms::Segments::CSRViewwrapCSRMatrix (const Index &rows, const Index &columns, Index *rowPointers, Real *values, Index *columnIndexes)
 Function for wrapping of arrays defining CSR format into a sparse matrix view.
template<typename Device, typename Real, typename Index, ElementsOrganization Organization = Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization()>
DenseMatrixView< Real, Device, Index, Organization > wrapDenseMatrix (const Index &rows, const Index &columns, Real *values)
 Function for wrapping an array of values into a dense matrix view.
template<typename Device, ElementsOrganization Organization, typename Real, typename Index, int Alignment = 1>
auto wrapEllpackMatrix (const Index rows, const Index columns, const Index nonzerosPerRow, Real *values, Index *columnIndexes) -> decltype(EllpackMatrixWrapper< Device, Organization, Real, Index, Alignment >::wrap(rows, columns, nonzerosPerRow, values, columnIndexes))
 Function for wrapping of arrays defining Ellpack format into a sparse matrix view.

Variables

template<typename T>
constexpr bool is_dense_matrix_v = is_dense_matrix< T >::value
template<typename T>
constexpr bool is_matrix_v = is_matrix< T >::value
template<typename T>
constexpr bool is_matrix_view_v = is_matrix_view< T >::value
template<typename Matrix>
constexpr bool is_sparse_csr_matrix_v = isSparseCSRMatrix< Matrix >::value
template<typename Matrix>
constexpr bool is_sparse_matrix_v = is_sparse_matrix< Matrix >::value
template<typename Index>
constexpr Index paddingIndex = static_cast< Index >( -1 )
 Padding index value.

Detailed Description

Namespace for matrix formats.

Enumeration Type Documentation

◆ MatrixElementsEncoding

Encoding of the matrix elements in initializer lists or STL maps.

Enumerator
Complete 

All elements of the matrix are provided.

SymmetricLower 

Only lower part of the matrix is provided.

SymmetricUpper 

Only upper part of the matrix is provided.

SymmetricMixed 

For each couple of non-zero elements a_ij and a_ji, at least one is provided. It is handy for example for adjacency matrices of undirected graphs.

Function Documentation

◆ compressSparseMatrix()

template<typename Matrix>
void TNL::Matrices::compressSparseMatrix ( Matrix & A)

Avoids unnecessary zero elements.

This method is especially useful for removing of explicitly coded zero elements but it also removes unnecessary padding elements.

Template Parameters
Matrixis the matrix type.
Parameters
Ais the matrix to be compressed.

◆ copySparseToSparseMatrix()

template<typename TargetMatrix, typename SourceMatrix>
void TNL::Matrices::copySparseToSparseMatrix ( TargetMatrix & A,
const SourceMatrix & B )

Copies sparse matrix to sparse matrix.

If the source matrix is TNL::Matrices::GeneralMatrix and the target matrix is TNL::Matrices::SymmetricMatrix, the values of the source matrix are assumed to be symmetric. No check is performed, only the lower part of the matrix and its diagonal are copied.

Template Parameters
TargetMatrixis the target symmetric sparse matrix type.
SourceMatrixis the source general sparse matrix type.
Parameters
Ais the target symmetric sparse matrix.
Bis the source general sparse matrix.

◆ forAllElements() [1/2]

template<typename Matrix, typename Function>
void TNL::Matrices::forAllElements ( const Matrix & matrix,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of all matrix rows of constant matrix and applies the specified lambda function.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
functionLambda function to be applied to each element. See Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forElementsExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forElements to set lower triangular matrix elements.
20 */
21 auto setLowerTriangular = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
22 {
23 if( columnIdx <= rowIdx )
24 value = rowIdx + columnIdx;
25 };
26
27 TNL::Matrices::forElements( denseMatrix, 0, denseMatrix.getRows(), setLowerTriangular );
28 std::cout << "Dense matrix with lower triangular elements set:\n";
29 std::cout << denseMatrix << '\n';
30
31 /***
32 * Create a 5x5 sparse matrix.
33 */
34 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 4, 5 }, 5 );
35
36 /***
37 * Use forElements to initialize sparse matrix elements.
38 */
39 auto setSparse = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
40 {
41 if( rowIdx >= localIdx ) {
42 columnIdx = localIdx;
43 value = rowIdx + localIdx + 1;
44 }
45 };
46
47 TNL::Matrices::forElements( sparseMatrix, 0, sparseMatrix.getRows(), setSparse );
48 std::cout << "Sparse matrix initialized:\n";
49 std::cout << sparseMatrix << '\n';
50}
51
52int
53main( int argc, char* argv[] )
54{
55 std::cout << "Running on host:\n";
56 forElementsExample< TNL::Devices::Host >();
57
58#ifdef __CUDACC__
59 std::cout << "Running on CUDA device:\n";
60 forElementsExample< TNL::Devices::Cuda >();
61#endif
62}
#define __cuda_callable__
Definition Macros.h:49
Implementation of dense matrix, i.e. matrix storing explicitly all of its elements including zeros.
Definition DenseMatrix.h:31
Implementation of sparse matrix, i.e. matrix storing only non-zero elements.
Definition SparseMatrix.h:57
void forElements(Matrix &matrix, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over all elements in the given range of matrix rows and applies the specified la...
Output
Running on host:
Dense matrix with lower triangular elements set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix initialized:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Running on CUDA device:
Dense matrix with lower triangular elements set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix initialized:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forAllElements() [2/2]

template<typename Matrix, typename Function>
void TNL::Matrices::forAllElements ( Matrix & matrix,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of all matrix rows and applies the specified lambda function.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
functionLambda function to be applied to each element. See Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forElementsExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forElements to set lower triangular matrix elements.
20 */
21 auto setLowerTriangular = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
22 {
23 if( columnIdx <= rowIdx )
24 value = rowIdx + columnIdx;
25 };
26
27 TNL::Matrices::forElements( denseMatrix, 0, denseMatrix.getRows(), setLowerTriangular );
28 std::cout << "Dense matrix with lower triangular elements set:\n";
29 std::cout << denseMatrix << '\n';
30
31 /***
32 * Create a 5x5 sparse matrix.
33 */
34 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 4, 5 }, 5 );
35
36 /***
37 * Use forElements to initialize sparse matrix elements.
38 */
39 auto setSparse = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
40 {
41 if( rowIdx >= localIdx ) {
42 columnIdx = localIdx;
43 value = rowIdx + localIdx + 1;
44 }
45 };
46
47 TNL::Matrices::forElements( sparseMatrix, 0, sparseMatrix.getRows(), setSparse );
48 std::cout << "Sparse matrix initialized:\n";
49 std::cout << sparseMatrix << '\n';
50}
51
52int
53main( int argc, char* argv[] )
54{
55 std::cout << "Running on host:\n";
56 forElementsExample< TNL::Devices::Host >();
57
58#ifdef __CUDACC__
59 std::cout << "Running on CUDA device:\n";
60 forElementsExample< TNL::Devices::Cuda >();
61#endif
62}
Output
Running on host:
Dense matrix with lower triangular elements set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix initialized:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Running on CUDA device:
Dense matrix with lower triangular elements set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix initialized:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forAllElementsIf() [1/2]

template<typename Matrix, typename Condition, typename Function>
void TNL::Matrices::forAllElementsIf ( const Matrix & matrix,
Condition && condition,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of all matrix rows based on a condition.

See also: Overview of Matrix Traversal Functions

This function is for constant matrices.

For each matrix row, a condition lambda function is evaluated based on the row index. If the condition lambda function returns true, all elements of the row are traversed, and the specified lambda function is applied to each element. If the condition lambda function returns false, the row is skipped.

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
conditionLambda function to check row condition. See Condition Lambda.
functionLambda function to be applied to each element. See Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forElementsIfExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forElementsIf to set elements only in even rows.
20 */
21 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
22 {
23 return rowIdx % 2 == 0;
24 };
25
26 auto setElements = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
27 {
28 if( columnIdx <= rowIdx )
29 value = rowIdx + columnIdx;
30 };
31
32 TNL::Matrices::forElementsIf( denseMatrix, 0, denseMatrix.getRows(), evenRowCondition, setElements );
33 std::cout << "Dense matrix with elements set only in even rows:\n";
34 std::cout << denseMatrix << '\n';
35
36 /***
37 * Create a 5x5 sparse matrix.
38 */
39 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 4, 5 }, 5 );
40
41 /***
42 * Use forElementsIf to set elements only in rows where rowIdx > 1.
43 */
44 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
45 {
46 return rowIdx > 1;
47 };
48
49 auto setSparseElements = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
50 {
51 if( rowIdx >= localIdx ) {
52 columnIdx = localIdx;
53 value = rowIdx + localIdx + 1;
54 }
55 };
56
57 TNL::Matrices::forElementsIf( sparseMatrix, 0, sparseMatrix.getRows(), rowCondition, setSparseElements );
58 std::cout << "Sparse matrix with elements set only in rows where rowIdx > 1:\n";
59 std::cout << sparseMatrix << '\n';
60}
61
62int
63main( int argc, char* argv[] )
64{
65 std::cout << "Running on host:\n";
66 forElementsIfExample< TNL::Devices::Host >();
67
68#ifdef __CUDACC__
69 std::cout << "Running on CUDA device:\n";
70 forElementsIfExample< TNL::Devices::Cuda >();
71#endif
72}
void forElementsIf(Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over all elements in a given range of rows based on a condition.
Output
Running on host:
Dense matrix with elements set only in even rows:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with elements set only in rows where rowIdx > 1:
Row: 0 ->
Row: 1 ->
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Running on CUDA device:
Dense matrix with elements set only in even rows:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with elements set only in rows where rowIdx > 1:
Row: 0 ->
Row: 1 ->
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forAllElementsIf() [2/2]

template<typename Matrix, typename Condition, typename Function>
void TNL::Matrices::forAllElementsIf ( Matrix & matrix,
Condition && condition,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of all matrix rows based on a condition.

See also: Overview of Matrix Traversal Functions

For each matrix row, a condition lambda function is evaluated based on the row index. If the condition lambda function returns true, all elements of the row are traversed, and the specified lambda function is applied to each element. If the condition lambda function returns false, the row is skipped.

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
conditionLambda function to check row condition. See Condition Lambda.
functionLambda function to be applied to each element. See Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forElementsIfExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forElementsIf to set elements only in even rows.
20 */
21 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
22 {
23 return rowIdx % 2 == 0;
24 };
25
26 auto setElements = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
27 {
28 if( columnIdx <= rowIdx )
29 value = rowIdx + columnIdx;
30 };
31
32 TNL::Matrices::forElementsIf( denseMatrix, 0, denseMatrix.getRows(), evenRowCondition, setElements );
33 std::cout << "Dense matrix with elements set only in even rows:\n";
34 std::cout << denseMatrix << '\n';
35
36 /***
37 * Create a 5x5 sparse matrix.
38 */
39 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 4, 5 }, 5 );
40
41 /***
42 * Use forElementsIf to set elements only in rows where rowIdx > 1.
43 */
44 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
45 {
46 return rowIdx > 1;
47 };
48
49 auto setSparseElements = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
50 {
51 if( rowIdx >= localIdx ) {
52 columnIdx = localIdx;
53 value = rowIdx + localIdx + 1;
54 }
55 };
56
57 TNL::Matrices::forElementsIf( sparseMatrix, 0, sparseMatrix.getRows(), rowCondition, setSparseElements );
58 std::cout << "Sparse matrix with elements set only in rows where rowIdx > 1:\n";
59 std::cout << sparseMatrix << '\n';
60}
61
62int
63main( int argc, char* argv[] )
64{
65 std::cout << "Running on host:\n";
66 forElementsIfExample< TNL::Devices::Host >();
67
68#ifdef __CUDACC__
69 std::cout << "Running on CUDA device:\n";
70 forElementsIfExample< TNL::Devices::Cuda >();
71#endif
72}
Output
Running on host:
Dense matrix with elements set only in even rows:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with elements set only in rows where rowIdx > 1:
Row: 0 ->
Row: 1 ->
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Running on CUDA device:
Dense matrix with elements set only in even rows:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with elements set only in rows where rowIdx > 1:
Row: 0 ->
Row: 1 ->
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forAllRows() [1/2]

template<typename Matrix, typename Function>
void TNL::Matrices::forAllRows ( const Matrix & matrix,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all matrix rows and applies the given lambda function to each row. This function is for constant matrices.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
functionLambda function to be applied to each row. See Row Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forRowsExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forRows to process rows 1 to 4 (inclusive).
20 */
22
23 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
24 {
25 const int rowIdx = row.getRowIndex();
26 for( int i = 0; i < row.getSize(); i++ )
27 if( i <= rowIdx )
28 row.setValue( i, rowIdx + i );
29 };
30
31 TNL::Matrices::forRows( denseMatrix, 1, 4, processDenseRow );
32 std::cout << "Dense matrix with rows 1-3 processed:\n";
33 std::cout << denseMatrix << '\n';
34
35 /***
36 * Create a 5x5 sparse matrix.
37 */
38 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
39
40 /***
41 * Use forRows to set up a tridiagonal structure.
42 */
43 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
44
45 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
46 {
47 const int rowIdx = row.getRowIndex();
48 const int size = 5;
49
50 if( rowIdx == 0 )
51 row.setElement( 0, rowIdx, 2.0 ); // diagonal element
52 else if( rowIdx == size - 1 )
53 row.setElement( 0, rowIdx, 2.0 ); // diagonal element
54 else {
55 row.setElement( 0, rowIdx - 1, 1.0 ); // below diagonal
56 row.setElement( 1, rowIdx, 2.0 ); // diagonal
57 row.setElement( 2, rowIdx + 1, 1.0 ); // above diagonal
58 }
59 };
60
61 TNL::Matrices::forRows( sparseMatrix, 0, 5, processSparseRow );
62 std::cout << "Sparse tridiagonal matrix:\n";
63 std::cout << sparseMatrix << '\n';
64}
65
66int
67main( int argc, char* argv[] )
68{
69 std::cout << "Running on host:\n";
70 forRowsExample< TNL::Devices::Host >();
71
72#ifdef __CUDACC__
73 std::cout << "Running on CUDA device:\n";
74 forRowsExample< TNL::Devices::Cuda >();
75#endif
76}
DenseMatrixRowView< SegmentViewType, typename Base::ValuesViewType > RowView
Definition DenseMatrixBase.h:65
void forRows(Matrix &matrix, IndexBegin begin, IndexEnd end, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over matrix rows within the specified range of row indexes and applies the given...
Output
Running on host:
Dense matrix with rows 1-3 processed:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:0
Sparse tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Running on CUDA device:
Dense matrix with rows 1-3 processed:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:0
Sparse tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
1#include <iostream>
2#include <TNL/Matrices/SparseMatrix.h>
3#include <TNL/Matrices/traverse.h>
4#include <TNL/Devices/Host.h>
5#include <TNL/Devices/Cuda.h>
6
7template< typename Device >
8void
9forRowsExample2()
10{
11 /***
12 * Create a sparse matrix and set up a tridiagonal structure.
13 */
14 const int size = 5;
15 TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 3, 3, 3, 1 }, size );
16
18
19 auto setupRow = [] __cuda_callable__( RowView & row )
20 {
21 const int rowIdx = row.getRowIndex();
22 const int size = 5;
23
24 if( rowIdx == 0 )
25 row.setElement( 0, rowIdx, 2.0 );
26 else if( rowIdx == size - 1 )
27 row.setElement( 0, rowIdx, 2.0 );
28 else {
29 row.setElement( 0, rowIdx - 1, 1.0 );
30 row.setElement( 1, rowIdx, 2.0 );
31 row.setElement( 2, rowIdx + 1, 1.0 );
32 }
33 };
34
35 TNL::Matrices::forRows( matrix, 0, size, setupRow );
36 std::cout << "Initial tridiagonal matrix:\n";
37 std::cout << matrix << '\n';
38
39 /***
40 * Normalize each row by dividing by the sum of its elements.
41 */
42 auto normalizeRow = [] __cuda_callable__( RowView & row )
43 {
44 double sum = 0.0;
45 for( auto element : row )
46 sum += element.value();
47
48 for( auto element : row )
49 element.value() /= sum;
50 };
51
52 TNL::Matrices::forRows( matrix, 0, size, normalizeRow );
53 std::cout << "Row-normalized matrix:\n";
54 std::cout << matrix << '\n';
55}
56
57int
58main( int argc, char* argv[] )
59{
60 std::cout << "Running on host:\n";
61 forRowsExample2< TNL::Devices::Host >();
62
63#ifdef __CUDACC__
64 std::cout << "Running on CUDA device:\n";
65 forRowsExample2< TNL::Devices::Cuda >();
66#endif
67}
Output
Running on host:
Initial tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Row-normalized matrix:
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1
Running on CUDA device:
Initial tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Row-normalized matrix:
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1

◆ forAllRows() [2/2]

template<typename Matrix, typename Function>
void TNL::Matrices::forAllRows ( Matrix & matrix,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all matrix rows and applies the given lambda function to each row.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
functionLambda function to be applied to each row. See Row Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forRowsExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forRows to process rows 1 to 4 (inclusive).
20 */
22
23 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
24 {
25 const int rowIdx = row.getRowIndex();
26 for( int i = 0; i < row.getSize(); i++ )
27 if( i <= rowIdx )
28 row.setValue( i, rowIdx + i );
29 };
30
31 TNL::Matrices::forRows( denseMatrix, 1, 4, processDenseRow );
32 std::cout << "Dense matrix with rows 1-3 processed:\n";
33 std::cout << denseMatrix << '\n';
34
35 /***
36 * Create a 5x5 sparse matrix.
37 */
38 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
39
40 /***
41 * Use forRows to set up a tridiagonal structure.
42 */
43 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
44
45 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
46 {
47 const int rowIdx = row.getRowIndex();
48 const int size = 5;
49
50 if( rowIdx == 0 )
51 row.setElement( 0, rowIdx, 2.0 ); // diagonal element
52 else if( rowIdx == size - 1 )
53 row.setElement( 0, rowIdx, 2.0 ); // diagonal element
54 else {
55 row.setElement( 0, rowIdx - 1, 1.0 ); // below diagonal
56 row.setElement( 1, rowIdx, 2.0 ); // diagonal
57 row.setElement( 2, rowIdx + 1, 1.0 ); // above diagonal
58 }
59 };
60
61 TNL::Matrices::forRows( sparseMatrix, 0, 5, processSparseRow );
62 std::cout << "Sparse tridiagonal matrix:\n";
63 std::cout << sparseMatrix << '\n';
64}
65
66int
67main( int argc, char* argv[] )
68{
69 std::cout << "Running on host:\n";
70 forRowsExample< TNL::Devices::Host >();
71
72#ifdef __CUDACC__
73 std::cout << "Running on CUDA device:\n";
74 forRowsExample< TNL::Devices::Cuda >();
75#endif
76}
Output
Running on host:
Dense matrix with rows 1-3 processed:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:0
Sparse tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Running on CUDA device:
Dense matrix with rows 1-3 processed:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:0
Sparse tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
1#include <iostream>
2#include <TNL/Matrices/SparseMatrix.h>
3#include <TNL/Matrices/traverse.h>
4#include <TNL/Devices/Host.h>
5#include <TNL/Devices/Cuda.h>
6
7template< typename Device >
8void
9forRowsExample2()
10{
11 /***
12 * Create a sparse matrix and set up a tridiagonal structure.
13 */
14 const int size = 5;
15 TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 3, 3, 3, 1 }, size );
16
18
19 auto setupRow = [] __cuda_callable__( RowView & row )
20 {
21 const int rowIdx = row.getRowIndex();
22 const int size = 5;
23
24 if( rowIdx == 0 )
25 row.setElement( 0, rowIdx, 2.0 );
26 else if( rowIdx == size - 1 )
27 row.setElement( 0, rowIdx, 2.0 );
28 else {
29 row.setElement( 0, rowIdx - 1, 1.0 );
30 row.setElement( 1, rowIdx, 2.0 );
31 row.setElement( 2, rowIdx + 1, 1.0 );
32 }
33 };
34
35 TNL::Matrices::forRows( matrix, 0, size, setupRow );
36 std::cout << "Initial tridiagonal matrix:\n";
37 std::cout << matrix << '\n';
38
39 /***
40 * Normalize each row by dividing by the sum of its elements.
41 */
42 auto normalizeRow = [] __cuda_callable__( RowView & row )
43 {
44 double sum = 0.0;
45 for( auto element : row )
46 sum += element.value();
47
48 for( auto element : row )
49 element.value() /= sum;
50 };
51
52 TNL::Matrices::forRows( matrix, 0, size, normalizeRow );
53 std::cout << "Row-normalized matrix:\n";
54 std::cout << matrix << '\n';
55}
56
57int
58main( int argc, char* argv[] )
59{
60 std::cout << "Running on host:\n";
61 forRowsExample2< TNL::Devices::Host >();
62
63#ifdef __CUDACC__
64 std::cout << "Running on CUDA device:\n";
65 forRowsExample2< TNL::Devices::Cuda >();
66#endif
67}
Output
Running on host:
Initial tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Row-normalized matrix:
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1
Running on CUDA device:
Initial tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Row-normalized matrix:
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1

◆ forAllRowsIf() [1/2]

template<typename Matrix, typename RowCondition, typename Function>
void TNL::Matrices::forAllRowsIf ( const Matrix & matrix,
RowCondition && rowCondition,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all matrix rows, applying a condition to determine whether each row should be processed. This function is for constant matrices.

See also: Overview of Matrix Traversal Functions

For each row, a condition lambda function is evaluated based on the row index. If the condition lambda function returns true, the specified lambda function is executed for the row. If the condition lambda function returns false, the row is skipped.

Template Parameters
MatrixThe type of the matrix.
RowConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
rowConditionLambda function to check row condition. See Condition Lambda.
functionLambda function to be applied to each row. See Row Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forRowsIfExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forRowsIf to process only even-numbered rows.
20 */
21 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
22 {
23 return rowIdx % 2 == 0;
24 };
25
27
28 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
29 {
30 const int rowIdx = row.getRowIndex();
31 for( int i = 0; i < row.getSize(); i++ )
32 row.setValue( i, rowIdx + i );
33 };
34
35 TNL::Matrices::forRowsIf( denseMatrix, 0, 5, evenRowCondition, processDenseRow );
36 std::cout << "Dense matrix with only even rows set:\n";
37 std::cout << denseMatrix << '\n';
38
39 /***
40 * Create a 5x5 sparse matrix.
41 */
42 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
43
44 /***
45 * Use forRowsIf to process only rows where rowIdx > 0 and rowIdx < 4.
46 */
47 auto innerRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
48 {
49 return rowIdx > 0 && rowIdx < 4;
50 };
51
52 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
53
54 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
55 {
56 const int rowIdx = row.getRowIndex();
57 row.setElement( 0, rowIdx - 1, 1.0 );
58 row.setElement( 1, rowIdx, 2.0 );
59 row.setElement( 2, rowIdx + 1, 1.0 );
60 };
61
62 TNL::Matrices::forRowsIf( sparseMatrix, 0, 5, innerRowCondition, processSparseRow );
63 std::cout << "Sparse matrix with only inner rows (1-3) set:\n";
64 std::cout << sparseMatrix << '\n';
65}
66
67int
68main( int argc, char* argv[] )
69{
70 std::cout << "Running on host:\n";
71 forRowsIfExample< TNL::Devices::Host >();
72
73#ifdef __CUDACC__
74 std::cout << "Running on CUDA device:\n";
75 forRowsIfExample< TNL::Devices::Cuda >();
76#endif
77}
void forRowsIf(Matrix &matrix, IndexBegin begin, IndexEnd end, RowCondition &&rowCondition, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over rows within the given range of row indexes, applying a condition to determi...
Output
Running on host:
Dense matrix with only even rows set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:5 4:6
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with only inner rows (1-3) set:
Row: 0 ->
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 ->
Running on CUDA device:
Dense matrix with only even rows set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:5 4:6
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with only inner rows (1-3) set:
Row: 0 ->
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 ->

◆ forAllRowsIf() [2/2]

template<typename Matrix, typename RowCondition, typename Function>
void TNL::Matrices::forAllRowsIf ( Matrix & matrix,
RowCondition && rowCondition,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all matrix rows, applying a condition to determine whether each row should be processed.

See also: Overview of Matrix Traversal Functions

For each row, a condition lambda function is evaluated based on the row index. If the condition lambda function returns true, the specified lambda function is executed for the row. If the condition lambda function returns false, the row is skipped.

Template Parameters
MatrixThe type of the matrix.
RowConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
rowConditionLambda function to check row condition. See Condition Lambda.
functionLambda function to be applied to each row. See Row Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forRowsIfExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forRowsIf to process only even-numbered rows.
20 */
21 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
22 {
23 return rowIdx % 2 == 0;
24 };
25
27
28 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
29 {
30 const int rowIdx = row.getRowIndex();
31 for( int i = 0; i < row.getSize(); i++ )
32 row.setValue( i, rowIdx + i );
33 };
34
35 TNL::Matrices::forRowsIf( denseMatrix, 0, 5, evenRowCondition, processDenseRow );
36 std::cout << "Dense matrix with only even rows set:\n";
37 std::cout << denseMatrix << '\n';
38
39 /***
40 * Create a 5x5 sparse matrix.
41 */
42 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
43
44 /***
45 * Use forRowsIf to process only rows where rowIdx > 0 and rowIdx < 4.
46 */
47 auto innerRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
48 {
49 return rowIdx > 0 && rowIdx < 4;
50 };
51
52 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
53
54 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
55 {
56 const int rowIdx = row.getRowIndex();
57 row.setElement( 0, rowIdx - 1, 1.0 );
58 row.setElement( 1, rowIdx, 2.0 );
59 row.setElement( 2, rowIdx + 1, 1.0 );
60 };
61
62 TNL::Matrices::forRowsIf( sparseMatrix, 0, 5, innerRowCondition, processSparseRow );
63 std::cout << "Sparse matrix with only inner rows (1-3) set:\n";
64 std::cout << sparseMatrix << '\n';
65}
66
67int
68main( int argc, char* argv[] )
69{
70 std::cout << "Running on host:\n";
71 forRowsIfExample< TNL::Devices::Host >();
72
73#ifdef __CUDACC__
74 std::cout << "Running on CUDA device:\n";
75 forRowsIfExample< TNL::Devices::Cuda >();
76#endif
77}
Output
Running on host:
Dense matrix with only even rows set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:5 4:6
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with only inner rows (1-3) set:
Row: 0 ->
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 ->
Running on CUDA device:
Dense matrix with only even rows set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:5 4:6
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with only inner rows (1-3) set:
Row: 0 ->
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 ->

◆ forElements() [1/6]

template<typename Matrix, typename Array, typename Function>
void TNL::Matrices::forElements ( const Matrix & matrix,
const Array & rowIndexes,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of matrix rows with the given indexes and applies the specified lambda function. This function is for constant matrices.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the matrix rows to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
rowIndexesThe array containing the indexes of the matrix rows to iterate over.
functionLambda function to be applied to each element. See Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Device >
10void
11forElementsWithIndexesExample()
12{
13 /***
14 * Create a 5x5 dense matrix and sparse matrix.
15 */
17 denseMatrix.setValue( 0.0 );
18
19 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 2, 1 }, 5 );
20
21 /***
22 * Create a vector with row indexes to process (rows 1, 2, and 4).
23 */
24 TNL::Containers::Vector< int, Device > rowIndexes{ 1, 2, 4 };
25
26 /***
27 * Use forElements with row indexes to set specific matrix rows.
28 */
29 auto setDenseElements = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
30 {
31 value = rowIdx * 10 + columnIdx;
32 };
33
34 TNL::Matrices::forElements( denseMatrix, rowIndexes, setDenseElements );
35 std::cout << "Dense matrix with selected rows set:\n";
36 std::cout << denseMatrix << '\n';
37
38 /***
39 * Set sparse matrix elements for selected rows.
40 */
41 auto setSparseElements = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
42 {
43 columnIdx = localIdx;
44 value = rowIdx + localIdx + 1;
45 };
46
47 TNL::Matrices::forElements( sparseMatrix, rowIndexes, setSparseElements );
48 std::cout << "Sparse matrix with selected rows set:\n";
49 std::cout << sparseMatrix << '\n';
50}
51
52int
53main( int argc, char* argv[] )
54{
55 std::cout << "Running on host:\n";
56 forElementsWithIndexesExample< TNL::Devices::Host >();
57
58#ifdef __CUDACC__
59 std::cout << "Running on CUDA device:\n";
60 forElementsWithIndexesExample< TNL::Devices::Cuda >();
61#endif
62}
Vector extends Array with algebraic operations.
Definition Vector.h:37
Output
Running on host:
Dense matrix with selected rows set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:10 1:11 2:12 3:13 4:14
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows set:
Row: 0 ->
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 ->
Row: 4 -> 0:5
Running on CUDA device:
Dense matrix with selected rows set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:10 1:11 2:12 3:13 4:14
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows set:
Row: 0 ->
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 ->
Row: 4 -> 0:5

◆ forElements() [2/6]

template<typename Matrix, typename Array, typename IndexBegin, typename IndexEnd, typename Function>
void TNL::Matrices::forElements ( const Matrix & matrix,
const Array & rowIndexes,
IndexBegin begin,
IndexEnd end,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of matrix rows with the given indexes and applies the specified lambda function. This function is for constant matrices.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the matrix rows to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of row indexes whose elements will be processed using the lambda function.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of row indexes whose elements will be processed using the lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
rowIndexesThe array containing the indexes of the matrix rows to iterate over.
beginThe beginning of the interval [ begin, end ) of row indexes whose elements will be processed using the lambda function.
endThe end of the interval [ begin, end ) of row indexes whose elements will be processed using the lambda function.
functionLambda function to be applied to each element. See Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Device >
10void
11forElementsWithIndexesExample()
12{
13 /***
14 * Create a 5x5 dense matrix and sparse matrix.
15 */
17 denseMatrix.setValue( 0.0 );
18
19 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 2, 1 }, 5 );
20
21 /***
22 * Create a vector with row indexes to process (rows 1, 2, and 4).
23 */
24 TNL::Containers::Vector< int, Device > rowIndexes{ 1, 2, 4 };
25
26 /***
27 * Use forElements with row indexes to set specific matrix rows.
28 */
29 auto setDenseElements = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
30 {
31 value = rowIdx * 10 + columnIdx;
32 };
33
34 TNL::Matrices::forElements( denseMatrix, rowIndexes, setDenseElements );
35 std::cout << "Dense matrix with selected rows set:\n";
36 std::cout << denseMatrix << '\n';
37
38 /***
39 * Set sparse matrix elements for selected rows.
40 */
41 auto setSparseElements = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
42 {
43 columnIdx = localIdx;
44 value = rowIdx + localIdx + 1;
45 };
46
47 TNL::Matrices::forElements( sparseMatrix, rowIndexes, setSparseElements );
48 std::cout << "Sparse matrix with selected rows set:\n";
49 std::cout << sparseMatrix << '\n';
50}
51
52int
53main( int argc, char* argv[] )
54{
55 std::cout << "Running on host:\n";
56 forElementsWithIndexesExample< TNL::Devices::Host >();
57
58#ifdef __CUDACC__
59 std::cout << "Running on CUDA device:\n";
60 forElementsWithIndexesExample< TNL::Devices::Cuda >();
61#endif
62}
Output
Running on host:
Dense matrix with selected rows set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:10 1:11 2:12 3:13 4:14
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows set:
Row: 0 ->
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 ->
Row: 4 -> 0:5
Running on CUDA device:
Dense matrix with selected rows set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:10 1:11 2:12 3:13 4:14
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows set:
Row: 0 ->
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 ->
Row: 4 -> 0:5

◆ forElements() [3/6]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Function>
void TNL::Matrices::forElements ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of constant matrix in the given range of matrix rows and applies the specified lambda function.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows whose elements we want to process using the lambda function.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows whose elements we want to process using the lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrix whose elements will be processed using the lambda function.
beginThe beginning of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
endThe end of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
functionLambda function to be applied to each element. See Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forElementsExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forElements to set lower triangular matrix elements.
20 */
21 auto setLowerTriangular = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
22 {
23 if( columnIdx <= rowIdx )
24 value = rowIdx + columnIdx;
25 };
26
27 TNL::Matrices::forElements( denseMatrix, 0, denseMatrix.getRows(), setLowerTriangular );
28 std::cout << "Dense matrix with lower triangular elements set:\n";
29 std::cout << denseMatrix << '\n';
30
31 /***
32 * Create a 5x5 sparse matrix.
33 */
34 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 4, 5 }, 5 );
35
36 /***
37 * Use forElements to initialize sparse matrix elements.
38 */
39 auto setSparse = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
40 {
41 if( rowIdx >= localIdx ) {
42 columnIdx = localIdx;
43 value = rowIdx + localIdx + 1;
44 }
45 };
46
47 TNL::Matrices::forElements( sparseMatrix, 0, sparseMatrix.getRows(), setSparse );
48 std::cout << "Sparse matrix initialized:\n";
49 std::cout << sparseMatrix << '\n';
50}
51
52int
53main( int argc, char* argv[] )
54{
55 std::cout << "Running on host:\n";
56 forElementsExample< TNL::Devices::Host >();
57
58#ifdef __CUDACC__
59 std::cout << "Running on CUDA device:\n";
60 forElementsExample< TNL::Devices::Cuda >();
61#endif
62}
Output
Running on host:
Dense matrix with lower triangular elements set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix initialized:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Running on CUDA device:
Dense matrix with lower triangular elements set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix initialized:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forElements() [4/6]

template<typename Matrix, typename Array, typename Function>
void TNL::Matrices::forElements ( Matrix & matrix,
const Array & rowIndexes,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of matrix rows with the given indexes and applies the specified lambda function.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the matrix rows to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
rowIndexesThe array containing the indexes of the matrix rows to iterate over.
functionLambda function to be applied to each element. See Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Device >
10void
11forElementsWithIndexesExample()
12{
13 /***
14 * Create a 5x5 dense matrix and sparse matrix.
15 */
17 denseMatrix.setValue( 0.0 );
18
19 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 2, 1 }, 5 );
20
21 /***
22 * Create a vector with row indexes to process (rows 1, 2, and 4).
23 */
24 TNL::Containers::Vector< int, Device > rowIndexes{ 1, 2, 4 };
25
26 /***
27 * Use forElements with row indexes to set specific matrix rows.
28 */
29 auto setDenseElements = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
30 {
31 value = rowIdx * 10 + columnIdx;
32 };
33
34 TNL::Matrices::forElements( denseMatrix, rowIndexes, setDenseElements );
35 std::cout << "Dense matrix with selected rows set:\n";
36 std::cout << denseMatrix << '\n';
37
38 /***
39 * Set sparse matrix elements for selected rows.
40 */
41 auto setSparseElements = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
42 {
43 columnIdx = localIdx;
44 value = rowIdx + localIdx + 1;
45 };
46
47 TNL::Matrices::forElements( sparseMatrix, rowIndexes, setSparseElements );
48 std::cout << "Sparse matrix with selected rows set:\n";
49 std::cout << sparseMatrix << '\n';
50}
51
52int
53main( int argc, char* argv[] )
54{
55 std::cout << "Running on host:\n";
56 forElementsWithIndexesExample< TNL::Devices::Host >();
57
58#ifdef __CUDACC__
59 std::cout << "Running on CUDA device:\n";
60 forElementsWithIndexesExample< TNL::Devices::Cuda >();
61#endif
62}
Output
Running on host:
Dense matrix with selected rows set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:10 1:11 2:12 3:13 4:14
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows set:
Row: 0 ->
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 ->
Row: 4 -> 0:5
Running on CUDA device:
Dense matrix with selected rows set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:10 1:11 2:12 3:13 4:14
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows set:
Row: 0 ->
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 ->
Row: 4 -> 0:5

◆ forElements() [5/6]

template<typename Matrix, typename Array, typename IndexBegin, typename IndexEnd, typename Function>
void TNL::Matrices::forElements ( Matrix & matrix,
const Array & rowIndexes,
IndexBegin begin,
IndexEnd end,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of matrix rows with the given indexes and applies the specified lambda function.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the matrix rows to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of row indexes whose elements will be processed using the lambda function.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of row indexes whose elements will be processed using the lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
rowIndexesThe array containing the indexes of the matrix rows to iterate over.
beginThe beginning of the interval [ begin, end ) of row indexes whose elements will be processed using the lambda function.
endThe end of the interval [ begin, end ) of row indexes whose elements will be processed using the lambda function.
functionLambda function to be applied to each element. See Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Device >
10void
11forElementsWithIndexesExample()
12{
13 /***
14 * Create a 5x5 dense matrix and sparse matrix.
15 */
17 denseMatrix.setValue( 0.0 );
18
19 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 2, 1 }, 5 );
20
21 /***
22 * Create a vector with row indexes to process (rows 1, 2, and 4).
23 */
24 TNL::Containers::Vector< int, Device > rowIndexes{ 1, 2, 4 };
25
26 /***
27 * Use forElements with row indexes to set specific matrix rows.
28 */
29 auto setDenseElements = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
30 {
31 value = rowIdx * 10 + columnIdx;
32 };
33
34 TNL::Matrices::forElements( denseMatrix, rowIndexes, setDenseElements );
35 std::cout << "Dense matrix with selected rows set:\n";
36 std::cout << denseMatrix << '\n';
37
38 /***
39 * Set sparse matrix elements for selected rows.
40 */
41 auto setSparseElements = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
42 {
43 columnIdx = localIdx;
44 value = rowIdx + localIdx + 1;
45 };
46
47 TNL::Matrices::forElements( sparseMatrix, rowIndexes, setSparseElements );
48 std::cout << "Sparse matrix with selected rows set:\n";
49 std::cout << sparseMatrix << '\n';
50}
51
52int
53main( int argc, char* argv[] )
54{
55 std::cout << "Running on host:\n";
56 forElementsWithIndexesExample< TNL::Devices::Host >();
57
58#ifdef __CUDACC__
59 std::cout << "Running on CUDA device:\n";
60 forElementsWithIndexesExample< TNL::Devices::Cuda >();
61#endif
62}
Output
Running on host:
Dense matrix with selected rows set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:10 1:11 2:12 3:13 4:14
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows set:
Row: 0 ->
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 ->
Row: 4 -> 0:5
Running on CUDA device:
Dense matrix with selected rows set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:10 1:11 2:12 3:13 4:14
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows set:
Row: 0 ->
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 ->
Row: 4 -> 0:5

◆ forElements() [6/6]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Function>
void TNL::Matrices::forElements ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements in the given range of matrix rows and applies the specified lambda function.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows whose elements we want to process using the lambda function.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows whose elements we want to process using the lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrix whose elements will be processed using the lambda function.
beginThe beginning of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
endThe end of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
functionLambda function to be applied to each element. See Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forElementsExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forElements to set lower triangular matrix elements.
20 */
21 auto setLowerTriangular = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
22 {
23 if( columnIdx <= rowIdx )
24 value = rowIdx + columnIdx;
25 };
26
27 TNL::Matrices::forElements( denseMatrix, 0, denseMatrix.getRows(), setLowerTriangular );
28 std::cout << "Dense matrix with lower triangular elements set:\n";
29 std::cout << denseMatrix << '\n';
30
31 /***
32 * Create a 5x5 sparse matrix.
33 */
34 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 4, 5 }, 5 );
35
36 /***
37 * Use forElements to initialize sparse matrix elements.
38 */
39 auto setSparse = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
40 {
41 if( rowIdx >= localIdx ) {
42 columnIdx = localIdx;
43 value = rowIdx + localIdx + 1;
44 }
45 };
46
47 TNL::Matrices::forElements( sparseMatrix, 0, sparseMatrix.getRows(), setSparse );
48 std::cout << "Sparse matrix initialized:\n";
49 std::cout << sparseMatrix << '\n';
50}
51
52int
53main( int argc, char* argv[] )
54{
55 std::cout << "Running on host:\n";
56 forElementsExample< TNL::Devices::Host >();
57
58#ifdef __CUDACC__
59 std::cout << "Running on CUDA device:\n";
60 forElementsExample< TNL::Devices::Cuda >();
61#endif
62}
Output
Running on host:
Dense matrix with lower triangular elements set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix initialized:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Running on CUDA device:
Dense matrix with lower triangular elements set:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix initialized:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forElementsIf() [1/2]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Function>
void TNL::Matrices::forElementsIf ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements in a given range of rows based on a condition. This function is for constant matrices.

See also: Overview of Matrix Traversal Functions

For each matrix row, a condition lambda function is evaluated based on the row index. If the condition lambda function returns true, all elements of the row are traversed, and the specified lambda function is applied to each element. If the condition lambda function returns false, the row is skipped.

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
ConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
beginThe beginning of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
endThe end of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
conditionLambda function to check row condition. See Condition Lambda.
functionLambda function to be applied to each element. See Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forElementsIfExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forElementsIf to set elements only in even rows.
20 */
21 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
22 {
23 return rowIdx % 2 == 0;
24 };
25
26 auto setElements = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
27 {
28 if( columnIdx <= rowIdx )
29 value = rowIdx + columnIdx;
30 };
31
32 TNL::Matrices::forElementsIf( denseMatrix, 0, denseMatrix.getRows(), evenRowCondition, setElements );
33 std::cout << "Dense matrix with elements set only in even rows:\n";
34 std::cout << denseMatrix << '\n';
35
36 /***
37 * Create a 5x5 sparse matrix.
38 */
39 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 4, 5 }, 5 );
40
41 /***
42 * Use forElementsIf to set elements only in rows where rowIdx > 1.
43 */
44 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
45 {
46 return rowIdx > 1;
47 };
48
49 auto setSparseElements = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
50 {
51 if( rowIdx >= localIdx ) {
52 columnIdx = localIdx;
53 value = rowIdx + localIdx + 1;
54 }
55 };
56
57 TNL::Matrices::forElementsIf( sparseMatrix, 0, sparseMatrix.getRows(), rowCondition, setSparseElements );
58 std::cout << "Sparse matrix with elements set only in rows where rowIdx > 1:\n";
59 std::cout << sparseMatrix << '\n';
60}
61
62int
63main( int argc, char* argv[] )
64{
65 std::cout << "Running on host:\n";
66 forElementsIfExample< TNL::Devices::Host >();
67
68#ifdef __CUDACC__
69 std::cout << "Running on CUDA device:\n";
70 forElementsIfExample< TNL::Devices::Cuda >();
71#endif
72}
Output
Running on host:
Dense matrix with elements set only in even rows:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with elements set only in rows where rowIdx > 1:
Row: 0 ->
Row: 1 ->
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Running on CUDA device:
Dense matrix with elements set only in even rows:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with elements set only in rows where rowIdx > 1:
Row: 0 ->
Row: 1 ->
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forElementsIf() [2/2]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Function>
void TNL::Matrices::forElementsIf ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements in a given range of rows based on a condition.

See also: Overview of Matrix Traversal Functions

For each matrix row, a condition lambda function is evaluated based on the row index. If the condition lambda function returns true, all elements of the row are traversed, and the specified lambda function is applied to each element. If the condition lambda function returns false, the row is skipped.

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
ConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
matrixThe matrixwhose elements will be processed using the lambda function.
beginThe beginning of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
endThe end of the interval [ begin, end ) of rows whose elements will be processed using the lambda function.
conditionLambda function to check row condition. See Condition Lambda.
functionLambda function to be applied to each element. See Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forElementsIfExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forElementsIf to set elements only in even rows.
20 */
21 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
22 {
23 return rowIdx % 2 == 0;
24 };
25
26 auto setElements = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
27 {
28 if( columnIdx <= rowIdx )
29 value = rowIdx + columnIdx;
30 };
31
32 TNL::Matrices::forElementsIf( denseMatrix, 0, denseMatrix.getRows(), evenRowCondition, setElements );
33 std::cout << "Dense matrix with elements set only in even rows:\n";
34 std::cout << denseMatrix << '\n';
35
36 /***
37 * Create a 5x5 sparse matrix.
38 */
39 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 2, 3, 4, 5 }, 5 );
40
41 /***
42 * Use forElementsIf to set elements only in rows where rowIdx > 1.
43 */
44 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
45 {
46 return rowIdx > 1;
47 };
48
49 auto setSparseElements = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
50 {
51 if( rowIdx >= localIdx ) {
52 columnIdx = localIdx;
53 value = rowIdx + localIdx + 1;
54 }
55 };
56
57 TNL::Matrices::forElementsIf( sparseMatrix, 0, sparseMatrix.getRows(), rowCondition, setSparseElements );
58 std::cout << "Sparse matrix with elements set only in rows where rowIdx > 1:\n";
59 std::cout << sparseMatrix << '\n';
60}
61
62int
63main( int argc, char* argv[] )
64{
65 std::cout << "Running on host:\n";
66 forElementsIfExample< TNL::Devices::Host >();
67
68#ifdef __CUDACC__
69 std::cout << "Running on CUDA device:\n";
70 forElementsIfExample< TNL::Devices::Cuda >();
71#endif
72}
Output
Running on host:
Dense matrix with elements set only in even rows:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with elements set only in rows where rowIdx > 1:
Row: 0 ->
Row: 1 ->
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Running on CUDA device:
Dense matrix with elements set only in even rows:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with elements set only in rows where rowIdx > 1:
Row: 0 ->
Row: 1 ->
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forRows() [1/6]

template<typename Matrix, typename Array, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::forRows ( const Matrix & matrix,
const Array & rowIndexes,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over matrix rows with the given indexes and applies the specified lambda function to each row. This function is for constant matrices.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the matrix rows to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
rowIndexesThe array containing the indexes of the matrix rows to iterate over.
functionLambda function to be applied to each row. See Row Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Device >
10void
11forRowsWithIndexesExample()
12{
13 /***
14 * Create a 5x5 dense matrix.
15 */
17 denseMatrix.setValue( 0.0 );
18
19 /***
20 * Create a vector with row indexes to process.
21 */
22 TNL::Containers::Vector< int, Device > rowIndexes{ 0, 2, 4 };
23
24 /***
25 * Use forRows with row indexes to set specific matrix rows.
26 */
28
29 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
30 {
31 const int rowIdx = row.getRowIndex();
32 for( int i = 0; i < row.getSize(); i++ )
33 row.setValue( i, rowIdx * 10 + i );
34 };
35
36 TNL::Matrices::forRows( denseMatrix, rowIndexes, processDenseRow );
37 std::cout << "Dense matrix with selected rows (0, 2, 4) set:\n";
38 std::cout << denseMatrix << '\n';
39
40 /***
41 * Create a 5x5 sparse matrix.
42 */
43 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
44
45 /***
46 * Use forRows with row indexes to process selected rows.
47 */
48 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
49
50 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
51 {
52 const int rowIdx = row.getRowIndex();
53 const int size = 5;
54
55 if( rowIdx == 0 || rowIdx == size - 1 )
56 row.setElement( 0, rowIdx, 2.0 );
57 else {
58 row.setElement( 0, rowIdx - 1, 1.0 );
59 row.setElement( 1, rowIdx, 2.0 );
60 row.setElement( 2, rowIdx + 1, 1.0 );
61 }
62 };
63
64 TNL::Matrices::forRows( sparseMatrix, rowIndexes, processSparseRow );
65 std::cout << "Sparse matrix with selected rows (0, 2, 4) set:\n";
66 std::cout << sparseMatrix << '\n';
67}
68
69int
70main( int argc, char* argv[] )
71{
72 std::cout << "Running on host:\n";
73 forRowsWithIndexesExample< TNL::Devices::Host >();
74
75#ifdef __CUDACC__
76 std::cout << "Running on CUDA device:\n";
77 forRowsWithIndexesExample< TNL::Devices::Cuda >();
78#endif
79}
Output
Running on host:
Dense matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:2
Row: 1 ->
Row: 2 -> 1:1 2:2 3:1
Row: 3 ->
Row: 4 -> 4:2
Running on CUDA device:
Dense matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:2
Row: 1 ->
Row: 2 -> 1:1 2:2 3:1
Row: 3 ->
Row: 4 -> 4:2

◆ forRows() [2/6]

template<typename Matrix, typename Array, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value && std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::forRows ( const Matrix & matrix,
const Array & rowIndexes,
IndexBegin begin,
IndexEnd end,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over matrix rows with the given indexes and applies the specified lambda function to each row. This function is for constant matrices.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the matrix rows to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows on which the lambda function will be applied.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows on which the lambda function will be applied.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
rowIndexesThe array containing the indexes of the matrix rows to iterate over.
beginThe beginning of the interval [ begin, end ) of row indexes whose corresponding rows will be processed using the lambda function.
endThe end of the interval [ begin, end ) of row indexes whose corresponding rows will be processed using the lambda function.
functionLambda function to be applied to each row. See Row Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Device >
10void
11forRowsWithIndexesExample()
12{
13 /***
14 * Create a 5x5 dense matrix.
15 */
17 denseMatrix.setValue( 0.0 );
18
19 /***
20 * Create a vector with row indexes to process.
21 */
22 TNL::Containers::Vector< int, Device > rowIndexes{ 0, 2, 4 };
23
24 /***
25 * Use forRows with row indexes to set specific matrix rows.
26 */
28
29 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
30 {
31 const int rowIdx = row.getRowIndex();
32 for( int i = 0; i < row.getSize(); i++ )
33 row.setValue( i, rowIdx * 10 + i );
34 };
35
36 TNL::Matrices::forRows( denseMatrix, rowIndexes, processDenseRow );
37 std::cout << "Dense matrix with selected rows (0, 2, 4) set:\n";
38 std::cout << denseMatrix << '\n';
39
40 /***
41 * Create a 5x5 sparse matrix.
42 */
43 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
44
45 /***
46 * Use forRows with row indexes to process selected rows.
47 */
48 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
49
50 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
51 {
52 const int rowIdx = row.getRowIndex();
53 const int size = 5;
54
55 if( rowIdx == 0 || rowIdx == size - 1 )
56 row.setElement( 0, rowIdx, 2.0 );
57 else {
58 row.setElement( 0, rowIdx - 1, 1.0 );
59 row.setElement( 1, rowIdx, 2.0 );
60 row.setElement( 2, rowIdx + 1, 1.0 );
61 }
62 };
63
64 TNL::Matrices::forRows( sparseMatrix, rowIndexes, processSparseRow );
65 std::cout << "Sparse matrix with selected rows (0, 2, 4) set:\n";
66 std::cout << sparseMatrix << '\n';
67}
68
69int
70main( int argc, char* argv[] )
71{
72 std::cout << "Running on host:\n";
73 forRowsWithIndexesExample< TNL::Devices::Host >();
74
75#ifdef __CUDACC__
76 std::cout << "Running on CUDA device:\n";
77 forRowsWithIndexesExample< TNL::Devices::Cuda >();
78#endif
79}
Output
Running on host:
Dense matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:2
Row: 1 ->
Row: 2 -> 1:1 2:2 3:1
Row: 3 ->
Row: 4 -> 4:2
Running on CUDA device:
Dense matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:2
Row: 1 ->
Row: 2 -> 1:1 2:2 3:1
Row: 3 ->
Row: 4 -> 4:2

◆ forRows() [3/6]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::forRows ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over matrix rows within the specified range of row indexes and applies the given lambda function to each row. This function is for constant matrices.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of matrix rows on which the lambda function will be applied.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of matrix rows on which the lambda function will be applied.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
beginThe beginning of the interval [ begin, end ) of matrix rows that will be processed using the lambda function.
endThe end of the interval [ begin, end ) of matrix rows that will be processed using the lambda function.
functionLambda function to be applied to each row. See Row Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forRowsExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forRows to process rows 1 to 4 (inclusive).
20 */
22
23 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
24 {
25 const int rowIdx = row.getRowIndex();
26 for( int i = 0; i < row.getSize(); i++ )
27 if( i <= rowIdx )
28 row.setValue( i, rowIdx + i );
29 };
30
31 TNL::Matrices::forRows( denseMatrix, 1, 4, processDenseRow );
32 std::cout << "Dense matrix with rows 1-3 processed:\n";
33 std::cout << denseMatrix << '\n';
34
35 /***
36 * Create a 5x5 sparse matrix.
37 */
38 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
39
40 /***
41 * Use forRows to set up a tridiagonal structure.
42 */
43 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
44
45 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
46 {
47 const int rowIdx = row.getRowIndex();
48 const int size = 5;
49
50 if( rowIdx == 0 )
51 row.setElement( 0, rowIdx, 2.0 ); // diagonal element
52 else if( rowIdx == size - 1 )
53 row.setElement( 0, rowIdx, 2.0 ); // diagonal element
54 else {
55 row.setElement( 0, rowIdx - 1, 1.0 ); // below diagonal
56 row.setElement( 1, rowIdx, 2.0 ); // diagonal
57 row.setElement( 2, rowIdx + 1, 1.0 ); // above diagonal
58 }
59 };
60
61 TNL::Matrices::forRows( sparseMatrix, 0, 5, processSparseRow );
62 std::cout << "Sparse tridiagonal matrix:\n";
63 std::cout << sparseMatrix << '\n';
64}
65
66int
67main( int argc, char* argv[] )
68{
69 std::cout << "Running on host:\n";
70 forRowsExample< TNL::Devices::Host >();
71
72#ifdef __CUDACC__
73 std::cout << "Running on CUDA device:\n";
74 forRowsExample< TNL::Devices::Cuda >();
75#endif
76}
Output
Running on host:
Dense matrix with rows 1-3 processed:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:0
Sparse tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Running on CUDA device:
Dense matrix with rows 1-3 processed:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:0
Sparse tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
1#include <iostream>
2#include <TNL/Matrices/SparseMatrix.h>
3#include <TNL/Matrices/traverse.h>
4#include <TNL/Devices/Host.h>
5#include <TNL/Devices/Cuda.h>
6
7template< typename Device >
8void
9forRowsExample2()
10{
11 /***
12 * Create a sparse matrix and set up a tridiagonal structure.
13 */
14 const int size = 5;
15 TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 3, 3, 3, 1 }, size );
16
18
19 auto setupRow = [] __cuda_callable__( RowView & row )
20 {
21 const int rowIdx = row.getRowIndex();
22 const int size = 5;
23
24 if( rowIdx == 0 )
25 row.setElement( 0, rowIdx, 2.0 );
26 else if( rowIdx == size - 1 )
27 row.setElement( 0, rowIdx, 2.0 );
28 else {
29 row.setElement( 0, rowIdx - 1, 1.0 );
30 row.setElement( 1, rowIdx, 2.0 );
31 row.setElement( 2, rowIdx + 1, 1.0 );
32 }
33 };
34
35 TNL::Matrices::forRows( matrix, 0, size, setupRow );
36 std::cout << "Initial tridiagonal matrix:\n";
37 std::cout << matrix << '\n';
38
39 /***
40 * Normalize each row by dividing by the sum of its elements.
41 */
42 auto normalizeRow = [] __cuda_callable__( RowView & row )
43 {
44 double sum = 0.0;
45 for( auto element : row )
46 sum += element.value();
47
48 for( auto element : row )
49 element.value() /= sum;
50 };
51
52 TNL::Matrices::forRows( matrix, 0, size, normalizeRow );
53 std::cout << "Row-normalized matrix:\n";
54 std::cout << matrix << '\n';
55}
56
57int
58main( int argc, char* argv[] )
59{
60 std::cout << "Running on host:\n";
61 forRowsExample2< TNL::Devices::Host >();
62
63#ifdef __CUDACC__
64 std::cout << "Running on CUDA device:\n";
65 forRowsExample2< TNL::Devices::Cuda >();
66#endif
67}
Output
Running on host:
Initial tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Row-normalized matrix:
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1
Running on CUDA device:
Initial tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Row-normalized matrix:
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1

◆ forRows() [4/6]

template<typename Matrix, typename Array, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::forRows ( Matrix & matrix,
const Array & rowIndexes,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over matrix rows with the given indexes and applies the specified lambda function to each row.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the matrix rows to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
rowIndexesThe array containing the indexes of the matrix rows to iterate over.
functionLambda function to be applied to each row. See Row Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Device >
10void
11forRowsWithIndexesExample()
12{
13 /***
14 * Create a 5x5 dense matrix.
15 */
17 denseMatrix.setValue( 0.0 );
18
19 /***
20 * Create a vector with row indexes to process.
21 */
22 TNL::Containers::Vector< int, Device > rowIndexes{ 0, 2, 4 };
23
24 /***
25 * Use forRows with row indexes to set specific matrix rows.
26 */
28
29 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
30 {
31 const int rowIdx = row.getRowIndex();
32 for( int i = 0; i < row.getSize(); i++ )
33 row.setValue( i, rowIdx * 10 + i );
34 };
35
36 TNL::Matrices::forRows( denseMatrix, rowIndexes, processDenseRow );
37 std::cout << "Dense matrix with selected rows (0, 2, 4) set:\n";
38 std::cout << denseMatrix << '\n';
39
40 /***
41 * Create a 5x5 sparse matrix.
42 */
43 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
44
45 /***
46 * Use forRows with row indexes to process selected rows.
47 */
48 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
49
50 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
51 {
52 const int rowIdx = row.getRowIndex();
53 const int size = 5;
54
55 if( rowIdx == 0 || rowIdx == size - 1 )
56 row.setElement( 0, rowIdx, 2.0 );
57 else {
58 row.setElement( 0, rowIdx - 1, 1.0 );
59 row.setElement( 1, rowIdx, 2.0 );
60 row.setElement( 2, rowIdx + 1, 1.0 );
61 }
62 };
63
64 TNL::Matrices::forRows( sparseMatrix, rowIndexes, processSparseRow );
65 std::cout << "Sparse matrix with selected rows (0, 2, 4) set:\n";
66 std::cout << sparseMatrix << '\n';
67}
68
69int
70main( int argc, char* argv[] )
71{
72 std::cout << "Running on host:\n";
73 forRowsWithIndexesExample< TNL::Devices::Host >();
74
75#ifdef __CUDACC__
76 std::cout << "Running on CUDA device:\n";
77 forRowsWithIndexesExample< TNL::Devices::Cuda >();
78#endif
79}
Output
Running on host:
Dense matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:2
Row: 1 ->
Row: 2 -> 1:1 2:2 3:1
Row: 3 ->
Row: 4 -> 4:2
Running on CUDA device:
Dense matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:2
Row: 1 ->
Row: 2 -> 1:1 2:2 3:1
Row: 3 ->
Row: 4 -> 4:2

◆ forRows() [5/6]

template<typename Matrix, typename Array, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value && std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::forRows ( Matrix & matrix,
const Array & rowIndexes,
IndexBegin begin,
IndexEnd end,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over matrix rows with the given indexes and applies the specified lambda function to each row.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the matrix rows to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows on which the lambda function will be applied.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows on which the lambda function will be applied.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
rowIndexesThe array containing the indexes of the matrix rows to iterate over.
beginThe beginning of the interval [ begin, end ) of row indexes whose corresponding rows will be processed using the lambda function.
endThe end of the interval [ begin, end ) of row indexes whose corresponding rows will be processed using the lambda function.
functionLambda function to be applied to each row. See Row Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Device >
10void
11forRowsWithIndexesExample()
12{
13 /***
14 * Create a 5x5 dense matrix.
15 */
17 denseMatrix.setValue( 0.0 );
18
19 /***
20 * Create a vector with row indexes to process.
21 */
22 TNL::Containers::Vector< int, Device > rowIndexes{ 0, 2, 4 };
23
24 /***
25 * Use forRows with row indexes to set specific matrix rows.
26 */
28
29 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
30 {
31 const int rowIdx = row.getRowIndex();
32 for( int i = 0; i < row.getSize(); i++ )
33 row.setValue( i, rowIdx * 10 + i );
34 };
35
36 TNL::Matrices::forRows( denseMatrix, rowIndexes, processDenseRow );
37 std::cout << "Dense matrix with selected rows (0, 2, 4) set:\n";
38 std::cout << denseMatrix << '\n';
39
40 /***
41 * Create a 5x5 sparse matrix.
42 */
43 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
44
45 /***
46 * Use forRows with row indexes to process selected rows.
47 */
48 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
49
50 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
51 {
52 const int rowIdx = row.getRowIndex();
53 const int size = 5;
54
55 if( rowIdx == 0 || rowIdx == size - 1 )
56 row.setElement( 0, rowIdx, 2.0 );
57 else {
58 row.setElement( 0, rowIdx - 1, 1.0 );
59 row.setElement( 1, rowIdx, 2.0 );
60 row.setElement( 2, rowIdx + 1, 1.0 );
61 }
62 };
63
64 TNL::Matrices::forRows( sparseMatrix, rowIndexes, processSparseRow );
65 std::cout << "Sparse matrix with selected rows (0, 2, 4) set:\n";
66 std::cout << sparseMatrix << '\n';
67}
68
69int
70main( int argc, char* argv[] )
71{
72 std::cout << "Running on host:\n";
73 forRowsWithIndexesExample< TNL::Devices::Host >();
74
75#ifdef __CUDACC__
76 std::cout << "Running on CUDA device:\n";
77 forRowsWithIndexesExample< TNL::Devices::Cuda >();
78#endif
79}
Output
Running on host:
Dense matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:2
Row: 1 ->
Row: 2 -> 1:1 2:2 3:1
Row: 3 ->
Row: 4 -> 4:2
Running on CUDA device:
Dense matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:20 1:21 2:22 3:23 4:24
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:40 1:41 2:42 3:43 4:44
Sparse matrix with selected rows (0, 2, 4) set:
Row: 0 -> 0:2
Row: 1 ->
Row: 2 -> 1:1 2:2 3:1
Row: 3 ->
Row: 4 -> 4:2

◆ forRows() [6/6]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::forRows ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over matrix rows within the specified range of row indexes and applies the given lambda function to each row.

See also: Overview of Matrix Traversal Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of matrix rows on which the lambda function will be applied.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of matrix rows on which the lambda function will be applied.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
beginThe beginning of the interval [ begin, end ) of matrix rows that will be processed using the lambda function.
endThe end of the interval [ begin, end ) of matrix rows that will be processed using the lambda function.
functionLambda function to be applied to each row. See Row Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forRowsExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forRows to process rows 1 to 4 (inclusive).
20 */
22
23 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
24 {
25 const int rowIdx = row.getRowIndex();
26 for( int i = 0; i < row.getSize(); i++ )
27 if( i <= rowIdx )
28 row.setValue( i, rowIdx + i );
29 };
30
31 TNL::Matrices::forRows( denseMatrix, 1, 4, processDenseRow );
32 std::cout << "Dense matrix with rows 1-3 processed:\n";
33 std::cout << denseMatrix << '\n';
34
35 /***
36 * Create a 5x5 sparse matrix.
37 */
38 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
39
40 /***
41 * Use forRows to set up a tridiagonal structure.
42 */
43 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
44
45 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
46 {
47 const int rowIdx = row.getRowIndex();
48 const int size = 5;
49
50 if( rowIdx == 0 )
51 row.setElement( 0, rowIdx, 2.0 ); // diagonal element
52 else if( rowIdx == size - 1 )
53 row.setElement( 0, rowIdx, 2.0 ); // diagonal element
54 else {
55 row.setElement( 0, rowIdx - 1, 1.0 ); // below diagonal
56 row.setElement( 1, rowIdx, 2.0 ); // diagonal
57 row.setElement( 2, rowIdx + 1, 1.0 ); // above diagonal
58 }
59 };
60
61 TNL::Matrices::forRows( sparseMatrix, 0, 5, processSparseRow );
62 std::cout << "Sparse tridiagonal matrix:\n";
63 std::cout << sparseMatrix << '\n';
64}
65
66int
67main( int argc, char* argv[] )
68{
69 std::cout << "Running on host:\n";
70 forRowsExample< TNL::Devices::Host >();
71
72#ifdef __CUDACC__
73 std::cout << "Running on CUDA device:\n";
74 forRowsExample< TNL::Devices::Cuda >();
75#endif
76}
Output
Running on host:
Dense matrix with rows 1-3 processed:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:0
Sparse tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Running on CUDA device:
Dense matrix with rows 1-3 processed:
Row: 0 -> 0:0 1:0 2:0 3:0 4:0
Row: 1 -> 0:1 1:2 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:0 4:0
Row: 3 -> 0:3 1:4 2:5 3:6 4:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:0
Sparse tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
1#include <iostream>
2#include <TNL/Matrices/SparseMatrix.h>
3#include <TNL/Matrices/traverse.h>
4#include <TNL/Devices/Host.h>
5#include <TNL/Devices/Cuda.h>
6
7template< typename Device >
8void
9forRowsExample2()
10{
11 /***
12 * Create a sparse matrix and set up a tridiagonal structure.
13 */
14 const int size = 5;
15 TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 3, 3, 3, 1 }, size );
16
18
19 auto setupRow = [] __cuda_callable__( RowView & row )
20 {
21 const int rowIdx = row.getRowIndex();
22 const int size = 5;
23
24 if( rowIdx == 0 )
25 row.setElement( 0, rowIdx, 2.0 );
26 else if( rowIdx == size - 1 )
27 row.setElement( 0, rowIdx, 2.0 );
28 else {
29 row.setElement( 0, rowIdx - 1, 1.0 );
30 row.setElement( 1, rowIdx, 2.0 );
31 row.setElement( 2, rowIdx + 1, 1.0 );
32 }
33 };
34
35 TNL::Matrices::forRows( matrix, 0, size, setupRow );
36 std::cout << "Initial tridiagonal matrix:\n";
37 std::cout << matrix << '\n';
38
39 /***
40 * Normalize each row by dividing by the sum of its elements.
41 */
42 auto normalizeRow = [] __cuda_callable__( RowView & row )
43 {
44 double sum = 0.0;
45 for( auto element : row )
46 sum += element.value();
47
48 for( auto element : row )
49 element.value() /= sum;
50 };
51
52 TNL::Matrices::forRows( matrix, 0, size, normalizeRow );
53 std::cout << "Row-normalized matrix:\n";
54 std::cout << matrix << '\n';
55}
56
57int
58main( int argc, char* argv[] )
59{
60 std::cout << "Running on host:\n";
61 forRowsExample2< TNL::Devices::Host >();
62
63#ifdef __CUDACC__
64 std::cout << "Running on CUDA device:\n";
65 forRowsExample2< TNL::Devices::Cuda >();
66#endif
67}
Output
Running on host:
Initial tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Row-normalized matrix:
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1
Running on CUDA device:
Initial tridiagonal matrix:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Row-normalized matrix:
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1

◆ forRowsIf() [1/2]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename RowCondition, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::forRowsIf ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
RowCondition && rowCondition,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over rows within the given range of row indexes, applying a condition to determine whether each row should be processed. This function is for constant matrices.

See also: Overview of Matrix Traversal Functions

For each row, a condition lambda function is evaluated based on the row index. If the condition lambda function returns true, the specified lambda function is executed for the row. If the condition lambda function returns false, the row is skipped.

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows on which the lambda function will be applied.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows on which the lambda function will be applied.
RowConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrix on which the lambda function will be applied.
beginThe beginning of the interval [ begin, end ) of row indexes whose corresponding rows will be processed using the lambda function.
endThe end of the interval [ begin, end ) of row indexes whose corresponding rows will be processed using the lambda function.
rowConditionLambda function to check row condition. See Condition Lambda.
functionLambda function to be applied to each row. See Row Traversal Function (Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forRowsIfExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forRowsIf to process only even-numbered rows.
20 */
21 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
22 {
23 return rowIdx % 2 == 0;
24 };
25
27
28 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
29 {
30 const int rowIdx = row.getRowIndex();
31 for( int i = 0; i < row.getSize(); i++ )
32 row.setValue( i, rowIdx + i );
33 };
34
35 TNL::Matrices::forRowsIf( denseMatrix, 0, 5, evenRowCondition, processDenseRow );
36 std::cout << "Dense matrix with only even rows set:\n";
37 std::cout << denseMatrix << '\n';
38
39 /***
40 * Create a 5x5 sparse matrix.
41 */
42 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
43
44 /***
45 * Use forRowsIf to process only rows where rowIdx > 0 and rowIdx < 4.
46 */
47 auto innerRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
48 {
49 return rowIdx > 0 && rowIdx < 4;
50 };
51
52 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
53
54 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
55 {
56 const int rowIdx = row.getRowIndex();
57 row.setElement( 0, rowIdx - 1, 1.0 );
58 row.setElement( 1, rowIdx, 2.0 );
59 row.setElement( 2, rowIdx + 1, 1.0 );
60 };
61
62 TNL::Matrices::forRowsIf( sparseMatrix, 0, 5, innerRowCondition, processSparseRow );
63 std::cout << "Sparse matrix with only inner rows (1-3) set:\n";
64 std::cout << sparseMatrix << '\n';
65}
66
67int
68main( int argc, char* argv[] )
69{
70 std::cout << "Running on host:\n";
71 forRowsIfExample< TNL::Devices::Host >();
72
73#ifdef __CUDACC__
74 std::cout << "Running on CUDA device:\n";
75 forRowsIfExample< TNL::Devices::Cuda >();
76#endif
77}
Output
Running on host:
Dense matrix with only even rows set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:5 4:6
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with only inner rows (1-3) set:
Row: 0 ->
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 ->
Running on CUDA device:
Dense matrix with only even rows set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:5 4:6
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with only inner rows (1-3) set:
Row: 0 ->
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 ->

◆ forRowsIf() [2/2]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename RowCondition, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::forRowsIf ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
RowCondition && rowCondition,
Function && function,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over rows within the given range of row indexes, applying a condition to determine whether each row should be processed.

See also: Overview of Matrix Traversal Functions

For each row, a condition lambda function is evaluated based on the row index. If the condition lambda function returns true, the specified lambda function is executed for the row. If the condition lambda function returns false, the row is skipped.

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows on which the lambda function will be applied.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows on which the lambda function will be applied.
RowConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be executed on each row.
Parameters
matrixThe matrixon which the lambda function will be applied.
beginThe beginning of the interval [ begin, end ) of row indexes whose corresponding rows will be processed using the lambda function.
endThe end of the interval [ begin, end ) of row indexes whose corresponding rows will be processed using the lambda function.
rowConditionLambda function to check row condition. See Condition Lambda.
functionLambda function to be applied to each row. See Row Traversal Function (Non-Const Matrix).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10forRowsIfExample()
11{
12 /***
13 * Create a 5x5 dense matrix.
14 */
16 denseMatrix.setValue( 0.0 );
17
18 /***
19 * Use forRowsIf to process only even-numbered rows.
20 */
21 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
22 {
23 return rowIdx % 2 == 0;
24 };
25
27
28 auto processDenseRow = [] __cuda_callable__( DenseRowView & row )
29 {
30 const int rowIdx = row.getRowIndex();
31 for( int i = 0; i < row.getSize(); i++ )
32 row.setValue( i, rowIdx + i );
33 };
34
35 TNL::Matrices::forRowsIf( denseMatrix, 0, 5, evenRowCondition, processDenseRow );
36 std::cout << "Dense matrix with only even rows set:\n";
37 std::cout << denseMatrix << '\n';
38
39 /***
40 * Create a 5x5 sparse matrix.
41 */
42 TNL::Matrices::SparseMatrix< double, Device > sparseMatrix( { 1, 3, 3, 3, 1 }, 5 );
43
44 /***
45 * Use forRowsIf to process only rows where rowIdx > 0 and rowIdx < 4.
46 */
47 auto innerRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
48 {
49 return rowIdx > 0 && rowIdx < 4;
50 };
51
52 using SparseRowView = typename TNL::Matrices::SparseMatrix< double, Device >::RowView;
53
54 auto processSparseRow = [] __cuda_callable__( SparseRowView & row )
55 {
56 const int rowIdx = row.getRowIndex();
57 row.setElement( 0, rowIdx - 1, 1.0 );
58 row.setElement( 1, rowIdx, 2.0 );
59 row.setElement( 2, rowIdx + 1, 1.0 );
60 };
61
62 TNL::Matrices::forRowsIf( sparseMatrix, 0, 5, innerRowCondition, processSparseRow );
63 std::cout << "Sparse matrix with only inner rows (1-3) set:\n";
64 std::cout << sparseMatrix << '\n';
65}
66
67int
68main( int argc, char* argv[] )
69{
70 std::cout << "Running on host:\n";
71 forRowsIfExample< TNL::Devices::Host >();
72
73#ifdef __CUDACC__
74 std::cout << "Running on CUDA device:\n";
75 forRowsIfExample< TNL::Devices::Cuda >();
76#endif
77}
Output
Running on host:
Dense matrix with only even rows set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:5 4:6
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with only inner rows (1-3) set:
Row: 0 ->
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 ->
Running on CUDA device:
Dense matrix with only even rows set:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4
Row: 1 -> 0:0 1:0 2:0 3:0 4:0
Row: 2 -> 0:2 1:3 2:4 3:5 4:6
Row: 3 -> 0:0 1:0 2:0 3:0 4:0
Row: 4 -> 0:4 1:5 2:6 3:7 4:8
Sparse matrix with only inner rows (1-3) set:
Row: 0 ->
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 ->

◆ getSymmetricPart()

template<typename OutMatrix, typename InMatrix>
OutMatrix TNL::Matrices::getSymmetricPart ( const InMatrix & inMatrix)

This function computes \(( A + A^T ) / 2 \), where \( A \) is a square matrix.

Template Parameters
InMatrixis the type of the input matrix.
OutMatrixis the type of the output matrix.
Parameters
inMatrixis the input matrix.
Returns
the output matrix.

◆ operator<<() [1/7]

template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
std::ostream & TNL::Matrices::operator<< ( std::ostream & str,
const DenseMatrixBase< Real, Device, Index, Organization > & matrix )

Insertion operator for dense matrix and output stream.

Parameters
stris the output stream.
matrixis the dense matrix.
Returns
reference to the stream.

◆ operator<<() [2/7]

template<typename MatrixElementsLambda, typename CompressedRowLengthsLambda, typename Real, typename Device, typename Index>
std::ostream & TNL::Matrices::operator<< ( std::ostream & str,
const LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index > & matrix )

Insertion operator for lambda matrix and output stream.

Parameters
stris the output stream.
matrixis the lambda matrix.
Returns
reference to the stream.

◆ operator<<() [3/7]

template<typename MatrixElementsLambda, typename CompressedRowLengthsLambda, typename Real, typename Index>
std::ostream & TNL::Matrices::operator<< ( std::ostream & str,
const LambdaMatrixRowView< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Index > & row )

Insertion operator for a Lambda matrix row.

Parameters
stris an output stream.
rowis an input Lambda matrix row.
Returns
reference to the output stream.

◆ operator<<() [4/7]

template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
std::ostream & TNL::Matrices::operator<< ( std::ostream & str,
const MultidiagonalMatrixBase< Real, Device, Index, Organization > & matrix )

Overloaded insertion operator for printing a matrix to output stream.

Template Parameters
Realis a type of the matrix elements.
Deviceis a device where the matrix is allocated.
Indexis a type used for the indexing of the matrix elements.
Parameters
stris a output stream.
matrixis the matrix to be printed.
Returns
a reference to the output stream std::ostream.

◆ operator<<() [5/7]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>
std::ostream & TNL::Matrices::operator<< ( std::ostream & str,
const SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > & matrix )

Overloaded insertion operator for printing a matrix to output stream.

Template Parameters
Realis a type of the matrix elements.
Deviceis a device where the matrix is allocated.
Indexis a type used for the indexing of the matrix elements.
Parameters
stris a output stream.
matrixis the matrix to be printed.
Returns
a reference to the output stream std::ostream.

◆ operator<<() [6/7]

template<typename SegmentView, typename ValuesView, typename ColumnsIndexesView>
std::ostream & TNL::Matrices::operator<< ( std::ostream & str,
const SparseMatrixRowView< SegmentView, ValuesView, ColumnsIndexesView > & row )

Insertion operator for a sparse matrix row.

Parameters
stris an output stream.
rowis an input sparse matrix row.
Returns
reference to the output stream.

◆ operator<<() [7/7]

template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
std::ostream & TNL::Matrices::operator<< ( std::ostream & str,
const TridiagonalMatrixBase< Real, Device, Index, Organization > & matrix )

Overloaded insertion operator for printing a matrix to output stream.

Template Parameters
Realis a type of the matrix elements.
Deviceis a device where the matrix is allocated.
Indexis a type used for the indexing of the matrix elements.
Parameters
stris a output stream.
matrixis the matrix to be printed.
Returns
a reference to the output stream std::ostream.

◆ reduceAllRows() [1/4]

template<typename Matrix, typename Fetch, typename Reduction, typename Store>
void TNL::Matrices::reduceAllRows ( const Matrix & matrix,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Basic Store (Row Index Only).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsExample()
13{
14 /***
15 * Create a 5x5 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values: row i has values i, i+1, i+2, ...
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = rowIdx + columnIdx + 1;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Compute row sums using reduceAllRows.
34 */
35 TNL::Containers::Vector< double, Device > rowSums( matrix.getRows() );
36 auto rowSums_view = rowSums.getView();
37
38 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
39 {
40 return value;
41 };
42
43 auto store = [ = ] __cuda_callable__( int rowIdx, const double& sum ) mutable
44 {
45 rowSums_view[ rowIdx ] = sum;
46 };
47
48 TNL::Matrices::reduceAllRows( matrix, fetch, TNL::Plus{}, store );
49
50 std::cout << "Row sums: " << rowSums << '\n';
51
52 /***
53 * Compute row maxima.
54 */
55 TNL::Containers::Vector< double, Device > rowMaxima( matrix.getRows() );
56 auto rowMaxima_view = rowMaxima.getView();
57
58 auto storeMax = [ = ] __cuda_callable__( int rowIdx, const double& max ) mutable
59 {
60 rowMaxima_view[ rowIdx ] = max;
61 };
62
63 TNL::Matrices::reduceAllRows( matrix, fetch, TNL::Max{}, storeMax );
64
65 std::cout << "Row maxima: " << rowMaxima << '\n';
66}
67
68int
69main( int argc, char* argv[] )
70{
71 std::cout << "Running on host:\n";
72 reduceAllRowsExample< TNL::Devices::Host >();
73
74#ifdef __CUDACC__
75 std::cout << '\n' << "Running on CUDA device:\n";
76 reduceAllRowsExample< TNL::Devices::Cuda >();
77#endif
78}
T max(T... args)
void reduceAllRows(Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Performs parallel reduction within each matrix row over all rows.
void forAllElements(Matrix &matrix, Function &&function, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over all elements of all matrix rows and applies the specified lambda function.
Function object implementing max(x, y).
Definition Functional.h:272
Function object implementing x + y.
Definition Functional.h:34
Output
Running on host:
Matrix:
Row: 0 -> 0:1 1:2 2:3 3:4 4:5
Row: 1 -> 0:2 1:3 2:4 3:5 4:6
Row: 2 -> 0:3 1:4 2:5 3:6 4:7
Row: 3 -> 0:4 1:5 2:6 3:7 4:8
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Row sums: [ 15, 20, 25, 30, 35 ]
Row maxima: [ 5, 6, 7, 8, 9 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:1 1:2 2:3 3:4 4:5
Row: 1 -> 0:2 1:3 2:4 3:5 4:6
Row: 2 -> 0:3 1:4 2:5 3:6 4:7
Row: 3 -> 0:4 1:5 2:6 3:7 4:8
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Row sums: [ 15, 20, 25, 30, 35 ]
Row maxima: [ 5, 6, 7, 8, 9 ]

◆ reduceAllRows() [2/4]

template<typename Matrix, typename Fetch, typename Reduction, typename Store, typename FetchValue>
void TNL::Matrices::reduceAllRows ( const Matrix & matrix,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
fetchLambda function for fetching data. See For Const Matrices.
reductionLambda function for reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Basic Store (Row Index Only).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

◆ reduceAllRows() [3/4]

template<typename Matrix, typename Fetch, typename Reduction, typename Store>
void TNL::Matrices::reduceAllRows ( Matrix & matrix,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Basic Store (Row Index Only).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsExample()
13{
14 /***
15 * Create a 5x5 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values: row i has values i, i+1, i+2, ...
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = rowIdx + columnIdx + 1;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Compute row sums using reduceAllRows.
34 */
35 TNL::Containers::Vector< double, Device > rowSums( matrix.getRows() );
36 auto rowSums_view = rowSums.getView();
37
38 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
39 {
40 return value;
41 };
42
43 auto store = [ = ] __cuda_callable__( int rowIdx, const double& sum ) mutable
44 {
45 rowSums_view[ rowIdx ] = sum;
46 };
47
48 TNL::Matrices::reduceAllRows( matrix, fetch, TNL::Plus{}, store );
49
50 std::cout << "Row sums: " << rowSums << '\n';
51
52 /***
53 * Compute row maxima.
54 */
55 TNL::Containers::Vector< double, Device > rowMaxima( matrix.getRows() );
56 auto rowMaxima_view = rowMaxima.getView();
57
58 auto storeMax = [ = ] __cuda_callable__( int rowIdx, const double& max ) mutable
59 {
60 rowMaxima_view[ rowIdx ] = max;
61 };
62
63 TNL::Matrices::reduceAllRows( matrix, fetch, TNL::Max{}, storeMax );
64
65 std::cout << "Row maxima: " << rowMaxima << '\n';
66}
67
68int
69main( int argc, char* argv[] )
70{
71 std::cout << "Running on host:\n";
72 reduceAllRowsExample< TNL::Devices::Host >();
73
74#ifdef __CUDACC__
75 std::cout << '\n' << "Running on CUDA device:\n";
76 reduceAllRowsExample< TNL::Devices::Cuda >();
77#endif
78}
Output
Running on host:
Matrix:
Row: 0 -> 0:1 1:2 2:3 3:4 4:5
Row: 1 -> 0:2 1:3 2:4 3:5 4:6
Row: 2 -> 0:3 1:4 2:5 3:6 4:7
Row: 3 -> 0:4 1:5 2:6 3:7 4:8
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Row sums: [ 15, 20, 25, 30, 35 ]
Row maxima: [ 5, 6, 7, 8, 9 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:1 1:2 2:3 3:4 4:5
Row: 1 -> 0:2 1:3 2:4 3:5 4:6
Row: 2 -> 0:3 1:4 2:5 3:6 4:7
Row: 3 -> 0:4 1:5 2:6 3:7 4:8
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Row sums: [ 15, 20, 25, 30, 35 ]
Row maxima: [ 5, 6, 7, 8, 9 ]

◆ reduceAllRows() [4/4]

template<typename Matrix, typename Fetch, typename Reduction, typename Store, typename FetchValue>
void TNL::Matrices::reduceAllRows ( Matrix & matrix,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionLambda function for reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Basic Store (Row Index Only).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

◆ reduceAllRowsIf() [1/4]

template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType TNL::Matrices::reduceAllRowsIf ( const Matrix & matrix,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows based on a condition with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
conditionLambda function for condition check. See Condition Check.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Basic Store (Row Index Only).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsIfExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values.
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = rowIdx + columnIdx;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Compute sums only for rows with even indices.
34 */
35 TNL::Containers::Vector< double, Device > evenRowSums( matrix.getRows() );
36 TNL::Containers::Vector< double, Device > compressedEvenRowSums( matrix.getRows() );
37 auto evenRowSums_view = evenRowSums.getView();
38 auto compressedEvenRowSums_view = compressedEvenRowSums.getView();
39
40 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
41 {
42 return value;
43 };
44
45 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
46 {
47 return rowIdx % 2 == 0; // Only even row indices
48 };
49
50 auto store = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
51 {
52 evenRowSums_view[ rowIdx ] = sum;
53 compressedEvenRowSums_view[ indexOfRowIdx ] = sum;
54 };
55
56 // Initialize with -1 to see which rows were processed
57 evenRowSums.setValue( -1.0 );
58
59 auto evenRowsCount = TNL::Matrices::reduceAllRowsIf( matrix, rowCondition, fetch, TNL::Plus{}, store );
60
61 std::cout << "Sums for even-indexed rows (odd indices show -1): " << evenRowSums << '\n';
62 std::cout << "Compressed sums for even-indexed rows: " << compressedEvenRowSums.getView( 0, evenRowsCount ) << '\n';
63}
64
65int
66main( int argc, char* argv[] )
67{
68 std::cout << "Running on host:\n";
69 reduceAllRowsIfExample< TNL::Devices::Host >();
70
71#ifdef __CUDACC__
72 std::cout << '\n' << "Running on CUDA device:\n";
73 reduceAllRowsIfExample< TNL::Devices::Cuda >();
74#endif
75}
Matrix::IndexType reduceAllRowsIf(Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Performs parallel reduction within each matrix row over all rows based on a condition.
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5
Row: 1 -> 0:1 1:2 2:3 3:4 4:5 5:6
Row: 2 -> 0:2 1:3 2:4 3:5 4:6 5:7
Row: 3 -> 0:3 1:4 2:5 3:6 4:7 5:8
Row: 4 -> 0:4 1:5 2:6 3:7 4:8 5:9
Row: 5 -> 0:5 1:6 2:7 3:8 4:9 5:10
Sums for even-indexed rows (odd indices show -1): [ 15, -1, 27, -1, 39, -1 ]
Compressed sums for even-indexed rows: [ 15, 27, 39 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5
Row: 1 -> 0:1 1:2 2:3 3:4 4:5 5:6
Row: 2 -> 0:2 1:3 2:4 3:5 4:6 5:7
Row: 3 -> 0:3 1:4 2:5 3:6 4:7 5:8
Row: 4 -> 0:4 1:5 2:6 3:7 4:8 5:9
Row: 5 -> 0:5 1:6 2:7 3:8 4:9 5:10
Sums for even-indexed rows (odd indices show -1): [ 15, -1, 27, -1, 39, -1 ]
Compressed sums for even-indexed rows: [ 15, 27, 39 ]

◆ reduceAllRowsIf() [2/4]

template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue>
Matrix::IndexType TNL::Matrices::reduceAllRowsIf ( const Matrix & matrix,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows based on a condition (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
conditionLambda function for condition check. See Condition Check.
fetchLambda function for fetching data. See For Const Matrices.
reductionLambda function for reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Basic Store (Row Index Only).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsIfExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values.
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = rowIdx + columnIdx;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Compute sums only for rows with even indices.
34 */
35 TNL::Containers::Vector< double, Device > evenRowSums( matrix.getRows() );
36 TNL::Containers::Vector< double, Device > compressedEvenRowSums( matrix.getRows() );
37 auto evenRowSums_view = evenRowSums.getView();
38 auto compressedEvenRowSums_view = compressedEvenRowSums.getView();
39
40 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
41 {
42 return value;
43 };
44
45 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
46 {
47 return rowIdx % 2 == 0; // Only even row indices
48 };
49
50 auto store = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
51 {
52 evenRowSums_view[ rowIdx ] = sum;
53 compressedEvenRowSums_view[ indexOfRowIdx ] = sum;
54 };
55
56 // Initialize with -1 to see which rows were processed
57 evenRowSums.setValue( -1.0 );
58
59 auto evenRowsCount = TNL::Matrices::reduceAllRowsIf( matrix, rowCondition, fetch, TNL::Plus{}, store );
60
61 std::cout << "Sums for even-indexed rows (odd indices show -1): " << evenRowSums << '\n';
62 std::cout << "Compressed sums for even-indexed rows: " << compressedEvenRowSums.getView( 0, evenRowsCount ) << '\n';
63}
64
65int
66main( int argc, char* argv[] )
67{
68 std::cout << "Running on host:\n";
69 reduceAllRowsIfExample< TNL::Devices::Host >();
70
71#ifdef __CUDACC__
72 std::cout << '\n' << "Running on CUDA device:\n";
73 reduceAllRowsIfExample< TNL::Devices::Cuda >();
74#endif
75}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5
Row: 1 -> 0:1 1:2 2:3 3:4 4:5 5:6
Row: 2 -> 0:2 1:3 2:4 3:5 4:6 5:7
Row: 3 -> 0:3 1:4 2:5 3:6 4:7 5:8
Row: 4 -> 0:4 1:5 2:6 3:7 4:8 5:9
Row: 5 -> 0:5 1:6 2:7 3:8 4:9 5:10
Sums for even-indexed rows (odd indices show -1): [ 15, -1, 27, -1, 39, -1 ]
Compressed sums for even-indexed rows: [ 15, 27, 39 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5
Row: 1 -> 0:1 1:2 2:3 3:4 4:5 5:6
Row: 2 -> 0:2 1:3 2:4 3:5 4:6 5:7
Row: 3 -> 0:3 1:4 2:5 3:6 4:7 5:8
Row: 4 -> 0:4 1:5 2:6 3:7 4:8 5:9
Row: 5 -> 0:5 1:6 2:7 3:8 4:9 5:10
Sums for even-indexed rows (odd indices show -1): [ 15, -1, 27, -1, 39, -1 ]
Compressed sums for even-indexed rows: [ 15, 27, 39 ]

◆ reduceAllRowsIf() [3/4]

template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType TNL::Matrices::reduceAllRowsIf ( Matrix & matrix,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows based on a condition with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
conditionLambda function for condition check. See Condition Check.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Basic Store (Row Index Only).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsIfExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values.
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = rowIdx + columnIdx;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Compute sums only for rows with even indices.
34 */
35 TNL::Containers::Vector< double, Device > evenRowSums( matrix.getRows() );
36 TNL::Containers::Vector< double, Device > compressedEvenRowSums( matrix.getRows() );
37 auto evenRowSums_view = evenRowSums.getView();
38 auto compressedEvenRowSums_view = compressedEvenRowSums.getView();
39
40 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
41 {
42 return value;
43 };
44
45 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
46 {
47 return rowIdx % 2 == 0; // Only even row indices
48 };
49
50 auto store = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
51 {
52 evenRowSums_view[ rowIdx ] = sum;
53 compressedEvenRowSums_view[ indexOfRowIdx ] = sum;
54 };
55
56 // Initialize with -1 to see which rows were processed
57 evenRowSums.setValue( -1.0 );
58
59 auto evenRowsCount = TNL::Matrices::reduceAllRowsIf( matrix, rowCondition, fetch, TNL::Plus{}, store );
60
61 std::cout << "Sums for even-indexed rows (odd indices show -1): " << evenRowSums << '\n';
62 std::cout << "Compressed sums for even-indexed rows: " << compressedEvenRowSums.getView( 0, evenRowsCount ) << '\n';
63}
64
65int
66main( int argc, char* argv[] )
67{
68 std::cout << "Running on host:\n";
69 reduceAllRowsIfExample< TNL::Devices::Host >();
70
71#ifdef __CUDACC__
72 std::cout << '\n' << "Running on CUDA device:\n";
73 reduceAllRowsIfExample< TNL::Devices::Cuda >();
74#endif
75}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5
Row: 1 -> 0:1 1:2 2:3 3:4 4:5 5:6
Row: 2 -> 0:2 1:3 2:4 3:5 4:6 5:7
Row: 3 -> 0:3 1:4 2:5 3:6 4:7 5:8
Row: 4 -> 0:4 1:5 2:6 3:7 4:8 5:9
Row: 5 -> 0:5 1:6 2:7 3:8 4:9 5:10
Sums for even-indexed rows (odd indices show -1): [ 15, -1, 27, -1, 39, -1 ]
Compressed sums for even-indexed rows: [ 15, 27, 39 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5
Row: 1 -> 0:1 1:2 2:3 3:4 4:5 5:6
Row: 2 -> 0:2 1:3 2:4 3:5 4:6 5:7
Row: 3 -> 0:3 1:4 2:5 3:6 4:7 5:8
Row: 4 -> 0:4 1:5 2:6 3:7 4:8 5:9
Row: 5 -> 0:5 1:6 2:7 3:8 4:9 5:10
Sums for even-indexed rows (odd indices show -1): [ 15, -1, 27, -1, 39, -1 ]
Compressed sums for even-indexed rows: [ 15, 27, 39 ]

◆ reduceAllRowsIf() [4/4]

template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue>
Matrix::IndexType TNL::Matrices::reduceAllRowsIf ( Matrix & matrix,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows based on a condition.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
conditionLambda function for condition check. See Condition Check.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionLambda function for reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Basic Store (Row Index Only).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsIfExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values.
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = rowIdx + columnIdx;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Compute sums only for rows with even indices.
34 */
35 TNL::Containers::Vector< double, Device > evenRowSums( matrix.getRows() );
36 TNL::Containers::Vector< double, Device > compressedEvenRowSums( matrix.getRows() );
37 auto evenRowSums_view = evenRowSums.getView();
38 auto compressedEvenRowSums_view = compressedEvenRowSums.getView();
39
40 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
41 {
42 return value;
43 };
44
45 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
46 {
47 return rowIdx % 2 == 0; // Only even row indices
48 };
49
50 auto store = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
51 {
52 evenRowSums_view[ rowIdx ] = sum;
53 compressedEvenRowSums_view[ indexOfRowIdx ] = sum;
54 };
55
56 // Initialize with -1 to see which rows were processed
57 evenRowSums.setValue( -1.0 );
58
59 auto evenRowsCount = TNL::Matrices::reduceAllRowsIf( matrix, rowCondition, fetch, TNL::Plus{}, store );
60
61 std::cout << "Sums for even-indexed rows (odd indices show -1): " << evenRowSums << '\n';
62 std::cout << "Compressed sums for even-indexed rows: " << compressedEvenRowSums.getView( 0, evenRowsCount ) << '\n';
63}
64
65int
66main( int argc, char* argv[] )
67{
68 std::cout << "Running on host:\n";
69 reduceAllRowsIfExample< TNL::Devices::Host >();
70
71#ifdef __CUDACC__
72 std::cout << '\n' << "Running on CUDA device:\n";
73 reduceAllRowsIfExample< TNL::Devices::Cuda >();
74#endif
75}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5
Row: 1 -> 0:1 1:2 2:3 3:4 4:5 5:6
Row: 2 -> 0:2 1:3 2:4 3:5 4:6 5:7
Row: 3 -> 0:3 1:4 2:5 3:6 4:7 5:8
Row: 4 -> 0:4 1:5 2:6 3:7 4:8 5:9
Row: 5 -> 0:5 1:6 2:7 3:8 4:9 5:10
Sums for even-indexed rows (odd indices show -1): [ 15, -1, 27, -1, 39, -1 ]
Compressed sums for even-indexed rows: [ 15, 27, 39 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5
Row: 1 -> 0:1 1:2 2:3 3:4 4:5 5:6
Row: 2 -> 0:2 1:3 2:4 3:5 4:6 5:7
Row: 3 -> 0:3 1:4 2:5 3:6 4:7 5:8
Row: 4 -> 0:4 1:5 2:6 3:7 4:8 5:9
Row: 5 -> 0:5 1:6 2:7 3:8 4:9 5:10
Sums for even-indexed rows (odd indices show -1): [ 15, -1, 27, -1, 39, -1 ]
Compressed sums for even-indexed rows: [ 15, 27, 39 ]

◆ reduceAllRowsWithArgument() [1/4]

template<typename Matrix, typename Fetch, typename Reduction, typename Store>
void TNL::Matrices::reduceAllRowsWithArgument ( const Matrix & matrix,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows while returning also the position of the element of interest with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsWithArgumentExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = ( rowIdx * 13 + columnIdx * 7 ) % 20;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Dense matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Find maximum value and its column index in each row.
34 */
35 TNL::Containers::Vector< double, Device > rowMaxValues( matrix.getRows() );
36 TNL::Containers::Vector< int, Device > rowMaxColumns( matrix.getRows() );
37 auto maxValues_view = rowMaxValues.getView();
38 auto maxColumns_view = rowMaxColumns.getView();
39
40 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
41 {
42 return value;
43 };
44
45 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
46 {
47 if( a < b ) {
48 a = b;
49 aIdx = bIdx;
50 }
51 else if( a == b && bIdx < aIdx ) {
52 aIdx = bIdx;
53 }
54 };
55
56 auto store = [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
57 {
58 maxValues_view[ rowIdx ] = value;
59 if( ! emptyRow )
60 maxColumns_view[ rowIdx ] = columnIdx;
61 };
62
64 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
65
66 std::cout << "Row maxima values: " << rowMaxValues << '\n';
67 std::cout << "Column indices of maxima: " << rowMaxColumns << '\n';
68
69 /***
70 * Find minimum value and its column index in each row.
71 */
72 TNL::Containers::Vector< double, Device > rowMinValues( matrix.getRows() );
73 TNL::Containers::Vector< int, Device > rowMinColumns( matrix.getRows() );
74 auto minValues_view = rowMinValues.getView();
75 auto minColumns_view = rowMinColumns.getView();
76
77 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
78 {
79 if( a > b ) {
80 a = b;
81 aIdx = bIdx;
82 }
83 else if( a == b && bIdx < aIdx ) {
84 aIdx = bIdx;
85 }
86 };
87
88 auto storeMin =
89 [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
90 {
91 minValues_view[ rowIdx ] = value;
92 if( ! emptyRow )
93 minColumns_view[ rowIdx ] = columnIdx;
94 };
95
96 TNL::Matrices::reduceAllRowsWithArgument( matrix, fetch, reductionMin, storeMin, std::numeric_limits< double >::max() );
97 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
98
99 std::cout << "Row minima values: " << rowMinValues << '\n';
100 std::cout << "Column indices of minima: " << rowMinColumns << '\n';
101}
102
103int
104main( int argc, char* argv[] )
105{
106 std::cout << "Running on host:\n";
107 reduceAllRowsWithArgumentExample< TNL::Devices::Host >();
108
109#ifdef __CUDACC__
110 std::cout << '\n' << "Running on CUDA device:\n";
111 reduceAllRowsWithArgumentExample< TNL::Devices::Cuda >();
112#endif
113}
T lowest(T... args)
void reduceAllRowsWithArgument(Matrix &matrix, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Performs parallel reduction within each matrix row over all rows while returning also the position of...
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:7 2:14 3:1 4:8 5:15
Row: 1 -> 0:13 1:0 2:7 3:14 4:1 5:8
Row: 2 -> 0:6 1:13 2:0 3:7 4:14 5:1
Row: 3 -> 0:19 1:6 2:13 3:0 4:7 5:14
Row: 4 -> 0:12 1:19 2:6 3:13 4:0 5:7
Row: 5 -> 0:5 1:12 2:19 3:6 4:13 5:0
Row maxima values: [ 15, 14, 14, 19, 19, 19 ]
Column indices of maxima: [ 5, 3, 4, 0, 1, 2 ]
Row minima values: [ 0, 0, 0, 0, 0, 0 ]
Column indices of minima: [ 0, 1, 2, 3, 4, 5 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:7 2:14 3:1 4:8 5:15
Row: 1 -> 0:13 1:0 2:7 3:14 4:1 5:8
Row: 2 -> 0:6 1:13 2:0 3:7 4:14 5:1
Row: 3 -> 0:19 1:6 2:13 3:0 4:7 5:14
Row: 4 -> 0:12 1:19 2:6 3:13 4:0 5:7
Row: 5 -> 0:5 1:12 2:19 3:6 4:13 5:0
Row maxima values: [ 15, 14, 14, 19, 19, 19 ]
Column indices of maxima: [ 5, 3, 4, 0, 1, 2 ]
Row minima values: [ 0, 0, 0, 0, 0, 0 ]
Column indices of minima: [ 0, 1, 2, 3, 4, 5 ]

◆ reduceAllRowsWithArgument() [2/4]

template<typename Matrix, typename Fetch, typename Reduction, typename Store, typename FetchValue>
void TNL::Matrices::reduceAllRowsWithArgument ( const Matrix & matrix,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows while returning also the position of the element of interest (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
fetchLambda function for fetching data. See For Const Matrices.
reductionLambda function for reduction with argument tracking. See Reduction With Argument (Position Tracking).
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsWithArgumentExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = ( rowIdx * 13 + columnIdx * 7 ) % 20;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Dense matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Find maximum value and its column index in each row.
34 */
35 TNL::Containers::Vector< double, Device > rowMaxValues( matrix.getRows() );
36 TNL::Containers::Vector< int, Device > rowMaxColumns( matrix.getRows() );
37 auto maxValues_view = rowMaxValues.getView();
38 auto maxColumns_view = rowMaxColumns.getView();
39
40 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
41 {
42 return value;
43 };
44
45 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
46 {
47 if( a < b ) {
48 a = b;
49 aIdx = bIdx;
50 }
51 else if( a == b && bIdx < aIdx ) {
52 aIdx = bIdx;
53 }
54 };
55
56 auto store = [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
57 {
58 maxValues_view[ rowIdx ] = value;
59 if( ! emptyRow )
60 maxColumns_view[ rowIdx ] = columnIdx;
61 };
62
64 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
65
66 std::cout << "Row maxima values: " << rowMaxValues << '\n';
67 std::cout << "Column indices of maxima: " << rowMaxColumns << '\n';
68
69 /***
70 * Find minimum value and its column index in each row.
71 */
72 TNL::Containers::Vector< double, Device > rowMinValues( matrix.getRows() );
73 TNL::Containers::Vector< int, Device > rowMinColumns( matrix.getRows() );
74 auto minValues_view = rowMinValues.getView();
75 auto minColumns_view = rowMinColumns.getView();
76
77 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
78 {
79 if( a > b ) {
80 a = b;
81 aIdx = bIdx;
82 }
83 else if( a == b && bIdx < aIdx ) {
84 aIdx = bIdx;
85 }
86 };
87
88 auto storeMin =
89 [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
90 {
91 minValues_view[ rowIdx ] = value;
92 if( ! emptyRow )
93 minColumns_view[ rowIdx ] = columnIdx;
94 };
95
96 TNL::Matrices::reduceAllRowsWithArgument( matrix, fetch, reductionMin, storeMin, std::numeric_limits< double >::max() );
97 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
98
99 std::cout << "Row minima values: " << rowMinValues << '\n';
100 std::cout << "Column indices of minima: " << rowMinColumns << '\n';
101}
102
103int
104main( int argc, char* argv[] )
105{
106 std::cout << "Running on host:\n";
107 reduceAllRowsWithArgumentExample< TNL::Devices::Host >();
108
109#ifdef __CUDACC__
110 std::cout << '\n' << "Running on CUDA device:\n";
111 reduceAllRowsWithArgumentExample< TNL::Devices::Cuda >();
112#endif
113}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:7 2:14 3:1 4:8 5:15
Row: 1 -> 0:13 1:0 2:7 3:14 4:1 5:8
Row: 2 -> 0:6 1:13 2:0 3:7 4:14 5:1
Row: 3 -> 0:19 1:6 2:13 3:0 4:7 5:14
Row: 4 -> 0:12 1:19 2:6 3:13 4:0 5:7
Row: 5 -> 0:5 1:12 2:19 3:6 4:13 5:0
Row maxima values: [ 15, 14, 14, 19, 19, 19 ]
Column indices of maxima: [ 5, 3, 4, 0, 1, 2 ]
Row minima values: [ 0, 0, 0, 0, 0, 0 ]
Column indices of minima: [ 0, 1, 2, 3, 4, 5 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:7 2:14 3:1 4:8 5:15
Row: 1 -> 0:13 1:0 2:7 3:14 4:1 5:8
Row: 2 -> 0:6 1:13 2:0 3:7 4:14 5:1
Row: 3 -> 0:19 1:6 2:13 3:0 4:7 5:14
Row: 4 -> 0:12 1:19 2:6 3:13 4:0 5:7
Row: 5 -> 0:5 1:12 2:19 3:6 4:13 5:0
Row maxima values: [ 15, 14, 14, 19, 19, 19 ]
Column indices of maxima: [ 5, 3, 4, 0, 1, 2 ]
Row minima values: [ 0, 0, 0, 0, 0, 0 ]
Column indices of minima: [ 0, 1, 2, 3, 4, 5 ]

◆ reduceAllRowsWithArgument() [3/4]

template<typename Matrix, typename Fetch, typename Reduction, typename Store>
void TNL::Matrices::reduceAllRowsWithArgument ( Matrix & matrix,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows while returning also the position of the element of interest with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsWithArgumentExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = ( rowIdx * 13 + columnIdx * 7 ) % 20;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Dense matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Find maximum value and its column index in each row.
34 */
35 TNL::Containers::Vector< double, Device > rowMaxValues( matrix.getRows() );
36 TNL::Containers::Vector< int, Device > rowMaxColumns( matrix.getRows() );
37 auto maxValues_view = rowMaxValues.getView();
38 auto maxColumns_view = rowMaxColumns.getView();
39
40 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
41 {
42 return value;
43 };
44
45 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
46 {
47 if( a < b ) {
48 a = b;
49 aIdx = bIdx;
50 }
51 else if( a == b && bIdx < aIdx ) {
52 aIdx = bIdx;
53 }
54 };
55
56 auto store = [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
57 {
58 maxValues_view[ rowIdx ] = value;
59 if( ! emptyRow )
60 maxColumns_view[ rowIdx ] = columnIdx;
61 };
62
64 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
65
66 std::cout << "Row maxima values: " << rowMaxValues << '\n';
67 std::cout << "Column indices of maxima: " << rowMaxColumns << '\n';
68
69 /***
70 * Find minimum value and its column index in each row.
71 */
72 TNL::Containers::Vector< double, Device > rowMinValues( matrix.getRows() );
73 TNL::Containers::Vector< int, Device > rowMinColumns( matrix.getRows() );
74 auto minValues_view = rowMinValues.getView();
75 auto minColumns_view = rowMinColumns.getView();
76
77 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
78 {
79 if( a > b ) {
80 a = b;
81 aIdx = bIdx;
82 }
83 else if( a == b && bIdx < aIdx ) {
84 aIdx = bIdx;
85 }
86 };
87
88 auto storeMin =
89 [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
90 {
91 minValues_view[ rowIdx ] = value;
92 if( ! emptyRow )
93 minColumns_view[ rowIdx ] = columnIdx;
94 };
95
96 TNL::Matrices::reduceAllRowsWithArgument( matrix, fetch, reductionMin, storeMin, std::numeric_limits< double >::max() );
97 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
98
99 std::cout << "Row minima values: " << rowMinValues << '\n';
100 std::cout << "Column indices of minima: " << rowMinColumns << '\n';
101}
102
103int
104main( int argc, char* argv[] )
105{
106 std::cout << "Running on host:\n";
107 reduceAllRowsWithArgumentExample< TNL::Devices::Host >();
108
109#ifdef __CUDACC__
110 std::cout << '\n' << "Running on CUDA device:\n";
111 reduceAllRowsWithArgumentExample< TNL::Devices::Cuda >();
112#endif
113}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:7 2:14 3:1 4:8 5:15
Row: 1 -> 0:13 1:0 2:7 3:14 4:1 5:8
Row: 2 -> 0:6 1:13 2:0 3:7 4:14 5:1
Row: 3 -> 0:19 1:6 2:13 3:0 4:7 5:14
Row: 4 -> 0:12 1:19 2:6 3:13 4:0 5:7
Row: 5 -> 0:5 1:12 2:19 3:6 4:13 5:0
Row maxima values: [ 15, 14, 14, 19, 19, 19 ]
Column indices of maxima: [ 5, 3, 4, 0, 1, 2 ]
Row minima values: [ 0, 0, 0, 0, 0, 0 ]
Column indices of minima: [ 0, 1, 2, 3, 4, 5 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:7 2:14 3:1 4:8 5:15
Row: 1 -> 0:13 1:0 2:7 3:14 4:1 5:8
Row: 2 -> 0:6 1:13 2:0 3:7 4:14 5:1
Row: 3 -> 0:19 1:6 2:13 3:0 4:7 5:14
Row: 4 -> 0:12 1:19 2:6 3:13 4:0 5:7
Row: 5 -> 0:5 1:12 2:19 3:6 4:13 5:0
Row maxima values: [ 15, 14, 14, 19, 19, 19 ]
Column indices of maxima: [ 5, 3, 4, 0, 1, 2 ]
Row minima values: [ 0, 0, 0, 0, 0, 0 ]
Column indices of minima: [ 0, 1, 2, 3, 4, 5 ]

◆ reduceAllRowsWithArgument() [4/4]

template<typename Matrix, typename Fetch, typename Reduction, typename Store, typename FetchValue>
void TNL::Matrices::reduceAllRowsWithArgument ( Matrix & matrix,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows while returning also the position of the element of interest.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionLambda function for reduction with argument tracking. See Reduction With Argument (Position Tracking).
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsWithArgumentExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = ( rowIdx * 13 + columnIdx * 7 ) % 20;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Dense matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Find maximum value and its column index in each row.
34 */
35 TNL::Containers::Vector< double, Device > rowMaxValues( matrix.getRows() );
36 TNL::Containers::Vector< int, Device > rowMaxColumns( matrix.getRows() );
37 auto maxValues_view = rowMaxValues.getView();
38 auto maxColumns_view = rowMaxColumns.getView();
39
40 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
41 {
42 return value;
43 };
44
45 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
46 {
47 if( a < b ) {
48 a = b;
49 aIdx = bIdx;
50 }
51 else if( a == b && bIdx < aIdx ) {
52 aIdx = bIdx;
53 }
54 };
55
56 auto store = [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
57 {
58 maxValues_view[ rowIdx ] = value;
59 if( ! emptyRow )
60 maxColumns_view[ rowIdx ] = columnIdx;
61 };
62
64 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
65
66 std::cout << "Row maxima values: " << rowMaxValues << '\n';
67 std::cout << "Column indices of maxima: " << rowMaxColumns << '\n';
68
69 /***
70 * Find minimum value and its column index in each row.
71 */
72 TNL::Containers::Vector< double, Device > rowMinValues( matrix.getRows() );
73 TNL::Containers::Vector< int, Device > rowMinColumns( matrix.getRows() );
74 auto minValues_view = rowMinValues.getView();
75 auto minColumns_view = rowMinColumns.getView();
76
77 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
78 {
79 if( a > b ) {
80 a = b;
81 aIdx = bIdx;
82 }
83 else if( a == b && bIdx < aIdx ) {
84 aIdx = bIdx;
85 }
86 };
87
88 auto storeMin =
89 [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
90 {
91 minValues_view[ rowIdx ] = value;
92 if( ! emptyRow )
93 minColumns_view[ rowIdx ] = columnIdx;
94 };
95
96 TNL::Matrices::reduceAllRowsWithArgument( matrix, fetch, reductionMin, storeMin, std::numeric_limits< double >::max() );
97 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
98
99 std::cout << "Row minima values: " << rowMinValues << '\n';
100 std::cout << "Column indices of minima: " << rowMinColumns << '\n';
101}
102
103int
104main( int argc, char* argv[] )
105{
106 std::cout << "Running on host:\n";
107 reduceAllRowsWithArgumentExample< TNL::Devices::Host >();
108
109#ifdef __CUDACC__
110 std::cout << '\n' << "Running on CUDA device:\n";
111 reduceAllRowsWithArgumentExample< TNL::Devices::Cuda >();
112#endif
113}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:7 2:14 3:1 4:8 5:15
Row: 1 -> 0:13 1:0 2:7 3:14 4:1 5:8
Row: 2 -> 0:6 1:13 2:0 3:7 4:14 5:1
Row: 3 -> 0:19 1:6 2:13 3:0 4:7 5:14
Row: 4 -> 0:12 1:19 2:6 3:13 4:0 5:7
Row: 5 -> 0:5 1:12 2:19 3:6 4:13 5:0
Row maxima values: [ 15, 14, 14, 19, 19, 19 ]
Column indices of maxima: [ 5, 3, 4, 0, 1, 2 ]
Row minima values: [ 0, 0, 0, 0, 0, 0 ]
Column indices of minima: [ 0, 1, 2, 3, 4, 5 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:7 2:14 3:1 4:8 5:15
Row: 1 -> 0:13 1:0 2:7 3:14 4:1 5:8
Row: 2 -> 0:6 1:13 2:0 3:7 4:14 5:1
Row: 3 -> 0:19 1:6 2:13 3:0 4:7 5:14
Row: 4 -> 0:12 1:19 2:6 3:13 4:0 5:7
Row: 5 -> 0:5 1:12 2:19 3:6 4:13 5:0
Row maxima values: [ 15, 14, 14, 19, 19, 19 ]
Column indices of maxima: [ 5, 3, 4, 0, 1, 2 ]
Row minima values: [ 0, 0, 0, 0, 0, 0 ]
Column indices of minima: [ 0, 1, 2, 3, 4, 5 ]

◆ reduceAllRowsWithArgumentIf() [1/4]

template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType TNL::Matrices::reduceAllRowsWithArgumentIf ( const Matrix & matrix,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows based on a condition while returning also the position of the element of interest with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
conditionLambda function for row condition checking. See Condition Check.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsWithArgumentIfExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values.
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = ( rowIdx * 7 + columnIdx * 11 ) % 20;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Dense matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Find argmax only for even-indexed rows.
34 */
35 TNL::Containers::Vector< double, Device > maxValues( matrix.getRows() );
36 TNL::Containers::Vector< int, Device > maxColumns( matrix.getRows() );
37 TNL::Containers::Vector< double, Device > compressedMaxValues( matrix.getRows() );
38 TNL::Containers::Vector< int, Device > compressedMaxColumns( matrix.getRows() );
39
40 auto maxValues_view = maxValues.getView();
41 auto maxColumns_view = maxColumns.getView();
42 auto compressedMaxValues_view = compressedMaxValues.getView();
43 auto compressedMaxColumns_view = compressedMaxColumns.getView();
44
45 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
46 {
47 return value;
48 };
49
50 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
51 {
52 if( a < b ) {
53 a = b;
54 aIdx = bIdx;
55 }
56 else if( a == b && bIdx < aIdx ) {
57 aIdx = bIdx;
58 }
59 };
60
61 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
62 {
63 return rowIdx % 2 == 0; // Only even row indices
64 };
65
66 auto store = [ = ] __cuda_callable__(
67 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
68 {
69 maxValues_view[ rowIdx ] = value;
70 compressedMaxValues_view[ indexOfRowIdx ] = value;
71 if( ! emptyRow ) {
72 maxColumns_view[ rowIdx ] = columnIdx;
73 compressedMaxColumns_view[ indexOfRowIdx ] = columnIdx;
74 }
75 };
76
77 // Initialize with -1 to see which rows were processed
78 maxValues.setValue( -1.0 );
79 maxColumns.setValue( -1 );
80
82 matrix, rowCondition, fetch, reduction, store, std::numeric_limits< double >::lowest() );
83 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
84
85 std::cout << "Argmax for even-indexed rows:\n";
86 std::cout << " Max values (odd indices show -1): " << maxValues << '\n';
87 std::cout << " Column indices: " << maxColumns << '\n';
88 std::cout << "Compressed argmax for even-indexed rows:\n";
89 std::cout << " Max values: " << compressedMaxValues.getView( 0, evenRowsCount ) << '\n';
90 std::cout << " Column indices: " << compressedMaxColumns.getView( 0, evenRowsCount ) << '\n';
91}
92
93int
94main( int argc, char* argv[] )
95{
96 std::cout << "Running on host:\n";
97 reduceAllRowsWithArgumentIfExample< TNL::Devices::Host >();
98
99#ifdef __CUDACC__
100 std::cout << '\n' << "Running on CUDA device:\n";
101 reduceAllRowsWithArgumentIfExample< TNL::Devices::Cuda >();
102#endif
103}
Matrix::IndexType reduceAllRowsWithArgumentIf(Matrix &matrix, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Performs parallel reduction within each matrix row over all rows based on a condition while returning...
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:11 2:2 3:13 4:4 5:15
Row: 1 -> 0:7 1:18 2:9 3:0 4:11 5:2
Row: 2 -> 0:14 1:5 2:16 3:7 4:18 5:9
Row: 3 -> 0:1 1:12 2:3 3:14 4:5 5:16
Row: 4 -> 0:8 1:19 2:10 3:1 4:12 5:3
Row: 5 -> 0:15 1:6 2:17 3:8 4:19 5:10
Argmax for even-indexed rows:
Max values (odd indices show -1): [ 15, -1, 18, -1, 19, -1 ]
Column indices: [ 5, -1, 4, -1, 1, -1 ]
Compressed argmax for even-indexed rows:
Max values: [ 15, 18, 19 ]
Column indices: [ 5, 4, 1 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:11 2:2 3:13 4:4 5:15
Row: 1 -> 0:7 1:18 2:9 3:0 4:11 5:2
Row: 2 -> 0:14 1:5 2:16 3:7 4:18 5:9
Row: 3 -> 0:1 1:12 2:3 3:14 4:5 5:16
Row: 4 -> 0:8 1:19 2:10 3:1 4:12 5:3
Row: 5 -> 0:15 1:6 2:17 3:8 4:19 5:10
Argmax for even-indexed rows:
Max values (odd indices show -1): [ 15, -1, 18, -1, 19, -1 ]
Column indices: [ 5, -1, 4, -1, 1, -1 ]
Compressed argmax for even-indexed rows:
Max values: [ 15, 18, 19 ]
Column indices: [ 5, 4, 1 ]

◆ reduceAllRowsWithArgumentIf() [2/4]

template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue = decltype( std::declval< Fetch >()( 0, 0, std::declval< typename Matrix::RealType >() ) )>
Matrix::IndexType TNL::Matrices::reduceAllRowsWithArgumentIf ( const Matrix & matrix,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows based on a condition while returning also the position of the element of interest (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
conditionLambda function for row condition checking. See Condition Check.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
identityThe identity element for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsWithArgumentIfExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values.
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = ( rowIdx * 7 + columnIdx * 11 ) % 20;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Dense matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Find argmax only for even-indexed rows.
34 */
35 TNL::Containers::Vector< double, Device > maxValues( matrix.getRows() );
36 TNL::Containers::Vector< int, Device > maxColumns( matrix.getRows() );
37 TNL::Containers::Vector< double, Device > compressedMaxValues( matrix.getRows() );
38 TNL::Containers::Vector< int, Device > compressedMaxColumns( matrix.getRows() );
39
40 auto maxValues_view = maxValues.getView();
41 auto maxColumns_view = maxColumns.getView();
42 auto compressedMaxValues_view = compressedMaxValues.getView();
43 auto compressedMaxColumns_view = compressedMaxColumns.getView();
44
45 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
46 {
47 return value;
48 };
49
50 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
51 {
52 if( a < b ) {
53 a = b;
54 aIdx = bIdx;
55 }
56 else if( a == b && bIdx < aIdx ) {
57 aIdx = bIdx;
58 }
59 };
60
61 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
62 {
63 return rowIdx % 2 == 0; // Only even row indices
64 };
65
66 auto store = [ = ] __cuda_callable__(
67 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
68 {
69 maxValues_view[ rowIdx ] = value;
70 compressedMaxValues_view[ indexOfRowIdx ] = value;
71 if( ! emptyRow ) {
72 maxColumns_view[ rowIdx ] = columnIdx;
73 compressedMaxColumns_view[ indexOfRowIdx ] = columnIdx;
74 }
75 };
76
77 // Initialize with -1 to see which rows were processed
78 maxValues.setValue( -1.0 );
79 maxColumns.setValue( -1 );
80
82 matrix, rowCondition, fetch, reduction, store, std::numeric_limits< double >::lowest() );
83 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
84
85 std::cout << "Argmax for even-indexed rows:\n";
86 std::cout << " Max values (odd indices show -1): " << maxValues << '\n';
87 std::cout << " Column indices: " << maxColumns << '\n';
88 std::cout << "Compressed argmax for even-indexed rows:\n";
89 std::cout << " Max values: " << compressedMaxValues.getView( 0, evenRowsCount ) << '\n';
90 std::cout << " Column indices: " << compressedMaxColumns.getView( 0, evenRowsCount ) << '\n';
91}
92
93int
94main( int argc, char* argv[] )
95{
96 std::cout << "Running on host:\n";
97 reduceAllRowsWithArgumentIfExample< TNL::Devices::Host >();
98
99#ifdef __CUDACC__
100 std::cout << '\n' << "Running on CUDA device:\n";
101 reduceAllRowsWithArgumentIfExample< TNL::Devices::Cuda >();
102#endif
103}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:11 2:2 3:13 4:4 5:15
Row: 1 -> 0:7 1:18 2:9 3:0 4:11 5:2
Row: 2 -> 0:14 1:5 2:16 3:7 4:18 5:9
Row: 3 -> 0:1 1:12 2:3 3:14 4:5 5:16
Row: 4 -> 0:8 1:19 2:10 3:1 4:12 5:3
Row: 5 -> 0:15 1:6 2:17 3:8 4:19 5:10
Argmax for even-indexed rows:
Max values (odd indices show -1): [ 15, -1, 18, -1, 19, -1 ]
Column indices: [ 5, -1, 4, -1, 1, -1 ]
Compressed argmax for even-indexed rows:
Max values: [ 15, 18, 19 ]
Column indices: [ 5, 4, 1 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:11 2:2 3:13 4:4 5:15
Row: 1 -> 0:7 1:18 2:9 3:0 4:11 5:2
Row: 2 -> 0:14 1:5 2:16 3:7 4:18 5:9
Row: 3 -> 0:1 1:12 2:3 3:14 4:5 5:16
Row: 4 -> 0:8 1:19 2:10 3:1 4:12 5:3
Row: 5 -> 0:15 1:6 2:17 3:8 4:19 5:10
Argmax for even-indexed rows:
Max values (odd indices show -1): [ 15, -1, 18, -1, 19, -1 ]
Column indices: [ 5, -1, 4, -1, 1, -1 ]
Compressed argmax for even-indexed rows:
Max values: [ 15, 18, 19 ]
Column indices: [ 5, 4, 1 ]

◆ reduceAllRowsWithArgumentIf() [3/4]

template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType TNL::Matrices::reduceAllRowsWithArgumentIf ( Matrix & matrix,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows based on a condition while returning also the position of the element of interest with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
conditionLambda function for row condition checking. See Condition Check.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsWithArgumentIfExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values.
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = ( rowIdx * 7 + columnIdx * 11 ) % 20;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Dense matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Find argmax only for even-indexed rows.
34 */
35 TNL::Containers::Vector< double, Device > maxValues( matrix.getRows() );
36 TNL::Containers::Vector< int, Device > maxColumns( matrix.getRows() );
37 TNL::Containers::Vector< double, Device > compressedMaxValues( matrix.getRows() );
38 TNL::Containers::Vector< int, Device > compressedMaxColumns( matrix.getRows() );
39
40 auto maxValues_view = maxValues.getView();
41 auto maxColumns_view = maxColumns.getView();
42 auto compressedMaxValues_view = compressedMaxValues.getView();
43 auto compressedMaxColumns_view = compressedMaxColumns.getView();
44
45 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
46 {
47 return value;
48 };
49
50 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
51 {
52 if( a < b ) {
53 a = b;
54 aIdx = bIdx;
55 }
56 else if( a == b && bIdx < aIdx ) {
57 aIdx = bIdx;
58 }
59 };
60
61 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
62 {
63 return rowIdx % 2 == 0; // Only even row indices
64 };
65
66 auto store = [ = ] __cuda_callable__(
67 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
68 {
69 maxValues_view[ rowIdx ] = value;
70 compressedMaxValues_view[ indexOfRowIdx ] = value;
71 if( ! emptyRow ) {
72 maxColumns_view[ rowIdx ] = columnIdx;
73 compressedMaxColumns_view[ indexOfRowIdx ] = columnIdx;
74 }
75 };
76
77 // Initialize with -1 to see which rows were processed
78 maxValues.setValue( -1.0 );
79 maxColumns.setValue( -1 );
80
82 matrix, rowCondition, fetch, reduction, store, std::numeric_limits< double >::lowest() );
83 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
84
85 std::cout << "Argmax for even-indexed rows:\n";
86 std::cout << " Max values (odd indices show -1): " << maxValues << '\n';
87 std::cout << " Column indices: " << maxColumns << '\n';
88 std::cout << "Compressed argmax for even-indexed rows:\n";
89 std::cout << " Max values: " << compressedMaxValues.getView( 0, evenRowsCount ) << '\n';
90 std::cout << " Column indices: " << compressedMaxColumns.getView( 0, evenRowsCount ) << '\n';
91}
92
93int
94main( int argc, char* argv[] )
95{
96 std::cout << "Running on host:\n";
97 reduceAllRowsWithArgumentIfExample< TNL::Devices::Host >();
98
99#ifdef __CUDACC__
100 std::cout << '\n' << "Running on CUDA device:\n";
101 reduceAllRowsWithArgumentIfExample< TNL::Devices::Cuda >();
102#endif
103}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:11 2:2 3:13 4:4 5:15
Row: 1 -> 0:7 1:18 2:9 3:0 4:11 5:2
Row: 2 -> 0:14 1:5 2:16 3:7 4:18 5:9
Row: 3 -> 0:1 1:12 2:3 3:14 4:5 5:16
Row: 4 -> 0:8 1:19 2:10 3:1 4:12 5:3
Row: 5 -> 0:15 1:6 2:17 3:8 4:19 5:10
Argmax for even-indexed rows:
Max values (odd indices show -1): [ 15, -1, 18, -1, 19, -1 ]
Column indices: [ 5, -1, 4, -1, 1, -1 ]
Compressed argmax for even-indexed rows:
Max values: [ 15, 18, 19 ]
Column indices: [ 5, 4, 1 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:11 2:2 3:13 4:4 5:15
Row: 1 -> 0:7 1:18 2:9 3:0 4:11 5:2
Row: 2 -> 0:14 1:5 2:16 3:7 4:18 5:9
Row: 3 -> 0:1 1:12 2:3 3:14 4:5 5:16
Row: 4 -> 0:8 1:19 2:10 3:1 4:12 5:3
Row: 5 -> 0:15 1:6 2:17 3:8 4:19 5:10
Argmax for even-indexed rows:
Max values (odd indices show -1): [ 15, -1, 18, -1, 19, -1 ]
Column indices: [ 5, -1, 4, -1, 1, -1 ]
Compressed argmax for even-indexed rows:
Max values: [ 15, 18, 19 ]
Column indices: [ 5, 4, 1 ]

◆ reduceAllRowsWithArgumentIf() [4/4]

template<typename Matrix, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue = decltype( std::declval< Fetch >()( 0, 0, std::declval< typename Matrix::RealType >() ) )>
Matrix::IndexType TNL::Matrices::reduceAllRowsWithArgumentIf ( Matrix & matrix,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over all rows based on a condition while returning also the position of the element of interest.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
conditionLambda function for row condition checking. See Condition Check.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
identityThe identity element for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Devices/Host.h>
8#include <TNL/Devices/Cuda.h>
9
10template< typename Device >
11void
12reduceAllRowsWithArgumentIfExample()
13{
14 /***
15 * Create a 6x6 dense matrix.
16 */
18 matrix.setValue( 0.0 );
19
20 /***
21 * Fill the matrix with values.
22 */
23 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
24 {
25 value = ( rowIdx * 7 + columnIdx * 11 ) % 20;
26 };
27 TNL::Matrices::forAllElements( matrix, fillMatrix );
28
29 std::cout << "Dense matrix:\n";
30 std::cout << matrix << '\n';
31
32 /***
33 * Find argmax only for even-indexed rows.
34 */
35 TNL::Containers::Vector< double, Device > maxValues( matrix.getRows() );
36 TNL::Containers::Vector< int, Device > maxColumns( matrix.getRows() );
37 TNL::Containers::Vector< double, Device > compressedMaxValues( matrix.getRows() );
38 TNL::Containers::Vector< int, Device > compressedMaxColumns( matrix.getRows() );
39
40 auto maxValues_view = maxValues.getView();
41 auto maxColumns_view = maxColumns.getView();
42 auto compressedMaxValues_view = compressedMaxValues.getView();
43 auto compressedMaxColumns_view = compressedMaxColumns.getView();
44
45 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
46 {
47 return value;
48 };
49
50 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
51 {
52 if( a < b ) {
53 a = b;
54 aIdx = bIdx;
55 }
56 else if( a == b && bIdx < aIdx ) {
57 aIdx = bIdx;
58 }
59 };
60
61 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
62 {
63 return rowIdx % 2 == 0; // Only even row indices
64 };
65
66 auto store = [ = ] __cuda_callable__(
67 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
68 {
69 maxValues_view[ rowIdx ] = value;
70 compressedMaxValues_view[ indexOfRowIdx ] = value;
71 if( ! emptyRow ) {
72 maxColumns_view[ rowIdx ] = columnIdx;
73 compressedMaxColumns_view[ indexOfRowIdx ] = columnIdx;
74 }
75 };
76
77 // Initialize with -1 to see which rows were processed
78 maxValues.setValue( -1.0 );
79 maxColumns.setValue( -1 );
80
82 matrix, rowCondition, fetch, reduction, store, std::numeric_limits< double >::lowest() );
83 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
84
85 std::cout << "Argmax for even-indexed rows:\n";
86 std::cout << " Max values (odd indices show -1): " << maxValues << '\n';
87 std::cout << " Column indices: " << maxColumns << '\n';
88 std::cout << "Compressed argmax for even-indexed rows:\n";
89 std::cout << " Max values: " << compressedMaxValues.getView( 0, evenRowsCount ) << '\n';
90 std::cout << " Column indices: " << compressedMaxColumns.getView( 0, evenRowsCount ) << '\n';
91}
92
93int
94main( int argc, char* argv[] )
95{
96 std::cout << "Running on host:\n";
97 reduceAllRowsWithArgumentIfExample< TNL::Devices::Host >();
98
99#ifdef __CUDACC__
100 std::cout << '\n' << "Running on CUDA device:\n";
101 reduceAllRowsWithArgumentIfExample< TNL::Devices::Cuda >();
102#endif
103}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:11 2:2 3:13 4:4 5:15
Row: 1 -> 0:7 1:18 2:9 3:0 4:11 5:2
Row: 2 -> 0:14 1:5 2:16 3:7 4:18 5:9
Row: 3 -> 0:1 1:12 2:3 3:14 4:5 5:16
Row: 4 -> 0:8 1:19 2:10 3:1 4:12 5:3
Row: 5 -> 0:15 1:6 2:17 3:8 4:19 5:10
Argmax for even-indexed rows:
Max values (odd indices show -1): [ 15, -1, 18, -1, 19, -1 ]
Column indices: [ 5, -1, 4, -1, 1, -1 ]
Compressed argmax for even-indexed rows:
Max values: [ 15, 18, 19 ]
Column indices: [ 5, 4, 1 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:11 2:2 3:13 4:4 5:15
Row: 1 -> 0:7 1:18 2:9 3:0 4:11 5:2
Row: 2 -> 0:14 1:5 2:16 3:7 4:18 5:9
Row: 3 -> 0:1 1:12 2:3 3:14 4:5 5:16
Row: 4 -> 0:8 1:19 2:10 3:1 4:12 5:3
Row: 5 -> 0:15 1:6 2:17 3:8 4:19 5:10
Argmax for even-indexed rows:
Max values (odd indices show -1): [ 15, -1, 18, -1, 19, -1 ]
Column indices: [ 5, -1, 4, -1, 1, -1 ]
Compressed argmax for even-indexed rows:
Max values: [ 15, 18, 19 ]
Column indices: [ 5, 4, 1 ]

◆ reduceRows() [1/8]

template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::reduceRows ( const Matrix & matrix,
const Array & rowIndexes,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within matrix rows specified by a given set of row indexes with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the rows to iterate over.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
rowIndexesThe array containing the indexes of the rows to iterate over.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Store With Row Index Array Or Condition.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsExample()
14{
15 /***
16 * Create a 7x7 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = rowIdx * 10 + columnIdx;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Dense matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Compute sums for rows 2-5 (range variant).
35 */
36 TNL::Containers::Vector< double, Device > rangeSums( 4 ); // 4 rows
37 auto rangeSums_view = rangeSums.getView();
38
39 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
40 {
41 return value;
42 };
43
44 auto storeRange = [ = ] __cuda_callable__( int rowIdx, const double& sum ) mutable
45 {
46 rangeSums_view[ rowIdx - 2 ] = sum; // Offset by begin index
47 };
48
49 TNL::Matrices::reduceRows( matrix, 2, 6, fetch, TNL::Plus{}, storeRange );
50
51 std::cout << "Sums for rows 2-5: " << rangeSums << '\n';
52
53 /***
54 * Compute sums for specific rows (array variant).
55 */
56 TNL::Containers::Array< int, Device > rowIndexes{ 0, 2, 4, 6 };
57 TNL::Containers::Vector< double, Device > arraySums( matrix.getRows() );
58 TNL::Containers::Vector< double, Device > compressedSums( rowIndexes.getSize() );
59 auto arraySums_view = arraySums.getView();
60 auto compressedSums_view = compressedSums.getView();
61
62 auto storeArray = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
63 {
64 arraySums_view[ rowIdx ] = sum;
65 compressedSums_view[ indexOfRowIdx ] = sum;
66 };
67
68 TNL::Matrices::reduceRows( matrix, rowIndexes, fetch, TNL::Plus{}, storeArray );
69
70 std::cout << "Sums for rows [0, 2, 4, 6]: " << arraySums << '\n';
71 std::cout << "Compressed sums for rows [0, 2, 4, 6]: " << compressedSums.getView( 0, rowIndexes.getSize() ) << '\n';
72}
73
74int
75main( int argc, char* argv[] )
76{
77 std::cout << "Running on host:\n";
78 reduceRowsExample< TNL::Devices::Host >();
79
80#ifdef __CUDACC__
81 std::cout << '\n' << "Running on CUDA device:\n";
82 reduceRowsExample< TNL::Devices::Cuda >();
83#endif
84}
Array is responsible for memory management, access to array elements, and general array operations.
Definition Array.h:65
__cuda_callable__ IndexType getSize() const
Returns the current array size.
void reduceRows(Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Performs parallel reduction within each matrix row over a given range of row indexes.
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 4.67421e-310, 301, 4.67421e-310, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 0, 301, 0, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]

◆ reduceRows() [2/8]

template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::reduceRows ( const Matrix & matrix,
const Array & rowIndexes,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within matrix rows specified by a given set of row indexes (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the rows to iterate over.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
rowIndexesThe array containing the indexes of the rows to iterate over.
fetchLambda function for fetching data. See For Const Matrices.
reductionLambda function for reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Store With Row Index Array Or Condition.
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

◆ reduceRows() [3/8]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::reduceRows ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Basic Store (Row Index Only).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsExample()
14{
15 /***
16 * Create a 7x7 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = rowIdx * 10 + columnIdx;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Dense matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Compute sums for rows 2-5 (range variant).
35 */
36 TNL::Containers::Vector< double, Device > rangeSums( 4 ); // 4 rows
37 auto rangeSums_view = rangeSums.getView();
38
39 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
40 {
41 return value;
42 };
43
44 auto storeRange = [ = ] __cuda_callable__( int rowIdx, const double& sum ) mutable
45 {
46 rangeSums_view[ rowIdx - 2 ] = sum; // Offset by begin index
47 };
48
49 TNL::Matrices::reduceRows( matrix, 2, 6, fetch, TNL::Plus{}, storeRange );
50
51 std::cout << "Sums for rows 2-5: " << rangeSums << '\n';
52
53 /***
54 * Compute sums for specific rows (array variant).
55 */
56 TNL::Containers::Array< int, Device > rowIndexes{ 0, 2, 4, 6 };
57 TNL::Containers::Vector< double, Device > arraySums( matrix.getRows() );
58 TNL::Containers::Vector< double, Device > compressedSums( rowIndexes.getSize() );
59 auto arraySums_view = arraySums.getView();
60 auto compressedSums_view = compressedSums.getView();
61
62 auto storeArray = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
63 {
64 arraySums_view[ rowIdx ] = sum;
65 compressedSums_view[ indexOfRowIdx ] = sum;
66 };
67
68 TNL::Matrices::reduceRows( matrix, rowIndexes, fetch, TNL::Plus{}, storeArray );
69
70 std::cout << "Sums for rows [0, 2, 4, 6]: " << arraySums << '\n';
71 std::cout << "Compressed sums for rows [0, 2, 4, 6]: " << compressedSums.getView( 0, rowIndexes.getSize() ) << '\n';
72}
73
74int
75main( int argc, char* argv[] )
76{
77 std::cout << "Running on host:\n";
78 reduceRowsExample< TNL::Devices::Host >();
79
80#ifdef __CUDACC__
81 std::cout << '\n' << "Running on CUDA device:\n";
82 reduceRowsExample< TNL::Devices::Cuda >();
83#endif
84}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 4.67421e-310, 301, 4.67421e-310, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 0, 301, 0, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]

◆ reduceRows() [4/8]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::reduceRows ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
fetchLambda function for fetching data. See For Const Matrices.
reductionLambda function for the reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Basic Store (Row Index Only).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

◆ reduceRows() [5/8]

template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::reduceRows ( Matrix & matrix,
const Array & rowIndexes,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within matrix rows specified by a given set of row indexes with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the rows to iterate over.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
rowIndexesThe array containing the indexes of the rows to iterate over.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Store With Row Index Array Or Condition.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsExample()
14{
15 /***
16 * Create a 7x7 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = rowIdx * 10 + columnIdx;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Dense matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Compute sums for rows 2-5 (range variant).
35 */
36 TNL::Containers::Vector< double, Device > rangeSums( 4 ); // 4 rows
37 auto rangeSums_view = rangeSums.getView();
38
39 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
40 {
41 return value;
42 };
43
44 auto storeRange = [ = ] __cuda_callable__( int rowIdx, const double& sum ) mutable
45 {
46 rangeSums_view[ rowIdx - 2 ] = sum; // Offset by begin index
47 };
48
49 TNL::Matrices::reduceRows( matrix, 2, 6, fetch, TNL::Plus{}, storeRange );
50
51 std::cout << "Sums for rows 2-5: " << rangeSums << '\n';
52
53 /***
54 * Compute sums for specific rows (array variant).
55 */
56 TNL::Containers::Array< int, Device > rowIndexes{ 0, 2, 4, 6 };
57 TNL::Containers::Vector< double, Device > arraySums( matrix.getRows() );
58 TNL::Containers::Vector< double, Device > compressedSums( rowIndexes.getSize() );
59 auto arraySums_view = arraySums.getView();
60 auto compressedSums_view = compressedSums.getView();
61
62 auto storeArray = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
63 {
64 arraySums_view[ rowIdx ] = sum;
65 compressedSums_view[ indexOfRowIdx ] = sum;
66 };
67
68 TNL::Matrices::reduceRows( matrix, rowIndexes, fetch, TNL::Plus{}, storeArray );
69
70 std::cout << "Sums for rows [0, 2, 4, 6]: " << arraySums << '\n';
71 std::cout << "Compressed sums for rows [0, 2, 4, 6]: " << compressedSums.getView( 0, rowIndexes.getSize() ) << '\n';
72}
73
74int
75main( int argc, char* argv[] )
76{
77 std::cout << "Running on host:\n";
78 reduceRowsExample< TNL::Devices::Host >();
79
80#ifdef __CUDACC__
81 std::cout << '\n' << "Running on CUDA device:\n";
82 reduceRowsExample< TNL::Devices::Cuda >();
83#endif
84}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 4.67421e-310, 301, 4.67421e-310, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 0, 301, 0, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]

◆ reduceRows() [6/8]

template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::reduceRows ( Matrix & matrix,
const Array & rowIndexes,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within matrix rows specified by a given set of row indexes.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the rows to iterate over.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of row indexes where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of row indexes where the reduction will be performed.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
rowIndexesThe array containing the indexes of the rows to iterate over.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionLambda function for reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Store With Row Index Array Or Condition.
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

◆ reduceRows() [7/8]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::reduceRows ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Basic Store (Row Index Only).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsExample()
14{
15 /***
16 * Create a 7x7 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = rowIdx * 10 + columnIdx;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Dense matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Compute sums for rows 2-5 (range variant).
35 */
36 TNL::Containers::Vector< double, Device > rangeSums( 4 ); // 4 rows
37 auto rangeSums_view = rangeSums.getView();
38
39 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
40 {
41 return value;
42 };
43
44 auto storeRange = [ = ] __cuda_callable__( int rowIdx, const double& sum ) mutable
45 {
46 rangeSums_view[ rowIdx - 2 ] = sum; // Offset by begin index
47 };
48
49 TNL::Matrices::reduceRows( matrix, 2, 6, fetch, TNL::Plus{}, storeRange );
50
51 std::cout << "Sums for rows 2-5: " << rangeSums << '\n';
52
53 /***
54 * Compute sums for specific rows (array variant).
55 */
56 TNL::Containers::Array< int, Device > rowIndexes{ 0, 2, 4, 6 };
57 TNL::Containers::Vector< double, Device > arraySums( matrix.getRows() );
58 TNL::Containers::Vector< double, Device > compressedSums( rowIndexes.getSize() );
59 auto arraySums_view = arraySums.getView();
60 auto compressedSums_view = compressedSums.getView();
61
62 auto storeArray = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
63 {
64 arraySums_view[ rowIdx ] = sum;
65 compressedSums_view[ indexOfRowIdx ] = sum;
66 };
67
68 TNL::Matrices::reduceRows( matrix, rowIndexes, fetch, TNL::Plus{}, storeArray );
69
70 std::cout << "Sums for rows [0, 2, 4, 6]: " << arraySums << '\n';
71 std::cout << "Compressed sums for rows [0, 2, 4, 6]: " << compressedSums.getView( 0, rowIndexes.getSize() ) << '\n';
72}
73
74int
75main( int argc, char* argv[] )
76{
77 std::cout << "Running on host:\n";
78 reduceRowsExample< TNL::Devices::Host >();
79
80#ifdef __CUDACC__
81 std::cout << '\n' << "Running on CUDA device:\n";
82 reduceRowsExample< TNL::Devices::Cuda >();
83#endif
84}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 4.67421e-310, 301, 4.67421e-310, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 0, 301, 0, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]

◆ reduceRows() [8/8]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::reduceRows ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionLambda function for the reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Basic Store (Row Index Only).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsExample()
14{
15 /***
16 * Create a 7x7 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = rowIdx * 10 + columnIdx;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Dense matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Compute sums for rows 2-5 (range variant).
35 */
36 TNL::Containers::Vector< double, Device > rangeSums( 4 ); // 4 rows
37 auto rangeSums_view = rangeSums.getView();
38
39 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
40 {
41 return value;
42 };
43
44 auto storeRange = [ = ] __cuda_callable__( int rowIdx, const double& sum ) mutable
45 {
46 rangeSums_view[ rowIdx - 2 ] = sum; // Offset by begin index
47 };
48
49 TNL::Matrices::reduceRows( matrix, 2, 6, fetch, TNL::Plus{}, storeRange );
50
51 std::cout << "Sums for rows 2-5: " << rangeSums << '\n';
52
53 /***
54 * Compute sums for specific rows (array variant).
55 */
56 TNL::Containers::Array< int, Device > rowIndexes{ 0, 2, 4, 6 };
57 TNL::Containers::Vector< double, Device > arraySums( matrix.getRows() );
58 TNL::Containers::Vector< double, Device > compressedSums( rowIndexes.getSize() );
59 auto arraySums_view = arraySums.getView();
60 auto compressedSums_view = compressedSums.getView();
61
62 auto storeArray = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
63 {
64 arraySums_view[ rowIdx ] = sum;
65 compressedSums_view[ indexOfRowIdx ] = sum;
66 };
67
68 TNL::Matrices::reduceRows( matrix, rowIndexes, fetch, TNL::Plus{}, storeArray );
69
70 std::cout << "Sums for rows [0, 2, 4, 6]: " << arraySums << '\n';
71 std::cout << "Compressed sums for rows [0, 2, 4, 6]: " << compressedSums.getView( 0, rowIndexes.getSize() ) << '\n';
72}
73
74int
75main( int argc, char* argv[] )
76{
77 std::cout << "Running on host:\n";
78 reduceRowsExample< TNL::Devices::Host >();
79
80#ifdef __CUDACC__
81 std::cout << '\n' << "Running on CUDA device:\n";
82 reduceRowsExample< TNL::Devices::Cuda >();
83#endif
84}
Output
Running on host:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 4.67421e-310, 301, 4.67421e-310, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]
Running on CUDA device:
Dense matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66
Sums for rows 2-5: [ 161, 231, 301, 371 ]
Sums for rows [0, 2, 4, 6]: [ 21, 0, 161, 0, 301, 0, 441 ]
Compressed sums for rows [0, 2, 4, 6]: [ 21, 161, 301, 441 ]

◆ reduceRowsIf() [1/4]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType TNL::Matrices::reduceRowsIf ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes based on a condition with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
conditionLambda function for condition check. See Condition Check.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Basic Store (Row Index Only).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsIfExample()
14{
15 /***
16 * Create an 8x8 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = rowIdx * 10 + columnIdx;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Compute sums for rows 2-6, but only for even-indexed rows (range + condition).
35 */
37 auto rangeSums_view = rangeSums.getView();
38
39 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
40 {
41 return value;
42 };
43
44 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
45 {
46 return rowIdx % 2 == 0;
47 };
48
49 auto storeRange = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
50 {
51 rangeSums_view[ rowIdx - 2 ] = sum;
52 };
53
54 rangeSums.setValue( -1.0 );
55
56 TNL::Matrices::reduceRowsIf( matrix, 2, 7, evenRowCondition, fetch, TNL::Plus{}, storeRange );
57
58 std::cout << "Sums for rows 2-6 (only even indices, others show -1): " << rangeSums << '\n';
59
60 /***
61 * Compute maxima for specific rows, but only if row index > 3 (array + condition).
62 */
63 TNL::Containers::Vector< double, Device > arrayMaxima( matrix.getRows() );
64 TNL::Containers::Vector< double, Device > compressedMaxima( matrix.getRows() );
65 auto arrayMaxima_view = arrayMaxima.getView();
66 auto compressedMaxima_view = compressedMaxima.getView();
67
68 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
69 {
70 return rowIdx > 3 && rowIdx % 2 == 1; // Only odd row indices greater than 3
71 };
72
73 auto storeArray = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& max ) mutable
74 {
75 arrayMaxima_view[ rowIdx ] = max;
76 compressedMaxima_view[ indexOfRowIdx ] = max;
77 };
78
79 arrayMaxima.setValue( -1.0 );
80
81 auto processedRows = TNL::Matrices::reduceAllRowsIf( matrix, rowCondition, fetch, TNL::Max{}, storeArray );
82
83 std::cout << "Maxima for rows [1, 3, 5, 7] where rowIdx > 3: " << arrayMaxima << '\n';
84 std::cout << "Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: " << compressedMaxima.getView( 0, processedRows )
85 << '\n';
86}
87
88int
89main( int argc, char* argv[] )
90{
91 std::cout << "Running on host:\n";
92 reduceRowsIfExample< TNL::Devices::Host >();
93
94#ifdef __CUDACC__
95 std::cout << '\n' << "Running on CUDA device:\n";
96 reduceRowsIfExample< TNL::Devices::Cuda >();
97#endif
98}
Matrix::IndexType reduceRowsIf(Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Performs parallel reduction within each matrix row over a given range of row indexes based on a condi...
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16 7:17
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26 7:27
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36 7:37
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46 7:47
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56 7:57
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66 7:67
Row: 7 -> 0:70 1:71 2:72 3:73 4:74 5:75 6:76 7:77
Sums for rows 2-6 (only even indices, others show -1): [ 188, -1, 348, -1, 508 ]
Maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ -1, -1, -1, -1, -1, 57, -1, 77 ]
Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ 57, 77 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16 7:17
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26 7:27
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36 7:37
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46 7:47
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56 7:57
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66 7:67
Row: 7 -> 0:70 1:71 2:72 3:73 4:74 5:75 6:76 7:77
Sums for rows 2-6 (only even indices, others show -1): [ 188, -1, 348, -1, 508 ]
Maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ -1, -1, -1, -1, -1, 57, -1, 77 ]
Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ 57, 77 ]

◆ reduceRowsIf() [2/4]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue>
Matrix::IndexType TNL::Matrices::reduceRowsIf ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes based on a condition (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
conditionLambda function for condition check. See Condition Check.
fetchLambda function for fetching data. See For Const Matrices.
reductionLambda function for reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Basic Store (Row Index Only).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsIfExample()
14{
15 /***
16 * Create an 8x8 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = rowIdx * 10 + columnIdx;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Compute sums for rows 2-6, but only for even-indexed rows (range + condition).
35 */
37 auto rangeSums_view = rangeSums.getView();
38
39 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
40 {
41 return value;
42 };
43
44 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
45 {
46 return rowIdx % 2 == 0;
47 };
48
49 auto storeRange = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
50 {
51 rangeSums_view[ rowIdx - 2 ] = sum;
52 };
53
54 rangeSums.setValue( -1.0 );
55
56 TNL::Matrices::reduceRowsIf( matrix, 2, 7, evenRowCondition, fetch, TNL::Plus{}, storeRange );
57
58 std::cout << "Sums for rows 2-6 (only even indices, others show -1): " << rangeSums << '\n';
59
60 /***
61 * Compute maxima for specific rows, but only if row index > 3 (array + condition).
62 */
63 TNL::Containers::Vector< double, Device > arrayMaxima( matrix.getRows() );
64 TNL::Containers::Vector< double, Device > compressedMaxima( matrix.getRows() );
65 auto arrayMaxima_view = arrayMaxima.getView();
66 auto compressedMaxima_view = compressedMaxima.getView();
67
68 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
69 {
70 return rowIdx > 3 && rowIdx % 2 == 1; // Only odd row indices greater than 3
71 };
72
73 auto storeArray = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& max ) mutable
74 {
75 arrayMaxima_view[ rowIdx ] = max;
76 compressedMaxima_view[ indexOfRowIdx ] = max;
77 };
78
79 arrayMaxima.setValue( -1.0 );
80
81 auto processedRows = TNL::Matrices::reduceAllRowsIf( matrix, rowCondition, fetch, TNL::Max{}, storeArray );
82
83 std::cout << "Maxima for rows [1, 3, 5, 7] where rowIdx > 3: " << arrayMaxima << '\n';
84 std::cout << "Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: " << compressedMaxima.getView( 0, processedRows )
85 << '\n';
86}
87
88int
89main( int argc, char* argv[] )
90{
91 std::cout << "Running on host:\n";
92 reduceRowsIfExample< TNL::Devices::Host >();
93
94#ifdef __CUDACC__
95 std::cout << '\n' << "Running on CUDA device:\n";
96 reduceRowsIfExample< TNL::Devices::Cuda >();
97#endif
98}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16 7:17
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26 7:27
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36 7:37
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46 7:47
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56 7:57
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66 7:67
Row: 7 -> 0:70 1:71 2:72 3:73 4:74 5:75 6:76 7:77
Sums for rows 2-6 (only even indices, others show -1): [ 188, -1, 348, -1, 508 ]
Maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ -1, -1, -1, -1, -1, 57, -1, 77 ]
Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ 57, 77 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16 7:17
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26 7:27
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36 7:37
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46 7:47
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56 7:57
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66 7:67
Row: 7 -> 0:70 1:71 2:72 3:73 4:74 5:75 6:76 7:77
Sums for rows 2-6 (only even indices, others show -1): [ 188, -1, 348, -1, 508 ]
Maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ -1, -1, -1, -1, -1, 57, -1, 77 ]
Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ 57, 77 ]

◆ reduceRowsIf() [3/4]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType TNL::Matrices::reduceRowsIf ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes based on a condition with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
conditionLambda function for condition check. See Condition Check.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction operation. See Function objects for reduction operations.
storeLambda function for storing results. See Basic Store (Row Index Only).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsIfExample()
14{
15 /***
16 * Create an 8x8 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = rowIdx * 10 + columnIdx;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Compute sums for rows 2-6, but only for even-indexed rows (range + condition).
35 */
37 auto rangeSums_view = rangeSums.getView();
38
39 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
40 {
41 return value;
42 };
43
44 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
45 {
46 return rowIdx % 2 == 0;
47 };
48
49 auto storeRange = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
50 {
51 rangeSums_view[ rowIdx - 2 ] = sum;
52 };
53
54 rangeSums.setValue( -1.0 );
55
56 TNL::Matrices::reduceRowsIf( matrix, 2, 7, evenRowCondition, fetch, TNL::Plus{}, storeRange );
57
58 std::cout << "Sums for rows 2-6 (only even indices, others show -1): " << rangeSums << '\n';
59
60 /***
61 * Compute maxima for specific rows, but only if row index > 3 (array + condition).
62 */
63 TNL::Containers::Vector< double, Device > arrayMaxima( matrix.getRows() );
64 TNL::Containers::Vector< double, Device > compressedMaxima( matrix.getRows() );
65 auto arrayMaxima_view = arrayMaxima.getView();
66 auto compressedMaxima_view = compressedMaxima.getView();
67
68 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
69 {
70 return rowIdx > 3 && rowIdx % 2 == 1; // Only odd row indices greater than 3
71 };
72
73 auto storeArray = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& max ) mutable
74 {
75 arrayMaxima_view[ rowIdx ] = max;
76 compressedMaxima_view[ indexOfRowIdx ] = max;
77 };
78
79 arrayMaxima.setValue( -1.0 );
80
81 auto processedRows = TNL::Matrices::reduceAllRowsIf( matrix, rowCondition, fetch, TNL::Max{}, storeArray );
82
83 std::cout << "Maxima for rows [1, 3, 5, 7] where rowIdx > 3: " << arrayMaxima << '\n';
84 std::cout << "Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: " << compressedMaxima.getView( 0, processedRows )
85 << '\n';
86}
87
88int
89main( int argc, char* argv[] )
90{
91 std::cout << "Running on host:\n";
92 reduceRowsIfExample< TNL::Devices::Host >();
93
94#ifdef __CUDACC__
95 std::cout << '\n' << "Running on CUDA device:\n";
96 reduceRowsIfExample< TNL::Devices::Cuda >();
97#endif
98}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16 7:17
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26 7:27
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36 7:37
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46 7:47
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56 7:57
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66 7:67
Row: 7 -> 0:70 1:71 2:72 3:73 4:74 5:75 6:76 7:77
Sums for rows 2-6 (only even indices, others show -1): [ 188, -1, 348, -1, 508 ]
Maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ -1, -1, -1, -1, -1, 57, -1, 77 ]
Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ 57, 77 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16 7:17
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26 7:27
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36 7:37
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46 7:47
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56 7:57
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66 7:67
Row: 7 -> 0:70 1:71 2:72 3:73 4:74 5:75 6:76 7:77
Sums for rows 2-6 (only even indices, others show -1): [ 188, -1, 348, -1, 508 ]
Maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ -1, -1, -1, -1, -1, 57, -1, 77 ]
Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ 57, 77 ]

◆ reduceRowsIf() [4/4]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue>
Matrix::IndexType TNL::Matrices::reduceRowsIf ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes based on a condition.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
conditionLambda function for condition check. See Condition Check.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionLambda function for reduction operation. See Basic Reduction (Without Arguments).
storeLambda function for storing results. See Basic Store (Row Index Only).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsIfExample()
14{
15 /***
16 * Create an 8x8 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = rowIdx * 10 + columnIdx;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Compute sums for rows 2-6, but only for even-indexed rows (range + condition).
35 */
37 auto rangeSums_view = rangeSums.getView();
38
39 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
40 {
41 return value;
42 };
43
44 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
45 {
46 return rowIdx % 2 == 0;
47 };
48
49 auto storeRange = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& sum ) mutable
50 {
51 rangeSums_view[ rowIdx - 2 ] = sum;
52 };
53
54 rangeSums.setValue( -1.0 );
55
56 TNL::Matrices::reduceRowsIf( matrix, 2, 7, evenRowCondition, fetch, TNL::Plus{}, storeRange );
57
58 std::cout << "Sums for rows 2-6 (only even indices, others show -1): " << rangeSums << '\n';
59
60 /***
61 * Compute maxima for specific rows, but only if row index > 3 (array + condition).
62 */
63 TNL::Containers::Vector< double, Device > arrayMaxima( matrix.getRows() );
64 TNL::Containers::Vector< double, Device > compressedMaxima( matrix.getRows() );
65 auto arrayMaxima_view = arrayMaxima.getView();
66 auto compressedMaxima_view = compressedMaxima.getView();
67
68 auto rowCondition = [] __cuda_callable__( int rowIdx ) -> bool
69 {
70 return rowIdx > 3 && rowIdx % 2 == 1; // Only odd row indices greater than 3
71 };
72
73 auto storeArray = [ = ] __cuda_callable__( int indexOfRowIdx, int rowIdx, const double& max ) mutable
74 {
75 arrayMaxima_view[ rowIdx ] = max;
76 compressedMaxima_view[ indexOfRowIdx ] = max;
77 };
78
79 arrayMaxima.setValue( -1.0 );
80
81 auto processedRows = TNL::Matrices::reduceAllRowsIf( matrix, rowCondition, fetch, TNL::Max{}, storeArray );
82
83 std::cout << "Maxima for rows [1, 3, 5, 7] where rowIdx > 3: " << arrayMaxima << '\n';
84 std::cout << "Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: " << compressedMaxima.getView( 0, processedRows )
85 << '\n';
86}
87
88int
89main( int argc, char* argv[] )
90{
91 std::cout << "Running on host:\n";
92 reduceRowsIfExample< TNL::Devices::Host >();
93
94#ifdef __CUDACC__
95 std::cout << '\n' << "Running on CUDA device:\n";
96 reduceRowsIfExample< TNL::Devices::Cuda >();
97#endif
98}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16 7:17
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26 7:27
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36 7:37
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46 7:47
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56 7:57
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66 7:67
Row: 7 -> 0:70 1:71 2:72 3:73 4:74 5:75 6:76 7:77
Sums for rows 2-6 (only even indices, others show -1): [ 188, -1, 348, -1, 508 ]
Maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ -1, -1, -1, -1, -1, 57, -1, 77 ]
Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ 57, 77 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
Row: 1 -> 0:10 1:11 2:12 3:13 4:14 5:15 6:16 7:17
Row: 2 -> 0:20 1:21 2:22 3:23 4:24 5:25 6:26 7:27
Row: 3 -> 0:30 1:31 2:32 3:33 4:34 5:35 6:36 7:37
Row: 4 -> 0:40 1:41 2:42 3:43 4:44 5:45 6:46 7:47
Row: 5 -> 0:50 1:51 2:52 3:53 4:54 5:55 6:56 7:57
Row: 6 -> 0:60 1:61 2:62 3:63 4:64 5:65 6:66 7:67
Row: 7 -> 0:70 1:71 2:72 3:73 4:74 5:75 6:76 7:77
Sums for rows 2-6 (only even indices, others show -1): [ 188, -1, 348, -1, 508 ]
Maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ -1, -1, -1, -1, -1, 57, -1, 77 ]
Compressed maxima for rows [1, 3, 5, 7] where rowIdx > 3: [ 57, 77 ]

◆ reduceRowsWithArgument() [1/8]

template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::reduceRowsWithArgument ( const Matrix & matrix,
const Array & rowIndexes,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within matrix rows specified by a given set of row indexes while returning also the position of the element of interest with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the rows to iterate over.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
rowIndexesThe array containing the indexes of the rows to iterate over.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Row Index Array and With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsWithArgumentExample()
14{
15 /***
16 * Create an 8x8 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = ( rowIdx * 7 + columnIdx * 11 ) % 25;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Find argmax for rows 2-6 (range variant).
35 */
36 int rangeBegin = 2;
37 int rangeEnd = 7;
38 int rangeSize = rangeEnd - rangeBegin;
39 TNL::Containers::Vector< double, Device > rangeMaxValues( rangeSize );
40 TNL::Containers::Vector< int, Device > rangeMaxColumns( rangeSize );
41 auto rangeMaxValues_view = rangeMaxValues.getView();
42 auto rangeMaxColumns_view = rangeMaxColumns.getView();
43
44 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
45 {
46 return value;
47 };
48
49 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
50 {
51 if( a < b ) {
52 a = b;
53 aIdx = bIdx;
54 }
55 else if( a == b && bIdx < aIdx ) {
56 aIdx = bIdx;
57 }
58 };
59
60 auto storeRange =
61 [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
62 {
63 rangeMaxValues_view[ rowIdx - rangeBegin ] = value;
64 if( ! emptyRow )
65 rangeMaxColumns_view[ rowIdx - rangeBegin ] = columnIdx;
66 };
67
69 matrix, rangeBegin, rangeEnd, fetch, reduction, storeRange, std::numeric_limits< double >::lowest() );
70 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
71
72 std::cout << "Maxima for rows 2-6:\n";
73 std::cout << " Values: " << rangeMaxValues << '\n';
74 std::cout << " Columns: " << rangeMaxColumns << '\n';
75
76 /***
77 * Find argmin for specific rows (array variant).
78 */
79 TNL::Containers::Array< int, Device > rowIndexes{ 1, 3, 5, 7 };
80 TNL::Containers::Vector< double, Device > arrayMinValues( matrix.getRows() );
81 TNL::Containers::Vector< int, Device > arrayMinColumns( matrix.getRows() );
82 TNL::Containers::Vector< double, Device > compressedMinValues( rowIndexes.getSize() );
83 TNL::Containers::Vector< int, Device > compressedMinColumns( rowIndexes.getSize() );
84 auto arrayMinValues_view = arrayMinValues.getView();
85 auto arrayMinColumns_view = arrayMinColumns.getView();
86 auto compressedMinValues_view = compressedMinValues.getView();
87 auto compressedMinColumns_view = compressedMinColumns.getView();
88
89 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
90 {
91 if( a > b ) {
92 a = b;
93 aIdx = bIdx;
94 }
95 else if( a == b && bIdx < aIdx ) {
96 aIdx = bIdx;
97 }
98 };
99
100 auto storeArray = [ = ] __cuda_callable__(
101 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
102 {
103 arrayMinValues_view[ rowIdx ] = value;
104 compressedMinValues_view[ indexOfRowIdx ] = value;
105 if( ! emptyRow ) {
106 arrayMinColumns_view[ rowIdx ] = columnIdx;
107 compressedMinColumns_view[ indexOfRowIdx ] = columnIdx;
108 }
109 };
110
112 matrix, rowIndexes, fetch, reductionMin, storeArray, std::numeric_limits< double >::max() );
113 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
114
115 std::cout << "Minima for rows [1, 3, 5, 7]:\n";
116 std::cout << " Values: " << arrayMinValues << '\n';
117 std::cout << " Columns: " << arrayMinColumns << '\n';
118 std::cout << "Compressed minima for rows [1, 3, 5, 7]: \n";
119 std::cout << " Values: " << compressedMinValues.getView( 0, rowIndexes.getSize() ) << '\n';
120 std::cout << " Columns: " << compressedMinColumns.getView( 0, rowIndexes.getSize() ) << '\n';
121}
122
123int
124main( int argc, char* argv[] )
125{
126 std::cout << "Running on host:\n";
127 reduceRowsWithArgumentExample< TNL::Devices::Host >();
128
129#ifdef __CUDACC__
130 std::cout << "\nRunning on CUDA device:\n";
131 reduceRowsWithArgumentExample< TNL::Devices::Cuda >();
132#endif
133}
void reduceRowsWithArgument(Matrix &matrix, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Performs parallel reduction within each matrix row over a given range of row indexes while returning ...
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:11 2:22 3:8 4:19 5:5 6:16 7:2
Row: 1 -> 0:7 1:18 2:4 3:15 4:1 5:12 6:23 7:9
Row: 2 -> 0:14 1:0 2:11 3:22 4:8 5:19 6:5 7:16
Row: 3 -> 0:21 1:7 2:18 3:4 4:15 5:1 6:12 7:23
Row: 4 -> 0:3 1:14 2:0 3:11 4:22 5:8 6:19 7:5
Row: 5 -> 0:10 1:21 2:7 3:18 4:4 5:15 6:1 7:12
Row: 6 -> 0:17 1:3 2:14 3:0 4:11 5:22 6:8 7:19
Row: 7 -> 0:24 1:10 2:21 3:7 4:18 5:4 6:15 7:1
Maxima for rows 2-6:
Values: [ 22, 23, 22, 21, 22 ]
Columns: [ 3, 7, 4, 1, 5 ]
Minima for rows [1, 3, 5, 7]:
Values: [ 6.93695e-310, 1, 6.95272e-310, 1, 0, 1, 6.93695e-310, 1 ]
Columns: [ -832209541, 4, 0, 5, 2, 6, -1397772448, 7 ]
Compressed minima for rows [1, 3, 5, 7]:
Values: [ 1, 1, 1, 1 ]
Columns: [ 4, 5, 6, 7 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:11 2:22 3:8 4:19 5:5 6:16 7:2
Row: 1 -> 0:7 1:18 2:4 3:15 4:1 5:12 6:23 7:9
Row: 2 -> 0:14 1:0 2:11 3:22 4:8 5:19 6:5 7:16
Row: 3 -> 0:21 1:7 2:18 3:4 4:15 5:1 6:12 7:23
Row: 4 -> 0:3 1:14 2:0 3:11 4:22 5:8 6:19 7:5
Row: 5 -> 0:10 1:21 2:7 3:18 4:4 5:15 6:1 7:12
Row: 6 -> 0:17 1:3 2:14 3:0 4:11 5:22 6:8 7:19
Row: 7 -> 0:24 1:10 2:21 3:7 4:18 5:4 6:15 7:1
Maxima for rows 2-6:
Values: [ 22, 23, 22, 21, 22 ]
Columns: [ 3, 7, 4, 1, 5 ]
Minima for rows [1, 3, 5, 7]:
Values: [ 0, 1, 0, 1, 0, 1, 0, 1 ]
Columns: [ 0, 4, 0, 5, 0, 6, 0, 7 ]
Compressed minima for rows [1, 3, 5, 7]:
Values: [ 1, 1, 1, 1 ]
Columns: [ 4, 5, 6, 7 ]

◆ reduceRowsWithArgument() [2/8]

template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::reduceRowsWithArgument ( const Matrix & matrix,
const Array & rowIndexes,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within matrix rows specified by a given set of row indexes while returning also the position of the element of interest (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the rows to iterate over.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
rowIndexesThe array containing the indexes of the rows to iterate over.
fetchLambda function for fetching data. See For Const Matrices.
reductionLambda function for reduction with argument tracking. See Reduction With Argument (Position Tracking).
storeLambda function for storing results with position tracking. See Store With Row Index Array and With Argument (Position Tracking).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsWithArgumentExample()
14{
15 /***
16 * Create an 8x8 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = ( rowIdx * 7 + columnIdx * 11 ) % 25;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Find argmax for rows 2-6 (range variant).
35 */
36 int rangeBegin = 2;
37 int rangeEnd = 7;
38 int rangeSize = rangeEnd - rangeBegin;
39 TNL::Containers::Vector< double, Device > rangeMaxValues( rangeSize );
40 TNL::Containers::Vector< int, Device > rangeMaxColumns( rangeSize );
41 auto rangeMaxValues_view = rangeMaxValues.getView();
42 auto rangeMaxColumns_view = rangeMaxColumns.getView();
43
44 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
45 {
46 return value;
47 };
48
49 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
50 {
51 if( a < b ) {
52 a = b;
53 aIdx = bIdx;
54 }
55 else if( a == b && bIdx < aIdx ) {
56 aIdx = bIdx;
57 }
58 };
59
60 auto storeRange =
61 [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
62 {
63 rangeMaxValues_view[ rowIdx - rangeBegin ] = value;
64 if( ! emptyRow )
65 rangeMaxColumns_view[ rowIdx - rangeBegin ] = columnIdx;
66 };
67
69 matrix, rangeBegin, rangeEnd, fetch, reduction, storeRange, std::numeric_limits< double >::lowest() );
70 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
71
72 std::cout << "Maxima for rows 2-6:\n";
73 std::cout << " Values: " << rangeMaxValues << '\n';
74 std::cout << " Columns: " << rangeMaxColumns << '\n';
75
76 /***
77 * Find argmin for specific rows (array variant).
78 */
79 TNL::Containers::Array< int, Device > rowIndexes{ 1, 3, 5, 7 };
80 TNL::Containers::Vector< double, Device > arrayMinValues( matrix.getRows() );
81 TNL::Containers::Vector< int, Device > arrayMinColumns( matrix.getRows() );
82 TNL::Containers::Vector< double, Device > compressedMinValues( rowIndexes.getSize() );
83 TNL::Containers::Vector< int, Device > compressedMinColumns( rowIndexes.getSize() );
84 auto arrayMinValues_view = arrayMinValues.getView();
85 auto arrayMinColumns_view = arrayMinColumns.getView();
86 auto compressedMinValues_view = compressedMinValues.getView();
87 auto compressedMinColumns_view = compressedMinColumns.getView();
88
89 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
90 {
91 if( a > b ) {
92 a = b;
93 aIdx = bIdx;
94 }
95 else if( a == b && bIdx < aIdx ) {
96 aIdx = bIdx;
97 }
98 };
99
100 auto storeArray = [ = ] __cuda_callable__(
101 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
102 {
103 arrayMinValues_view[ rowIdx ] = value;
104 compressedMinValues_view[ indexOfRowIdx ] = value;
105 if( ! emptyRow ) {
106 arrayMinColumns_view[ rowIdx ] = columnIdx;
107 compressedMinColumns_view[ indexOfRowIdx ] = columnIdx;
108 }
109 };
110
112 matrix, rowIndexes, fetch, reductionMin, storeArray, std::numeric_limits< double >::max() );
113 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
114
115 std::cout << "Minima for rows [1, 3, 5, 7]:\n";
116 std::cout << " Values: " << arrayMinValues << '\n';
117 std::cout << " Columns: " << arrayMinColumns << '\n';
118 std::cout << "Compressed minima for rows [1, 3, 5, 7]: \n";
119 std::cout << " Values: " << compressedMinValues.getView( 0, rowIndexes.getSize() ) << '\n';
120 std::cout << " Columns: " << compressedMinColumns.getView( 0, rowIndexes.getSize() ) << '\n';
121}
122
123int
124main( int argc, char* argv[] )
125{
126 std::cout << "Running on host:\n";
127 reduceRowsWithArgumentExample< TNL::Devices::Host >();
128
129#ifdef __CUDACC__
130 std::cout << "\nRunning on CUDA device:\n";
131 reduceRowsWithArgumentExample< TNL::Devices::Cuda >();
132#endif
133}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:11 2:22 3:8 4:19 5:5 6:16 7:2
Row: 1 -> 0:7 1:18 2:4 3:15 4:1 5:12 6:23 7:9
Row: 2 -> 0:14 1:0 2:11 3:22 4:8 5:19 6:5 7:16
Row: 3 -> 0:21 1:7 2:18 3:4 4:15 5:1 6:12 7:23
Row: 4 -> 0:3 1:14 2:0 3:11 4:22 5:8 6:19 7:5
Row: 5 -> 0:10 1:21 2:7 3:18 4:4 5:15 6:1 7:12
Row: 6 -> 0:17 1:3 2:14 3:0 4:11 5:22 6:8 7:19
Row: 7 -> 0:24 1:10 2:21 3:7 4:18 5:4 6:15 7:1
Maxima for rows 2-6:
Values: [ 22, 23, 22, 21, 22 ]
Columns: [ 3, 7, 4, 1, 5 ]
Minima for rows [1, 3, 5, 7]:
Values: [ 6.93695e-310, 1, 6.95272e-310, 1, 0, 1, 6.93695e-310, 1 ]
Columns: [ -832209541, 4, 0, 5, 2, 6, -1397772448, 7 ]
Compressed minima for rows [1, 3, 5, 7]:
Values: [ 1, 1, 1, 1 ]
Columns: [ 4, 5, 6, 7 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:11 2:22 3:8 4:19 5:5 6:16 7:2
Row: 1 -> 0:7 1:18 2:4 3:15 4:1 5:12 6:23 7:9
Row: 2 -> 0:14 1:0 2:11 3:22 4:8 5:19 6:5 7:16
Row: 3 -> 0:21 1:7 2:18 3:4 4:15 5:1 6:12 7:23
Row: 4 -> 0:3 1:14 2:0 3:11 4:22 5:8 6:19 7:5
Row: 5 -> 0:10 1:21 2:7 3:18 4:4 5:15 6:1 7:12
Row: 6 -> 0:17 1:3 2:14 3:0 4:11 5:22 6:8 7:19
Row: 7 -> 0:24 1:10 2:21 3:7 4:18 5:4 6:15 7:1
Maxima for rows 2-6:
Values: [ 22, 23, 22, 21, 22 ]
Columns: [ 3, 7, 4, 1, 5 ]
Minima for rows [1, 3, 5, 7]:
Values: [ 0, 1, 0, 1, 0, 1, 0, 1 ]
Columns: [ 0, 4, 0, 5, 0, 6, 0, 7 ]
Compressed minima for rows [1, 3, 5, 7]:
Values: [ 1, 1, 1, 1 ]
Columns: [ 4, 5, 6, 7 ]

◆ reduceRowsWithArgument() [3/8]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::reduceRowsWithArgument ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes while returning also the position of the element of interest with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction operation with argument. See Function objects for reduction operations.
storeLambda function for storing results. See Store With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

◆ reduceRowsWithArgument() [4/8]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::reduceRowsWithArgument ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes while returning also the position of the element of interest (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
fetchLambda function for fetching data. See For Const Matrices.
reductionLambda function for reduction operation with argument. See Reduction With Argument (Position Tracking).
storeLambda function for storing results. See Store With Argument (Position Tracking).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

◆ reduceRowsWithArgument() [5/8]

template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::reduceRowsWithArgument ( Matrix & matrix,
const Array & rowIndexes,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within matrix rows specified by a given set of row indexes while returning also the position of the element of interest with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the rows to iterate over.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
rowIndexesThe array containing the indexes of the rows to iterate over.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Row Index Array and With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsWithArgumentExample()
14{
15 /***
16 * Create an 8x8 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = ( rowIdx * 7 + columnIdx * 11 ) % 25;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Find argmax for rows 2-6 (range variant).
35 */
36 int rangeBegin = 2;
37 int rangeEnd = 7;
38 int rangeSize = rangeEnd - rangeBegin;
39 TNL::Containers::Vector< double, Device > rangeMaxValues( rangeSize );
40 TNL::Containers::Vector< int, Device > rangeMaxColumns( rangeSize );
41 auto rangeMaxValues_view = rangeMaxValues.getView();
42 auto rangeMaxColumns_view = rangeMaxColumns.getView();
43
44 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
45 {
46 return value;
47 };
48
49 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
50 {
51 if( a < b ) {
52 a = b;
53 aIdx = bIdx;
54 }
55 else if( a == b && bIdx < aIdx ) {
56 aIdx = bIdx;
57 }
58 };
59
60 auto storeRange =
61 [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
62 {
63 rangeMaxValues_view[ rowIdx - rangeBegin ] = value;
64 if( ! emptyRow )
65 rangeMaxColumns_view[ rowIdx - rangeBegin ] = columnIdx;
66 };
67
69 matrix, rangeBegin, rangeEnd, fetch, reduction, storeRange, std::numeric_limits< double >::lowest() );
70 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
71
72 std::cout << "Maxima for rows 2-6:\n";
73 std::cout << " Values: " << rangeMaxValues << '\n';
74 std::cout << " Columns: " << rangeMaxColumns << '\n';
75
76 /***
77 * Find argmin for specific rows (array variant).
78 */
79 TNL::Containers::Array< int, Device > rowIndexes{ 1, 3, 5, 7 };
80 TNL::Containers::Vector< double, Device > arrayMinValues( matrix.getRows() );
81 TNL::Containers::Vector< int, Device > arrayMinColumns( matrix.getRows() );
82 TNL::Containers::Vector< double, Device > compressedMinValues( rowIndexes.getSize() );
83 TNL::Containers::Vector< int, Device > compressedMinColumns( rowIndexes.getSize() );
84 auto arrayMinValues_view = arrayMinValues.getView();
85 auto arrayMinColumns_view = arrayMinColumns.getView();
86 auto compressedMinValues_view = compressedMinValues.getView();
87 auto compressedMinColumns_view = compressedMinColumns.getView();
88
89 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
90 {
91 if( a > b ) {
92 a = b;
93 aIdx = bIdx;
94 }
95 else if( a == b && bIdx < aIdx ) {
96 aIdx = bIdx;
97 }
98 };
99
100 auto storeArray = [ = ] __cuda_callable__(
101 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
102 {
103 arrayMinValues_view[ rowIdx ] = value;
104 compressedMinValues_view[ indexOfRowIdx ] = value;
105 if( ! emptyRow ) {
106 arrayMinColumns_view[ rowIdx ] = columnIdx;
107 compressedMinColumns_view[ indexOfRowIdx ] = columnIdx;
108 }
109 };
110
112 matrix, rowIndexes, fetch, reductionMin, storeArray, std::numeric_limits< double >::max() );
113 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
114
115 std::cout << "Minima for rows [1, 3, 5, 7]:\n";
116 std::cout << " Values: " << arrayMinValues << '\n';
117 std::cout << " Columns: " << arrayMinColumns << '\n';
118 std::cout << "Compressed minima for rows [1, 3, 5, 7]: \n";
119 std::cout << " Values: " << compressedMinValues.getView( 0, rowIndexes.getSize() ) << '\n';
120 std::cout << " Columns: " << compressedMinColumns.getView( 0, rowIndexes.getSize() ) << '\n';
121}
122
123int
124main( int argc, char* argv[] )
125{
126 std::cout << "Running on host:\n";
127 reduceRowsWithArgumentExample< TNL::Devices::Host >();
128
129#ifdef __CUDACC__
130 std::cout << "\nRunning on CUDA device:\n";
131 reduceRowsWithArgumentExample< TNL::Devices::Cuda >();
132#endif
133}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:11 2:22 3:8 4:19 5:5 6:16 7:2
Row: 1 -> 0:7 1:18 2:4 3:15 4:1 5:12 6:23 7:9
Row: 2 -> 0:14 1:0 2:11 3:22 4:8 5:19 6:5 7:16
Row: 3 -> 0:21 1:7 2:18 3:4 4:15 5:1 6:12 7:23
Row: 4 -> 0:3 1:14 2:0 3:11 4:22 5:8 6:19 7:5
Row: 5 -> 0:10 1:21 2:7 3:18 4:4 5:15 6:1 7:12
Row: 6 -> 0:17 1:3 2:14 3:0 4:11 5:22 6:8 7:19
Row: 7 -> 0:24 1:10 2:21 3:7 4:18 5:4 6:15 7:1
Maxima for rows 2-6:
Values: [ 22, 23, 22, 21, 22 ]
Columns: [ 3, 7, 4, 1, 5 ]
Minima for rows [1, 3, 5, 7]:
Values: [ 6.93695e-310, 1, 6.95272e-310, 1, 0, 1, 6.93695e-310, 1 ]
Columns: [ -832209541, 4, 0, 5, 2, 6, -1397772448, 7 ]
Compressed minima for rows [1, 3, 5, 7]:
Values: [ 1, 1, 1, 1 ]
Columns: [ 4, 5, 6, 7 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:11 2:22 3:8 4:19 5:5 6:16 7:2
Row: 1 -> 0:7 1:18 2:4 3:15 4:1 5:12 6:23 7:9
Row: 2 -> 0:14 1:0 2:11 3:22 4:8 5:19 6:5 7:16
Row: 3 -> 0:21 1:7 2:18 3:4 4:15 5:1 6:12 7:23
Row: 4 -> 0:3 1:14 2:0 3:11 4:22 5:8 6:19 7:5
Row: 5 -> 0:10 1:21 2:7 3:18 4:4 5:15 6:1 7:12
Row: 6 -> 0:17 1:3 2:14 3:0 4:11 5:22 6:8 7:19
Row: 7 -> 0:24 1:10 2:21 3:7 4:18 5:4 6:15 7:1
Maxima for rows 2-6:
Values: [ 22, 23, 22, 21, 22 ]
Columns: [ 3, 7, 4, 1, 5 ]
Minima for rows [1, 3, 5, 7]:
Values: [ 0, 1, 0, 1, 0, 1, 0, 1 ]
Columns: [ 0, 4, 0, 5, 0, 6, 0, 7 ]
Compressed minima for rows [1, 3, 5, 7]:
Values: [ 1, 1, 1, 1 ]
Columns: [ 4, 5, 6, 7 ]

◆ reduceRowsWithArgument() [6/8]

template<typename Matrix, typename Array, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Matrices::reduceRowsWithArgument ( Matrix & matrix,
const Array & rowIndexes,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within matrix rows specified by a given set of row indexes while returning also the position of the element of interest.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
ArrayThe type of the array containing the indexes of the rows to iterate over.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
rowIndexesThe array containing the indexes of the rows to iterate over.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionLambda function for reduction with argument tracking. See Reduction With Argument (Position Tracking).
storeLambda function for storing results with position tracking. See Store With Row Index Array and With Argument (Position Tracking).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsWithArgumentExample()
14{
15 /***
16 * Create an 8x8 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = ( rowIdx * 7 + columnIdx * 11 ) % 25;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Find argmax for rows 2-6 (range variant).
35 */
36 int rangeBegin = 2;
37 int rangeEnd = 7;
38 int rangeSize = rangeEnd - rangeBegin;
39 TNL::Containers::Vector< double, Device > rangeMaxValues( rangeSize );
40 TNL::Containers::Vector< int, Device > rangeMaxColumns( rangeSize );
41 auto rangeMaxValues_view = rangeMaxValues.getView();
42 auto rangeMaxColumns_view = rangeMaxColumns.getView();
43
44 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
45 {
46 return value;
47 };
48
49 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
50 {
51 if( a < b ) {
52 a = b;
53 aIdx = bIdx;
54 }
55 else if( a == b && bIdx < aIdx ) {
56 aIdx = bIdx;
57 }
58 };
59
60 auto storeRange =
61 [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
62 {
63 rangeMaxValues_view[ rowIdx - rangeBegin ] = value;
64 if( ! emptyRow )
65 rangeMaxColumns_view[ rowIdx - rangeBegin ] = columnIdx;
66 };
67
69 matrix, rangeBegin, rangeEnd, fetch, reduction, storeRange, std::numeric_limits< double >::lowest() );
70 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
71
72 std::cout << "Maxima for rows 2-6:\n";
73 std::cout << " Values: " << rangeMaxValues << '\n';
74 std::cout << " Columns: " << rangeMaxColumns << '\n';
75
76 /***
77 * Find argmin for specific rows (array variant).
78 */
79 TNL::Containers::Array< int, Device > rowIndexes{ 1, 3, 5, 7 };
80 TNL::Containers::Vector< double, Device > arrayMinValues( matrix.getRows() );
81 TNL::Containers::Vector< int, Device > arrayMinColumns( matrix.getRows() );
82 TNL::Containers::Vector< double, Device > compressedMinValues( rowIndexes.getSize() );
83 TNL::Containers::Vector< int, Device > compressedMinColumns( rowIndexes.getSize() );
84 auto arrayMinValues_view = arrayMinValues.getView();
85 auto arrayMinColumns_view = arrayMinColumns.getView();
86 auto compressedMinValues_view = compressedMinValues.getView();
87 auto compressedMinColumns_view = compressedMinColumns.getView();
88
89 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
90 {
91 if( a > b ) {
92 a = b;
93 aIdx = bIdx;
94 }
95 else if( a == b && bIdx < aIdx ) {
96 aIdx = bIdx;
97 }
98 };
99
100 auto storeArray = [ = ] __cuda_callable__(
101 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
102 {
103 arrayMinValues_view[ rowIdx ] = value;
104 compressedMinValues_view[ indexOfRowIdx ] = value;
105 if( ! emptyRow ) {
106 arrayMinColumns_view[ rowIdx ] = columnIdx;
107 compressedMinColumns_view[ indexOfRowIdx ] = columnIdx;
108 }
109 };
110
112 matrix, rowIndexes, fetch, reductionMin, storeArray, std::numeric_limits< double >::max() );
113 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
114
115 std::cout << "Minima for rows [1, 3, 5, 7]:\n";
116 std::cout << " Values: " << arrayMinValues << '\n';
117 std::cout << " Columns: " << arrayMinColumns << '\n';
118 std::cout << "Compressed minima for rows [1, 3, 5, 7]: \n";
119 std::cout << " Values: " << compressedMinValues.getView( 0, rowIndexes.getSize() ) << '\n';
120 std::cout << " Columns: " << compressedMinColumns.getView( 0, rowIndexes.getSize() ) << '\n';
121}
122
123int
124main( int argc, char* argv[] )
125{
126 std::cout << "Running on host:\n";
127 reduceRowsWithArgumentExample< TNL::Devices::Host >();
128
129#ifdef __CUDACC__
130 std::cout << "\nRunning on CUDA device:\n";
131 reduceRowsWithArgumentExample< TNL::Devices::Cuda >();
132#endif
133}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:11 2:22 3:8 4:19 5:5 6:16 7:2
Row: 1 -> 0:7 1:18 2:4 3:15 4:1 5:12 6:23 7:9
Row: 2 -> 0:14 1:0 2:11 3:22 4:8 5:19 6:5 7:16
Row: 3 -> 0:21 1:7 2:18 3:4 4:15 5:1 6:12 7:23
Row: 4 -> 0:3 1:14 2:0 3:11 4:22 5:8 6:19 7:5
Row: 5 -> 0:10 1:21 2:7 3:18 4:4 5:15 6:1 7:12
Row: 6 -> 0:17 1:3 2:14 3:0 4:11 5:22 6:8 7:19
Row: 7 -> 0:24 1:10 2:21 3:7 4:18 5:4 6:15 7:1
Maxima for rows 2-6:
Values: [ 22, 23, 22, 21, 22 ]
Columns: [ 3, 7, 4, 1, 5 ]
Minima for rows [1, 3, 5, 7]:
Values: [ 6.93695e-310, 1, 6.95272e-310, 1, 0, 1, 6.93695e-310, 1 ]
Columns: [ -832209541, 4, 0, 5, 2, 6, -1397772448, 7 ]
Compressed minima for rows [1, 3, 5, 7]:
Values: [ 1, 1, 1, 1 ]
Columns: [ 4, 5, 6, 7 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:11 2:22 3:8 4:19 5:5 6:16 7:2
Row: 1 -> 0:7 1:18 2:4 3:15 4:1 5:12 6:23 7:9
Row: 2 -> 0:14 1:0 2:11 3:22 4:8 5:19 6:5 7:16
Row: 3 -> 0:21 1:7 2:18 3:4 4:15 5:1 6:12 7:23
Row: 4 -> 0:3 1:14 2:0 3:11 4:22 5:8 6:19 7:5
Row: 5 -> 0:10 1:21 2:7 3:18 4:4 5:15 6:1 7:12
Row: 6 -> 0:17 1:3 2:14 3:0 4:11 5:22 6:8 7:19
Row: 7 -> 0:24 1:10 2:21 3:7 4:18 5:4 6:15 7:1
Maxima for rows 2-6:
Values: [ 22, 23, 22, 21, 22 ]
Columns: [ 3, 7, 4, 1, 5 ]
Minima for rows [1, 3, 5, 7]:
Values: [ 0, 1, 0, 1, 0, 1, 0, 1 ]
Columns: [ 0, 4, 0, 5, 0, 6, 0, 7 ]
Compressed minima for rows [1, 3, 5, 7]:
Values: [ 1, 1, 1, 1 ]
Columns: [ 4, 5, 6, 7 ]

◆ reduceRowsWithArgument() [7/8]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::reduceRowsWithArgument ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes while returning also the position of the element of interest with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction operation with argument. See Function objects for reduction operations.
storeLambda function for storing results. See Store With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

◆ reduceRowsWithArgument() [8/8]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduction, typename Store, typename FetchValue, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Matrices::reduceRowsWithArgument ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes while returning also the position of the element of interest.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionLambda function for reduction operation with argument. See Reduction With Argument (Position Tracking).
storeLambda function for storing results. See Store With Argument (Position Tracking).
identityThe initial value for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

◆ reduceRowsWithArgumentIf() [1/4]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType TNL::Matrices::reduceRowsWithArgumentIf ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes based on a condition while returning also the position of the element of interest with automatic identity deduction (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
conditionLambda function for row condition checking. See Condition Check.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsWithArgumentIfExample()
14{
15 /***
16 * Create a 10x10 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = ( rowIdx * 13 + columnIdx * 7 ) % 30;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Find argmax for rows 3-8, but only for even-indexed rows (range + condition).
35 */
36 int rangeBegin = 3;
37 int rangeEnd = 9;
38 int rangeSize = rangeEnd - rangeBegin;
39 TNL::Containers::Vector< double, Device > rangeMaxValues( rangeSize );
40 TNL::Containers::Vector< int, Device > rangeMaxColumns( rangeSize );
41 TNL::Containers::Vector< double, Device > compressedMaxValues( rangeSize );
42 TNL::Containers::Vector< int, Device > compressedMaxColumns( rangeSize );
43 auto rangeMaxValues_view = rangeMaxValues.getView();
44 auto rangeMaxColumns_view = rangeMaxColumns.getView();
45 auto compressedMaxValues_view = compressedMaxValues.getView();
46 auto compressedMaxColumns_view = compressedMaxColumns.getView();
47
48 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
49 {
50 return value;
51 };
52
53 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
54 {
55 if( a < b ) {
56 a = b;
57 aIdx = bIdx;
58 }
59 else if( a == b && bIdx < aIdx ) {
60 aIdx = bIdx;
61 }
62 };
63
64 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
65 {
66 return rowIdx % 2 == 0;
67 };
68
69 auto storeRange = [ = ] __cuda_callable__(
70 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
71 {
72 rangeMaxValues_view[ rowIdx - rangeBegin ] = value;
73 compressedMaxValues_view[ indexOfRowIdx ] = value;
74 if( ! emptyRow ) {
75 rangeMaxColumns_view[ rowIdx - rangeBegin ] = columnIdx;
76 compressedMaxColumns_view[ indexOfRowIdx ] = columnIdx;
77 }
78 };
79
80 rangeMaxValues.setValue( -1.0 );
81 rangeMaxColumns.setValue( -1 );
82
83 auto evenRowsCount = TNL::Matrices::reduceRowsWithArgumentIf(
84 matrix, rangeBegin, rangeEnd, evenRowCondition, fetch, reduction, storeRange, std::numeric_limits< double >::lowest() );
85 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
86
87 std::cout << "Argmax for rows 3-8 (only even indices):\n";
88 std::cout << " Max values: " << rangeMaxValues << '\n';
89 std::cout << " Column indices: " << rangeMaxColumns << '\n';
90 std::cout << " Compressed max values: " << compressedMaxValues.getView( 0, evenRowsCount ) << '\n';
91 std::cout << " Compressed column indices: " << compressedMaxColumns.getView( 0, evenRowsCount ) << '\n';
92
93 /***
94 * Find argmin for rows 5-9, but only for odd-indexed rows.
95 */
98 TNL::Containers::Vector< double, Device > compressedOddMinValues( 5 );
99 TNL::Containers::Vector< int, Device > compressedOddMinColumns( 5 );
100 auto oddMinValues_view = oddMinValues.getView();
101 auto oddMinColumns_view = oddMinColumns.getView();
102 auto compressedOddMinValues_view = compressedOddMinValues.getView();
103 auto compressedOddMinColumns_view = compressedOddMinColumns.getView();
104
105 auto oddRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
106 {
107 return rowIdx % 2 == 1;
108 };
109
110 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
111 {
112 if( a > b ) {
113 a = b;
114 aIdx = bIdx;
115 }
116 else if( a == b && bIdx < aIdx ) {
117 aIdx = bIdx;
118 }
119 };
120
121 auto storeOddMin =
122 [ = ] __cuda_callable__(
123 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
124 {
125 oddMinValues_view[ rowIdx - 5 ] = value;
126 compressedOddMinValues_view[ indexOfRowIdx ] = value;
127 if( ! emptyRow ) {
128 oddMinColumns_view[ rowIdx - 5 ] = columnIdx;
129 compressedOddMinColumns_view[ indexOfRowIdx ] = columnIdx;
130 }
131 };
132
133 oddMinValues.setValue( -1.0 );
134 oddMinColumns.setValue( -1 );
135
136 auto oddRowsCount = TNL::Matrices::reduceRowsWithArgumentIf(
137 matrix, 5, 10, oddRowCondition, fetch, reductionMin, storeOddMin, std::numeric_limits< double >::max() );
138 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
139
140 std::cout << "Argmin for rows 5-9 (only odd indices):\n";
141 std::cout << " Min values: " << oddMinValues << '\n';
142 std::cout << " Column indices: " << oddMinColumns << '\n';
143 std::cout << " Compressed min values: " << compressedOddMinValues.getView( 0, oddRowsCount ) << '\n';
144 std::cout << " Compressed column indices: " << compressedOddMinColumns.getView( 0, oddRowsCount ) << '\n';
145}
146
147int
148main( int argc, char* argv[] )
149{
150 std::cout << "Running on host:\n";
151 reduceRowsWithArgumentIfExample< TNL::Devices::Host >();
152
153#ifdef __CUDACC__
154 std::cout << '\n' << "Running on CUDA device:\n";
155 reduceRowsWithArgumentIfExample< TNL::Devices::Cuda >();
156#endif
157}
Matrix::IndexType reduceRowsWithArgumentIf(Matrix &matrix, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduction &&reduction, Store &&store, const FetchValue &identity, Algorithms::Segments::LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Performs parallel reduction within each matrix row over a given range of row indexes based on a condi...
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:7 2:14 3:21 4:28 5:5 6:12 7:19 8:26 9:3
Row: 1 -> 0:13 1:20 2:27 3:4 4:11 5:18 6:25 7:2 8:9 9:16
Row: 2 -> 0:26 1:3 2:10 3:17 4:24 5:1 6:8 7:15 8:22 9:29
Row: 3 -> 0:9 1:16 2:23 3:0 4:7 5:14 6:21 7:28 8:5 9:12
Row: 4 -> 0:22 1:29 2:6 3:13 4:20 5:27 6:4 7:11 8:18 9:25
Row: 5 -> 0:5 1:12 2:19 3:26 4:3 5:10 6:17 7:24 8:1 9:8
Row: 6 -> 0:18 1:25 2:2 3:9 4:16 5:23 6:0 7:7 8:14 9:21
Row: 7 -> 0:1 1:8 2:15 3:22 4:29 5:6 6:13 7:20 8:27 9:4
Row: 8 -> 0:14 1:21 2:28 3:5 4:12 5:19 6:26 7:3 8:10 9:17
Row: 9 -> 0:27 1:4 2:11 3:18 4:25 5:2 6:9 7:16 8:23 9:0
Argmax for rows 3-8 (only even indices):
Max values: [ -1, 29, -1, 25, -1, 28 ]
Column indices: [ -1, 1, -1, 1, -1, 2 ]
Compressed max values: [ 29, 25, 28 ]
Compressed column indices: [ 1, 1, 2 ]
Argmin for rows 5-9 (only odd indices):
Min values: [ 1, -1, 1, -1, 0 ]
Column indices: [ 8, -1, 0, -1, 9 ]
Compressed min values: [ 1, 1, 0 ]
Compressed column indices: [ 8, 0, 9 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:7 2:14 3:21 4:28 5:5 6:12 7:19 8:26 9:3
Row: 1 -> 0:13 1:20 2:27 3:4 4:11 5:18 6:25 7:2 8:9 9:16
Row: 2 -> 0:26 1:3 2:10 3:17 4:24 5:1 6:8 7:15 8:22 9:29
Row: 3 -> 0:9 1:16 2:23 3:0 4:7 5:14 6:21 7:28 8:5 9:12
Row: 4 -> 0:22 1:29 2:6 3:13 4:20 5:27 6:4 7:11 8:18 9:25
Row: 5 -> 0:5 1:12 2:19 3:26 4:3 5:10 6:17 7:24 8:1 9:8
Row: 6 -> 0:18 1:25 2:2 3:9 4:16 5:23 6:0 7:7 8:14 9:21
Row: 7 -> 0:1 1:8 2:15 3:22 4:29 5:6 6:13 7:20 8:27 9:4
Row: 8 -> 0:14 1:21 2:28 3:5 4:12 5:19 6:26 7:3 8:10 9:17
Row: 9 -> 0:27 1:4 2:11 3:18 4:25 5:2 6:9 7:16 8:23 9:0
Argmax for rows 3-8 (only even indices):
Max values: [ -1, 29, -1, 25, -1, 28 ]
Column indices: [ -1, 1, -1, 1, -1, 2 ]
Compressed max values: [ 29, 25, 28 ]
Compressed column indices: [ 1, 1, 2 ]
Argmin for rows 5-9 (only odd indices):
Min values: [ 1, -1, 1, -1, 0 ]
Column indices: [ 8, -1, 0, -1, 9 ]
Compressed min values: [ 1, 1, 0 ]
Compressed column indices: [ 8, 0, 9 ]

◆ reduceRowsWithArgumentIf() [2/4]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue = decltype( std::declval< Fetch >()( 0, 0, std::declval< typename Matrix::RealType >() ) )>
Matrix::IndexType TNL::Matrices::reduceRowsWithArgumentIf ( const Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes based on a condition while returning also the position of the element of interest (const version).

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
conditionLambda function for row condition checking. See Condition Check.
fetchLambda function for fetching data. See For Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
identityThe identity element for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsWithArgumentIfExample()
14{
15 /***
16 * Create a 10x10 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = ( rowIdx * 13 + columnIdx * 7 ) % 30;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Find argmax for rows 3-8, but only for even-indexed rows (range + condition).
35 */
36 int rangeBegin = 3;
37 int rangeEnd = 9;
38 int rangeSize = rangeEnd - rangeBegin;
39 TNL::Containers::Vector< double, Device > rangeMaxValues( rangeSize );
40 TNL::Containers::Vector< int, Device > rangeMaxColumns( rangeSize );
41 TNL::Containers::Vector< double, Device > compressedMaxValues( rangeSize );
42 TNL::Containers::Vector< int, Device > compressedMaxColumns( rangeSize );
43 auto rangeMaxValues_view = rangeMaxValues.getView();
44 auto rangeMaxColumns_view = rangeMaxColumns.getView();
45 auto compressedMaxValues_view = compressedMaxValues.getView();
46 auto compressedMaxColumns_view = compressedMaxColumns.getView();
47
48 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
49 {
50 return value;
51 };
52
53 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
54 {
55 if( a < b ) {
56 a = b;
57 aIdx = bIdx;
58 }
59 else if( a == b && bIdx < aIdx ) {
60 aIdx = bIdx;
61 }
62 };
63
64 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
65 {
66 return rowIdx % 2 == 0;
67 };
68
69 auto storeRange = [ = ] __cuda_callable__(
70 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
71 {
72 rangeMaxValues_view[ rowIdx - rangeBegin ] = value;
73 compressedMaxValues_view[ indexOfRowIdx ] = value;
74 if( ! emptyRow ) {
75 rangeMaxColumns_view[ rowIdx - rangeBegin ] = columnIdx;
76 compressedMaxColumns_view[ indexOfRowIdx ] = columnIdx;
77 }
78 };
79
80 rangeMaxValues.setValue( -1.0 );
81 rangeMaxColumns.setValue( -1 );
82
83 auto evenRowsCount = TNL::Matrices::reduceRowsWithArgumentIf(
84 matrix, rangeBegin, rangeEnd, evenRowCondition, fetch, reduction, storeRange, std::numeric_limits< double >::lowest() );
85 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
86
87 std::cout << "Argmax for rows 3-8 (only even indices):\n";
88 std::cout << " Max values: " << rangeMaxValues << '\n';
89 std::cout << " Column indices: " << rangeMaxColumns << '\n';
90 std::cout << " Compressed max values: " << compressedMaxValues.getView( 0, evenRowsCount ) << '\n';
91 std::cout << " Compressed column indices: " << compressedMaxColumns.getView( 0, evenRowsCount ) << '\n';
92
93 /***
94 * Find argmin for rows 5-9, but only for odd-indexed rows.
95 */
98 TNL::Containers::Vector< double, Device > compressedOddMinValues( 5 );
99 TNL::Containers::Vector< int, Device > compressedOddMinColumns( 5 );
100 auto oddMinValues_view = oddMinValues.getView();
101 auto oddMinColumns_view = oddMinColumns.getView();
102 auto compressedOddMinValues_view = compressedOddMinValues.getView();
103 auto compressedOddMinColumns_view = compressedOddMinColumns.getView();
104
105 auto oddRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
106 {
107 return rowIdx % 2 == 1;
108 };
109
110 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
111 {
112 if( a > b ) {
113 a = b;
114 aIdx = bIdx;
115 }
116 else if( a == b && bIdx < aIdx ) {
117 aIdx = bIdx;
118 }
119 };
120
121 auto storeOddMin =
122 [ = ] __cuda_callable__(
123 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
124 {
125 oddMinValues_view[ rowIdx - 5 ] = value;
126 compressedOddMinValues_view[ indexOfRowIdx ] = value;
127 if( ! emptyRow ) {
128 oddMinColumns_view[ rowIdx - 5 ] = columnIdx;
129 compressedOddMinColumns_view[ indexOfRowIdx ] = columnIdx;
130 }
131 };
132
133 oddMinValues.setValue( -1.0 );
134 oddMinColumns.setValue( -1 );
135
136 auto oddRowsCount = TNL::Matrices::reduceRowsWithArgumentIf(
137 matrix, 5, 10, oddRowCondition, fetch, reductionMin, storeOddMin, std::numeric_limits< double >::max() );
138 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
139
140 std::cout << "Argmin for rows 5-9 (only odd indices):\n";
141 std::cout << " Min values: " << oddMinValues << '\n';
142 std::cout << " Column indices: " << oddMinColumns << '\n';
143 std::cout << " Compressed min values: " << compressedOddMinValues.getView( 0, oddRowsCount ) << '\n';
144 std::cout << " Compressed column indices: " << compressedOddMinColumns.getView( 0, oddRowsCount ) << '\n';
145}
146
147int
148main( int argc, char* argv[] )
149{
150 std::cout << "Running on host:\n";
151 reduceRowsWithArgumentIfExample< TNL::Devices::Host >();
152
153#ifdef __CUDACC__
154 std::cout << '\n' << "Running on CUDA device:\n";
155 reduceRowsWithArgumentIfExample< TNL::Devices::Cuda >();
156#endif
157}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:7 2:14 3:21 4:28 5:5 6:12 7:19 8:26 9:3
Row: 1 -> 0:13 1:20 2:27 3:4 4:11 5:18 6:25 7:2 8:9 9:16
Row: 2 -> 0:26 1:3 2:10 3:17 4:24 5:1 6:8 7:15 8:22 9:29
Row: 3 -> 0:9 1:16 2:23 3:0 4:7 5:14 6:21 7:28 8:5 9:12
Row: 4 -> 0:22 1:29 2:6 3:13 4:20 5:27 6:4 7:11 8:18 9:25
Row: 5 -> 0:5 1:12 2:19 3:26 4:3 5:10 6:17 7:24 8:1 9:8
Row: 6 -> 0:18 1:25 2:2 3:9 4:16 5:23 6:0 7:7 8:14 9:21
Row: 7 -> 0:1 1:8 2:15 3:22 4:29 5:6 6:13 7:20 8:27 9:4
Row: 8 -> 0:14 1:21 2:28 3:5 4:12 5:19 6:26 7:3 8:10 9:17
Row: 9 -> 0:27 1:4 2:11 3:18 4:25 5:2 6:9 7:16 8:23 9:0
Argmax for rows 3-8 (only even indices):
Max values: [ -1, 29, -1, 25, -1, 28 ]
Column indices: [ -1, 1, -1, 1, -1, 2 ]
Compressed max values: [ 29, 25, 28 ]
Compressed column indices: [ 1, 1, 2 ]
Argmin for rows 5-9 (only odd indices):
Min values: [ 1, -1, 1, -1, 0 ]
Column indices: [ 8, -1, 0, -1, 9 ]
Compressed min values: [ 1, 1, 0 ]
Compressed column indices: [ 8, 0, 9 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:7 2:14 3:21 4:28 5:5 6:12 7:19 8:26 9:3
Row: 1 -> 0:13 1:20 2:27 3:4 4:11 5:18 6:25 7:2 8:9 9:16
Row: 2 -> 0:26 1:3 2:10 3:17 4:24 5:1 6:8 7:15 8:22 9:29
Row: 3 -> 0:9 1:16 2:23 3:0 4:7 5:14 6:21 7:28 8:5 9:12
Row: 4 -> 0:22 1:29 2:6 3:13 4:20 5:27 6:4 7:11 8:18 9:25
Row: 5 -> 0:5 1:12 2:19 3:26 4:3 5:10 6:17 7:24 8:1 9:8
Row: 6 -> 0:18 1:25 2:2 3:9 4:16 5:23 6:0 7:7 8:14 9:21
Row: 7 -> 0:1 1:8 2:15 3:22 4:29 5:6 6:13 7:20 8:27 9:4
Row: 8 -> 0:14 1:21 2:28 3:5 4:12 5:19 6:26 7:3 8:10 9:17
Row: 9 -> 0:27 1:4 2:11 3:18 4:25 5:2 6:9 7:16 8:23 9:0
Argmax for rows 3-8 (only even indices):
Max values: [ -1, 29, -1, 25, -1, 28 ]
Column indices: [ -1, 1, -1, 1, -1, 2 ]
Compressed max values: [ 29, 25, 28 ]
Compressed column indices: [ 1, 1, 2 ]
Argmin for rows 5-9 (only odd indices):
Min values: [ 1, -1, 1, -1, 0 ]
Column indices: [ 8, -1, 0, -1, 9 ]
Compressed min values: [ 1, 1, 0 ]
Compressed column indices: [ 8, 0, 9 ]

◆ reduceRowsWithArgumentIf() [3/4]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store>
Matrix::IndexType TNL::Matrices::reduceRowsWithArgumentIf ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes based on a condition while returning also the position of the element of interest with automatic identity deduction.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
conditionLambda function for row condition checking. See Condition Check.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsWithArgumentIfExample()
14{
15 /***
16 * Create a 10x10 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = ( rowIdx * 13 + columnIdx * 7 ) % 30;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Find argmax for rows 3-8, but only for even-indexed rows (range + condition).
35 */
36 int rangeBegin = 3;
37 int rangeEnd = 9;
38 int rangeSize = rangeEnd - rangeBegin;
39 TNL::Containers::Vector< double, Device > rangeMaxValues( rangeSize );
40 TNL::Containers::Vector< int, Device > rangeMaxColumns( rangeSize );
41 TNL::Containers::Vector< double, Device > compressedMaxValues( rangeSize );
42 TNL::Containers::Vector< int, Device > compressedMaxColumns( rangeSize );
43 auto rangeMaxValues_view = rangeMaxValues.getView();
44 auto rangeMaxColumns_view = rangeMaxColumns.getView();
45 auto compressedMaxValues_view = compressedMaxValues.getView();
46 auto compressedMaxColumns_view = compressedMaxColumns.getView();
47
48 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
49 {
50 return value;
51 };
52
53 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
54 {
55 if( a < b ) {
56 a = b;
57 aIdx = bIdx;
58 }
59 else if( a == b && bIdx < aIdx ) {
60 aIdx = bIdx;
61 }
62 };
63
64 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
65 {
66 return rowIdx % 2 == 0;
67 };
68
69 auto storeRange = [ = ] __cuda_callable__(
70 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
71 {
72 rangeMaxValues_view[ rowIdx - rangeBegin ] = value;
73 compressedMaxValues_view[ indexOfRowIdx ] = value;
74 if( ! emptyRow ) {
75 rangeMaxColumns_view[ rowIdx - rangeBegin ] = columnIdx;
76 compressedMaxColumns_view[ indexOfRowIdx ] = columnIdx;
77 }
78 };
79
80 rangeMaxValues.setValue( -1.0 );
81 rangeMaxColumns.setValue( -1 );
82
83 auto evenRowsCount = TNL::Matrices::reduceRowsWithArgumentIf(
84 matrix, rangeBegin, rangeEnd, evenRowCondition, fetch, reduction, storeRange, std::numeric_limits< double >::lowest() );
85 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
86
87 std::cout << "Argmax for rows 3-8 (only even indices):\n";
88 std::cout << " Max values: " << rangeMaxValues << '\n';
89 std::cout << " Column indices: " << rangeMaxColumns << '\n';
90 std::cout << " Compressed max values: " << compressedMaxValues.getView( 0, evenRowsCount ) << '\n';
91 std::cout << " Compressed column indices: " << compressedMaxColumns.getView( 0, evenRowsCount ) << '\n';
92
93 /***
94 * Find argmin for rows 5-9, but only for odd-indexed rows.
95 */
98 TNL::Containers::Vector< double, Device > compressedOddMinValues( 5 );
99 TNL::Containers::Vector< int, Device > compressedOddMinColumns( 5 );
100 auto oddMinValues_view = oddMinValues.getView();
101 auto oddMinColumns_view = oddMinColumns.getView();
102 auto compressedOddMinValues_view = compressedOddMinValues.getView();
103 auto compressedOddMinColumns_view = compressedOddMinColumns.getView();
104
105 auto oddRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
106 {
107 return rowIdx % 2 == 1;
108 };
109
110 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
111 {
112 if( a > b ) {
113 a = b;
114 aIdx = bIdx;
115 }
116 else if( a == b && bIdx < aIdx ) {
117 aIdx = bIdx;
118 }
119 };
120
121 auto storeOddMin =
122 [ = ] __cuda_callable__(
123 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
124 {
125 oddMinValues_view[ rowIdx - 5 ] = value;
126 compressedOddMinValues_view[ indexOfRowIdx ] = value;
127 if( ! emptyRow ) {
128 oddMinColumns_view[ rowIdx - 5 ] = columnIdx;
129 compressedOddMinColumns_view[ indexOfRowIdx ] = columnIdx;
130 }
131 };
132
133 oddMinValues.setValue( -1.0 );
134 oddMinColumns.setValue( -1 );
135
136 auto oddRowsCount = TNL::Matrices::reduceRowsWithArgumentIf(
137 matrix, 5, 10, oddRowCondition, fetch, reductionMin, storeOddMin, std::numeric_limits< double >::max() );
138 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
139
140 std::cout << "Argmin for rows 5-9 (only odd indices):\n";
141 std::cout << " Min values: " << oddMinValues << '\n';
142 std::cout << " Column indices: " << oddMinColumns << '\n';
143 std::cout << " Compressed min values: " << compressedOddMinValues.getView( 0, oddRowsCount ) << '\n';
144 std::cout << " Compressed column indices: " << compressedOddMinColumns.getView( 0, oddRowsCount ) << '\n';
145}
146
147int
148main( int argc, char* argv[] )
149{
150 std::cout << "Running on host:\n";
151 reduceRowsWithArgumentIfExample< TNL::Devices::Host >();
152
153#ifdef __CUDACC__
154 std::cout << '\n' << "Running on CUDA device:\n";
155 reduceRowsWithArgumentIfExample< TNL::Devices::Cuda >();
156#endif
157}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:7 2:14 3:21 4:28 5:5 6:12 7:19 8:26 9:3
Row: 1 -> 0:13 1:20 2:27 3:4 4:11 5:18 6:25 7:2 8:9 9:16
Row: 2 -> 0:26 1:3 2:10 3:17 4:24 5:1 6:8 7:15 8:22 9:29
Row: 3 -> 0:9 1:16 2:23 3:0 4:7 5:14 6:21 7:28 8:5 9:12
Row: 4 -> 0:22 1:29 2:6 3:13 4:20 5:27 6:4 7:11 8:18 9:25
Row: 5 -> 0:5 1:12 2:19 3:26 4:3 5:10 6:17 7:24 8:1 9:8
Row: 6 -> 0:18 1:25 2:2 3:9 4:16 5:23 6:0 7:7 8:14 9:21
Row: 7 -> 0:1 1:8 2:15 3:22 4:29 5:6 6:13 7:20 8:27 9:4
Row: 8 -> 0:14 1:21 2:28 3:5 4:12 5:19 6:26 7:3 8:10 9:17
Row: 9 -> 0:27 1:4 2:11 3:18 4:25 5:2 6:9 7:16 8:23 9:0
Argmax for rows 3-8 (only even indices):
Max values: [ -1, 29, -1, 25, -1, 28 ]
Column indices: [ -1, 1, -1, 1, -1, 2 ]
Compressed max values: [ 29, 25, 28 ]
Compressed column indices: [ 1, 1, 2 ]
Argmin for rows 5-9 (only odd indices):
Min values: [ 1, -1, 1, -1, 0 ]
Column indices: [ 8, -1, 0, -1, 9 ]
Compressed min values: [ 1, 1, 0 ]
Compressed column indices: [ 8, 0, 9 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:7 2:14 3:21 4:28 5:5 6:12 7:19 8:26 9:3
Row: 1 -> 0:13 1:20 2:27 3:4 4:11 5:18 6:25 7:2 8:9 9:16
Row: 2 -> 0:26 1:3 2:10 3:17 4:24 5:1 6:8 7:15 8:22 9:29
Row: 3 -> 0:9 1:16 2:23 3:0 4:7 5:14 6:21 7:28 8:5 9:12
Row: 4 -> 0:22 1:29 2:6 3:13 4:20 5:27 6:4 7:11 8:18 9:25
Row: 5 -> 0:5 1:12 2:19 3:26 4:3 5:10 6:17 7:24 8:1 9:8
Row: 6 -> 0:18 1:25 2:2 3:9 4:16 5:23 6:0 7:7 8:14 9:21
Row: 7 -> 0:1 1:8 2:15 3:22 4:29 5:6 6:13 7:20 8:27 9:4
Row: 8 -> 0:14 1:21 2:28 3:5 4:12 5:19 6:26 7:3 8:10 9:17
Row: 9 -> 0:27 1:4 2:11 3:18 4:25 5:2 6:9 7:16 8:23 9:0
Argmax for rows 3-8 (only even indices):
Max values: [ -1, 29, -1, 25, -1, 28 ]
Column indices: [ -1, 1, -1, 1, -1, 2 ]
Compressed max values: [ 29, 25, 28 ]
Compressed column indices: [ 1, 1, 2 ]
Argmin for rows 5-9 (only odd indices):
Min values: [ 1, -1, 1, -1, 0 ]
Column indices: [ 8, -1, 0, -1, 9 ]
Compressed min values: [ 1, 1, 0 ]
Compressed column indices: [ 8, 0, 9 ]

◆ reduceRowsWithArgumentIf() [4/4]

template<typename Matrix, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduction, typename Store, typename FetchValue = decltype( std::declval< Fetch >()( 0, 0, std::declval< typename Matrix::RealType >() ) )>
Matrix::IndexType TNL::Matrices::reduceRowsWithArgumentIf ( Matrix & matrix,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduction && reduction,
Store && store,
const FetchValue & identity,
Algorithms::Segments::LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Performs parallel reduction within each matrix row over a given range of row indexes based on a condition while returning also the position of the element of interest.

See also: Overview of Matrix Reduction Functions

Template Parameters
MatrixThe type of the matrix.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of rows where the reduction will be performed.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of rows where the reduction will be performed.
ConditionThe type of the lambda function used for the condition check.
FetchThe type of the lambda function used for data fetching.
ReductionThe type of the function object defining the reduction operation.
StoreThe type of the lambda function used for storing results from individual rows.
FetchValueThe type returned by the Fetch lambda function.
Parameters
matrixThe matrix on which the reduction will be performed.
beginThe beginning of the interval [ begin, end ) of rows where the reduction will be performed.
endThe end of the interval [ begin, end ) of rows where the reduction will be performed.
conditionLambda function for row condition checking. See Condition Check.
fetchLambda function for fetching data. See For Non-Const Matrices.
reductionFunction object for reduction with argument tracking. See Function objects for reduction operations.
storeLambda function for storing results with position tracking. See Store With Argument (Position Tracking).
identityThe identity element for the reduction operation.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Returns
The number of processed rows, i.e. rows for which the condition was true.
Example
1#include <iostream>
2#include <TNL/Matrices/DenseMatrix.h>
3#include <TNL/Matrices/SparseMatrix.h>
4#include <TNL/Matrices/reduce.h>
5#include <TNL/Matrices/traverse.h>
6#include <TNL/Containers/Vector.h>
7#include <TNL/Containers/Array.h>
8#include <TNL/Devices/Host.h>
9#include <TNL/Devices/Cuda.h>
10
11template< typename Device >
12void
13reduceRowsWithArgumentIfExample()
14{
15 /***
16 * Create a 10x10 dense matrix.
17 */
19 matrix.setValue( 0.0 );
20
21 /***
22 * Fill the matrix with values.
23 */
24 auto fillMatrix = [] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double& value )
25 {
26 value = ( rowIdx * 13 + columnIdx * 7 ) % 30;
27 };
28 TNL::Matrices::forAllElements( matrix, fillMatrix );
29
30 std::cout << "Matrix:\n";
31 std::cout << matrix << '\n';
32
33 /***
34 * Find argmax for rows 3-8, but only for even-indexed rows (range + condition).
35 */
36 int rangeBegin = 3;
37 int rangeEnd = 9;
38 int rangeSize = rangeEnd - rangeBegin;
39 TNL::Containers::Vector< double, Device > rangeMaxValues( rangeSize );
40 TNL::Containers::Vector< int, Device > rangeMaxColumns( rangeSize );
41 TNL::Containers::Vector< double, Device > compressedMaxValues( rangeSize );
42 TNL::Containers::Vector< int, Device > compressedMaxColumns( rangeSize );
43 auto rangeMaxValues_view = rangeMaxValues.getView();
44 auto rangeMaxColumns_view = rangeMaxColumns.getView();
45 auto compressedMaxValues_view = compressedMaxValues.getView();
46 auto compressedMaxColumns_view = compressedMaxColumns.getView();
47
48 auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
49 {
50 return value;
51 };
52
53 auto reduction = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
54 {
55 if( a < b ) {
56 a = b;
57 aIdx = bIdx;
58 }
59 else if( a == b && bIdx < aIdx ) {
60 aIdx = bIdx;
61 }
62 };
63
64 auto evenRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
65 {
66 return rowIdx % 2 == 0;
67 };
68
69 auto storeRange = [ = ] __cuda_callable__(
70 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
71 {
72 rangeMaxValues_view[ rowIdx - rangeBegin ] = value;
73 compressedMaxValues_view[ indexOfRowIdx ] = value;
74 if( ! emptyRow ) {
75 rangeMaxColumns_view[ rowIdx - rangeBegin ] = columnIdx;
76 compressedMaxColumns_view[ indexOfRowIdx ] = columnIdx;
77 }
78 };
79
80 rangeMaxValues.setValue( -1.0 );
81 rangeMaxColumns.setValue( -1 );
82
83 auto evenRowsCount = TNL::Matrices::reduceRowsWithArgumentIf(
84 matrix, rangeBegin, rangeEnd, evenRowCondition, fetch, reduction, storeRange, std::numeric_limits< double >::lowest() );
85 // You may also use TNL::MaxWithArg{} instead of defining your own reduction lambda.
86
87 std::cout << "Argmax for rows 3-8 (only even indices):\n";
88 std::cout << " Max values: " << rangeMaxValues << '\n';
89 std::cout << " Column indices: " << rangeMaxColumns << '\n';
90 std::cout << " Compressed max values: " << compressedMaxValues.getView( 0, evenRowsCount ) << '\n';
91 std::cout << " Compressed column indices: " << compressedMaxColumns.getView( 0, evenRowsCount ) << '\n';
92
93 /***
94 * Find argmin for rows 5-9, but only for odd-indexed rows.
95 */
98 TNL::Containers::Vector< double, Device > compressedOddMinValues( 5 );
99 TNL::Containers::Vector< int, Device > compressedOddMinColumns( 5 );
100 auto oddMinValues_view = oddMinValues.getView();
101 auto oddMinColumns_view = oddMinColumns.getView();
102 auto compressedOddMinValues_view = compressedOddMinValues.getView();
103 auto compressedOddMinColumns_view = compressedOddMinColumns.getView();
104
105 auto oddRowCondition = [] __cuda_callable__( int rowIdx ) -> bool
106 {
107 return rowIdx % 2 == 1;
108 };
109
110 auto reductionMin = [] __cuda_callable__( double& a, const double& b, int& aIdx, const int& bIdx )
111 {
112 if( a > b ) {
113 a = b;
114 aIdx = bIdx;
115 }
116 else if( a == b && bIdx < aIdx ) {
117 aIdx = bIdx;
118 }
119 };
120
121 auto storeOddMin =
122 [ = ] __cuda_callable__(
123 int indexOfRowIdx, int rowIdx, int localIdx, int columnIdx, const double& value, bool emptyRow ) mutable
124 {
125 oddMinValues_view[ rowIdx - 5 ] = value;
126 compressedOddMinValues_view[ indexOfRowIdx ] = value;
127 if( ! emptyRow ) {
128 oddMinColumns_view[ rowIdx - 5 ] = columnIdx;
129 compressedOddMinColumns_view[ indexOfRowIdx ] = columnIdx;
130 }
131 };
132
133 oddMinValues.setValue( -1.0 );
134 oddMinColumns.setValue( -1 );
135
136 auto oddRowsCount = TNL::Matrices::reduceRowsWithArgumentIf(
137 matrix, 5, 10, oddRowCondition, fetch, reductionMin, storeOddMin, std::numeric_limits< double >::max() );
138 // You may also use TNL::MinWithArg{} instead of defining your own reduction lambda.
139
140 std::cout << "Argmin for rows 5-9 (only odd indices):\n";
141 std::cout << " Min values: " << oddMinValues << '\n';
142 std::cout << " Column indices: " << oddMinColumns << '\n';
143 std::cout << " Compressed min values: " << compressedOddMinValues.getView( 0, oddRowsCount ) << '\n';
144 std::cout << " Compressed column indices: " << compressedOddMinColumns.getView( 0, oddRowsCount ) << '\n';
145}
146
147int
148main( int argc, char* argv[] )
149{
150 std::cout << "Running on host:\n";
151 reduceRowsWithArgumentIfExample< TNL::Devices::Host >();
152
153#ifdef __CUDACC__
154 std::cout << '\n' << "Running on CUDA device:\n";
155 reduceRowsWithArgumentIfExample< TNL::Devices::Cuda >();
156#endif
157}
Output
Running on host:
Matrix:
Row: 0 -> 0:0 1:7 2:14 3:21 4:28 5:5 6:12 7:19 8:26 9:3
Row: 1 -> 0:13 1:20 2:27 3:4 4:11 5:18 6:25 7:2 8:9 9:16
Row: 2 -> 0:26 1:3 2:10 3:17 4:24 5:1 6:8 7:15 8:22 9:29
Row: 3 -> 0:9 1:16 2:23 3:0 4:7 5:14 6:21 7:28 8:5 9:12
Row: 4 -> 0:22 1:29 2:6 3:13 4:20 5:27 6:4 7:11 8:18 9:25
Row: 5 -> 0:5 1:12 2:19 3:26 4:3 5:10 6:17 7:24 8:1 9:8
Row: 6 -> 0:18 1:25 2:2 3:9 4:16 5:23 6:0 7:7 8:14 9:21
Row: 7 -> 0:1 1:8 2:15 3:22 4:29 5:6 6:13 7:20 8:27 9:4
Row: 8 -> 0:14 1:21 2:28 3:5 4:12 5:19 6:26 7:3 8:10 9:17
Row: 9 -> 0:27 1:4 2:11 3:18 4:25 5:2 6:9 7:16 8:23 9:0
Argmax for rows 3-8 (only even indices):
Max values: [ -1, 29, -1, 25, -1, 28 ]
Column indices: [ -1, 1, -1, 1, -1, 2 ]
Compressed max values: [ 29, 25, 28 ]
Compressed column indices: [ 1, 1, 2 ]
Argmin for rows 5-9 (only odd indices):
Min values: [ 1, -1, 1, -1, 0 ]
Column indices: [ 8, -1, 0, -1, 9 ]
Compressed min values: [ 1, 1, 0 ]
Compressed column indices: [ 8, 0, 9 ]
Running on CUDA device:
Matrix:
Row: 0 -> 0:0 1:7 2:14 3:21 4:28 5:5 6:12 7:19 8:26 9:3
Row: 1 -> 0:13 1:20 2:27 3:4 4:11 5:18 6:25 7:2 8:9 9:16
Row: 2 -> 0:26 1:3 2:10 3:17 4:24 5:1 6:8 7:15 8:22 9:29
Row: 3 -> 0:9 1:16 2:23 3:0 4:7 5:14 6:21 7:28 8:5 9:12
Row: 4 -> 0:22 1:29 2:6 3:13 4:20 5:27 6:4 7:11 8:18 9:25
Row: 5 -> 0:5 1:12 2:19 3:26 4:3 5:10 6:17 7:24 8:1 9:8
Row: 6 -> 0:18 1:25 2:2 3:9 4:16 5:23 6:0 7:7 8:14 9:21
Row: 7 -> 0:1 1:8 2:15 3:22 4:29 5:6 6:13 7:20 8:27 9:4
Row: 8 -> 0:14 1:21 2:28 3:5 4:12 5:19 6:26 7:3 8:10 9:17
Row: 9 -> 0:27 1:4 2:11 3:18 4:25 5:2 6:9 7:16 8:23 9:0
Argmax for rows 3-8 (only even indices):
Max values: [ -1, 29, -1, 25, -1, 28 ]
Column indices: [ -1, 1, -1, 1, -1, 2 ]
Compressed max values: [ 29, 25, 28 ]
Compressed column indices: [ 1, 1, 2 ]
Argmin for rows 5-9 (only odd indices):
Min values: [ 1, -1, 1, -1, 0 ]
Column indices: [ 8, -1, 0, -1, 9 ]
Compressed min values: [ 1, 1, 0 ]
Compressed column indices: [ 8, 0, 9 ]

◆ wrapCSRMatrix()

template<typename Device, typename Real, typename Index>
SparseMatrixView< Real, Device, Index, GeneralMatrix, Algorithms::Segments::CSRView > TNL::Matrices::wrapCSRMatrix ( const Index & rows,
const Index & columns,
Index * rowPointers,
Real * values,
Index * columnIndexes )
nodiscard

Function for wrapping of arrays defining CSR format into a sparse matrix view.

Template Parameters
Deviceis a device on which the arrays are allocated.
Realis a type of matrix elements values.
Indexis a type for matrix elements indexing.
Parameters
rowsis a number of matrix rows.
columnsis a number of matrix columns.
rowPointersis an array holding row pointers of the CSR format ( ROW_INDEX here)
valuesis an array with values of matrix elements ( V here)
columnIndexesis an array with column indexes of matrix elements ( COL_INDEX here)
Returns
instance of SparseMatrixView with CSR format.

The size of array rowPointers must be equal to number of rows + 1. The last element of the array equals to the number of all nonzero matrix elements. The sizes of arrays values and columnIndexes must be equal to this number.

Example
#include <iostream>
#include <TNL/Matrices/DenseMatrix.h>
#include <TNL/Matrices/MatrixWrapping.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
wrapMatrixView()
{
/***
* Encode the following matrix to CSR format...
*
* / 1 2 0 0 \.
* | 0 6 0 0 |
* | 9 0 0 0 |
* \ 0 0 15 16 /
*/
const int rows = 4;
const int columns = 4;
TNL::Containers::Vector< double, Device > valuesVector{ 1, 2, 6, 9, 15, 16 };
TNL::Containers::Vector< int, Device > columnIndexesVector{ 0, 1, 1, 0, 2, 3 };
TNL::Containers::Vector< int, Device > rowPointersVector{ 0, 2, 3, 4, 6 };
double* values = valuesVector.getData();
int* columnIndexes = columnIndexesVector.getData();
int* rowPointers = rowPointersVector.getData();
/***
* Wrap the arrays `rowPointers, `values` and `columnIndexes` to sparse matrix view
*/
auto matrix = TNL::Matrices::wrapCSRMatrix< Device >( rows, columns, rowPointers, values, columnIndexes );
std::cout << "Matrix reads as:\n" << matrix << '\n';
}
int
main( int argc, char* argv[] )
{
std::cout << "Wrapping matrix view on host:\n";
wrapMatrixView< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Wrapping matrix view on CUDA device:\n";
wrapMatrixView< TNL::Devices::Cuda >();
#endif
}
__cuda_callable__ const Value * getData() const
Returns a const-qualified raw pointer to the data.
SparseMatrixView< Real, Device, Index, GeneralMatrix, Algorithms::Segments::CSRView > wrapCSRMatrix(const Index &rows, const Index &columns, Index *rowPointers, Real *values, Index *columnIndexes)
Function for wrapping of arrays defining CSR format into a sparse matrix view.
Definition MatrixWrapping.h:74
Output
Wrapping matrix view on host:
Matrix reads as:
Row: 0 -> 0:1 1:2
Row: 1 -> 1:6
Row: 2 -> 0:9
Row: 3 -> 2:15 3:16
Wrapping matrix view on CUDA device:
Matrix reads as:
Row: 0 -> 0:1 1:2
Row: 1 -> 1:6
Row: 2 -> 0:9
Row: 3 -> 2:15 3:16

◆ wrapDenseMatrix()

template<typename Device, typename Real, typename Index, ElementsOrganization Organization = Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization()>
DenseMatrixView< Real, Device, Index, Organization > TNL::Matrices::wrapDenseMatrix ( const Index & rows,
const Index & columns,
Real * values )
nodiscard

Function for wrapping an array of values into a dense matrix view.

Template Parameters
Deviceis a device on which the array is allocated.
Realis a type of array elements.
Indexis a type for indexing of matrix elements.
Organizationis matrix elements organization - see TNL::Algorithms::Segments::ElementsOrganization.
Parameters
rowsis a number of matrix rows.
columnsis a number of matrix columns.
valuesis the array with matrix elements values.
Returns
instance of DenseMatrixView wrapping the array.

The array size must be equal to product of rows and columns. The dense matrix view does not deallocate the input array at the end of its lifespan.

Example
#include <iostream>
#include <TNL/Matrices/DenseMatrix.h>
#include <TNL/Matrices/MatrixWrapping.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
wrapMatrixView()
{
const int rows = 3;
const int columns = 4;
// clang-format off
1, 2, 3, 4,
5, 6, 7, 8,
9, 10, 11, 12
// clang-format on
};
double* values = valuesVector.getData();
/***
* Wrap the array `values` to dense matrix view
*/
auto matrix = TNL::Matrices::wrapDenseMatrix< Device >( rows, columns, values );
std::cout << "Matrix reads as:\n" << matrix << '\n';
}
int
main( int argc, char* argv[] )
{
std::cout << "Wrapping matrix view on host:\n";
wrapMatrixView< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Wrapping matrix view on CUDA device:\n";
wrapMatrixView< TNL::Devices::Cuda >();
#endif
}
DenseMatrixView< Real, Device, Index, Organization > wrapDenseMatrix(const Index &rows, const Index &columns, Real *values)
Function for wrapping an array of values into a dense matrix view.
Definition MatrixWrapping.h:39
Output
Wrapping matrix view on host:
Matrix reads as:
Row: 0 -> 0:1 1:2 2:3 3:4
Row: 1 -> 0:5 1:6 2:7 3:8
Row: 2 -> 0:9 1:10 2:11 3:12
Wrapping matrix view on CUDA device:
Matrix reads as:
Row: 0 -> 0:1 1:4 2:7 3:10
Row: 1 -> 0:2 1:5 2:8 3:11
Row: 2 -> 0:3 1:6 2:9 3:12

◆ wrapEllpackMatrix()

template<typename Device, ElementsOrganization Organization, typename Real, typename Index, int Alignment = 1>
auto TNL::Matrices::wrapEllpackMatrix ( const Index rows,
const Index columns,
const Index nonzerosPerRow,
Real * values,
Index * columnIndexes ) -> decltype(EllpackMatrixWrapper< Device, Organization, Real, Index, Alignment >::wrap( rows, columns, nonzerosPerRow, values, columnIndexes))
nodiscard

Function for wrapping of arrays defining Ellpack format into a sparse matrix view.

This is to prevent from appearing in Doxygen documentation.

Template Parameters
Deviceis a device on which the arrays are allocated.
Realis a type of matrix elements values.
Indexis a type for matrix elements indexing.
Alignmentdefines alignment of data. The number of matrix rows is rounded to a multiple of this number. It it useful mainly for GPUs.
Parameters
rowsis a number of matrix rows.
columnsis a number of matrix columns.
nonzerosPerRowis number of nonzero matrix elements in each row.
valuesis an array with values of matrix elements.
columnIndexesis an array with column indexes of matrix elements.
Returns
instance of SparseMatrixView with CSR format.

The sizes of arrays values and columnIndexes must be equal to rows * nonzerosPerRow. Use -1 as a column index for padding zeros.

Example
#include <iostream>
#include <TNL/Matrices/DenseMatrix.h>
#include <TNL/Matrices/MatrixWrapping.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
wrapMatrixView()
{
/***
* Encode the following matrix to Ellpack format...
*
* / 1 2 0 0 \.
* | 0 6 0 0 |
* | 9 0 0 0 |
* \ 0 0 15 16 /
*/
const int rows = 4;
const int columns = 4;
TNL::Containers::Vector< double, Device > valuesVector{ 1, 2, 6, 0, 9, 0, 15, 16 };
TNL::Containers::Vector< int, Device > columnIndexesVector{ 0, 1, 1, -1, 0, -1, 2, 3 };
double* values = valuesVector.getData();
int* columnIndexes = columnIndexesVector.getData();
/***
* Wrap the arrays `values` and `columnIndexes` to sparse matrix view
*/
rows, columns, 2, values, columnIndexes );
std::cout << "Matrix reads as:\n" << matrix << '\n';
}
int
main( int argc, char* argv[] )
{
std::cout << "Wrapping matrix view on host:\n";
wrapMatrixView< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Wrapping matrix view on CUDA device:\n";
wrapMatrixView< TNL::Devices::Cuda >();
#endif
}
auto wrapEllpackMatrix(const Index rows, const Index columns, const Index nonzerosPerRow, Real *values, Index *columnIndexes) -> decltype(EllpackMatrixWrapper< Device, Organization, Real, Index, Alignment >::wrap(rows, columns, nonzerosPerRow, values, columnIndexes))
Function for wrapping of arrays defining Ellpack format into a sparse matrix view.
Definition MatrixWrapping.h:135
Output
Wrapping matrix view on host:
Matrix reads as:
Row: 0 -> 0:1 1:2
Row: 1 -> 1:6
Row: 2 -> 0:9
Row: 3 -> 2:15 3:16
Wrapping matrix view on CUDA device:
Matrix reads as:
Row: 0 -> 0:1 1:2
Row: 1 -> 1:6
Row: 2 -> 0:9
Row: 3 -> 2:15 3:16

Variable Documentation

◆ paddingIndex

template<typename Index>
Index TNL::Matrices::paddingIndex = static_cast< Index >( -1 )
constexpr

Padding index value.

Padding index is used for column indexes of padding zeros. Padding zeros are used in some sparse matrix formats for better data alignment in memory.