Template Numerical Library: TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal

Implementation of sparse matrix view. More...

#include <TNL/Matrices/SparseMatrixBase.h>

Inheritance diagram for TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >:

[legend]

Collaboration diagram for TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >:

[legend]

Public Types
using	ColumnIndexesViewType

using	ComputeRealType = ComputeReal

using	ConstRowView = typename RowView::ConstRowView
	Type for accessing constant matrix rows.

using	DefaultSegmentsReductionKernel = typename Algorithms::SegmentsReductionKernels::DefaultKernel< SegmentsView >::type
	Type of the kernel used for parallel reductions on segments.

using	DeviceType = Device
	The device where the matrix is allocated.

using	IndexType = Index
	The type used for matrix elements indexing.

using	RealType = typename Base::RealType
	The type of matrix elements.

using	RowView
	Type for accessing matrix rows.

using	SegmentsViewType = SegmentsView
	Type of segments view used by this matrix. It represents the sparse matrix format.

Public Types inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
using	ConstValuesViewType
	Type of constant vector view holding values of matrix elements.

using	DeviceType
	The device where the matrix is allocated.

using	IndexType
	The type used for matrix elements indexing.

using	RealType
	The type of matrix elements.

using	RowCapacitiesType

using	ValuesViewType
	Type of vector view holding values of matrix elements.

Public Member Functions
__cuda_callable__	SparseMatrixBase ()=default
	Constructor with no parameters.

__cuda_callable__	SparseMatrixBase (const SparseMatrixBase &matrix)=default
	Copy constructor.

__cuda_callable__	SparseMatrixBase (IndexType rows, IndexType columns, typename Base::ValuesViewType values, ColumnIndexesViewType columnIndexes, SegmentsViewType segments)
	Constructor with all necessary data and views.

__cuda_callable__	SparseMatrixBase (SparseMatrixBase &&matrix) noexcept=default
	Move constructor.

__cuda_callable__ void	addElement (IndexType row, IndexType column, const RealType &value, const RealType &thisElementMultiplicator=1.0)
	Add element at given row and column to given value.

__cuda_callable__ IndexType	findElement (IndexType row, IndexType column) const
	Finds element in the matrix and returns its position in the arrays with values and columnIndexes.

template<typename Function>
void	forAllElements (Function &&function)
	This method calls forElements for all matrix rows.

template<typename Function>
void	forAllElements (Function &&function) const
	This method calls forElements for all matrix rows (for constant instances).

template<typename Function>
void	forAllRows (Function &&function)
	Method for parallel iteration over all matrix rows.

template<typename Function>
void	forAllRows (Function &&function) const
	Method for parallel iteration over all matrix rows for constant instances.

template<typename Function>
void	forElements (IndexType begin, IndexType end, Function &&function)
	Method for iteration over all matrix rows for non-constant instances.

template<typename Function>
void	forElements (IndexType begin, IndexType end, Function &&function) const
	Method for iteration over all matrix rows for constant instances.

template<typename Function>
void	forRows (IndexType begin, IndexType end, Function &&function)
	Method for parallel iteration over matrix rows from interval `[begin, end)`.

template<typename Function>
void	forRows (IndexType begin, IndexType end, Function &&function) const
	Method for parallel iteration over matrix rows from interval `[begin, end)` for constant instances.

ColumnIndexesViewType &	getColumnIndexes ()
	Getter of column indexes for nonconstant instances.

const ColumnIndexesViewType &	getColumnIndexes () const
	Getter of column indexes for constant instances.

template<typename Vector>
void	getCompressedRowLengths (Vector &rowLengths) const
	Computes number of non-zeros in each row.

__cuda_callable__ RealType	getElement (IndexType row, IndexType column) const
	Returns value of matrix element at position given by its row and column index.

IndexType	getNonzeroElementsCount () const override
	Returns number of non-zero matrix elements.

__cuda_callable__ RowView	getRow (IndexType rowIdx)
	Non-constant getter of simple structure for accessing given matrix row.

__cuda_callable__ ConstRowView	getRow (IndexType rowIdx) const
	Constant getter of simple structure for accessing given matrix row.

template<typename Vector>
void	getRowCapacities (Vector &rowCapacities) const
	Compute capacities of all rows.

__cuda_callable__ IndexType	getRowCapacity (IndexType row) const
	Returns capacity of given matrix row.

SegmentsViewType &	getSegments ()
	Getter of segments for non-constant instances.

const SegmentsViewType &	getSegments () const
	Getter of segments for constant instances.

template<typename Matrix>
bool	operator!= (const Matrix &matrix) const
	Comparison operator with another arbitrary matrix type.

SparseMatrixBase &	operator= (const SparseMatrixBase &)=delete
	Copy-assignment operator.

SparseMatrixBase &	operator= (SparseMatrixBase &&)=delete
	Move-assignment operator.

template<typename Matrix>
bool	operator== (const Matrix &matrix) const
	Comparison operator with another arbitrary matrix type.

void	print (std::ostream &str) const
	Method for printing the matrix to output stream.

template<typename Fetch, typename Reduce, typename Keep, typename FetchValue, typename SegmentsReductionKernel = DefaultSegmentsReductionKernel>
std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value >	reduceAllRows (Fetch &&fetch, const Reduce &reduce, Keep &&keep, const FetchValue &identity, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
	Method for performing general reduction on all matrix rows for constant instances.

template<typename Fetch, typename Reduce, typename Keep, typename SegmentsReductionKernel = DefaultSegmentsReductionKernel>
std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value >	reduceAllRows (Fetch &&fetch, const Reduce &reduce, Keep &&keep, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
	Method for performing general reduction on all matrix rows for constant instances with function object instead of lambda function for reduction.

template<typename Fetch, typename Reduce, typename Keep, typename FetchValue, typename SegmentsReductionKernel = DefaultSegmentsReductionKernel>
std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value >	reduceRows (IndexType begin, IndexType end, Fetch &&fetch, const Reduce &reduce, Keep &&keep, const FetchValue &identity, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
	Method for performing general reduction on matrix rows for constant instances.

template<typename Fetch, typename Reduce, typename Keep, typename SegmentsReductionKernel = DefaultSegmentsReductionKernel>
std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value >	reduceRows (IndexType begin, IndexType end, Fetch &&fetch, const Reduce &reduce, Keep &&keep, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
	Method for performing general reduction on matrix rows for constant instances with function object instead of lamda function for reduction.

template<typename Function>
void	sequentialForAllRows (Function &&function)
	This method calls sequentialForRows for all matrix rows.

template<typename Function>
void	sequentialForAllRows (Function &&function) const
	This method calls sequentialForRows for all matrix rows (for constant instances).

template<typename Function>
void	sequentialForRows (IndexType begin, IndexType end, Function &&function)
	Method for sequential iteration over all matrix rows for non-constant instances.

template<typename Function>
void	sequentialForRows (IndexType begin, IndexType end, Function &&function) const
	Method for sequential iteration over all matrix rows for constant instances.

__cuda_callable__ void	setElement (IndexType row, IndexType column, const RealType &value)
	Sets element at given row and column to given value.

void	sortColumnIndexes ()
	Sort matrix elements in each row by column indexes in ascending order.

template<typename InVector, typename OutVector>
void	transposedVectorProduct (const InVector &inVector, OutVector &outVector, ComputeReal matrixMultiplicator=1.0, ComputeReal outVectorMultiplicator=0.0, Index begin=0, Index end=0) const
	Computes product of transposed matrix and vector.

template<typename InVector, typename OutVector, typename SegmentsReductionKernel = DefaultSegmentsReductionKernel>
void	vectorProduct (const InVector &inVector, OutVector &outVector, ComputeRealType matrixMultiplicator=1.0, ComputeRealType outVectorMultiplicator=0.0, IndexType begin=0, IndexType end=0, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
	Computes product of matrix and vector.

template<typename InVector, typename OutVector, typename SegmentsReductionKernel, typename..., std::enable_if_t< ! std::is_convertible_v< SegmentsReductionKernel, ComputeRealType >, bool > = true>
void	vectorProduct (const InVector &inVector, OutVector &outVector, const SegmentsReductionKernel &kernel) const

Public Member Functions inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
__cuda_callable__	MatrixBase ()=default
	Basic constructor with no parameters.

__cuda_callable__	MatrixBase ()=default
	Basic constructor with no parameters.

__cuda_callable__	MatrixBase (const MatrixBase &view)=default
	Shallow copy constructor.

__cuda_callable__	MatrixBase (const MatrixBase &view)=default
	Shallow copy constructor.

__cuda_callable__	MatrixBase (IndexType rows, IndexType columns, ValuesViewType values)
	Constructor with matrix dimensions and matrix elements values.

__cuda_callable__	MatrixBase (IndexType rows, IndexType columns, ValuesViewType values)
	Constructor with matrix dimensions and matrix elements values.

__cuda_callable__	MatrixBase (MatrixBase &&view) noexcept=default
	Move constructor.

__cuda_callable__	MatrixBase (MatrixBase &&view) noexcept=default
	Move constructor.

IndexType	getAllocatedElementsCount () const
	Tells the number of allocated matrix elements.

IndexType	getAllocatedElementsCount () const
	Tells the number of allocated matrix elements.

__cuda_callable__ IndexType	getColumns () const
	Returns number of matrix columns.

__cuda_callable__ IndexType	getColumns () const
	Returns number of matrix columns.

__cuda_callable__ IndexType	getRows () const
	Returns number of matrix rows.

__cuda_callable__ IndexType	getRows () const
	Returns number of matrix rows.

__cuda_callable__ ValuesViewType &	getValues ()
	Returns a reference to a vector with the matrix elements values.

__cuda_callable__ ValuesViewType &	getValues ()
	Returns a reference to a vector with the matrix elements values.

__cuda_callable__ const ValuesViewType &	getValues () const
	Returns a constant reference to a vector with the matrix elements values.

__cuda_callable__ const ValuesViewType &	getValues () const
	Returns a constant reference to a vector with the matrix elements values.

__cuda_callable__ MatrixBase &	operator= (const MatrixBase &)=delete
	Copy-assignment operator.

__cuda_callable__ MatrixBase &	operator= (const MatrixBase &)=delete
	Copy-assignment operator.

__cuda_callable__ MatrixBase &	operator= (MatrixBase &&)=delete
	Move-assignment operator.

__cuda_callable__ MatrixBase &	operator= (MatrixBase &&)=delete
	Move-assignment operator.

Static Public Member Functions
static std::string	getSerializationType ()
	Returns string with serialization type.

Static Public Member Functions inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
static constexpr ElementsOrganization	getOrganization ()
	Matrix elements organization getter.

static constexpr ElementsOrganization	getOrganization ()
	Matrix elements organization getter.

static constexpr bool	isBinary ()
	Test of binary matrix type.

static constexpr bool	isBinary ()
	Test of binary matrix type.

static constexpr bool	isSymmetric ()
	Test of symmetric matrix type.

static constexpr bool	isSymmetric ()
	Test of symmetric matrix type.

Protected Member Functions
__cuda_callable__ void	bind (IndexType rows, IndexType columns, typename Base::ValuesViewType values, ColumnIndexesViewType columnIndexes, SegmentsViewType segments)
	Re-initializes the internal attributes of the base class.

Protected Member Functions inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
__cuda_callable__ void	bind (IndexType rows, IndexType columns, ValuesViewType values)
	Re-initializes the internal attributes of the base class.

__cuda_callable__ void	bind (IndexType rows, IndexType columns, ValuesViewType values)
	Re-initializes the internal attributes of the base class.

Protected Attributes
ColumnIndexesViewType	columnIndexes

SegmentsViewType	segments

Protected Attributes inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
IndexType	columns

IndexType	columns

IndexType	rows

IndexType	rows

ValuesViewType	values

ValuesViewType	values

Detailed Description

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>
class TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >

Implementation of sparse matrix view.

It serves as an accessor to SparseMatrix for example when passing the matrix to lambda functions. SparseMatrix view can be also created in CUDA kernels.

Template Parameters

Real	is a type of matrix elements. If Real equals bool the matrix is treated as binary and so the matrix elements values are not stored in the memory since we need to remember only coordinates of non-zero elements (which equal one).
Device	is a device where the matrix is allocated.
Index	is a type for indexing of the matrix elements.
MatrixType	specifies a symmetry of matrix. See MatrixType. Symmetric matrices store only lower part of the matrix and its diagonal. The upper part is reconstructed on the fly. GeneralMatrix with no symmetry is used by default.
Segments	is a structure representing the sparse matrix format. Depending on the pattern of the non-zero elements different matrix formats can perform differently especially on GPUs. By default Algorithms::Segments::CSR format is used. See also Algorithms::Segments::Ellpack, Algorithms::Segments::SlicedEllpack, Algorithms::Segments::ChunkedEllpack, and Algorithms::Segments::BiEllpack.
ComputeReal	is the same as Real mostly but for binary matrices it is set to Index type. This can be changed by the user, of course.

Member Typedef Documentation

◆ ColumnIndexesViewType

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

using TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::ColumnIndexesViewType

Initial value:

Containers::VectorView< typename TNL::copy_const< Index >::template from< Real >::type, Device, Index >

TNL::Containers::VectorView

VectorView extends ArrayView with algebraic operations.

Definition VectorView.h:24

◆ DefaultSegmentsReductionKernel

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

using TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::DefaultSegmentsReductionKernel = typename Algorithms::SegmentsReductionKernels::DefaultKernel< SegmentsView >::type

Type of the kernel used for parallel reductions on segments.

We are assuming that the default segments reduction kernel provides a static reduceAllSegments method, and thus it does not have to be instantiated and initialized. If the user wants to use a more complicated kernel, such as CSRAdaptive, it must be instantiated and initialized by the user and the object must be passed to the vectorProduct or reduceRows method.

◆ RowView

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

using TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::RowView

Initial value:

SparseMatrixRowView< typename SegmentsViewType::SegmentViewType, typename Base::ValuesViewType, ColumnIndexesViewType >

TNL::Matrices::SparseMatrixRowView< typename SegmentsViewType::SegmentViewType, typename Base::ValuesViewType, ColumnIndexesViewType >

Type for accessing matrix rows.

Constructor & Destructor Documentation

◆ SparseMatrixBase() [1/3]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::SparseMatrixBase	(	IndexType	rows,
		IndexType	columns,
		typename Base::ValuesViewType	values,
		ColumnIndexesViewType	columnIndexes,
		SegmentsViewType	segments )

Constructor with all necessary data and views.

Parameters

rows	is a number of matrix rows.
columns	is a number of matrix columns.
values	is a vector view with matrix elements values.
columnIndexes	is a vector view with matrix elements column indexes.
segments	is a segments view representing the sparse matrix format.

◆ SparseMatrixBase() [2/3]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::SparseMatrixBase ( const SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > & matrix )

default

Copy constructor.

Parameters

matrix is an input sparse matrix view.

◆ SparseMatrixBase() [3/3]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::SparseMatrixBase ( SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > && matrix )

defaultnoexcept

Move constructor.

Parameters

matrix is an input sparse matrix view.

Member Function Documentation

◆ addElement()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::addElement	(	IndexType	row,
		IndexType	column,
		const RealType &	value,
		const RealType &	thisElementMultiplicator = 1.0 )

Add element at given row and column to given value.

This method can be called from the host system (CPU) no matter where the matrix is allocated. If the matrix is allocated on GPU this method can be called even from device kernels. If the matrix is allocated in GPU device this method is called from CPU, it transfers values of each matrix element separately and so the performance is very low. For higher performance see. SparseMatrix::getRow or SparseMatrix::forElements and SparseMatrix::forAllElements. The call may fail if the matrix row capacity is exhausted.

Parameters

row	is row index of the element.
column	is columns index of the element.
value	is the value the element will be set to.
thisElementMultiplicator	is multiplicator the original matrix element value is multiplied by before addition of given value.

Example: #include <iostream>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

addElements()

{

TNL::Matrices::SparseMatrix< double, Device > matrix( { 5, 5, 5, 5, 5 }, 5 );

auto matrixView = matrix.getView();

for( int i = 0; i < 5; i++ )

matrixView.setElement( i, i, i ); // or matrix.setElement

std::cout << "Initial matrix is: " << std::endl << matrix << std::endl;

for( int i = 0; i < 5; i++ )

for( int j = 0; j < 5; j++ )

matrixView.addElement( i, j, 1.0, 5.0 ); // or matrix.addElement

std::cout << "Matrix after addition is: " << std::endl << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Add elements on host:" << std::endl;

addElements< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Add elements on CUDA device:" << std::endl;

addElements< TNL::Devices::Cuda >();

#endif

}

std::cout

TNL::Matrices::SparseMatrixBase::setElement
__cuda_callable__ void setElement(IndexType row, IndexType column, const RealType &value)
Sets element at given row and column to given value.
Definition SparseMatrixBase.hpp:150

TNL::Matrices::SparseMatrixBase::addElement
__cuda_callable__ void addElement(IndexType row, IndexType column, const RealType &value, const RealType &thisElementMultiplicator=1.0)
Add element at given row and column to given value.
Definition SparseMatrixBase.hpp:160

TNL::Matrices::SparseMatrix
Implementation of sparse matrix, i.e. matrix storing only non-zero elements.
Definition SparseMatrix.h:57

TNL::Matrices::SparseMatrix::getView
ViewType getView()
Returns a modifiable view of the sparse matrix.
Definition SparseMatrix.hpp:166

std::endl
T endl(T... args)

Output: Add elements on host:

Initial matrix is:

Row: 0 ->

Row: 1 -> 1:1

Row: 2 -> 2:2

Row: 3 -> 3:3

Row: 4 -> 4:4

Matrix after addition is:

Row: 0 -> 0:1 1:1 2:1 3:1 4:1

Row: 1 -> 0:1 1:6 2:1 3:1 4:1

Row: 2 -> 0:1 1:1 2:11 3:1 4:1

Row: 3 -> 0:1 1:1 2:1 3:16 4:1

Row: 4 -> 0:1 1:1 2:1 3:1 4:21

Add elements on CUDA device:

Initial matrix is:

Row: 0 ->

Row: 1 -> 1:1

Row: 2 -> 2:2

Row: 3 -> 3:3

Row: 4 -> 4:4

Matrix after addition is:

Row: 0 -> 0:1 1:1 2:1 3:1 4:1

Row: 1 -> 0:1 1:6 2:1 3:1 4:1

Row: 2 -> 0:1 1:1 2:11 3:1 4:1

Row: 3 -> 0:1 1:1 2:1 3:16 4:1

Row: 4 -> 0:1 1:1 2:1 3:1 4:21

◆ bind()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::bind	(	IndexType	rows,
		IndexType	columns,
		typename Base::ValuesViewType	values,
		ColumnIndexesViewType	columnIndexes,
		SegmentsViewType	segments )

protected

Re-initializes the internal attributes of the base class.

Note that this function is protected to ensure that the user cannot modify the base class of a matrix. For the same reason, in future code development we also need to make sure that all non-const functions in the base class return by value and not by reference.

◆ findElement()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ Index TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::findElement	(	IndexType	row,
		IndexType	column ) const

nodiscard

Finds element in the matrix and returns its position in the arrays with values and columnIndexes.

If the element is not found, the method returns the padding index.

Parameters

row	is the row index of the element.
column	is the column index of the element.

Returns: the position of the element in the arrays with values and columnIndexes or the padding index if the element is not found.

◆ forAllElements() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forAllElements ( Function && function )

This method calls forElements for all matrix rows.

See SparseMatrix::forElements.

Template Parameters

Function is a type of lambda function that will operate on matrix elements.

Parameters

function is an instance of the lambda function to be called in each row.

Example: #include <iostream>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

forAllElementsExample()

{

TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 2, 3, 4, 5 }, 5 );

auto matrixView = matrix.getView();

auto f = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )

{

// This is important, some matrix formats may allocate more matrix elements

// than we requested. These padding elements are processed here as well.

if( rowIdx >= localIdx ) {

columnIdx = localIdx;

value = rowIdx + localIdx + 1;

}

};

matrixView.forAllElements( f ); // or matrix.forAllElements

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Creating matrix on host: " << std::endl;

forAllElementsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Creating matrix on CUDA device: " << std::endl;

forAllElementsExample< TNL::Devices::Cuda >();

#endif

}

__cuda_callable__
#define __cuda_callable__
Definition Macros.h:49

TNL::Matrices::SparseMatrixBase::forAllElements
void forAllElements(Function &&function) const
This method calls forElements for all matrix rows (for constant instances).
Definition SparseMatrixBase.hpp:574

Output: Creating matrix on host:

Row: 0 -> 0:1

Row: 1 -> 0:2 1:3

Row: 2 -> 0:3 1:4 2:5

Row: 3 -> 0:4 1:5 2:6 3:7

Row: 4 -> 0:5 1:6 2:7 3:8 4:9

Creating matrix on CUDA device:

Row: 0 -> 0:1

Row: 1 -> 0:2 1:3

Row: 2 -> 0:3 1:4 2:5

Row: 3 -> 0:4 1:5 2:6 3:7

Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forAllElements() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forAllElements ( Function && function ) const

This method calls forElements for all matrix rows (for constant instances).

See SparseMatrix::forElements.

Template Parameters

Function is a type of lambda function that will operate on matrix elements.

Parameters

function is an instance of the lambda function to be called in each row.

◆ forAllRows() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forAllRows ( Function && function )

Method for parallel iteration over all matrix rows.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method SparseMatrixBase::forAllElements where more than one thread can be mapped to each row.

Template Parameters

Function is type of the lambda function.

Parameters

function is an instance of the lambda function to be called for each row.

auto function = [] __cuda_callable__ ( RowView& row ) { ... };

TNL::Matrices::SparseMatrixBase::RowView

SparseMatrixRowView< typename SegmentsViewType::SegmentViewType, typename Base::ValuesViewType, ColumnIndexesViewType > RowView

Type for accessing matrix rows.

Definition SparseMatrixBase.h:95

RowView represents matrix row - see TNL::Matrices::SparseMatrixBase::RowView.

Example: #include <iostream>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

forRowsExample()

{

/***

* Set the following matrix (dots represent zero matrix elements):

*

* / 2 . . . . \

* | 1 2 1 . . |

* | . 1 2 1. . |

* | . . 1 2 1 |

* \ . . . . 2 /

*/

const int size( 5 );

using MatrixType = TNL::Matrices::SparseMatrix< double, Device >;

using RowView = typename MatrixType::RowView;

MatrixType matrix( { 1, 3, 3, 3, 1 }, size );

auto matrixView = matrix.getView();

/***

* Set the matrix elements.

*/

auto f = [] __cuda_callable__( RowView & row )

{

const int rowIdx = row.getRowIndex();

if( rowIdx == 0 )

row.setElement( 0, rowIdx, 2.0 ); // diagonal element

else if( rowIdx == size - 1 )

row.setElement( 0, rowIdx, 2.0 ); // diagonal element

else {

row.setElement( 0, rowIdx - 1, 1.0 ); // elements below the diagonal

row.setElement( 1, rowIdx, 2.0 ); // diagonal element

row.setElement( 2, rowIdx + 1, 1.0 ); // elements above the diagonal

}

};

matrixView.forAllRows( f ); // or matrix.forAllRows

std::cout << matrix << std::endl;

/***

* Divide each matrix row by a sum of all elements in the row - with use of iterators.

*/

matrix.forAllRows(

[] __cuda_callable__( RowView & row )

{

double sum = 0.0;

for( auto element : row )

sum += element.value();

for( auto element : row )

element.value() /= sum;

} );

std::cout << "Divide each matrix row by a sum of all elements in the row ... " << std::endl;

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting matrix rows on host: " << std::endl;

forRowsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting matrix rows on CUDA device: " << std::endl;

forRowsExample< TNL::Devices::Cuda >();

#endif

}

TNL::Matrices::MatrixType
Structure for specifying type of sparse matrix.
Definition MatrixType.h:17

Output: Getting matrix rows on host:

Row: 0 -> 0:2

Row: 1 -> 0:1 1:2 2:1

Row: 2 -> 1:1 2:2 3:1

Row: 3 -> 2:1 3:2 4:1

Row: 4 -> 4:2

Divide each matrix row by a sum of all elements in the row ...

Row: 0 -> 0:1

Row: 1 -> 0:0.25 1:0.5 2:0.25

Row: 2 -> 1:0.25 2:0.5 3:0.25

Row: 3 -> 2:0.25 3:0.5 4:0.25

Row: 4 -> 4:1

Getting matrix rows on CUDA device:

Row: 0 -> 0:2

Row: 1 -> 0:1 1:2 2:1

Row: 2 -> 1:1 2:2 3:1

Row: 3 -> 2:1 3:2 4:1

Row: 4 -> 4:2

Divide each matrix row by a sum of all elements in the row ...

Row: 0 -> 0:1

Row: 1 -> 0:0.25 1:0.5 2:0.25

Row: 2 -> 1:0.25 2:0.5 3:0.25

Row: 3 -> 2:0.25 3:0.5 4:0.25

Row: 4 -> 4:1

◆ forAllRows() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forAllRows ( Function && function ) const

Method for parallel iteration over all matrix rows for constant instances.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method SparseMatrixBase::forAllElements where more than one thread can be mapped to each row.

Template Parameters

Function is type of the lambda function.

Parameters

function is an instance of the lambda function to be called for each row.

auto function = [] __cuda_callable__ ( const ConstRowView& row ) { ... };

TNL::Matrices::SparseMatrixBase::ConstRowView

typename RowView::ConstRowView ConstRowView

Type for accessing constant matrix rows.

Definition SparseMatrixBase.h:101

ConstRowView represents matrix row - see TNL::Matrices::SparseMatrixBase::ConstRowView.

◆ forElements() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forElements	(	IndexType	begin,
		IndexType	end,
		Function &&	function )

Method for iteration over all matrix rows for non-constant instances.

Template Parameters

Function is type of lambda function that will operate on matrix elements. It should have form like

auto function = [] __cuda_callable__ ( IndexType rowIdx, IndexType localIdx, IndexType& columnIdx, RealType& value )

{ ... };

TNL::Matrices::SparseMatrixBase::IndexType

Index IndexType

The type used for matrix elements indexing.

Definition SparseMatrixBase.h:85

TNL::Matrices::SparseMatrixBase::RealType

typename Base::RealType RealType

The type of matrix elements.

Definition SparseMatrixBase.h:73

The localIdx parameter is a rank of the non-zero element in given row.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
function	is an instance of the lambda function to be called in each row.

Example: #include <iostream>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

forElementsExample()

{

TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 2, 3, 4, 5 }, 5 );

auto matrixView = matrix.getView();

auto f = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )

{

// This is important, some matrix formats may allocate more matrix elements

// than we requested. These padding elements are processed here as well.

if( rowIdx >= localIdx ) {

columnIdx = localIdx;

value = rowIdx + localIdx + 1;

}

};

matrixView.forElements( 0, matrix.getRows(), f ); // or matrix.forElements

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Creating matrix on host: " << std::endl;

forElementsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Creating matrix on CUDA device: " << std::endl;

forElementsExample< TNL::Devices::Cuda >();

#endif

}

TNL::Matrices::SparseMatrixBase::forElements
void forElements(IndexType begin, IndexType end, Function &&function) const
Method for iteration over all matrix rows for constant instances.
Definition SparseMatrixBase.hpp:528

Output: Creating matrix on host:

Row: 0 -> 0:1

Row: 1 -> 0:2 1:3

Row: 2 -> 0:3 1:4 2:5

Row: 3 -> 0:4 1:5 2:6 3:7

Row: 4 -> 0:5 1:6 2:7 3:8 4:9

Creating matrix on CUDA device:

Row: 0 -> 0:1

Row: 1 -> 0:2 1:3

Row: 2 -> 0:3 1:4 2:5

Row: 3 -> 0:4 1:5 2:6 3:7

Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forElements() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forElements	(	IndexType	begin,
		IndexType	end,
		Function &&	function ) const

Method for iteration over all matrix rows for constant instances.

Template Parameters

Function is type of lambda function that will operate on matrix elements. It should have form like

auto function = [] __cuda_callable__

( IndexType rowIdx, IndexType localIdx, const IndexType& columnIdx, const RealType& value )

{ ... };

The localIdx parameter is a rank of the non-zero element in given row.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
function	is an instance of the lambda function to be called in each row.

◆ forRows() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forRows	(	IndexType	begin,
		IndexType	end,
		Function &&	function )

Method for parallel iteration over matrix rows from interval [begin, end).

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method SparseMatrixBase::forElements where more than one thread can be mapped to each row.

Template Parameters

Function is type of the lambda function.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
function	is an instance of the lambda function to be called for each row.

auto function = [] __cuda_callable__ ( RowView& row ) { ... };

RowView represents matrix row - see TNL::Matrices::SparseMatrixBase::RowView.

Example: #include <iostream>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

forRowsExample()

{

/***

* Set the following matrix (dots represent zero matrix elements):

*

* / 2 . . . . \

* | 1 2 1 . . |

* | . 1 2 1. . |

* | . . 1 2 1 |

* \ . . . . 2 /

*/

const int size( 5 );

using MatrixType = TNL::Matrices::SparseMatrix< double, Device >;

using RowView = typename MatrixType::RowView;

MatrixType matrix( { 1, 3, 3, 3, 1 }, size );

auto matrixView = matrix.getView();

/***

* Set the matrix elements.

*/

auto f = [] __cuda_callable__( RowView & row )

{

const int rowIdx = row.getRowIndex();

if( rowIdx == 0 )

row.setElement( 0, rowIdx, 2.0 ); // diagonal element

else if( rowIdx == size - 1 )

row.setElement( 0, rowIdx, 2.0 ); // diagonal element

else {

row.setElement( 0, rowIdx - 1, 1.0 ); // elements below the diagonal

row.setElement( 1, rowIdx, 2.0 ); // diagonal element

row.setElement( 2, rowIdx + 1, 1.0 ); // elements above the diagonal

}

};

matrixView.forAllRows( f ); // or matrix.forAllRows

std::cout << matrix << std::endl;

/***

* Divide each matrix row by a sum of all elements in the row - with use of iterators.

*/

matrix.forAllRows(

[] __cuda_callable__( RowView & row )

{

double sum = 0.0;

for( auto element : row )

sum += element.value();

for( auto element : row )

element.value() /= sum;

} );

std::cout << "Divide each matrix row by a sum of all elements in the row ... " << std::endl;

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting matrix rows on host: " << std::endl;

forRowsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting matrix rows on CUDA device: " << std::endl;

forRowsExample< TNL::Devices::Cuda >();

#endif

}

Output: Getting matrix rows on host:

Row: 0 -> 0:2

Row: 1 -> 0:1 1:2 2:1

Row: 2 -> 1:1 2:2 3:1

Row: 3 -> 2:1 3:2 4:1

Row: 4 -> 4:2

Divide each matrix row by a sum of all elements in the row ...

Row: 0 -> 0:1

Row: 1 -> 0:0.25 1:0.5 2:0.25

Row: 2 -> 1:0.25 2:0.5 3:0.25

Row: 3 -> 2:0.25 3:0.5 4:0.25

Row: 4 -> 4:1

Getting matrix rows on CUDA device:

Row: 0 -> 0:2

Row: 1 -> 0:1 1:2 2:1

Row: 2 -> 1:1 2:2 3:1

Row: 3 -> 2:1 3:2 4:1

Row: 4 -> 4:2

Divide each matrix row by a sum of all elements in the row ...

Row: 0 -> 0:1

Row: 1 -> 0:0.25 1:0.5 2:0.25

Row: 2 -> 1:0.25 2:0.5 3:0.25

Row: 3 -> 2:0.25 3:0.5 4:0.25

Row: 4 -> 4:1

◆ forRows() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forRows	(	IndexType	begin,
		IndexType	end,
		Function &&	function ) const

Method for parallel iteration over matrix rows from interval [begin, end) for constant instances.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method SparseMatrixBase::forElements where more than one thread can be mapped to each row.

Template Parameters

Function is type of the lambda function.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
function	is an instance of the lambda function to be called for each row.

auto function = [] __cuda_callable__ ( const ConstRowView& row ) { ... };

ConstRowView represents matrix row - see TNL::Matrices::SparseMatrixBase::ConstRowView.

◆ getColumnIndexes() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getColumnIndexes ( )

nodiscard

Getter of column indexes for nonconstant instances.

Returns: Reference to a vector with matrix elements column indexes.

◆ getColumnIndexes() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getColumnIndexes ( ) const

nodiscard

Getter of column indexes for constant instances.

Returns: Constant reference to a vector with matrix elements column indexes.

◆ getCompressedRowLengths()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Vector>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getCompressedRowLengths ( Vector & rowLengths ) const

Computes number of non-zeros in each row.

Parameters

rowLengths is a vector into which the number of non-zeros in each row will be stored.

Example: #include <iostream>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

getCompressedRowLengthsExample()

{

TNL::Matrices::SparseMatrix< double, Device > triangularMatrix( 5, 5 );

triangularMatrix.setElements( {

// clang-format off

{ 0, 0, 1 },

{ 1, 0, 2 }, { 1, 1, 3 },

{ 2, 0, 4 }, { 2, 1, 5 }, { 2, 2, 6 },

{ 3, 0, 7 }, { 3, 1, 8 }, { 3, 2, 9 }, { 3, 3, 10 },

{ 4, 0, 11 }, { 4, 1, 12 }, { 4, 2, 13 }, { 4, 3, 14 }, { 4, 4, 15 },

// clang-format on

} );

std::cout << triangularMatrix << std::endl;

TNL::Containers::Vector< int, Device > rowLengths;

triangularMatrix.getCompressedRowLengths( rowLengths );

std::cout << "Compressed row lengths are: " << rowLengths << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting compressed row lengths on host: " << std::endl;

getCompressedRowLengthsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting compressed row lengths on CUDA device: " << std::endl;

getCompressedRowLengthsExample< TNL::Devices::Cuda >();

#endif

}

TNL::Containers::Vector
Vector extends Array with algebraic operations.
Definition Vector.h:36

Output: Getting compressed row lengths on host:

Row: 0 -> 0:1

Row: 1 -> 0:2 1:3

Row: 2 -> 0:4 1:5 2:6

Row: 3 -> 0:7 1:8 2:9 3:10

Row: 4 -> 0:11 1:12 2:13 3:14 4:15

Compressed row lengths are: [ 1, 2, 3, 4, 5 ]

Getting compressed row lengths on CUDA device:

Row: 0 -> 0:1

Row: 1 -> 0:2 1:3

Row: 2 -> 0:4 1:5 2:6

Row: 3 -> 0:7 1:8 2:9 3:10

Row: 4 -> 0:11 1:12 2:13 3:14 4:15

Compressed row lengths are: [ 1, 2, 3, 4, 5 ]

◆ getElement()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getElement	(	IndexType	row,
		IndexType	column ) const

nodiscard

Returns value of matrix element at position given by its row and column index.

Parameters

row	is a row index of the matrix element.
column	i a column index of the matrix element.

Returns: value of given matrix element.

Example: #include <iostream>

#include <iomanip>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

getElements()

{

TNL::Matrices::SparseMatrix< double, Device > matrix(

// number of matrix rows

5,

// number of matrix columns

5,

// matrix elements definition

{

// clang-format off

{ 0, 0, 2.0 },

{ 1, 0, -1.0 }, { 1, 1, 2.0 }, { 1, 2, -1.0 },

{ 2, 1, -1.0 }, { 2, 2, 2.0 }, { 2, 3, -1.0 },

{ 3, 2, -1.0 }, { 3, 3, 2.0 }, { 3, 4, -1.0 },

{ 4, 4, 2.0 },

// clang-format on

} );

for( int i = 0; i < 5; i++ ) {

for( int j = 0; j < 5; j++ )

std::cout << std::setw( 5 ) << matrix.getElement( i, j );

std::cout << std::endl;

}

}

int

main( int argc, char* argv[] )

{

std::cout << "Get elements on host:" << std::endl;

getElements< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Get elements on CUDA device:" << std::endl;

getElements< TNL::Devices::Cuda >();

#endif

}

std::setw
T setw(T... args)

Output: Get elements on host:

2 0 0 0 0

-1 2 -1 0 0

0 -1 2 -1 0

0 0 -1 2 -1

0 0 0 0 2

Get elements on CUDA device:

2 0 0 0 0

-1 2 -1 0 0

0 -1 2 -1 0

0 0 -1 2 -1

0 0 0 0 2

◆ getNonzeroElementsCount()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

Index TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getNonzeroElementsCount ( ) const

nodiscardoverridevirtual

Returns number of non-zero matrix elements.

This method really counts the non-zero matrix elements and so it returns zero for matrix having all allocated elements set to zero.

Returns: number of non-zero matrix elements.

Reimplemented from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >.

◆ getRow() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getRow ( IndexType rowIdx )

nodiscard

Non-constant getter of simple structure for accessing given matrix row.

Parameters

rowIdx is matrix row index.

Returns: RowView for accessing given matrix row.

Example: #include <iostream>

#include <TNL/Algorithms/parallelFor.h>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

getRowExample()

{

/***

* Set the following matrix (dots represent zero matrix elements):

*

* / 2 . . . . \

* | 1 2 1 . . |

* | . 1 2 1. . |

* | . . 1 2 1 |

* \ . . . . 2 /

*/

const int size = 5;

TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 3, 3, 3, 1 }, size );

auto view = matrix.getView();

/***

* Set the matrix elements.

*/

auto f = [ = ] __cuda_callable__( int rowIdx ) mutable

{

auto row = view.getRow( rowIdx );

if( rowIdx == 0 )

row.setElement( 0, rowIdx, 2.0 ); // diagonal element

else if( rowIdx == size - 1 )

row.setElement( 0, rowIdx, 2.0 ); // diagonal element

else {

row.setElement( 0, rowIdx - 1, 1.0 ); // elements below the diagonal

row.setElement( 1, rowIdx, 2.0 ); // diagonal element

row.setElement( 2, rowIdx + 1, 1.0 ); // elements above the diagonal

}

};

TNL::Algorithms::parallelFor< Device >( 0, matrix.getRows(), f );

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting matrix rows on host: " << std::endl;

getRowExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting matrix rows on CUDA device: " << std::endl;

getRowExample< TNL::Devices::Cuda >();

#endif

}

TNL::Matrices::SparseMatrixBase::getRow
__cuda_callable__ ConstRowView getRow(IndexType rowIdx) const
Constant getter of simple structure for accessing given matrix row.
Definition SparseMatrixBase.hpp:132

TNL::Algorithms::parallelFor
std::enable_if_t< std::is_integral_v< Begin > &&std::is_integral_v< End > > parallelFor(const Begin &begin, const End &end, typename Device::LaunchConfiguration launch_config, Function f, FunctionArgs... args)
Parallel for-loop function for 1D range specified with integral values.
Definition parallelFor.h:41

Output: Getting matrix rows on host:

Row: 0 -> 0:2

Row: 1 -> 0:1 1:2 2:1

Row: 2 -> 1:1 2:2 3:1

Row: 3 -> 2:1 3:2 4:1

Row: 4 -> 4:2

Getting matrix rows on CUDA device:

Row: 0 -> 0:2

Row: 1 -> 0:1 1:2 2:1

Row: 2 -> 1:1 2:2 3:1

Row: 3 -> 2:1 3:2 4:1

Row: 4 -> 4:2

See SparseMatrixRowView.

◆ getRow() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getRow ( IndexType rowIdx ) const

nodiscard

Constant getter of simple structure for accessing given matrix row.

Parameters

rowIdx is matrix row index.

Returns: RowView for accessing given matrix row.

Example: #include <iostream>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

getRowExample()

{

TNL::Matrices::SparseMatrix< double, Device > matrix(

//

5,

5,

{

// clang-format off

{ 0, 0, 1 },

{ 1, 0, 1 }, { 1, 1, 2 },

{ 2, 0, 1 }, { 2, 1, 2 }, { 2, 2, 3 },

{ 3, 0, 1 }, { 3, 1, 2 }, { 3, 2, 3 }, { 3, 3, 4 },

{ 4, 0, 1 }, { 4, 1, 2 }, { 4, 2, 3 }, { 4, 3, 4 }, { 4, 4, 5 },

// clang-format on

} );

const auto matrixView = matrix.getView();

/***

* Fetch lambda function returns diagonal element in each row.

*/

auto fetch = [ = ] __cuda_callable__( int rowIdx ) -> double

{

auto row = matrixView.getRow( rowIdx );

return row.getValue( rowIdx );

};

/***

* Compute the matrix trace.

*/

int trace = TNL::Algorithms::reduce< Device >( 0, matrix.getRows(), fetch, std::plus<>{}, 0 );

std::cout << "Matrix trace is " << trace << "." << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting matrix rows on host: " << std::endl;

getRowExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting matrix rows on CUDA device: " << std::endl;

getRowExample< TNL::Devices::Cuda >();

#endif

}

TNL::Algorithms::reduce
Result reduce(Index begin, Index end, Fetch &&fetch, Reduction &&reduction, const Result &identity)
reduce implements (parallel) reduction for vectors and arrays.
Definition reduce.h:66

std::plus

Output: Getting matrix rows on host:

Matrix trace is 15.

Getting matrix rows on CUDA device:

Matrix trace is 15.

See SparseMatrixRowView.

◆ getRowCapacities()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Vector>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getRowCapacities ( Vector & rowCapacities ) const

Compute capacities of all rows.

The row capacities are not stored explicitly and must be computed.

Parameters

rowCapacities is a vector where the row capacities will be stored.

◆ getRowCapacity()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ Index TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getRowCapacity ( IndexType row ) const

nodiscard

Returns capacity of given matrix row.

Parameters

row	index of matrix row.

Returns: number of matrix elements allocated for the row.

◆ getSegments() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getSegments ( )

nodiscard

Getter of segments for non-constant instances.

Segments are a structure for addressing the matrix elements columns and values. In fact, Segments represent the sparse matrix format.

Returns: Non-constant reference to segments.

◆ getSegments() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getSegments ( ) const

nodiscard

Getter of segments for constant instances.

Segments are a structure for addressing the matrix elements columns and values. In fact, Segments represent the sparse matrix format.

Returns: Constant reference to segments.

◆ getSerializationType()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

std::string TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getSerializationType ( )

staticnodiscard

Returns string with serialization type.

The string has a form Matrices::SparseMatrix< RealType, [any_device], IndexType, General/Symmetric, Format, [any_allocator] >.

Returns: String with the serialization type.

Example: #include <iostream>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

getSerializationTypeExample()

{

TNL::Matrices::SparseMatrix< double, Device > matrix;

std::cout << "Matrix type is: " << matrix.getSerializationType();

}

int

main( int argc, char* argv[] )

{

std::cout << "Get serialization type on CPU ... " << std::endl;

getSerializationTypeExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Get serialization type on CUDA GPU ... " << std::endl;

getSerializationTypeExample< TNL::Devices::Cuda >();

#endif

}

TNL::Matrices::SparseMatrix::getSerializationType
static std::string getSerializationType()
Returns string with serialization type.
Definition SparseMatrixBase.hpp:45

Output: Get serialization type on CPU ...

Matrix type is: Matrices::SparseMatrix< double, CSR< int >, [any_device], int, General, [any_allocator], [any_allocator] >Get serialization type on CUDA GPU ...

Matrix type is: Matrices::SparseMatrix< double, CSR< int >, [any_device], int, General, [any_allocator], [any_allocator] >

◆ operator!=()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Matrix>

bool TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::operator!= ( const Matrix & matrix ) const

nodiscard

Comparison operator with another arbitrary matrix type.

Parameters

matrix is the right-hand side matrix.

Returns: false if the RHS matrix is equal, true otherwise.

◆ operator=()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

SparseMatrixBase & TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::operator= ( const SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > & )

delete

Copy-assignment operator.

It is a deleted function, because sparse matrix assignment in general requires reallocation.

◆ operator==()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Matrix>

bool TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::operator== ( const Matrix & matrix ) const

nodiscard

Comparison operator with another arbitrary matrix type.

Parameters

matrix is the right-hand side matrix.

Returns: true if the RHS matrix is equal, false otherwise.

◆ print()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::print ( std::ostream & str ) const

Method for printing the matrix to output stream.

Parameters

str	is the output stream.

◆ reduceAllRows() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Fetch, typename Reduce, typename Keep, typename FetchValue, typename SegmentsReductionKernel>

std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value > TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::reduceAllRows	(	Fetch &&	fetch,
		const Reduce &	reduce,
		Keep &&	keep,
		const FetchValue &	identity,
		const SegmentsReductionKernel &	kernel = SegmentsReductionKernel{} ) const

Method for performing general reduction on all matrix rows for constant instances.

Template Parameters

Fetch is a type of lambda function for data fetch declared as

auto fetch = [] __cuda_callable__ ( IndexType rowIdx, IndexType& columnIdx, RealType& elementValue ) -> FetchValue

{ ... };

The return type of this lambda can be any non void.

Template Parameters

Reduce is a function object for reduction (some of Function objects for reduction operations) or a lambda function defined as

auto reduce = [] __cuda_callable__ ( const FetchValue& v1, const FetchValue& v2 ) -> FetchValue { ... };

Template Parameters

Keep	is a type of lambda function for storing results of reduction in each row. It is declared as

auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value ) { ... };

Template Parameters

FetchValue is type returned by the Fetch lambda function.

Parameters

fetch	is an instance of lambda function for data fetch.
reduce	is an instance of lambda function for reduction.
keep	in an instance of lambda function for storing results.
identity	is the identity element for the reduction operation, i.e. element which does not change the result of the reduction.
kernel	is an instance of the segments reduction kernel to be used for the operation.

Example: #include <iostream>

#include <iomanip>

#include <functional>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

reduceAllRows()

{

TNL::Matrices::SparseMatrix< double, Device > matrix(

// number of matrix rows

5,

// number of matrix columns

5,

// matrix elements definition

{

// clang-format off

{ 0, 0, 1 },

{ 1, 1, 1 }, { 1, 2, 8 },

{ 2, 2, 1 }, { 2, 3, 9 },

{ 3, 3, 1 }, { 3, 4, 9 },

{ 4, 4, 1 },

// clang-format on

} );

/***

* Find largest element in each row.

*/

TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );

/***

* Prepare vector view and matrix view for lambdas.

*/

auto rowMaxView = rowMax.getView();

/***

* Fetch lambda just returns absolute value of matrix elements.

*/

auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double

{

return TNL::abs( value );

};

/***

* Reduce lambda return maximum of given values.

*/

auto reduce = [] __cuda_callable__( double& a, const double& b ) -> double

{

return TNL::max( a, b );

};

/***

* Keep lambda store the largest value in each row to the vector rowMax.

*/

auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable

{

rowMaxView[ rowIdx ] = value;

};

/***

* Compute the largest values in each row.

*/

matrix.reduceAllRows( fetch, reduce, keep, std::numeric_limits< double >::lowest() );

std::cout << "The matrix reads as: " << std::endl << matrix << std::endl;

std::cout << "Max. elements in rows are: " << rowMax << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "All rows reduction on host:" << std::endl;

reduceAllRows< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "All rows reduction on CUDA device:" << std::endl;

reduceAllRows< TNL::Devices::Cuda >();

#endif

}

TNL::Matrices::SparseMatrixBase::reduceAllRows
std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value > reduceAllRows(Fetch &&fetch, const Reduce &reduce, Keep &&keep, const FetchValue &identity, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
Method for performing general reduction on all matrix rows for constant instances.
Definition SparseMatrixBase.hpp:502

std::numeric_limits::lowest
T lowest(T... args)

TNL::abs
__cuda_callable__ T abs(const T &n)
This function returns absolute value of given number n.
Definition Math.h:74

TNL::max
constexpr ResultType max(const T1 &a, const T2 &b)
This function returns maximum of two numbers.
Definition Math.h:48

Output: All rows reduction on host:

The matrix reads as:

Row: 0 -> 0:1

Row: 1 -> 1:1 2:8

Row: 2 -> 2:1 3:9

Row: 3 -> 3:1 4:9

Row: 4 -> 4:1

Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

All rows reduction on CUDA device:

The matrix reads as:

Row: 0 -> 0:1

Row: 1 -> 1:1 2:8

Row: 2 -> 2:1 3:9

Row: 3 -> 3:1 4:9

Row: 4 -> 4:1

Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

◆ reduceAllRows() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Fetch, typename Reduce, typename Keep, typename SegmentsReductionKernel>

std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value > TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::reduceAllRows	(	Fetch &&	fetch,
		const Reduce &	reduce,
		Keep &&	keep,
		const SegmentsReductionKernel &	kernel = SegmentsReductionKernel{} ) const

Method for performing general reduction on all matrix rows for constant instances with function object instead of lambda function for reduction.

Template Parameters

Fetch is a type of lambda function for data fetch declared as

auto fetch = [] __cuda_callable__ ( IndexType rowIdx, IndexType& columnIdx, RealType& elementValue ) -> FetchValue

{ ... };

The return type of this lambda can be any non void.

Template Parameters

Reduce	is a function object for reduction (some of Function objects for reduction operations).
Keep	is a type of lambda function for storing results of reduction in each row. It is declared as

auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value ) { ... };

Template Parameters

FetchValue is type returned by the Fetch lambda function.

Parameters

fetch	is an instance of lambda function for data fetch.
reduce	is an instance of function object for reduction.
keep	in an instance of lambda function for storing results.
kernel	is an instance of the segments reduction kernel to be used for the operation.

Example: #include <iostream>

#include <iomanip>

#include <functional>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

reduceAllRows()

{

TNL::Matrices::SparseMatrix< double, Device > matrix(

// number of matrix rows

5,

// number of matrix columns

5,

// matrix elements definition

{

// clang-format off

{ 0, 0, 1 },

{ 1, 1, 1 }, { 1, 2, 8 },

{ 2, 2, 1 }, { 2, 3, 9 },

{ 3, 3, 1 }, { 3, 4, 9 },

{ 4, 4, 1 },

// clang-format on

} );

/***

* Find largest element in each row.

*/

TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );

/***

* Prepare vector view and matrix view for lambdas.

*/

auto rowMaxView = rowMax.getView();

/***

* Fetch lambda just returns absolute value of matrix elements.

*/

auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double

{

return TNL::abs( value );

};

/***

* Keep lambda store the largest value in each row to the vector rowMax.

*/

auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable

{

rowMaxView[ rowIdx ] = value;

};

/***

* Compute the largest values in each row.

*/

matrix.reduceAllRows( fetch, TNL::Max{}, keep );

std::cout << "The matrix reads as: " << std::endl << matrix << std::endl;

std::cout << "Max. elements in rows are: " << rowMax << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "All rows reduction on host:" << std::endl;

reduceAllRows< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "All rows reduction on CUDA device:" << std::endl;

reduceAllRows< TNL::Devices::Cuda >();

#endif

}

TNL::Max
Function object implementing max(x, y).
Definition Functional.h:272

Output: All rows reduction on host:

The matrix reads as:

Row: 0 -> 0:1

Row: 1 -> 1:1 2:8

Row: 2 -> 2:1 3:9

Row: 3 -> 3:1 4:9

Row: 4 -> 4:1

Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

All rows reduction on CUDA device:

The matrix reads as:

Row: 0 -> 0:1

Row: 1 -> 1:1 2:8

Row: 2 -> 2:1 3:9

Row: 3 -> 3:1 4:9

Row: 4 -> 4:1

Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

◆ reduceRows() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Fetch, typename Reduce, typename Keep, typename FetchValue, typename SegmentsReductionKernel>

std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value > TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::reduceRows	(	IndexType	begin,
		IndexType	end,
		Fetch &&	fetch,
		const Reduce &	reduce,
		Keep &&	keep,
		const FetchValue &	identity,
		const SegmentsReductionKernel &	kernel = SegmentsReductionKernel{} ) const

Method for performing general reduction on matrix rows for constant instances.

Template Parameters

Fetch is a type of lambda function for data fetch declared as

auto fetch = [] __cuda_callable__ ( IndexType rowIdx, IndexType& columnIdx, RealType& elementValue ) -> FetchValue

{ ... };

The return type of this lambda can be any non void.

Template Parameters

Reduce is a function object for reduction (some of Function objects for reduction operations) or a lambda function defined as

auto reduce = [] __cuda_callable__ ( const FetchValue& v1, const FetchValue& v2 ) -> FetchValue { ... };

Template Parameters

Keep	is a type of lambda function for storing results of reduction in each row. It is declared as

auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value ) { ... };

Template Parameters

FetchValue is type returned by the Fetch lambda function.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
fetch	is an instance of lambda function for data fetch.
reduce	is an instance of lambda function for reduction.
keep	in an instance of lambda function for storing results.
identity	is the identity element for the reduction operation, i.e. element which does not change the result of the reduction.
kernel	is an instance of the segments reduction kernel to be used for the operation.

Example: #include <iostream>

#include <iomanip>

#include <functional>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

reduceRows()

{

TNL::Matrices::SparseMatrix< double, Device > matrix(

// number of matrix rows

5,

// number of matrix columns

5,

// matrix elements definition

{

// clang-format off

{ 0, 0, 1 },

{ 1, 1, 1 }, { 1, 2, 8 },

{ 2, 2, 1 }, { 2, 3, 9 },

{ 3, 3, 1 }, { 3, 4, 9 },

{ 4, 4, 1 },

// clang-format on

} );

/***

* Find largest element in each row.

*/

TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );

/***

* Prepare vector view for lambdas.

*/

auto rowMaxView = rowMax.getView();

/***

* Fetch lambda just returns absolute value of matrix elements.

*/

auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double

{

return TNL::abs( value );

};

/***

* Reduce lambda return maximum of given values.

*/

auto reduce = [] __cuda_callable__( double& a, const double& b ) -> double

{

return TNL::max( a, b );

};

/***

* Keep lambda store the largest value in each row to the vector rowMax.

*/

auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable

{

rowMaxView[ rowIdx ] = value;

};

/***

* Compute the largest values in each row.

*/

matrix.reduceRows( 0, matrix.getRows(), fetch, reduce, keep, std::numeric_limits< double >::lowest() );

std::cout << "The matrix reads as: " << std::endl << matrix << std::endl;

std::cout << "Max. elements in rows are: " << rowMax << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Rows reduction on host:" << std::endl;

reduceRows< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Rows reduction on CUDA device:" << std::endl;

reduceRows< TNL::Devices::Cuda >();

#endif

}

TNL::Matrices::SparseMatrixBase::reduceRows
std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value > reduceRows(IndexType begin, IndexType end, Fetch &&fetch, const Reduce &reduce, Keep &&keep, const FetchValue &identity, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
Method for performing general reduction on matrix rows for constant instances.
Definition SparseMatrixBase.hpp:451

Output: Rows reduction on host:

The matrix reads as:

Row: 0 -> 0:1

Row: 1 -> 1:1 2:8

Row: 2 -> 2:1 3:9

Row: 3 -> 3:1 4:9

Row: 4 -> 4:1

Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

Rows reduction on CUDA device:

The matrix reads as:

Row: 0 -> 0:1

Row: 1 -> 1:1 2:8

Row: 2 -> 2:1 3:9

Row: 3 -> 3:1 4:9

Row: 4 -> 4:1

Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

◆ reduceRows() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Fetch, typename Reduce, typename Keep, typename SegmentsReductionKernel>

std::enable_if_t< Algorithms::SegmentsReductionKernels::isSegmentReductionKernel< SegmentsReductionKernel >::value > TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::reduceRows	(	IndexType	begin,
		IndexType	end,
		Fetch &&	fetch,
		const Reduce &	reduce,
		Keep &&	keep,
		const SegmentsReductionKernel &	kernel = SegmentsReductionKernel{} ) const

Method for performing general reduction on matrix rows for constant instances with function object instead of lamda function for reduction.

Template Parameters

Fetch is a type of lambda function for data fetch declared as

auto fetch = [] __cuda_callable__ ( IndexType rowIdx, IndexType& columnIdx, RealType& elementValue ) -> FetchValue

{ ... };

The return type of this lambda can be any non void.

Template Parameters

Reduce	is a function object for reduction (some of Function objects for reduction operations).
Keep	is a type of lambda function for storing results of reduction in each row. It is declared as

auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value ) { ... };

Template Parameters

FetchValue is type returned by the Fetch lambda function.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
fetch	is an instance of lambda function for data fetch.
reduce	is an instance of function object for reduction.
keep	in an instance of lambda function for storing results.
kernel	is an instance of the segments reduction kernel to be used for the operation.

Example: #include <iostream>

#include <iomanip>

#include <functional>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

reduceRows()

{

TNL::Matrices::SparseMatrix< double, Device > matrix(

// number of matrix rows

5,

// number of matrix columns

5,

// matrix elements definition

{

// clang-format off

{ 0, 0, 1 },

{ 1, 1, 1 }, { 1, 2, 8 },

{ 2, 2, 1 }, { 2, 3, 9 },

{ 3, 3, 1 }, { 3, 4, 9 },

{ 4, 4, 1 },

// clang-format on

} );

/***

* Find largest element in each row.

*/

TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );

/***

* Prepare vector view for lambdas.

*/

auto rowMaxView = rowMax.getView();

/***

* Fetch lambda just returns absolute value of matrix elements.

*/

auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double

{

return TNL::abs( value );

};

/***

* Keep lambda store the largest value in each row to the vector rowMax.

*/

auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable

{

rowMaxView[ rowIdx ] = value;

};

/***

* Compute the largest values in each row.

*/

matrix.reduceRows( 0, matrix.getRows(), fetch, TNL::Max{}, keep );

std::cout << "The matrix reads as: " << std::endl << matrix << std::endl;

std::cout << "Max. elements in rows are: " << rowMax << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Rows reduction on host:" << std::endl;

reduceRows< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Rows reduction on CUDA device:" << std::endl;

reduceRows< TNL::Devices::Cuda >();

#endif

}

Output: Rows reduction on host:

The matrix reads as:

Row: 0 -> 0:1

Row: 1 -> 1:1 2:8

Row: 2 -> 2:1 3:9

Row: 3 -> 3:1 4:9

Row: 4 -> 4:1

Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

Rows reduction on CUDA device:

The matrix reads as:

Row: 0 -> 0:1

Row: 1 -> 1:1 2:8

Row: 2 -> 2:1 3:9

Row: 3 -> 3:1 4:9

Row: 4 -> 4:1

Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

◆ sequentialForAllRows() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::sequentialForAllRows ( Function && function )

This method calls sequentialForRows for all matrix rows.

See SparseMatrixBase::sequentialForAllRows.

Template Parameters

Function is a type of lambda function that will operate on matrix elements.

Parameters

function is an instance of the lambda function to be called in each row.

◆ sequentialForAllRows() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::sequentialForAllRows ( Function && function ) const

This method calls sequentialForRows for all matrix rows (for constant instances).

See SparseMatrixBase::sequentialForRows.

Template Parameters

Function is a type of lambda function that will operate on matrix elements.

Parameters

function is an instance of the lambda function to be called in each row.

◆ sequentialForRows() [1/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::sequentialForRows	(	IndexType	begin,
		IndexType	end,
		Function &&	function )

Method for sequential iteration over all matrix rows for non-constant instances.

Template Parameters

Function is type of lambda function that will operate on matrix elements. It should have form like

auto function = [] __cuda_callable__ ( RowView& row ) { ... };

RowView represents matrix row - see TNL::Matrices::SparseMatrixBase::RowView.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
function	is an instance of the lambda function to be called in each row.

◆ sequentialForRows() [2/2]

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename Function>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::sequentialForRows	(	IndexType	begin,
		IndexType	end,
		Function &&	function ) const

Method for sequential iteration over all matrix rows for constant instances.

Template Parameters

Function is type of lambda function that will operate on matrix elements. It should have form like

auto function = [] __cuda_callable__ ( const ConstRowView& row ) { ... };

ConstRowView represents matrix row - see TNL::Matrices::SparseMatrixBase::ConstRowView.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
function	is an instance of the lambda function to be called in each row.

◆ setElement()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

__cuda_callable__ void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::setElement	(	IndexType	row,
		IndexType	column,
		const RealType &	value )

Sets element at given row and column to given value.

Parameters

row	is row index of the element.
column	is columns index of the element.
value	is the value the element will be set to.

Example: #include <iostream>

#include <TNL/Algorithms/parallelFor.h>

#include <TNL/Matrices/SparseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

setElements()

{

TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 1, 1, 1, 1 }, 5 );

/****

* Get the matrix view.

*/

auto view = matrix.getView();

for( int i = 0; i < 5; i++ )

view.setElement( i, i, i );

std::cout << "Matrix set from the host:" << std::endl;

std::cout << matrix << std::endl;

auto f = [ = ] __cuda_callable__( int i ) mutable

{

view.setElement( i, i, -i );

};

TNL::Algorithms::parallelFor< Device >( 0, 5, f );

std::cout << "Matrix set from its native device:" << std::endl;

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Set elements on host:" << std::endl;

setElements< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Set elements on CUDA device:" << std::endl;

setElements< TNL::Devices::Cuda >();

#endif

}

Output: Set elements on host:

Matrix set from the host:

Row: 0 ->

Row: 1 -> 1:1

Row: 2 -> 2:2

Row: 3 -> 3:3

Row: 4 -> 4:4

Matrix set from its native device:

Row: 0 ->

Row: 1 -> 1:-1

Row: 2 -> 2:-2

Row: 3 -> 3:-3

Row: 4 -> 4:-4

Set elements on CUDA device:

Matrix set from the host:

Row: 0 ->

Row: 1 -> 1:1

Row: 2 -> 2:2

Row: 3 -> 3:3

Row: 4 -> 4:4

Matrix set from its native device:

Row: 0 ->

Row: 1 -> 1:-1

Row: 2 -> 2:-2

Row: 3 -> 3:-3

Row: 4 -> 4:-4

◆ transposedVectorProduct()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename InVector, typename OutVector>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::transposedVectorProduct	(	const InVector &	inVector,
		OutVector &	outVector,
		ComputeReal	matrixMultiplicator = 1.0,
		ComputeReal	outVectorMultiplicator = 0.0,
		Index	begin = 0,
		Index	end = 0 ) const

Computes product of transposed matrix and vector.

Template Parameters

InVector	is type of input vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.
OutVector	is type of output vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.

Parameters

inVector	is input vector.
outVector	is output vector.
matrixMultiplicator	is a factor by which the matrix is multiplied. It is one by default.
outVectorMultiplicator	is a factor by which the outVector is multiplied before added
begin	is the beginning of the rows range for which the vector product is computed. It is zero by default.
end	is the end of the rows range for which the vector product is computed. It is number if the matrix rows by default.

◆ vectorProduct()

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>

template<typename InVector, typename OutVector, typename SegmentsReductionKernel>

void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::vectorProduct	(	const InVector &	inVector,
		OutVector &	outVector,
		ComputeRealType	matrixMultiplicator = 1.0,
		ComputeRealType	outVectorMultiplicator = 0.0,
		IndexType	begin = 0,
		IndexType	end = 0,
		const SegmentsReductionKernel &	kernel = SegmentsReductionKernel{} ) const

Computes product of matrix and vector.

More precisely, it computes:

outVector = matrixMultiplicator * ( * this ) * inVector + outVectorMultiplicator * outVector

Template Parameters

InVector	is type of input vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.
OutVector	is type of output vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.

Parameters

inVector	is input vector.
outVector	is output vector.
matrixMultiplicator	is a factor by which the matrix is multiplied. It is one by default.
outVectorMultiplicator	is a factor by which the outVector is multiplied before added to the result of matrix-vector product. It is zero by default.
begin	is the beginning of the rows range for which the vector product is computed. It is zero by default.
end	is the end of the rows range for which the vector product is computed. It is number if the matrix rows by default.
kernel	is an instance of the segments reduction kernel to be used for the operation.

The documentation for this class was generated from the following files:

src/TNL/Matrices/SparseMatrixBase.h
src/TNL/Matrices/SparseMatrixBase.hpp

Public Types

Public Member Functions

Static Public Member Functions

Protected Member Functions

Protected Attributes

Detailed Description

Member Typedef Documentation

◆ ColumnIndexesViewType

◆ DefaultSegmentsReductionKernel

◆ RowView

Constructor & Destructor Documentation

◆ SparseMatrixBase() [1/3]

◆ SparseMatrixBase() [2/3]

◆ SparseMatrixBase() [3/3]

Member Function Documentation

◆ addElement()

◆ bind()

◆ findElement()

◆ forAllElements() [1/2]

◆ forAllElements() [2/2]

◆ forAllRows() [1/2]

◆ forAllRows() [2/2]

◆ forElements() [1/2]

◆ forElements() [2/2]

◆ forRows() [1/2]

◆ forRows() [2/2]

◆ getColumnIndexes() [1/2]

◆ getColumnIndexes() [2/2]

◆ getCompressedRowLengths()

◆ getElement()

◆ getNonzeroElementsCount()

◆ getRow() [1/2]

◆ getRow() [2/2]

◆ getRowCapacities()

◆ getRowCapacity()

◆ getSegments() [1/2]

◆ getSegments() [2/2]

◆ getSerializationType()

◆ operator!=()

◆ operator=()

◆ operator==()

◆ print()

◆ reduceAllRows() [1/2]

◆ reduceAllRows() [2/2]

◆ reduceRows() [1/2]

◆ reduceRows() [2/2]

◆ sequentialForAllRows() [1/2]

◆ sequentialForAllRows() [2/2]

◆ sequentialForRows() [1/2]

◆ sequentialForRows() [2/2]

◆ setElement()

◆ transposedVectorProduct()

◆ vectorProduct()