Template Numerical Library version\ main:bb09b17
Loading...
Searching...
No Matches
TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > Class Template Reference

Implementation of sparse matrix view. More...

#include <TNL/Matrices/SparseMatrixBase.h>

Inheritance diagram for TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >:
Collaboration diagram for TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >:

Public Types

using ColumnIndexesViewType
 
using ComputeRealType = ComputeReal
 
using ConstRowView = typename RowView::ConstRowView
 Type for accessing constant matrix rows.
 
using DefaultSegmentsReductionKernel = typename Algorithms::SegmentsReductionKernels::DefaultKernel< SegmentsView >::type
 Type of the kernel used for parallel reductions on segments.
 
using DeviceType = Device
 The device where the matrix is allocated.
 
using IndexType = Index
 The type used for matrix elements indexing.
 
using RealType = typename Base::RealType
 The type of matrix elements.
 
using RowView
 Type for accessing matrix rows.
 
using SegmentsViewType = SegmentsView
 Type of segments view used by this matrix. It represents the sparse matrix format.
 
- Public Types inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
using ConstValuesViewType
 Type of constant vector view holding values of matrix elements.
 
using DeviceType
 The device where the matrix is allocated.
 
using IndexType
 The type used for matrix elements indexing.
 
using RealType
 The type of matrix elements.
 
using RowCapacitiesType
 
using ValuesViewType
 Type of vector view holding values of matrix elements.
 

Public Member Functions

__cuda_callable__ SparseMatrixBase ()=default
 Constructor with no parameters.
 
__cuda_callable__ SparseMatrixBase (const SparseMatrixBase &matrix)=default
 Copy constructor.
 
__cuda_callable__ SparseMatrixBase (IndexType rows, IndexType columns, typename Base::ValuesViewType values, ColumnIndexesViewType columnIndexes, SegmentsViewType segments)
 Constructor with all necessary data and views.
 
__cuda_callable__ SparseMatrixBase (SparseMatrixBase &&matrix) noexcept=default
 Move constructor.
 
__cuda_callable__ void addElement (IndexType row, IndexType column, const RealType &value, const RealType &thisElementMultiplicator=1.0)
 Add element at given row and column to given value.
 
__cuda_callable__ IndexType findElement (IndexType row, IndexType column) const
 Finds element in the matrix and returns its position in the arrays with values and columnIndexes.
 
template<typename Function >
void forAllElements (Function &&function)
 This method calls forElements for all matrix rows.
 
template<typename Function >
void forAllElements (Function &&function) const
 This method calls forElements for all matrix rows (for constant instances).
 
template<typename Function >
void forAllRows (Function &&function)
 Method for parallel iteration over all matrix rows.
 
template<typename Function >
void forAllRows (Function &&function) const
 Method for parallel iteration over all matrix rows for constant instances.
 
template<typename Function >
void forElements (IndexType begin, IndexType end, Function &&function)
 Method for iteration over all matrix rows for non-constant instances.
 
template<typename Function >
void forElements (IndexType begin, IndexType end, Function &&function) const
 Method for iteration over all matrix rows for constant instances.
 
template<typename Function >
void forRows (IndexType begin, IndexType end, Function &&function)
 Method for parallel iteration over matrix rows from interval [begin, end).
 
template<typename Function >
void forRows (IndexType begin, IndexType end, Function &&function) const
 Method for parallel iteration over matrix rows from interval [begin, end) for constant instances.
 
ColumnIndexesViewTypegetColumnIndexes ()
 Getter of column indexes for nonconstant instances.
 
const ColumnIndexesViewTypegetColumnIndexes () const
 Getter of column indexes for constant instances.
 
template<typename Vector >
void getCompressedRowLengths (Vector &rowLengths) const
 Computes number of non-zeros in each row.
 
__cuda_callable__ RealType getElement (IndexType row, IndexType column) const
 Returns value of matrix element at position given by its row and column index.
 
IndexType getNonzeroElementsCount () const override
 Returns number of non-zero matrix elements.
 
__cuda_callable__ RowView getRow (IndexType rowIdx)
 Non-constant getter of simple structure for accessing given matrix row.
 
__cuda_callable__ ConstRowView getRow (IndexType rowIdx) const
 Constant getter of simple structure for accessing given matrix row.
 
template<typename Vector >
void getRowCapacities (Vector &rowCapacities) const
 Compute capacities of all rows.
 
__cuda_callable__ IndexType getRowCapacity (IndexType row) const
 Returns capacity of given matrix row.
 
SegmentsViewTypegetSegments ()
 Getter of segments for non-constant instances.
 
const SegmentsViewTypegetSegments () const
 Getter of segments for constant instances.
 
template<typename Matrix >
bool operator!= (const Matrix &matrix) const
 Comparison operator with another arbitrary matrix type.
 
SparseMatrixBaseoperator= (const SparseMatrixBase &)=delete
 Copy-assignment operator.
 
SparseMatrixBaseoperator= (SparseMatrixBase &&)=delete
 Move-assignment operator.
 
template<typename Matrix >
bool operator== (const Matrix &matrix) const
 Comparison operator with another arbitrary matrix type.
 
void print (std::ostream &str) const
 Method for printing the matrix to output stream.
 
template<typename Fetch , typename Reduce , typename Keep , typename FetchValue , typename SegmentsReductionKernel = DefaultSegmentsReductionKernel>
void reduceAllRows (Fetch &&fetch, const Reduce &reduce, Keep &&keep, const FetchValue &identity, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
 Method for performing general reduction on all matrix rows for constant instances.
 
template<typename Fetch , typename Reduce , typename Keep , typename FetchValue , typename SegmentsReductionKernel = DefaultSegmentsReductionKernel>
void reduceRows (IndexType begin, IndexType end, Fetch &&fetch, const Reduce &reduce, Keep &&keep, const FetchValue &identity, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
 Method for performing general reduction on matrix rows for constant instances.
 
template<typename Function >
void sequentialForAllRows (Function &&function)
 This method calls sequentialForRows for all matrix rows.
 
template<typename Function >
void sequentialForAllRows (Function &&function) const
 This method calls sequentialForRows for all matrix rows (for constant instances).
 
template<typename Function >
void sequentialForRows (IndexType begin, IndexType end, Function &&function)
 Method for sequential iteration over all matrix rows for non-constant instances.
 
template<typename Function >
void sequentialForRows (IndexType begin, IndexType end, Function &&function) const
 Method for sequential iteration over all matrix rows for constant instances.
 
__cuda_callable__ void setElement (IndexType row, IndexType column, const RealType &value)
 Sets element at given row and column to given value.
 
void sortColumnIndexes ()
 Sort matrix elements in each row by column indexes in ascending order.
 
template<typename InVector , typename OutVector >
void transposedVectorProduct (const InVector &inVector, OutVector &outVector, ComputeReal matrixMultiplicator=1.0, ComputeReal outVectorMultiplicator=0.0, Index begin=0, Index end=0) const
 Computes product of transposed matrix and vector.
 
template<typename InVector , typename OutVector , typename SegmentsReductionKernel = DefaultSegmentsReductionKernel>
void vectorProduct (const InVector &inVector, OutVector &outVector, ComputeRealType matrixMultiplicator=1.0, ComputeRealType outVectorMultiplicator=0.0, IndexType begin=0, IndexType end=0, const SegmentsReductionKernel &kernel=SegmentsReductionKernel{}) const
 Computes product of matrix and vector.
 
template<typename InVector , typename OutVector , typename SegmentsReductionKernel , typename... , std::enable_if_t< ! std::is_convertible_v< SegmentsReductionKernel, ComputeRealType >, bool > = true>
void vectorProduct (const InVector &inVector, OutVector &outVector, const SegmentsReductionKernel &kernel) const
 
- Public Member Functions inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
__cuda_callable__ MatrixBase ()=default
 Basic constructor with no parameters.
 
__cuda_callable__ MatrixBase (const MatrixBase &view)=default
 Shallow copy constructor.
 
__cuda_callable__ MatrixBase (IndexType rows, IndexType columns, ValuesViewType values)
 Constructor with matrix dimensions and matrix elements values.
 
__cuda_callable__ MatrixBase (MatrixBase &&view) noexcept=default
 Move constructor.
 
IndexType getAllocatedElementsCount () const
 Tells the number of allocated matrix elements.
 
__cuda_callable__ IndexType getColumns () const
 Returns number of matrix columns.
 
__cuda_callable__ IndexType getRows () const
 Returns number of matrix rows.
 
__cuda_callable__ ValuesViewTypegetValues ()
 Returns a reference to a vector with the matrix elements values.
 
__cuda_callable__ const ValuesViewTypegetValues () const
 Returns a constant reference to a vector with the matrix elements values.
 
bool operator!= (const Matrix &matrix) const
 Comparison operator with another arbitrary matrix view type.
 
bool operator!= (const MatrixT &matrix) const
 
__cuda_callable__ MatrixBaseoperator= (const MatrixBase &)=delete
 Copy-assignment operator.
 
__cuda_callable__ MatrixBaseoperator= (MatrixBase &&)=delete
 Move-assignment operator.
 
bool operator== (const Matrix &matrix) const
 Comparison operator with another arbitrary matrix view type.
 
bool operator== (const MatrixT &matrix) const
 

Static Public Member Functions

static std::string getSerializationType ()
 Returns string with serialization type.
 
- Static Public Member Functions inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
static constexpr ElementsOrganization getOrganization ()
 Matrix elements organization getter.
 
static constexpr bool isBinary ()
 Test of binary matrix type.
 
static constexpr bool isSymmetric ()
 Test of symmetric matrix type.
 

Protected Member Functions

__cuda_callable__ void bind (IndexType rows, IndexType columns, typename Base::ValuesViewType values, ColumnIndexesViewType columnIndexes, SegmentsViewType segments)
 Re-initializes the internal attributes of the base class.
 
- Protected Member Functions inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
__cuda_callable__ void bind (IndexType rows, IndexType columns, ValuesViewType values)
 Re-initializes the internal attributes of the base class.
 

Protected Attributes

ColumnIndexesViewType columnIndexes
 
SegmentsViewType segments
 
- Protected Attributes inherited from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >
IndexType columns
 
IndexType rows
 
ValuesViewType values
 

Detailed Description

template<typename Real, typename Device, typename Index, typename MatrixType, typename SegmentsView, typename ComputeReal>
class TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >

Implementation of sparse matrix view.

It serves as an accessor to SparseMatrix for example when passing the matrix to lambda functions. SparseMatrix view can be also created in CUDA kernels.

Template Parameters
Realis a type of matrix elements. If Real equals bool the matrix is treated as binary and so the matrix elements values are not stored in the memory since we need to remember only coordinates of non-zero elements (which equal one).
Deviceis a device where the matrix is allocated.
Indexis a type for indexing of the matrix elements.
MatrixTypespecifies a symmetry of matrix. See MatrixType. Symmetric matrices store only lower part of the matrix and its diagonal. The upper part is reconstructed on the fly. GeneralMatrix with no symmetry is used by default.
Segmentsis a structure representing the sparse matrix format. Depending on the pattern of the non-zero elements different matrix formats can perform differently especially on GPUs. By default Algorithms::Segments::CSR format is used. See also Algorithms::Segments::Ellpack, Algorithms::Segments::SlicedEllpack, Algorithms::Segments::ChunkedEllpack, and Algorithms::Segments::BiEllpack.
ComputeRealis the same as Real mostly but for binary matrices it is set to Index type. This can be changed by the user, of course.

Member Typedef Documentation

◆ ColumnIndexesViewType

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
using TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::ColumnIndexesViewType
Initial value:
Containers::VectorView< typename TNL::copy_const< Index >::template from< Real >::type, Device, Index >

◆ DefaultSegmentsReductionKernel

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
using TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::DefaultSegmentsReductionKernel = typename Algorithms::SegmentsReductionKernels::DefaultKernel< SegmentsView >::type

Type of the kernel used for parallel reductions on segments.

We are assuming that the default segments reduction kernel provides a static reduceAllSegments method, and thus it does not have to be instantiated and initialized. If the user wants to use a more complicated kernel, such as CSRAdaptive, it must be instantiated and initialized by the user and the object must be passed to the vectorProduct or reduceRows method.

◆ RowView

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
using TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::RowView
Initial value:
SparseMatrixRowView< typename SegmentsViewType::SegmentViewType, typename Base::ValuesViewType, ColumnIndexesViewType >

Type for accessing matrix rows.

Constructor & Destructor Documentation

◆ SparseMatrixBase() [1/3]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::SparseMatrixBase ( IndexType rows,
IndexType columns,
typename Base::ValuesViewType values,
ColumnIndexesViewType columnIndexes,
SegmentsViewType segments )

Constructor with all necessary data and views.

Parameters
rowsis a number of matrix rows.
columnsis a number of matrix columns.
valuesis a vector view with matrix elements values.
columnIndexesis a vector view with matrix elements column indexes.
segmentsis a segments view representing the sparse matrix format.

◆ SparseMatrixBase() [2/3]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::SparseMatrixBase ( const SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > & matrix)
default

Copy constructor.

Parameters
matrixis an input sparse matrix view.

◆ SparseMatrixBase() [3/3]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::SparseMatrixBase ( SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > && matrix)
defaultnoexcept

Move constructor.

Parameters
matrixis an input sparse matrix view.

Member Function Documentation

◆ addElement()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::addElement ( IndexType row,
IndexType column,
const RealType & value,
const RealType & thisElementMultiplicator = 1.0 )

Add element at given row and column to given value.

This method can be called from the host system (CPU) no matter where the matrix is allocated. If the matrix is allocated on GPU this method can be called even from device kernels. If the matrix is allocated in GPU device this method is called from CPU, it transfers values of each matrix element separately and so the performance is very low. For higher performance see. SparseMatrix::getRow or SparseMatrix::forElements and SparseMatrix::forAllElements. The call may fail if the matrix row capacity is exhausted.

Parameters
rowis row index of the element.
columnis columns index of the element.
valueis the value the element will be set to.
thisElementMultiplicatoris multiplicator the original matrix element value is multiplied by before addition of given value.
Example
#include <iostream>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
template< typename Device >
void
addElements()
{
TNL::Matrices::SparseMatrix< double, Device > matrix( { 5, 5, 5, 5, 5 }, 5 );
auto matrixView = matrix.getView();
for( int i = 0; i < 5; i++ )
matrixView.setElement( i, i, i ); // or matrix.setElement
std::cout << "Initial matrix is: " << std::endl << matrix << std::endl;
for( int i = 0; i < 5; i++ )
for( int j = 0; j < 5; j++ )
matrixView.addElement( i, j, 1.0, 5.0 ); // or matrix.addElement
std::cout << "Matrix after addition is: " << std::endl << matrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Add elements on host:" << std::endl;
addElements< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Add elements on CUDA device:" << std::endl;
addElements< TNL::Devices::Cuda >();
#endif
}
__cuda_callable__ void setElement(IndexType row, IndexType column, const RealType &value)
Sets element at given row and column to given value.
Definition SparseMatrixBase.hpp:148
__cuda_callable__ void addElement(IndexType row, IndexType column, const RealType &value, const RealType &thisElementMultiplicator=1.0)
Add element at given row and column to given value.
Definition SparseMatrixBase.hpp:158
Implementation of sparse matrix, i.e. matrix storing only non-zero elements.
Definition SparseMatrix.h:57
ViewType getView()
Returns a modifiable view of the sparse matrix.
Definition SparseMatrix.hpp:165
T endl(T... args)
Output
Add elements on host:
Initial matrix is:
Row: 0 ->
Row: 1 -> 1:1
Row: 2 -> 2:2
Row: 3 -> 3:3
Row: 4 -> 4:4
Matrix after addition is:
Row: 0 -> 0:1 1:1 2:1 3:1 4:1
Row: 1 -> 0:1 1:6 2:1 3:1 4:1
Row: 2 -> 0:1 1:1 2:11 3:1 4:1
Row: 3 -> 0:1 1:1 2:1 3:16 4:1
Row: 4 -> 0:1 1:1 2:1 3:1 4:21
Add elements on CUDA device:
Initial matrix is:
Row: 0 ->
Row: 1 -> 1:1
Row: 2 -> 2:2
Row: 3 -> 3:3
Row: 4 -> 4:4
Matrix after addition is:
Row: 0 -> 0:1 1:1 2:1 3:1 4:1
Row: 1 -> 0:1 1:6 2:1 3:1 4:1
Row: 2 -> 0:1 1:1 2:11 3:1 4:1
Row: 3 -> 0:1 1:1 2:1 3:16 4:1
Row: 4 -> 0:1 1:1 2:1 3:1 4:21

◆ bind()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::bind ( IndexType rows,
IndexType columns,
typename Base::ValuesViewType values,
ColumnIndexesViewType columnIndexes,
SegmentsViewType segments )
protected

Re-initializes the internal attributes of the base class.

Note that this function is protected to ensure that the user cannot modify the base class of a matrix. For the same reason, in future code development we also need to make sure that all non-const functions in the base class return by value and not by reference.

◆ findElement()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ Index TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::findElement ( IndexType row,
IndexType column ) const
nodiscard

Finds element in the matrix and returns its position in the arrays with values and columnIndexes.

If the element is not found, the method returns the padding index.

Parameters
rowis the row index of the element.
columnis the column index of the element.
Returns
the position of the element in the arrays with values and columnIndexes or the padding index if the element is not found.

◆ forAllElements() [1/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forAllElements ( Function && function)

This method calls forElements for all matrix rows.

See SparseMatrix::forElements.

Template Parameters
Functionis a type of lambda function that will operate on matrix elements.
Parameters
functionis an instance of the lambda function to be called in each row.
Example
#include <iostream>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
forAllElementsExample()
{
TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 2, 3, 4, 5 }, 5 );
auto matrixView = matrix.getView();
auto f = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
{
// This is important, some matrix formats may allocate more matrix elements
// than we requested. These padding elements are processed here as well.
if( rowIdx >= localIdx ) {
columnIdx = localIdx;
value = rowIdx + localIdx + 1;
}
};
matrixView.forAllElements( f ); // or matrix.forAllElements
std::cout << matrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Creating matrix on host: " << std::endl;
forAllElementsExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Creating matrix on CUDA device: " << std::endl;
forAllElementsExample< TNL::Devices::Cuda >();
#endif
}
#define __cuda_callable__
Definition Macros.h:49
void forAllElements(Function &&function) const
This method calls forElements for all matrix rows (for constant instances).
Definition SparseMatrixBase.hpp:537
Output
Creating matrix on host:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Creating matrix on CUDA device:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forAllElements() [2/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forAllElements ( Function && function) const

This method calls forElements for all matrix rows (for constant instances).

See SparseMatrix::forElements.

Template Parameters
Functionis a type of lambda function that will operate on matrix elements.
Parameters
functionis an instance of the lambda function to be called in each row.

◆ forAllRows() [1/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forAllRows ( Function && function)

Method for parallel iteration over all matrix rows.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method SparseMatrixBase::forAllElements where more than one thread can be mapped to each row.

Template Parameters
Functionis type of the lambda function.
Parameters
functionis an instance of the lambda function to be called for each row.
auto function = [] __cuda_callable__ ( RowView& row ) { ... };
RowView is a simple structure for accessing rows of sparse matrix.
Definition SparseMatrixRowView.h:29

RowView represents matrix row - see TNL::Matrices::SparseMatrixBase::RowView.

Example
#include <iostream>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
forRowsExample()
{
/***
* Set the following matrix (dots represent zero matrix elements):
*
* / 2 . . . . \
* | 1 2 1 . . |
* | . 1 2 1. . |
* | . . 1 2 1 |
* \ . . . . 2 /
*/
const int size( 5 );
using RowView = typename MatrixType::RowView;
MatrixType matrix( { 1, 3, 3, 3, 1 }, size );
auto matrixView = matrix.getView();
/***
* Set the matrix elements.
*/
auto f = [] __cuda_callable__( RowView & row )
{
const int rowIdx = row.getRowIndex();
if( rowIdx == 0 )
row.setElement( 0, rowIdx, 2.0 ); // diagonal element
else if( rowIdx == size - 1 )
row.setElement( 0, rowIdx, 2.0 ); // diagonal element
else {
row.setElement( 0, rowIdx - 1, 1.0 ); // elements below the diagonal
row.setElement( 1, rowIdx, 2.0 ); // diagonal element
row.setElement( 2, rowIdx + 1, 1.0 ); // elements above the diagonal
}
};
matrixView.forAllRows( f ); // or matrix.forAllRows
std::cout << matrix << std::endl;
/***
* Divide each matrix row by a sum of all elements in the row - with use of iterators.
*/
matrix.forAllRows(
[] __cuda_callable__( RowView & row )
{
double sum = 0.0;
for( auto element : row )
sum += element.value();
for( auto element : row )
element.value() /= sum;
} );
std::cout << "Divide each matrix row by a sum of all elements in the row ... " << std::endl;
std::cout << matrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Getting matrix rows on host: " << std::endl;
forRowsExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Getting matrix rows on CUDA device: " << std::endl;
forRowsExample< TNL::Devices::Cuda >();
#endif
}
Structure for specifying type of sparse matrix.
Definition MatrixType.h:17
Output
Getting matrix rows on host:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Divide each matrix row by a sum of all elements in the row ...
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1
Getting matrix rows on CUDA device:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Divide each matrix row by a sum of all elements in the row ...
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1

◆ forAllRows() [2/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forAllRows ( Function && function) const

Method for parallel iteration over all matrix rows for constant instances.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method SparseMatrixBase::forAllElements where more than one thread can be mapped to each row.

Template Parameters
Functionis type of the lambda function.
Parameters
functionis an instance of the lambda function to be called for each row.
auto function = [] __cuda_callable__ ( const ConstRowView& row ) { ... };
typename RowView::ConstRowView ConstRowView
Type for accessing constant matrix rows.
Definition SparseMatrixBase.h:101

ConstRowView represents matrix row - see TNL::Matrices::SparseMatrixBase::ConstRowView.

◆ forElements() [1/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forElements ( IndexType begin,
IndexType end,
Function && function )

Method for iteration over all matrix rows for non-constant instances.

Template Parameters
Functionis type of lambda function that will operate on matrix elements. It should have form like
auto function = [] __cuda_callable__ ( IndexType rowIdx, IndexType localIdx, IndexType& columnIdx, RealType& value )
{ ... };
Index IndexType
The type used for matrix elements indexing.
Definition SparseMatrixBase.h:85
typename Base::RealType RealType
The type of matrix elements.
Definition SparseMatrixBase.h:73

The localIdx parameter is a rank of the non-zero element in given row.

Parameters
begindefines beginning of the range [begin, end) of rows to be processed.
enddefines ending of the range [begin, end) of rows to be processed.
functionis an instance of the lambda function to be called in each row.
Example
#include <iostream>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
forElementsExample()
{
TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 2, 3, 4, 5 }, 5 );
auto matrixView = matrix.getView();
auto f = [] __cuda_callable__( int rowIdx, int localIdx, int& columnIdx, double& value )
{
// This is important, some matrix formats may allocate more matrix elements
// than we requested. These padding elements are processed here as well.
if( rowIdx >= localIdx ) {
columnIdx = localIdx;
value = rowIdx + localIdx + 1;
}
};
matrixView.forElements( 0, matrix.getRows(), f ); // or matrix.forElements
std::cout << matrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Creating matrix on host: " << std::endl;
forElementsExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Creating matrix on CUDA device: " << std::endl;
forElementsExample< TNL::Devices::Cuda >();
#endif
}
void forElements(IndexType begin, IndexType end, Function &&function) const
Method for iteration over all matrix rows for constant instances.
Definition SparseMatrixBase.hpp:490
Output
Creating matrix on host:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9
Creating matrix on CUDA device:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:3 1:4 2:5
Row: 3 -> 0:4 1:5 2:6 3:7
Row: 4 -> 0:5 1:6 2:7 3:8 4:9

◆ forElements() [2/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forElements ( IndexType begin,
IndexType end,
Function && function ) const

Method for iteration over all matrix rows for constant instances.

Template Parameters
Functionis type of lambda function that will operate on matrix elements. It should have form like
auto function = [] __cuda_callable__
( IndexType rowIdx, IndexType localIdx, const IndexType& columnIdx, const RealType& value )
{ ... };

The localIdx parameter is a rank of the non-zero element in given row.

Parameters
begindefines beginning of the range [begin, end) of rows to be processed.
enddefines ending of the range [begin, end) of rows to be processed.
functionis an instance of the lambda function to be called in each row.

◆ forRows() [1/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forRows ( IndexType begin,
IndexType end,
Function && function )

Method for parallel iteration over matrix rows from interval [begin, end).

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method SparseMatrixBase::forElements where more than one thread can be mapped to each row.

Template Parameters
Functionis type of the lambda function.
Parameters
begindefines beginning of the range [begin, end) of rows to be processed.
enddefines ending of the range [begin, end) of rows to be processed.
functionis an instance of the lambda function to be called for each row.
auto function = [] __cuda_callable__ ( RowView& row ) { ... };

RowView represents matrix row - see TNL::Matrices::SparseMatrixBase::RowView.

Example
#include <iostream>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
forRowsExample()
{
/***
* Set the following matrix (dots represent zero matrix elements):
*
* / 2 . . . . \
* | 1 2 1 . . |
* | . 1 2 1. . |
* | . . 1 2 1 |
* \ . . . . 2 /
*/
const int size( 5 );
using RowView = typename MatrixType::RowView;
MatrixType matrix( { 1, 3, 3, 3, 1 }, size );
auto matrixView = matrix.getView();
/***
* Set the matrix elements.
*/
auto f = [] __cuda_callable__( RowView & row )
{
const int rowIdx = row.getRowIndex();
if( rowIdx == 0 )
row.setElement( 0, rowIdx, 2.0 ); // diagonal element
else if( rowIdx == size - 1 )
row.setElement( 0, rowIdx, 2.0 ); // diagonal element
else {
row.setElement( 0, rowIdx - 1, 1.0 ); // elements below the diagonal
row.setElement( 1, rowIdx, 2.0 ); // diagonal element
row.setElement( 2, rowIdx + 1, 1.0 ); // elements above the diagonal
}
};
matrixView.forAllRows( f ); // or matrix.forAllRows
std::cout << matrix << std::endl;
/***
* Divide each matrix row by a sum of all elements in the row - with use of iterators.
*/
matrix.forAllRows(
[] __cuda_callable__( RowView & row )
{
double sum = 0.0;
for( auto element : row )
sum += element.value();
for( auto element : row )
element.value() /= sum;
} );
std::cout << "Divide each matrix row by a sum of all elements in the row ... " << std::endl;
std::cout << matrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Getting matrix rows on host: " << std::endl;
forRowsExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Getting matrix rows on CUDA device: " << std::endl;
forRowsExample< TNL::Devices::Cuda >();
#endif
}
Output
Getting matrix rows on host:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Divide each matrix row by a sum of all elements in the row ...
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1
Getting matrix rows on CUDA device:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Divide each matrix row by a sum of all elements in the row ...
Row: 0 -> 0:1
Row: 1 -> 0:0.25 1:0.5 2:0.25
Row: 2 -> 1:0.25 2:0.5 3:0.25
Row: 3 -> 2:0.25 3:0.5 4:0.25
Row: 4 -> 4:1

◆ forRows() [2/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::forRows ( IndexType begin,
IndexType end,
Function && function ) const

Method for parallel iteration over matrix rows from interval [begin, end) for constant instances.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method SparseMatrixBase::forElements where more than one thread can be mapped to each row.

Template Parameters
Functionis type of the lambda function.
Parameters
begindefines beginning of the range [begin, end) of rows to be processed.
enddefines ending of the range [begin, end) of rows to be processed.
functionis an instance of the lambda function to be called for each row.
auto function = [] __cuda_callable__ ( const ConstRowView& row ) { ... };

ConstRowView represents matrix row - see TNL::Matrices::SparseMatrixBase::ConstRowView.

◆ getColumnIndexes() [1/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getColumnIndexes ( )
nodiscard

Getter of column indexes for nonconstant instances.

Returns
Reference to a vector with matrix elements column indexes.

◆ getColumnIndexes() [2/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getColumnIndexes ( ) const
nodiscard

Getter of column indexes for constant instances.

Returns
Constant reference to a vector with matrix elements column indexes.

◆ getCompressedRowLengths()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Vector >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getCompressedRowLengths ( Vector & rowLengths) const

Computes number of non-zeros in each row.

Parameters
rowLengthsis a vector into which the number of non-zeros in each row will be stored.
Example
#include <iostream>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
getCompressedRowLengthsExample()
{
triangularMatrix.setElements( {
// clang-format off
{ 0, 0, 1 },
{ 1, 0, 2 }, { 1, 1, 3 },
{ 2, 0, 4 }, { 2, 1, 5 }, { 2, 2, 6 },
{ 3, 0, 7 }, { 3, 1, 8 }, { 3, 2, 9 }, { 3, 3, 10 },
{ 4, 0, 11 }, { 4, 1, 12 }, { 4, 2, 13 }, { 4, 3, 14 }, { 4, 4, 15 }
// clang-format on
} );
std::cout << triangularMatrix << std::endl;
triangularMatrix.getCompressedRowLengths( rowLengths );
std::cout << "Compressed row lengths are: " << rowLengths << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Getting compressed row lengths on host: " << std::endl;
getCompressedRowLengthsExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Getting compressed row lengths on CUDA device: " << std::endl;
getCompressedRowLengthsExample< TNL::Devices::Cuda >();
#endif
}
Vector extends Array with algebraic operations.
Definition Vector.h:36
Output
Getting compressed row lengths on host:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:4 1:5 2:6
Row: 3 -> 0:7 1:8 2:9 3:10
Row: 4 -> 0:11 1:12 2:13 3:14 4:15
Compressed row lengths are: [ 1, 2, 3, 4, 5 ]
Getting compressed row lengths on CUDA device:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:3
Row: 2 -> 0:4 1:5 2:6
Row: 3 -> 0:7 1:8 2:9 3:10
Row: 4 -> 0:11 1:12 2:13 3:14 4:15
Compressed row lengths are: [ 1, 2, 3, 4, 5 ]

◆ getElement()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getElement ( IndexType row,
IndexType column ) const
nodiscard

Returns value of matrix element at position given by its row and column index.

This method can be called from the host system (CPU) no matter where the matrix is allocated. If the matrix is allocated on GPU this method can be called even from device kernels. If the matrix is allocated in GPU device this method is called from CPU, it transfers values of each matrix element separately and so the performance is very low. For higher performance see. SparseMatrix::getRow or SparseMatrix::forElements and SparseMatrix::forAllElements.

Parameters
rowis a row index of the matrix element.
columni a column index of the matrix element.
Returns
value of given matrix element.
Example
#include <iostream>
#include <iomanip>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
template< typename Device >
void
getElements()
{
// number of matrix rows
5,
// number of matrix columns
5,
// matrix elements definition
{
// clang-format off
{ 0, 0, 2.0 },
{ 1, 0, -1.0 }, { 1, 1, 2.0 }, { 1, 2, -1.0 },
{ 2, 1, -1.0 }, { 2, 2, 2.0 }, { 2, 3, -1.0 },
{ 3, 2, -1.0 }, { 3, 3, 2.0 }, { 3, 4, -1.0 },
{ 4, 4, 2.0 }
// clang-format on
} );
for( int i = 0; i < 5; i++ ) {
for( int j = 0; j < 5; j++ )
std::cout << std::setw( 5 ) << matrix.getElement( i, j );
}
}
int
main( int argc, char* argv[] )
{
std::cout << "Get elements on host:" << std::endl;
getElements< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Get elements on CUDA device:" << std::endl;
getElements< TNL::Devices::Cuda >();
#endif
}
T setw(T... args)
Output
Get elements on host:
2 0 0 0 0
-1 2 -1 0 0
0 -1 2 -1 0
0 0 -1 2 -1
0 0 0 0 2
Get elements on CUDA device:
2 0 0 0 0
-1 2 -1 0 0
0 -1 2 -1 0
0 0 -1 2 -1
0 0 0 0 2

◆ getNonzeroElementsCount()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
Index TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getNonzeroElementsCount ( ) const
nodiscardoverridevirtual

Returns number of non-zero matrix elements.

This method really counts the non-zero matrix elements and so it returns zero for matrix having all allocated elements set to zero.

Returns
number of non-zero matrix elements.

Reimplemented from TNL::Matrices::MatrixBase< Real, Device, Index, MatrixType, SegmentsView::getOrganization() >.

◆ getRow() [1/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getRow ( IndexType rowIdx)
nodiscard

Non-constant getter of simple structure for accessing given matrix row.

Parameters
rowIdxis matrix row index.
Returns
RowView for accessing given matrix row.
Example
#include <iostream>
#include <TNL/Algorithms/parallelFor.h>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
getRowExample()
{
/***
* Set the following matrix (dots represent zero matrix elements):
*
* / 2 . . . . \
* | 1 2 1 . . |
* | . 1 2 1. . |
* | . . 1 2 1 |
* \ . . . . 2 /
*/
const int size = 5;
TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 3, 3, 3, 1 }, size );
auto view = matrix.getView();
/***
* Set the matrix elements.
*/
auto f = [ = ] __cuda_callable__( int rowIdx ) mutable
{
auto row = view.getRow( rowIdx );
if( rowIdx == 0 )
row.setElement( 0, rowIdx, 2.0 ); // diagonal element
else if( rowIdx == size - 1 )
row.setElement( 0, rowIdx, 2.0 ); // diagonal element
else {
row.setElement( 0, rowIdx - 1, 1.0 ); // elements below the diagonal
row.setElement( 1, rowIdx, 2.0 ); // diagonal element
row.setElement( 2, rowIdx + 1, 1.0 ); // elements above the diagonal
}
};
TNL::Algorithms::parallelFor< Device >( 0, matrix.getRows(), f );
std::cout << matrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Getting matrix rows on host: " << std::endl;
getRowExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Getting matrix rows on CUDA device: " << std::endl;
getRowExample< TNL::Devices::Cuda >();
#endif
}
__cuda_callable__ ConstRowView getRow(IndexType rowIdx) const
Constant getter of simple structure for accessing given matrix row.
Definition SparseMatrixBase.hpp:130
std::enable_if_t< std::is_integral_v< Begin > &&std::is_integral_v< End > > parallelFor(const Begin &begin, const End &end, typename Device::LaunchConfiguration launch_config, Function f, FunctionArgs... args)
Parallel for-loop function for 1D range specified with integral values.
Definition parallelFor.h:41
Output
Getting matrix rows on host:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2
Getting matrix rows on CUDA device:
Row: 0 -> 0:2
Row: 1 -> 0:1 1:2 2:1
Row: 2 -> 1:1 2:2 3:1
Row: 3 -> 2:1 3:2 4:1
Row: 4 -> 4:2

See SparseMatrixRowView.

◆ getRow() [2/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getRow ( IndexType rowIdx) const
nodiscard

Constant getter of simple structure for accessing given matrix row.

Parameters
rowIdxis matrix row index.
Returns
RowView for accessing given matrix row.
Example
#include <iostream>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
getRowExample()
{
//
5,
5,
{
// clang-format off
{ 0, 0, 1 },
{ 1, 0, 1 }, { 1, 1, 2 },
{ 2, 0, 1 }, { 2, 1, 2 }, { 2, 2, 3 },
{ 3, 0, 1 }, { 3, 1, 2 }, { 3, 2, 3 }, { 3, 3, 4 },
{ 4, 0, 1 }, { 4, 1, 2 }, { 4, 2, 3 }, { 4, 3, 4 }, { 4, 4, 5 }
// clang-format on
} );
const auto matrixView = matrix.getView();
/***
* Fetch lambda function returns diagonal element in each row.
*/
auto fetch = [ = ] __cuda_callable__( int rowIdx ) -> double
{
auto row = matrixView.getRow( rowIdx );
return row.getValue( rowIdx );
};
/***
* Compute the matrix trace.
*/
int trace = TNL::Algorithms::reduce< Device >( 0, matrix.getRows(), fetch, std::plus<>{}, 0 );
std::cout << "Matrix trace is " << trace << "." << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Getting matrix rows on host: " << std::endl;
getRowExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Getting matrix rows on CUDA device: " << std::endl;
getRowExample< TNL::Devices::Cuda >();
#endif
}
Result reduce(Index begin, Index end, Fetch &&fetch, Reduction &&reduction, const Result &identity)
reduce implements (parallel) reduction for vectors and arrays.
Definition reduce.h:65
Output
Getting matrix rows on host:
Matrix trace is 15.
Getting matrix rows on CUDA device:
Matrix trace is 15.

See SparseMatrixRowView.

◆ getRowCapacities()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Vector >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getRowCapacities ( Vector & rowCapacities) const

Compute capacities of all rows.

The row capacities are not stored explicitly and must be computed.

Parameters
rowCapacitiesis a vector where the row capacities will be stored.

◆ getRowCapacity()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ Index TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getRowCapacity ( IndexType row) const
nodiscard

Returns capacity of given matrix row.

Parameters
rowindex of matrix row.
Returns
number of matrix elements allocated for the row.

◆ getSegments() [1/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getSegments ( )
nodiscard

Getter of segments for non-constant instances.

Segments are a structure for addressing the matrix elements columns and values. In fact, Segments represent the sparse matrix format.

Returns
Non-constant reference to segments.

◆ getSegments() [2/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
auto TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getSegments ( ) const
nodiscard

Getter of segments for constant instances.

Segments are a structure for addressing the matrix elements columns and values. In fact, Segments represent the sparse matrix format.

Returns
Constant reference to segments.

◆ getSerializationType()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
std::string TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::getSerializationType ( )
staticnodiscard

Returns string with serialization type.

The string has a form Matrices::SparseMatrix< RealType, [any_device], IndexType, General/Symmetric, Format, [any_allocator] >.

Returns
String with the serialization type.
Example
#include <iostream>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
template< typename Device >
void
getSerializationTypeExample()
{
std::cout << "Matrix type is: " << matrix.getSerializationType();
}
int
main( int argc, char* argv[] )
{
std::cout << "Get serialization type on CPU ... " << std::endl;
getSerializationTypeExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Get serialization type on CUDA GPU ... " << std::endl;
getSerializationTypeExample< TNL::Devices::Cuda >();
#endif
}
static std::string getSerializationType()
Returns string with serialization type.
Definition SparseMatrixBase.hpp:43
Output
Get serialization type on CPU ...
Matrix type is: Matrices::SparseMatrix< double, CSR< int >, [any_device], int, General, [any_allocator], [any_allocator] >Get serialization type on CUDA GPU ...
Matrix type is: Matrices::SparseMatrix< double, CSR< int >, [any_device], int, General, [any_allocator], [any_allocator] >

◆ operator!=()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Matrix >
bool TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::operator!= ( const Matrix & matrix) const
nodiscard

Comparison operator with another arbitrary matrix type.

Parameters
matrixis the right-hand side matrix.
Returns
false if the RHS matrix is equal, true otherwise.

◆ operator=()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
SparseMatrixBase & TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::operator= ( const SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal > & )
delete

Copy-assignment operator.

It is a deleted function, because sparse matrix assignment in general requires reallocation.

◆ operator==()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Matrix >
bool TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::operator== ( const Matrix & matrix) const
nodiscard

Comparison operator with another arbitrary matrix type.

Parameters
matrixis the right-hand side matrix.
Returns
true if the RHS matrix is equal, false otherwise.

◆ print()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::print ( std::ostream & str) const

Method for printing the matrix to output stream.

Parameters
stris the output stream.

◆ reduceAllRows()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Fetch , typename Reduce , typename Keep , typename FetchValue , typename SegmentsReductionKernel >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::reduceAllRows ( Fetch && fetch,
const Reduce & reduce,
Keep && keep,
const FetchValue & identity,
const SegmentsReductionKernel & kernel = SegmentsReductionKernel{} ) const

Method for performing general reduction on all matrix rows for constant instances.

Template Parameters
Fetchis a type of lambda function for data fetch declared as
auto fetch = [] __cuda_callable__ ( IndexType rowIdx, IndexType& columnIdx, RealType& elementValue ) -> FetchValue
{ ... };

The return type of this lambda can be any non void.

Template Parameters
Reduceis a type of lambda function for reduction declared as
auto reduce = [] __cuda_callable__ ( const FetchValue& v1, const FetchValue& v2 ) -> FetchValue { ... };
Template Parameters
Keepis a type of lambda function for storing results of reduction in each row. It is declared as
auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value ) { ... };
Template Parameters
FetchValueis type returned by the Fetch lambda function.
Parameters
fetchis an instance of lambda function for data fetch.
reduceis an instance of lambda function for reduction.
keepin an instance of lambda function for storing results.
identityis the identity element for the reduction operation, i.e. element which does not change the result of the reduction.
kernelis an instance of the segments reduction kernel to be used for the operation.
Example
#include <iostream>
#include <iomanip>
#include <functional>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
reduceAllRows()
{
// number of matrix rows
5,
// number of matrix columns
5,
// matrix elements definition
{
// clang-format off
{ 0, 0, 1 },
{ 1, 1, 1 }, { 1, 2, 8 },
{ 2, 2, 1 }, { 2, 3, 9 },
{ 3, 3, 1 }, { 3, 4, 9 },
{ 4, 4, 1 }
// clang-format on
} );
/***
* Find largest element in each row.
*/
TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );
/***
* Prepare vector view and matrix view for lambdas.
*/
auto rowMaxView = rowMax.getView();
/***
* Fetch lambda just returns absolute value of matrix elements.
*/
auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
{
return TNL::abs( value );
};
/***
* Reduce lambda return maximum of given values.
*/
auto reduce = [] __cuda_callable__( double& a, const double& b ) -> double
{
return TNL::max( a, b );
};
/***
* Keep lambda store the largest value in each row to the vector rowMax.
*/
auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable
{
rowMaxView[ rowIdx ] = value;
};
/***
* Compute the largest values in each row.
*/
matrix.reduceAllRows( fetch, reduce, keep, std::numeric_limits< double >::lowest() );
std::cout << "The matrix reads as: " << std::endl << matrix << std::endl;
std::cout << "Max. elements in rows are: " << rowMax << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "All rows reduction on host:" << std::endl;
reduceAllRows< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "All rows reduction on CUDA device:" << std::endl;
reduceAllRows< TNL::Devices::Cuda >();
#endif
}
__cuda_callable__ T abs(const T &n)
This function returns absolute value of given number n.
Definition Math.h:74
constexpr ResultType max(const T1 &a, const T2 &b)
This function returns maximum of two numbers.
Definition Math.h:48
Output
All rows reduction on host:
The matrix reads as:
Row: 0 -> 0:1
Row: 1 -> 1:1 2:8
Row: 2 -> 2:1 3:9
Row: 3 -> 3:1 4:9
Row: 4 -> 4:1
Max. elements in rows are: [ 1, 8, 9, 9, 1 ]
All rows reduction on CUDA device:
The matrix reads as:
Row: 0 -> 0:1
Row: 1 -> 1:1 2:8
Row: 2 -> 2:1 3:9
Row: 3 -> 3:1 4:9
Row: 4 -> 4:1
Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

◆ reduceRows()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Fetch , typename Reduce , typename Keep , typename FetchValue , typename SegmentsReductionKernel >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::reduceRows ( IndexType begin,
IndexType end,
Fetch && fetch,
const Reduce & reduce,
Keep && keep,
const FetchValue & identity,
const SegmentsReductionKernel & kernel = SegmentsReductionKernel{} ) const

Method for performing general reduction on matrix rows for constant instances.

Template Parameters
Fetchis a type of lambda function for data fetch declared as
auto fetch = [] __cuda_callable__ ( IndexType rowIdx, IndexType& columnIdx, RealType& elementValue ) -> FetchValue
{ ... };

The return type of this lambda can be any non void.

Template Parameters
Reduceis a type of lambda function for reduction declared as
auto reduce = [] __cuda_callable__ ( const FetchValue& v1, const FetchValue& v2 ) -> FetchValue { ... };
Template Parameters
Keepis a type of lambda function for storing results of reduction in each row. It is declared as
auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value ) { ... };
Template Parameters
FetchValueis type returned by the Fetch lambda function.
Parameters
begindefines beginning of the range [begin, end) of rows to be processed.
enddefines ending of the range [begin, end) of rows to be processed.
fetchis an instance of lambda function for data fetch.
reduceis an instance of lambda function for reduction.
keepin an instance of lambda function for storing results.
identityis the identity element for the reduction operation, i.e. element which does not change the result of the reduction.
kernelis an instance of the segments reduction kernel to be used for the operation.
Example
#include <iostream>
#include <iomanip>
#include <functional>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
template< typename Device >
void
reduceRows()
{
// number of matrix rows
5,
// number of matrix columns
5,
// matrix elements definition
{
// clang-format off
{ 0, 0, 1 },
{ 1, 1, 1 }, { 1, 2, 8 },
{ 2, 2, 1 }, { 2, 3, 9 },
{ 3, 3, 1 }, { 3, 4, 9 },
{ 4, 4, 1 }
// clang-format on
} );
/***
* Find largest element in each row.
*/
TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );
/***
* Prepare vector view for lambdas.
*/
auto rowMaxView = rowMax.getView();
/***
* Fetch lambda just returns absolute value of matrix elements.
*/
auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
{
return TNL::abs( value );
};
/***
* Reduce lambda return maximum of given values.
*/
auto reduce = [] __cuda_callable__( double& a, const double& b ) -> double
{
return TNL::max( a, b );
};
/***
* Keep lambda store the largest value in each row to the vector rowMax.
*/
auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable
{
rowMaxView[ rowIdx ] = value;
};
/***
* Compute the largest values in each row.
*/
matrix.reduceRows( 0, matrix.getRows(), fetch, reduce, keep, std::numeric_limits< double >::lowest() );
std::cout << "The matrix reads as: " << std::endl << matrix << std::endl;
std::cout << "Max. elements in rows are: " << rowMax << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Rows reduction on host:" << std::endl;
reduceRows< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Rows reduction on CUDA device:" << std::endl;
reduceRows< TNL::Devices::Cuda >();
#endif
}
Output
Rows reduction on host:
The matrix reads as:
Row: 0 -> 0:1
Row: 1 -> 1:1 2:8
Row: 2 -> 2:1 3:9
Row: 3 -> 3:1 4:9
Row: 4 -> 4:1
Max. elements in rows are: [ 1, 8, 9, 9, 1 ]
Rows reduction on CUDA device:
The matrix reads as:
Row: 0 -> 0:1
Row: 1 -> 1:1 2:8
Row: 2 -> 2:1 3:9
Row: 3 -> 3:1 4:9
Row: 4 -> 4:1
Max. elements in rows are: [ 1, 8, 9, 9, 1 ]

◆ sequentialForAllRows() [1/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::sequentialForAllRows ( Function && function)

This method calls sequentialForRows for all matrix rows.

See SparseMatrixBase::sequentialForAllRows.

Template Parameters
Functionis a type of lambda function that will operate on matrix elements.
Parameters
functionis an instance of the lambda function to be called in each row.

◆ sequentialForAllRows() [2/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::sequentialForAllRows ( Function && function) const

This method calls sequentialForRows for all matrix rows (for constant instances).

See SparseMatrixBase::sequentialForRows.

Template Parameters
Functionis a type of lambda function that will operate on matrix elements.
Parameters
functionis an instance of the lambda function to be called in each row.

◆ sequentialForRows() [1/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::sequentialForRows ( IndexType begin,
IndexType end,
Function && function )

Method for sequential iteration over all matrix rows for non-constant instances.

Template Parameters
Functionis type of lambda function that will operate on matrix elements. It should have form like
auto function = [] __cuda_callable__ ( RowView& row ) { ... };

RowView represents matrix row - see TNL::Matrices::SparseMatrixBase::RowView.

Parameters
begindefines beginning of the range [begin, end) of rows to be processed.
enddefines ending of the range [begin, end) of rows to be processed.
functionis an instance of the lambda function to be called in each row.

◆ sequentialForRows() [2/2]

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename Function >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::sequentialForRows ( IndexType begin,
IndexType end,
Function && function ) const

Method for sequential iteration over all matrix rows for constant instances.

Template Parameters
Functionis type of lambda function that will operate on matrix elements. It should have form like
auto function = [] __cuda_callable__ ( const ConstRowView& row ) { ... };

ConstRowView represents matrix row - see TNL::Matrices::SparseMatrixBase::ConstRowView.

Parameters
begindefines beginning of the range [begin, end) of rows to be processed.
enddefines ending of the range [begin, end) of rows to be processed.
functionis an instance of the lambda function to be called in each row.

◆ setElement()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
__cuda_callable__ void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::setElement ( IndexType row,
IndexType column,
const RealType & value )

Sets element at given row and column to given value.

This method can be called from the host system (CPU) no matter where the matrix is allocated. If the matrix is allocated on GPU this method can be called even from device kernels. If the matrix is allocated in GPU device this method is called from CPU, it transfers values of each matrix element separately and so the performance is very low. For higher performance see. SparseMatrix::getRow or SparseMatrix::forElements and SparseMatrix::forAllElements. The call may fail if the matrix row capacity is exhausted.

Parameters
rowis row index of the element.
columnis columns index of the element.
valueis the value the element will be set to.
Example
#include <iostream>
#include <TNL/Algorithms/parallelFor.h>
#include <TNL/Matrices/SparseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
setElements()
{
TNL::Matrices::SparseMatrix< double, Device > matrix( { 1, 1, 1, 1, 1 }, 5 );
/****
* Get the matrix view.
*/
auto view = matrix.getView();
for( int i = 0; i < 5; i++ )
view.setElement( i, i, i );
std::cout << "Matrix set from the host:" << std::endl;
std::cout << matrix << std::endl;
auto f = [ = ] __cuda_callable__( int i ) mutable
{
view.setElement( i, i, -i );
};
std::cout << "Matrix set from its native device:" << std::endl;
std::cout << matrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Set elements on host:" << std::endl;
setElements< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Set elements on CUDA device:" << std::endl;
setElements< TNL::Devices::Cuda >();
#endif
}
Output
Set elements on host:
Matrix set from the host:
Row: 0 ->
Row: 1 -> 1:1
Row: 2 -> 2:2
Row: 3 -> 3:3
Row: 4 -> 4:4
Matrix set from its native device:
Row: 0 ->
Row: 1 -> 1:-1
Row: 2 -> 2:-2
Row: 3 -> 3:-3
Row: 4 -> 4:-4
Set elements on CUDA device:
Matrix set from the host:
Row: 0 ->
Row: 1 -> 1:1
Row: 2 -> 2:2
Row: 3 -> 3:3
Row: 4 -> 4:4
Matrix set from its native device:
Row: 0 ->
Row: 1 -> 1:-1
Row: 2 -> 2:-2
Row: 3 -> 3:-3
Row: 4 -> 4:-4

◆ transposedVectorProduct()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename InVector , typename OutVector >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::transposedVectorProduct ( const InVector & inVector,
OutVector & outVector,
ComputeReal matrixMultiplicator = 1.0,
ComputeReal outVectorMultiplicator = 0.0,
Index begin = 0,
Index end = 0 ) const

Computes product of transposed matrix and vector.

Template Parameters
InVectoris type of input vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.
OutVectoris type of output vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.
Parameters
inVectoris input vector.
outVectoris output vector.
matrixMultiplicatoris a factor by which the matrix is multiplied. It is one by default.
outVectorMultiplicatoris a factor by which the outVector is multiplied before added
beginis the beginning of the rows range for which the vector product is computed. It is zero by default.
endis the end of the rows range for which the vector product is computed. It is number if the matrix rows by default.

◆ vectorProduct()

template<typename Real , typename Device , typename Index , typename MatrixType , typename SegmentsView , typename ComputeReal >
template<typename InVector , typename OutVector , typename SegmentsReductionKernel >
void TNL::Matrices::SparseMatrixBase< Real, Device, Index, MatrixType, SegmentsView, ComputeReal >::vectorProduct ( const InVector & inVector,
OutVector & outVector,
ComputeRealType matrixMultiplicator = 1.0,
ComputeRealType outVectorMultiplicator = 0.0,
IndexType begin = 0,
IndexType end = 0,
const SegmentsReductionKernel & kernel = SegmentsReductionKernel{} ) const

Computes product of matrix and vector.

More precisely, it computes:

outVector = matrixMultiplicator * ( * this ) * inVector + outVectorMultiplicator * outVector
Template Parameters
InVectoris type of input vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.
OutVectoris type of output vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.
Parameters
inVectoris input vector.
outVectoris output vector.
matrixMultiplicatoris a factor by which the matrix is multiplied. It is one by default.
outVectorMultiplicatoris a factor by which the outVector is multiplied before added to the result of matrix-vector product. It is zero by default.
beginis the beginning of the rows range for which the vector product is computed. It is zero by default.
endis the end of the rows range for which the vector product is computed. It is number if the matrix rows by default.
kernelis an instance of the segments reduction kernel to be used for the operation.

The documentation for this class was generated from the following files: