Template Numerical Library: TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization

Implementation of dense matrix view. More...

#include <TNL/Matrices/DenseMatrixBase.h>

Inheritance diagram for TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >:

[legend]

Collaboration diagram for TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >:

[legend]

Public Types
using	ConstRowView = typename RowView::ConstRowView
	Type for accessing immutable matrix row.

using	DeviceType = Device
	The device where the matrix is allocated.

using	IndexType = Index
	The type used for matrix elements indexing.

using	RealType = Real
	The type of matrix elements.

using	RowView = DenseMatrixRowView< SegmentViewType, typename Base::ValuesViewType >
	Type for accessing matrix row.

Public Types inherited from TNL::Matrices::MatrixBase< Real, Device, Index, GeneralMatrix, Organization >
using	ConstValuesViewType
	Type of constant vector view holding values of matrix elements.

using	DeviceType
	The device where the matrix is allocated.

using	IndexType
	The type used for matrix elements indexing.

using	RealType
	The type of matrix elements.

using	RowCapacitiesType

using	ValuesViewType
	Type of vector view holding values of matrix elements.

Public Member Functions
__cuda_callable__	DenseMatrixBase ()=default
	Constructor without parameters.

__cuda_callable__	DenseMatrixBase (const DenseMatrixBase &matrix)=default
	Copy constructor.

__cuda_callable__	DenseMatrixBase (DenseMatrixBase &&matrix) noexcept=default
	Move constructor.

__cuda_callable__	DenseMatrixBase (IndexType rows, IndexType columns, typename Base::ValuesViewType values)
	Constructor with matrix dimensions and values.

__cuda_callable__ void	addElement (IndexType row, IndexType column, const RealType &value, const RealType &thisElementMultiplicator=1.0)
	Add element at given row and column to given value.

template<typename Matrix >
void	addMatrix (const Matrix &matrix, const RealType &matrixMultiplicator=1.0, const RealType &thisMatrixMultiplicator=1.0)

template<typename Function >
void	forAllElements (Function &&function)
	This method calls forElements for all matrix rows.

template<typename Function >
void	forAllElements (Function &&function) const
	This method calls forElements for all matrix rows.

template<typename Function >
void	forAllRows (Function &&function)
	Method for parallel iteration over all matrix rows.

template<typename Function >
void	forAllRows (Function &&function) const
	Method for parallel iteration over all matrix rows for constant instances.

template<typename Function >
void	forElements (IndexType begin, IndexType end, Function &&function)
	Method for iteration over all matrix rows for non-constant instances.

template<typename Function >
void	forElements (IndexType begin, IndexType end, Function &&function) const
	Method for iteration over all matrix rows for constant instances.

template<typename Function >
void	forRows (IndexType begin, IndexType end, Function &&function)
	Method for parallel iteration over matrix rows from interval `[begin, end)`.

template<typename Function >
void	forRows (IndexType begin, IndexType end, Function &&function) const
	Method for parallel iteration over matrix rows from interval `[begin, end)` for constant instances.

template<typename Vector >
void	getCompressedRowLengths (Vector &rowLengths) const
	Computes number of non-zeros in each row.

__cuda_callable__ Real	getElement (IndexType row, IndexType column) const
	Returns value of matrix element at position given by its row and column index.

__cuda_callable__ RowView	getRow (IndexType rowIdx)
	Non-constant getter of simple structure for accessing given matrix row.

__cuda_callable__ ConstRowView	getRow (IndexType rowIdx) const
	Constant getter of simple structure for accessing given matrix row.

template<typename Vector >
void	getRowCapacities (Vector &rowCapacities) const
	Compute capacities of all rows.

template<typename Real_ , typename Device_ , typename Index_ >
bool	operator!= (const DenseMatrixBase< Real_, Device_, Index_, Organization > &matrix) const
	Comparison operator with another dense matrix view.

__cuda_callable__ Real &	operator() (IndexType row, IndexType column)
	Returns non-constant reference to element at row row and column column.

__cuda_callable__ const Real &	operator() (IndexType row, IndexType column) const
	Returns constant reference to element at row row and column column.

DenseMatrixBase &	operator= (const DenseMatrixBase &)=delete
	Copy-assignment operator.

DenseMatrixBase &	operator= (DenseMatrixBase &&)=delete
	Move-assignment operator.

template<typename Real_ , typename Device_ , typename Index_ >
bool	operator== (const DenseMatrixBase< Real_, Device_, Index_, Organization > &matrix) const
	Comparison operator with another dense matrix view.

void	print (std::ostream &str) const
	Method for printing the matrix to output stream.

template<typename Fetch , typename Reduce , typename Keep , typename FetchReal >
void	reduceAllRows (Fetch &fetch, const Reduce &reduce, Keep &keep, const FetchReal &identity) const
	Method for performing general reduction on ALL matrix rows for constant instances.

template<typename Fetch , typename Reduce , typename Keep , typename FetchReal >
void	reduceRows (IndexType begin, IndexType end, Fetch &fetch, const Reduce &reduce, Keep &keep, const FetchReal &identity) const
	Method for performing general reduction on matrix rows for constant instances.

template<typename Fetch , typename Reduce , typename Keep , typename FetchValue >
void	reduceRows (IndexType begin, IndexType end, Fetch &fetch, const Reduce &reduce, Keep &keep, const FetchValue &identity) const

template<typename Function >
void	sequentialForAllRows (Function &&function)
	This method calls sequentialForRows for all matrix rows.

template<typename Function >
void	sequentialForAllRows (Function &&function) const
	This method calls sequentialForRows for all matrix rows (for constant instances).

template<typename Function >
void	sequentialForRows (IndexType begin, IndexType end, Function &&function)
	Method for sequential iteration over all matrix rows for non-constant instances.

template<typename Function >
void	sequentialForRows (IndexType begin, IndexType end, Function &&function) const
	Method for sequential iteration over all matrix rows for constant instances.

__cuda_callable__ void	setElement (IndexType row, IndexType column, const RealType &value)
	Sets element at given row and column to given value.

void	setValue (const RealType &v)
	Sets all matrix elements to value v.

template<typename InVector , typename OutVector >
void	vectorProduct (const InVector &inVector, OutVector &outVector, const RealType &matrixMultiplicator=1.0, const RealType &outVectorMultiplicator=0.0, IndexType begin=0, IndexType end=0) const
	Computes product of matrix and vector.

Public Member Functions inherited from TNL::Matrices::MatrixBase< Real, Device, Index, GeneralMatrix, Organization >
__cuda_callable__	MatrixBase ()=default
	Basic constructor with no parameters.

__cuda_callable__	MatrixBase (const MatrixBase &view)=default
	Shallow copy constructor.

__cuda_callable__	MatrixBase (IndexType rows, IndexType columns, ValuesViewType values)
	Constructor with matrix dimensions and matrix elements values.

__cuda_callable__	MatrixBase (MatrixBase &&view) noexcept=default
	Move constructor.

IndexType	getAllocatedElementsCount () const
	Tells the number of allocated matrix elements.

__cuda_callable__ IndexType	getColumns () const
	Returns number of matrix columns.

virtual IndexType	getNonzeroElementsCount () const
	Computes a current number of nonzero matrix elements.

__cuda_callable__ IndexType	getRows () const
	Returns number of matrix rows.

__cuda_callable__ ValuesViewType &	getValues ()
	Returns a reference to a vector with the matrix elements values.

__cuda_callable__ const ValuesViewType &	getValues () const
	Returns a constant reference to a vector with the matrix elements values.

bool	operator!= (const Matrix &matrix) const
	Comparison operator with another arbitrary matrix view type.

bool	operator!= (const MatrixT &matrix) const

__cuda_callable__ MatrixBase &	operator= (const MatrixBase &)=delete
	Copy-assignment operator.

__cuda_callable__ MatrixBase &	operator= (MatrixBase &&)=delete
	Move-assignment operator.

bool	operator== (const Matrix &matrix) const
	Comparison operator with another arbitrary matrix view type.

bool	operator== (const MatrixT &matrix) const

Static Public Member Functions
static std::string	getSerializationType ()
	Returns string with serialization type.

Static Public Member Functions inherited from TNL::Matrices::MatrixBase< Real, Device, Index, GeneralMatrix, Organization >
static constexpr ElementsOrganization	getOrganization ()
	Matrix elements organization getter.

static constexpr bool	isBinary ()
	Test of binary matrix type.

static constexpr bool	isMatrix ()
	Test of matrix type.

static constexpr bool	isSymmetric ()
	Test of symmetric matrix type.

Protected Types
using	Base = MatrixBase< Real, Device, Index, GeneralMatrix, Organization >

using	SegmentsReductionKernel = Algorithms::SegmentsReductionKernels::EllpackKernel< Index, Device >

using	SegmentsType

using	SegmentsViewType = typename SegmentsType::ViewType

using	SegmentViewType = typename SegmentsType::SegmentViewType

Protected Member Functions
__cuda_callable__ void	bind (IndexType rows, IndexType columns, typename Base::ValuesViewType values, SegmentsViewType segments)
	Re-initializes the internal attributes of the base class.

__cuda_callable__ IndexType	getElementIndex (IndexType row, IndexType column) const

Protected Member Functions inherited from TNL::Matrices::MatrixBase< Real, Device, Index, GeneralMatrix, Organization >
__cuda_callable__ void	bind (IndexType rows, IndexType columns, ValuesViewType values)
	Re-initializes the internal attributes of the base class.

Protected Attributes
SegmentsViewType	segments

Protected Attributes inherited from TNL::Matrices::MatrixBase< Real, Device, Index, GeneralMatrix, Organization >
IndexType	columns

IndexType	rows

ValuesViewType	values

Detailed Description

template<typename Real, typename Device, typename Index, ElementsOrganization Organization>
class TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >

Implementation of dense matrix view.

It serves as an accessor to DenseMatrix for example when passing the matrix to lambda functions. DenseMatrix view can be also created in CUDA kernels.

Template Parameters

Real	is a type of matrix elements.
Device	is a device where the matrix is allocated.
Index	is a type for indexing of the matrix elements.
MatrixElementsOrganization	tells the ordering of matrix elements in memory. It is either TNL::Algorithms::Segments::RowMajorOrder or TNL::Algorithms::Segments::ColumnMajorOrder.

See DenseMatrix.

Member Typedef Documentation

◆ SegmentsType

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

using TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::SegmentsType

protected

Initial value:

Algorithms::Segments::

Ellpack< Device, Index, typename Allocators::Default< Device >::template Allocator< Index >, Organization, 1 >

Constructor & Destructor Documentation

◆ DenseMatrixBase() [1/3]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::DenseMatrixBase	(	IndexType	rows,
		IndexType	columns,
		typename Base::ValuesViewType	values )

Constructor with matrix dimensions and values.

Parameters

rows	number of matrix rows.
columns	number of matrix columns.
values	is vector view with matrix elements values.

◆ DenseMatrixBase() [2/3]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::DenseMatrixBase ( const DenseMatrixBase< Real, Device, Index, Organization > & matrix )

default

Copy constructor.

Parameters

matrix is the source matrix view.

◆ DenseMatrixBase() [3/3]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::DenseMatrixBase ( DenseMatrixBase< Real, Device, Index, Organization > && matrix )

defaultnoexcept

Move constructor.

Parameters

matrix is the source matrix view.

Member Function Documentation

◆ addElement()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::addElement	(	IndexType	row,
		IndexType	column,
		const RealType &	value,
		const RealType &	thisElementMultiplicator = 1.0 )

Add element at given row and column to given value.

This method can be called from the host system (CPU) no matter where the matrix is allocated. If the matrix is allocated on GPU this method can be called even from device kernels. If the matrix is allocated in GPU device this method is called from CPU, it transfers values of each matrix element separately and so the performance is very low. For higher performance see. DenseMatrix::getRow or DenseMatrix::forElements and DenseMatrix::forAllElements.

Parameters

row	is row index of the element.
column	is columns index of the element.
value	is the value the element will be set to.
thisElementMultiplicator	is multiplicator the original matrix element value is multiplied by before addition of given value.

Example: #include <iostream>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

addElements()

{

TNL::Matrices::DenseMatrix< double, Device > matrix( 5, 5 );

auto matrixView = matrix.getView();

for( int i = 0; i < 5; i++ )

matrixView.setElement( i, i, i ); // or matrix.setElement

std::cout << "Initial matrix is: " << std::endl << matrix << std::endl;

for( int i = 0; i < 5; i++ )

for( int j = 0; j < 5; j++ )

matrixView.addElement( i, j, 1.0, 5.0 ); // or matrix.addElement

std::cout << "Matrix after addition is: " << std::endl << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Add elements on host:" << std::endl;

addElements< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Add elements on CUDA device:" << std::endl;

addElements< TNL::Devices::Cuda >();

#endif

}

std::cout

TNL::Matrices::DenseMatrix
Implementation of dense matrix, i.e. matrix storing explicitly all of its elements including zeros.
Definition DenseMatrix.h:31

std::endl
T endl(T... args)

Output: Add elements on host:

Initial matrix is:

Row: 0 -> 0:0 1:0 2:0 3:0 4:0

Row: 1 -> 0:0 1:1 2:0 3:0 4:0

Row: 2 -> 0:0 1:0 2:2 3:0 4:0

Row: 3 -> 0:0 1:0 2:0 3:3 4:0

Row: 4 -> 0:0 1:0 2:0 3:0 4:4

Matrix after addition is:

Row: 0 -> 0:1 1:1 2:1 3:1 4:1

Row: 1 -> 0:1 1:6 2:1 3:1 4:1

Row: 2 -> 0:1 1:1 2:11 3:1 4:1

Row: 3 -> 0:1 1:1 2:1 3:16 4:1

Row: 4 -> 0:1 1:1 2:1 3:1 4:21

Add elements on CUDA device:

Initial matrix is:

Row: 0 -> 0:0 1:0 2:0 3:0 4:0

Row: 1 -> 0:0 1:1 2:0 3:0 4:0

Row: 2 -> 0:0 1:0 2:2 3:0 4:0

Row: 3 -> 0:0 1:0 2:0 3:3 4:0

Row: 4 -> 0:0 1:0 2:0 3:0 4:4

Matrix after addition is:

Row: 0 -> 0:1 1:1 2:1 3:1 4:1

Row: 1 -> 0:1 1:6 2:1 3:1 4:1

Row: 2 -> 0:1 1:1 2:11 3:1 4:1

Row: 3 -> 0:1 1:1 2:1 3:16 4:1

Row: 4 -> 0:1 1:1 2:1 3:1 4:21

◆ bind()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::bind	(	IndexType	rows,
		IndexType	columns,
		typename Base::ValuesViewType	values,
		SegmentsViewType	segments )

protected

Re-initializes the internal attributes of the base class.

Note that this function is protected to ensure that the user cannot modify the base class of a matrix. For the same reason, in future code development we also need to make sure that all non-const functions in the base class return by value and not by reference.

◆ forAllElements() [1/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::forAllElements ( Function && function )

This method calls forElements for all matrix rows.

See DenseMatrix::forAllElements.

Template Parameters

Function is a type of lambda function that will operate on matrix elements.

Parameters

function is an instance of the lambda function to be called in each row.

Example: #include <iostream>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

forAllElementsExample()

{

TNL::Matrices::DenseMatrix< double, Device > matrix( 5, 5 );

auto matrixView = matrix.getView();

auto f = [] __cuda_callable__( int rowIdx, int columnIdx, int globalIdx, double& value )

{

if( rowIdx >= columnIdx )

value = rowIdx + columnIdx;

};

matrixView.forAllElements( f ); // or matrix.forAllElements

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Creating matrix on host: " << std::endl;

forAllElementsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Creating matrix on CUDA device: " << std::endl;

forAllElementsExample< TNL::Devices::Cuda >();

#endif

}

__cuda_callable__
#define __cuda_callable__
Definition Macros.h:49

Output: Creating matrix on host:

Row: 0 -> 0:0 1:0 2:0 3:0 4:0

Row: 1 -> 0:1 1:2 2:0 3:0 4:0

Row: 2 -> 0:2 1:3 2:4 3:0 4:0

Row: 3 -> 0:3 1:4 2:5 3:6 4:0

Row: 4 -> 0:4 1:5 2:6 3:7 4:8

Creating matrix on CUDA device:

Row: 0 -> 0:0 1:0 2:0 3:0 4:0

Row: 1 -> 0:1 1:2 2:0 3:0 4:0

Row: 2 -> 0:2 1:3 2:4 3:0 4:0

Row: 3 -> 0:3 1:4 2:5 3:6 4:0

Row: 4 -> 0:4 1:5 2:6 3:7 4:8

◆ forAllElements() [2/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::forAllElements ( Function && function ) const

This method calls forElements for all matrix rows.

See DenseMatrix::forElements.

Template Parameters

Function is a type of lambda function that will operate on matrix elements.

Parameters

function is an instance of the lambda function to be called in each row.

◆ forAllRows() [1/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::forAllRows ( Function && function )

Method for parallel iteration over all matrix rows.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method DenseMatrixBase::forAllElements where more than one thread can be mapped to each row.

Template Parameters

Function is type of the lambda function.

Parameters

function is an instance of the lambda function to be called for each row.

auto function = [] __cuda_callable__ ( RowView& row ) { ... };

TNL::Matrices::DenseMatrixRowView

RowView is a simple structure for accessing rows of dense matrix.

Definition DenseMatrixRowView.h:27

RowView represents matrix row - see TNL::Matrices::DenseMatrixBase::RowView.

Example: #include <iostream>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

forRowsExample()

{

using MatrixType = TNL::Matrices::DenseMatrix< double, Device >;

using RowView = typename MatrixType::RowView;

const int size = 5;

MatrixType matrix( size, size );

auto view = matrix.getView();

/***

* Set the matrix elements.

*/

auto f = [] __cuda_callable__( RowView & row )

{

const int& rowIdx = row.getRowIndex();

if( rowIdx > 0 )

row.setValue( rowIdx - 1, -1.0 );

row.setValue( rowIdx, rowIdx + 1.0 );

if( rowIdx < size - 1 )

row.setValue( rowIdx + 1, -1.0 );

};

view.forAllRows( f ); // or matrix.forAllRows

std::cout << matrix << std::endl;

/***

* Now divide each matrix row by its largest element - with the use of iterators.

*/

view.forAllRows(

[] __cuda_callable__( RowView & row )

{

double largest = std::numeric_limits< double >::lowest();

for( auto element : row )

largest = TNL::max( largest, element.value() );

for( auto element : row )

element.value() /= largest;

} );

std::cout << "Divide each matrix row by its largest element... " << std::endl;

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting matrix rows on host: " << std::endl;

forRowsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting matrix rows on CUDA device: " << std::endl;

forRowsExample< TNL::Devices::Cuda >();

#endif

}

TNL::Matrices::DenseMatrixRowView::setValue
__cuda_callable__ void setValue(IndexType column, const RealType &value)
Sets value of matrix element with given column index.
Definition DenseMatrixRowView.hpp:63

TNL::Matrices::DenseMatrixRowView::getRowIndex
__cuda_callable__ IndexType getRowIndex() const
Returns the matrix row index.
Definition DenseMatrixRowView.hpp:28

std::numeric_limits::lowest
T lowest(T... args)

TNL
The main TNL namespace.
Definition AtomicOperations.h:9

TNL::Matrices::MatrixType
Structure for specifying type of sparse matrix.
Definition MatrixType.h:17

Output: Getting matrix rows on host:

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-1 1:2 2:-1 3:0 4:0

Row: 2 -> 0:0 1:-1 2:3 3:-1 4:0

Row: 3 -> 0:0 1:0 2:-1 3:4 4:-1

Row: 4 -> 0:0 1:0 2:0 3:-1 4:5

Divide each matrix row by its largest element...

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-0.5 1:1 2:-0.5 3:0 4:0

Row: 2 -> 0:0 1:-0.333333 2:1 3:-0.333333 4:0

Row: 3 -> 0:0 1:0 2:-0.25 3:1 4:-0.25

Row: 4 -> 0:0 1:0 2:0 3:-0.2 4:1

Getting matrix rows on CUDA device:

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-1 1:2 2:-1 3:0 4:0

Row: 2 -> 0:0 1:-1 2:3 3:-1 4:0

Row: 3 -> 0:0 1:0 2:-1 3:4 4:-1

Row: 4 -> 0:0 1:0 2:0 3:-1 4:5

Divide each matrix row by its largest element...

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-0.5 1:1 2:-0.5 3:0 4:0

Row: 2 -> 0:0 1:-0.333333 2:1 3:-0.333333 4:0

Row: 3 -> 0:0 1:0 2:-0.25 3:1 4:-0.25

Row: 4 -> 0:0 1:0 2:0 3:-0.2 4:1

◆ forAllRows() [2/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::forAllRows ( Function && function ) const

Method for parallel iteration over all matrix rows for constant instances.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method DenseMatrixBase::forAllElements where more than one thread can be mapped to each row.

Template Parameters

Function is type of the lambda function.

Parameters

function is an instance of the lambda function to be called for each row.

auto function = [] __cuda_callable__ ( const ConstRowView& row ) { ... };

TNL::Matrices::DenseMatrixBase::ConstRowView

typename RowView::ConstRowView ConstRowView

Type for accessing immutable matrix row.

Definition DenseMatrixBase.h:67

ConstRowView represents matrix row - see TNL::Matrices::DenseMatrixBase::ConstRowView.

◆ forElements() [1/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::forElements	(	IndexType	begin,
		IndexType	end,
		Function &&	function )

Method for iteration over all matrix rows for non-constant instances.

Template Parameters

Function is type of lambda function that will operate on matrix elements. It should have form like

auto function = [=] __cuda_callable__ ( IndexType rowIdx, IndexType columnIdx, IndexType columnIdx, RealType& value ) {

... };

TNL::Matrices::DenseMatrixBase::IndexType

Index IndexType

The type used for matrix elements indexing.

Definition DenseMatrixBase.h:57

TNL::Real

Definition Real.h:14

The column index repeats twice only for compatibility with sparse matrices.

Parameters

begin	defines beginning of the range [begin,end) of rows to be processed.
end	defines ending of the range [begin,end) of rows to be processed.
function	is an instance of the lambda function to be called in each row.

Example: #include <iostream>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

forElementsExample()

{

TNL::Matrices::DenseMatrix< double, Device > matrix( 5, 5 );

auto matrixView = matrix.getView();

auto f = [] __cuda_callable__( int rowIdx, int columnIdx, int globalIdx, double& value )

{

if( columnIdx <= rowIdx )

value = rowIdx + columnIdx;

};

matrixView.forElements( 0, matrix.getRows(), f ); // or matrix.forElements

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Creating matrix on host: " << std::endl;

forElementsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Creating matrix on CUDA device: " << std::endl;

forElementsExample< TNL::Devices::Cuda >();

#endif

}

Output: Creating matrix on host:

Row: 0 -> 0:0 1:0 2:0 3:0 4:0

Row: 1 -> 0:1 1:2 2:0 3:0 4:0

Row: 2 -> 0:2 1:3 2:4 3:0 4:0

Row: 3 -> 0:3 1:4 2:5 3:6 4:0

Row: 4 -> 0:4 1:5 2:6 3:7 4:8

Creating matrix on CUDA device:

Row: 0 -> 0:0 1:0 2:0 3:0 4:0

Row: 1 -> 0:1 1:2 2:0 3:0 4:0

Row: 2 -> 0:2 1:3 2:4 3:0 4:0

Row: 3 -> 0:3 1:4 2:5 3:6 4:0

Row: 4 -> 0:4 1:5 2:6 3:7 4:8

◆ forElements() [2/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::forElements	(	IndexType	begin,
		IndexType	end,
		Function &&	function ) const

Method for iteration over all matrix rows for constant instances.

Template Parameters

Function is type of lambda function that will operate on matrix elements. It should have form like

auto function = [] __cuda_callable__ ( IndexType rowIdx, IndexType columnIdx, IndexType columnIdx, const RealType& value

) { ... };

The column index repeats twice only for compatibility with sparse matrices.

Parameters

begin	defines beginning of the range [begin,end) of rows to be processed.
end	defines ending of the range [begin,end) of rows to be processed.
function	is an instance of the lambda function to be called in each row.

◆ forRows() [1/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::forRows	(	IndexType	begin,
		IndexType	end,
		Function &&	function )

Method for parallel iteration over matrix rows from interval [begin, end).

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method DenseMatrix::forElements where more than one thread can be mapped to each row.

Template Parameters

Function is type of the lambda function.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
function	is an instance of the lambda function to be called for each row.

auto function = [] __cuda_callable__ ( RowView& row ) { ... };

RowView represents matrix row - see TNL::Matrices::DenseMatrix::RowView.

Example: #include <iostream>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

forRowsExample()

{

using MatrixType = TNL::Matrices::DenseMatrix< double, Device >;

using RowView = typename MatrixType::RowView;

const int size = 5;

MatrixType matrix( size, size );

auto view = matrix.getView();

/***

* Set the matrix elements.

*/

auto f = [] __cuda_callable__( RowView & row )

{

const int& rowIdx = row.getRowIndex();

if( rowIdx > 0 )

row.setValue( rowIdx - 1, -1.0 );

row.setValue( rowIdx, rowIdx + 1.0 );

if( rowIdx < size - 1 )

row.setValue( rowIdx + 1, -1.0 );

};

view.forAllRows( f ); // or matrix.forAllRows

std::cout << matrix << std::endl;

/***

* Now divide each matrix row by its largest element - with the use of iterators.

*/

view.forAllRows(

[] __cuda_callable__( RowView & row )

{

double largest = std::numeric_limits< double >::lowest();

for( auto element : row )

largest = TNL::max( largest, element.value() );

for( auto element : row )

element.value() /= largest;

} );

std::cout << "Divide each matrix row by its largest element... " << std::endl;

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting matrix rows on host: " << std::endl;

forRowsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting matrix rows on CUDA device: " << std::endl;

forRowsExample< TNL::Devices::Cuda >();

#endif

}

Output: Getting matrix rows on host:

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-1 1:2 2:-1 3:0 4:0

Row: 2 -> 0:0 1:-1 2:3 3:-1 4:0

Row: 3 -> 0:0 1:0 2:-1 3:4 4:-1

Row: 4 -> 0:0 1:0 2:0 3:-1 4:5

Divide each matrix row by its largest element...

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-0.5 1:1 2:-0.5 3:0 4:0

Row: 2 -> 0:0 1:-0.333333 2:1 3:-0.333333 4:0

Row: 3 -> 0:0 1:0 2:-0.25 3:1 4:-0.25

Row: 4 -> 0:0 1:0 2:0 3:-0.2 4:1

Getting matrix rows on CUDA device:

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-1 1:2 2:-1 3:0 4:0

Row: 2 -> 0:0 1:-1 2:3 3:-1 4:0

Row: 3 -> 0:0 1:0 2:-1 3:4 4:-1

Row: 4 -> 0:0 1:0 2:0 3:-1 4:5

Divide each matrix row by its largest element...

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-0.5 1:1 2:-0.5 3:0 4:0

Row: 2 -> 0:0 1:-0.333333 2:1 3:-0.333333 4:0

Row: 3 -> 0:0 1:0 2:-0.25 3:1 4:-0.25

Row: 4 -> 0:0 1:0 2:0 3:-0.2 4:1

◆ forRows() [2/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::forRows	(	IndexType	begin,
		IndexType	end,
		Function &&	function ) const

Method for parallel iteration over matrix rows from interval [begin, end) for constant instances.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method DenseMatrixBase::forElements where more than one thread can be mapped to each row.

Template Parameters

Function is type of the lambda function.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
function	is an instance of the lambda function to be called for each row.

auto function = [] __cuda_callable__ ( const ConstRowView& row ) { ... };

ConstRowView represents matrix row - see TNL::Matrices::DenseMatrixBase::ConstRowView.

◆ getCompressedRowLengths()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Vector >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::getCompressedRowLengths ( Vector & rowLengths ) const

Computes number of non-zeros in each row.

Parameters

rowLengths is a vector into which the number of non-zeros in each row will be stored.

Example: #include <iostream>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

getCompressedRowLengthsExample()

{

TNL::Matrices::DenseMatrix< double, Device > denseMatrix{

// clang-format off

{ 1 },

{ 2, 3 },

{ 4, 5, 6 },

{ 7, 8, 9, 10 },

{ 11, 12, 13, 14, 15 }

// clang-format on

};

auto denseMatrixView = denseMatrix.getConstView();

std::cout << denseMatrixView << std::endl;

TNL::Containers::Vector< int, Device > rowLengths;

denseMatrixView.getCompressedRowLengths( rowLengths );

std::cout << "Compressed row lengths are: " << rowLengths << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting compressed row lengths on host: " << std::endl;

getCompressedRowLengthsExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting compressed row lengths on CUDA device: " << std::endl;

getCompressedRowLengthsExample< TNL::Devices::Cuda >();

#endif

}

TNL::Containers::Vector
Vector extends Array with algebraic operations.
Definition Vector.h:36

TNL::Matrices::DenseMatrix::getConstView
ConstViewType getConstView() const
Returns a non-modifiable view of the dense matrix.
Definition DenseMatrix.hpp:110

Output: Getting compressed row lengths on host:

Row: 0 -> 0:1 1:0 2:0 3:0 4:0

Row: 1 -> 0:2 1:3 2:0 3:0 4:0

Row: 2 -> 0:4 1:5 2:6 3:0 4:0

Row: 3 -> 0:7 1:8 2:9 3:10 4:0

Row: 4 -> 0:11 1:12 2:13 3:14 4:15

Compressed row lengths are: [ 1, 2, 3, 4, 5 ]

Getting compressed row lengths on CUDA device:

Row: 0 -> 0:1 1:0 2:0 3:0 4:0

Row: 1 -> 0:2 1:3 2:0 3:0 4:0

Row: 2 -> 0:4 1:5 2:6 3:0 4:0

Row: 3 -> 0:7 1:8 2:9 3:10 4:0

Row: 4 -> 0:11 1:12 2:13 3:14 4:15

Compressed row lengths are: [ 1, 2, 3, 4, 5 ]

◆ getElement()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ Real TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::getElement	(	IndexType	row,
		IndexType	column ) const

Returns value of matrix element at position given by its row and column index.

Parameters

row	is a row index of the matrix element.
column	i a column index of the matrix element.

Returns: value of given matrix element.

Example: #include <iostream>

#include <iomanip>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

getElements()

{

TNL::Matrices::DenseMatrix< double, Device > matrix{

// clang-format off

{ 1, 0, 0, 0, 0 },

{ -1, 2, -1, 0, 0 },

{ 0, -1, 2, -1, 0 },

{ 0, 0, -1, 2, -1 },

{ 0, 0, 0, 0, 1 }

// clang-format on

};

auto matrixView = matrix.getConstView();

for( int i = 0; i < 5; i++ ) {

for( int j = 0; j < 5; j++ )

std::cout << std::setw( 5 ) << std::ios::right << matrixView.getElement( i, i ); // or matrix.getElement

std::cout << std::endl;

}

}

int

main( int argc, char* argv[] )

{

std::cout << "Get elements on host:" << std::endl;

getElements< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Get elements on CUDA device:" << std::endl;

getElements< TNL::Devices::Cuda >();

#endif

}

std::setw
T setw(T... args)

Output: Get elements on host:

1281 1281 1281 1281 1281

1282 1282 1282 1282 1282

1282 1282 1282 1282 1282

1282 1282 1282 1282 1282

1281 1281 1281 1281 1281

Get elements on CUDA device:

1281 1281 1281 1281 1281

1282 1282 1282 1282 1282

1282 1282 1282 1282 1282

1282 1282 1282 1282 1282

1281 1281 1281 1281 1281

◆ getRow() [1/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ auto TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::getRow ( IndexType rowIdx )

Non-constant getter of simple structure for accessing given matrix row.

Parameters

rowIdx is matrix row index.

Returns: RowView for accessing given matrix row.

Example: #include <iostream>

#include <TNL/Algorithms/parallelFor.h>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

getRowExample()

{

const int size = 5;

TNL::Matrices::DenseMatrix< double, Device > matrix( size, size );

/***

* Create dense matrix view which can be captured by the following lambda

* function.

*/

auto matrixView = matrix.getView();

auto f = [ = ] __cuda_callable__( int rowIdx ) mutable

{

auto row = matrixView.getRow( rowIdx );

if( rowIdx > 0 )

row.setValue( rowIdx - 1, -1.0 );

row.setValue( rowIdx, rowIdx + 1.0 );

if( rowIdx < size - 1 )

row.setValue( rowIdx + 1, -1.0 );

};

/***

* Set the matrix elements.

*/

TNL::Algorithms::parallelFor< Device >( 0, matrix.getRows(), f );

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting matrix rows on host: " << std::endl;

getRowExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting matrix rows on CUDA device: " << std::endl;

getRowExample< TNL::Devices::Cuda >();

#endif

}

Output: Getting matrix rows on host:

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-1 1:2 2:-1 3:0 4:0

Row: 2 -> 0:0 1:-1 2:3 3:-1 4:0

Row: 3 -> 0:0 1:0 2:-1 3:4 4:-1

Row: 4 -> 0:0 1:0 2:0 3:-1 4:5

Getting matrix rows on CUDA device:

Row: 0 -> 0:1 1:-1 2:0 3:0 4:0

Row: 1 -> 0:-1 1:2 2:-1 3:0 4:0

Row: 2 -> 0:0 1:-1 2:3 3:-1 4:0

Row: 3 -> 0:0 1:0 2:-1 3:4 4:-1

Row: 4 -> 0:0 1:0 2:0 3:-1 4:5

See DenseMatrixRowView.

◆ getRow() [2/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ auto TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::getRow ( IndexType rowIdx ) const

Constant getter of simple structure for accessing given matrix row.

Parameters

rowIdx is matrix row index.

Returns: RowView for accessing given matrix row.

Example: #include <iostream>

#include <functional>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

#include <TNL/Devices/Cuda.h>

template< typename Device >

void

getRowExample()

{

TNL::Matrices::DenseMatrix< double, Device > matrix{

// clang-format off

{ 1, 0, 0, 0, 0 },

{ 1, 2, 0, 0, 0 },

{ 1, 2, 3, 0, 0 },

{ 1, 2, 3, 4, 0 },

{ 1, 2, 3, 4, 5 }

// clang-format on

};

/***

* We need a matrix view to pass the matrix to lambda function even on CUDA device.

*/

const auto matrixView = matrix.getConstView();

/***

* Fetch lambda function returns diagonal element in each row.

*/

auto fetch = [ = ] __cuda_callable__( int rowIdx ) -> double

{

auto row = matrixView.getRow( rowIdx );

return row.getValue( rowIdx );

};

int trace = TNL::Algorithms::reduce< Device >( 0, matrix.getRows(), fetch, std::plus<>{}, 0 );

std::cout << "Matrix trace is " << trace << "." << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Getting matrix rows on host: " << std::endl;

getRowExample< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Getting matrix rows on CUDA device: " << std::endl;

getRowExample< TNL::Devices::Cuda >();

#endif

}

std::plus

Output: Getting matrix rows on host:

Matrix trace is 15.

Getting matrix rows on CUDA device:

Matrix trace is 15.

See DenseMatrixRowView.

◆ getRowCapacities()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Vector >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::getRowCapacities ( Vector & rowCapacities ) const

Compute capacities of all rows.

The row capacities are not stored explicitly and must be computed.

Parameters

rowCapacities is a vector where the row capacities will be stored.

◆ getSerializationType()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

std::string TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::getSerializationType ( )

static

Returns string with serialization type.

The string has a form `MatricesDenseMatrix< RealType, [any_device], IndexType, [any_allocator], true/false >`.

Returns: String with the serialization type.

◆ operator!=()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Real_ , typename Device_ , typename Index_ >

bool TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::operator!= ( const DenseMatrixBase< Real_, Device_, Index_, Organization > & matrix ) const

Comparison operator with another dense matrix view.

Parameters

matrix is the right-hand side matrix.

Returns: false if the RHS matrix view is equal, true otherwise.

◆ operator()() [1/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ Real & TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::operator()	(	IndexType	row,
		IndexType	column )

Returns non-constant reference to element at row row and column column.

Since this method returns reference to the element, it cannot be called across different address spaces. It means that it can be called only form CPU if the matrix is allocated on CPU or only from GPU kernels if the matrix is allocated on GPU.

Parameters

row	is a row index of the element.
column	is a columns index of the element.

Returns: reference to given matrix element.

◆ operator()() [2/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ const Real & TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::operator()	(	IndexType	row,
		IndexType	column ) const

Returns constant reference to element at row row and column column.

Parameters

row	is a row index of the element.
column	is a columns index of the element.

Returns: reference to given matrix element.

◆ operator=()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

DenseMatrixBase & TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::operator= ( const DenseMatrixBase< Real, Device, Index, Organization > & )

delete

Copy-assignment operator.

It is a deleted function, because matrix assignment in general requires reallocation.

◆ operator==()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Real_ , typename Device_ , typename Index_ >

bool TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::operator== ( const DenseMatrixBase< Real_, Device_, Index_, Organization > & matrix ) const

Comparison operator with another dense matrix view.

Parameters

matrix is the right-hand side matrix view.

Returns: true if the RHS matrix view is equal, false otherwise.

◆ print()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::print ( std::ostream & str ) const

Method for printing the matrix to output stream.

Parameters

str	is the output stream.

◆ reduceAllRows()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Fetch , typename Reduce , typename Keep , typename FetchReal >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::reduceAllRows	(	Fetch &	fetch,
		const Reduce &	reduce,
		Keep &	keep,
		const FetchReal &	identity ) const

Method for performing general reduction on ALL matrix rows for constant instances.

Template Parameters

Fetch is a type of lambda function for data fetch declared as

auto fetch = [] __cuda_callable__ ( IndexType rowIdx, IndexType columnIdx, RealType elementValue ) -> FetchValue { ... };

The return type of this lambda can be any non void.

Template Parameters

Reduce is a type of lambda function for reduction declared as

auto reduce = [] __cuda_callable__ ( const FetchValue& v1, const FetchValue& v2 ) -> FetchValue { ... };

Template Parameters

Keep	is a type of lambda function for storing results of reduction in each row. It is declared as

auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value ) { ... };

Template Parameters

FetchValue is type returned by the Fetch lambda function.

Parameters

fetch	is an instance of lambda function for data fetch.
reduce	is an instance of lambda function for reduction.
keep	in an instance of lambda function for storing results.
identity	is the identity element for the reduction operation, i.e. element which does not change the result of the reduction.

Example: #include <iostream>

#include <iomanip>

#include <functional>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

reduceAllRows()

{

TNL::Matrices::DenseMatrix< double, Device > matrix{

// clang-format off

{ 1, 0, 0, 0, 0 },

{ 1, 2, 0, 0, 0 },

{ 0, 1, 8, 0, 0 },

{ 0, 0, 1, 9, 0 },

{ 0, 0, 0, 0, 1 }

// clang-format on

};

auto matrixView = matrix.getView();

/***

* Find largest element in each row.

*/

TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );

/***

* Prepare vector view and matrix view for lambdas.

*/

auto rowMaxView = rowMax.getView();

/***

* Fetch lambda just returns absolute value of matrix elements.

*/

auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double

{

return TNL::abs( value );

};

/***

* Reduce lambda return maximum of given values.

*/

auto reduce = [] __cuda_callable__( const double& a, const double& b ) -> double

{

return TNL::max( a, b );

};

/***

* Keep lambda store the largest value in each row to the vector rowMax.

*/

auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable

{

rowMaxView[ rowIdx ] = value;

};

/***

* Compute the largest values in each row.

*/

matrixView.reduceAllRows( fetch, reduce, keep, std::numeric_limits< double >::lowest() ); // or matrix.reduceAllRows

std::cout << "Max. elements in rows are: " << rowMax << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "All rows reduction on host:" << std::endl;

reduceAllRows< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "All rows reduction on CUDA device:" << std::endl;

reduceAllRows< TNL::Devices::Cuda >();

#endif

}

TNL::Matrices::DenseMatrix::getView
ViewType getView()
Returns a modifiable view of the dense matrix.
Definition DenseMatrix.hpp:103

TNL::Algorithms::reduce
Result reduce(Index begin, Index end, Fetch &&fetch, Reduction &&reduction, const Result &identity)
reduce implements (parallel) reduction for vectors and arrays.
Definition reduce.h:65

TNL::abs
__cuda_callable__ T abs(const T &n)
This function returns absolute value of given number n.
Definition Math.h:74

TNL::max
constexpr ResultType max(const T1 &a, const T2 &b)
This function returns maximum of two numbers.
Definition Math.h:48

std::numeric_limits

Output: All rows reduction on host:

Max. elements in rows are: [ 1, 2, 8, 9, 1 ]

All rows reduction on CUDA device:

Max. elements in rows are: [ 1, 2, 8, 9, 1 ]

◆ reduceRows()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Fetch , typename Reduce , typename Keep , typename FetchReal >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::reduceRows	(	IndexType	begin,
		IndexType	end,
		Fetch &	fetch,
		const Reduce &	reduce,
		Keep &	keep,
		const FetchReal &	identity ) const

Method for performing general reduction on matrix rows for constant instances.

Template Parameters

Fetch is a type of lambda function for data fetch declared as

auto fetch = [] __cuda_callable__ ( IndexType rowIdx, IndexType columnIdx, RealType elementValue ) -> FetchValue { ... };

The return type of this lambda can be any non void.

Template Parameters

Reduce is a type of lambda function for reduction declared as

auto reduce = [] __cuda_callable__ ( const FetchValue& v1, const FetchValue& v2 ) -> FetchValue { ... };

Template Parameters

Keep	is a type of lambda function for storing results of reduction in each row. It is declared as

auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value ) { ... };

Template Parameters

FetchValue is type returned by the Fetch lambda function.

Parameters

begin	defines beginning of the range `[begin, end)` of rows to be processed.
end	defines ending of the range `[begin, end)` of rows to be processed.
fetch	is an instance of lambda function for data fetch.
reduce	is an instance of lambda function for reduction.
keep	in an instance of lambda function for storing results.
identity	is the identity element for the reduction operation, i.e. element which does not change the result of the reduction.

Example: #include <iostream>

#include <iomanip>

#include <functional>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

reduceRows()

{

TNL::Matrices::DenseMatrix< double, Device > matrix{

// clang-format off

{ 1, 0, 0, 0, 0 },

{ 1, 2, 0, 0, 0 },

{ 0, 1, 8, 0, 0 },

{ 0, 0, 1, 9, 0 },

{ 0, 0, 0, 0, 1 }

// clang-format on

};

auto matrixView = matrix.getView();

/***

* Find largest element in each row.

*/

TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );

/***

* Prepare vector view for lambdas.

*/

auto rowMaxView = rowMax.getView();

/***

* Fetch lambda just returns absolute value of matrix elements.

*/

auto fetch = [] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double

{

return TNL::abs( value );

};

/***

* Reduce lambda return maximum of given values.

*/

auto reduce = [] __cuda_callable__( const double& a, const double& b ) -> double

{

return TNL::max( a, b );

};

/***

* Keep lambda store the largest value in each row to the vector rowMax.

*/

auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable

{

rowMaxView[ rowIdx ] = value;

};

/***

* Compute the largest values in each row.

*/

matrixView.reduceRows(

0, matrix.getRows(), fetch, reduce, keep, std::numeric_limits< double >::lowest() ); // or matrix.reduceRows

std::cout << "Max. elements in rows are: " << rowMax << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Rows reduction on host:" << std::endl;

reduceRows< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Rows reduction on CUDA device:" << std::endl;

reduceRows< TNL::Devices::Cuda >();

#endif

}

Output: Rows reduction on host:

Max. elements in rows are: [ 1, 2, 8, 9, 1 ]

Rows reduction on CUDA device:

Max. elements in rows are: [ 1, 2, 8, 9, 1 ]

◆ sequentialForAllRows() [1/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::sequentialForAllRows ( Function && function )

This method calls sequentialForRows for all matrix rows.

See DenseMatrixBase::sequentialForAllRows.

Template Parameters

Function is a type of lambda function that will operate on matrix elements.

Parameters

function is an instance of the lambda function to be called in each row.

◆ sequentialForAllRows() [2/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::sequentialForAllRows ( Function && function ) const

This method calls sequentialForRows for all matrix rows (for constant instances).

See DenseMatrixBase::sequentialForRows.

Template Parameters

Function is a type of lambda function that will operate on matrix elements.

Parameters

function is an instance of the lambda function to be called in each row.

◆ sequentialForRows() [1/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::sequentialForRows	(	IndexType	begin,
		IndexType	end,
		Function &&	function )

Method for sequential iteration over all matrix rows for non-constant instances.

Template Parameters

Function is type of lambda function that will operate on matrix elements. It should have form like

auto function = [] __cuda_callable__ ( RowView& row ) { ... };

RowView represents matrix row - see TNL::Matrices::DenseMatrixBase::RowView.

Parameters

begin	defines beginning of the range [begin,end) of rows to be processed.
end	defines ending of the range [begin,end) of rows to be processed.
function	is an instance of the lambda function to be called in each row.

◆ sequentialForRows() [2/2]

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename Function >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::sequentialForRows	(	IndexType	begin,
		IndexType	end,
		Function &&	function ) const

Method for sequential iteration over all matrix rows for constant instances.

Template Parameters

Function is type of lambda function that will operate on matrix elements. It should have form like

auto function = [] __cuda_callable__ ( const ConstRowView& row ) { ... };

ConstRowView represents matrix row - see TNL::Matrices::DenseMatrixBase::ConstRowView.

Parameters

begin	defines beginning of the range [begin,end) of rows to be processed.
end	defines ending of the range [begin,end) of rows to be processed.
function	is an instance of the lambda function to be called in each row.

◆ setElement()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

__cuda_callable__ void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::setElement	(	IndexType	row,
		IndexType	column,
		const RealType &	value )

Sets element at given row and column to given value.

Parameters

row	is row index of the element.
column	is columns index of the element.
value	is the value the element will be set to.

Example: #include <iostream>

#include <TNL/Algorithms/parallelFor.h>

#include <TNL/Containers/StaticArray.h>

#include <TNL/Matrices/DenseMatrix.h>

#include <TNL/Devices/Host.h>

template< typename Device >

void

setElements()

{

TNL::Matrices::DenseMatrix< double, Device > matrix( 5, 5 );

auto matrixView = matrix.getView();

for( int i = 0; i < 5; i++ )

matrixView.setElement( i, i, i ); // or matrix.setElement

std::cout << "Matrix set from the host:" << std::endl;

std::cout << matrix << std::endl;

auto f = [ = ] __cuda_callable__( const TNL::Containers::StaticArray< 2, int >& i ) mutable

{

matrixView.addElement( i[ 0 ], i[ 1 ], 5.0 );

};

TNL::Containers::StaticArray< 2, int > begin = { 0, 0 };

TNL::Containers::StaticArray< 2, int > end = { 5, 5 };

TNL::Algorithms::parallelFor< Device >( begin, end, f );

std::cout << "Matrix set from its native device:" << std::endl;

std::cout << matrix << std::endl;

}

int

main( int argc, char* argv[] )

{

std::cout << "Set elements on host:" << std::endl;

setElements< TNL::Devices::Host >();

#ifdef __CUDACC__

std::cout << "Set elements on CUDA device:" << std::endl;

setElements< TNL::Devices::Cuda >();

#endif

}

TNL::Containers::StaticArray
Array with constant size.
Definition StaticArray.h:20

Output: Set elements on host:

Matrix set from the host:

Row: 0 -> 0:0 1:0 2:0 3:0 4:0

Row: 1 -> 0:0 1:1 2:0 3:0 4:0

Row: 2 -> 0:0 1:0 2:2 3:0 4:0

Row: 3 -> 0:0 1:0 2:0 3:3 4:0

Row: 4 -> 0:0 1:0 2:0 3:0 4:4

Matrix set from its native device:

Row: 0 -> 0:5 1:5 2:5 3:5 4:5

Row: 1 -> 0:5 1:6 2:5 3:5 4:5

Row: 2 -> 0:5 1:5 2:7 3:5 4:5

Row: 3 -> 0:5 1:5 2:5 3:8 4:5

Row: 4 -> 0:5 1:5 2:5 3:5 4:9

Set elements on CUDA device:

Matrix set from the host:

Row: 0 -> 0:0 1:0 2:0 3:0 4:0

Row: 1 -> 0:0 1:1 2:0 3:0 4:0

Row: 2 -> 0:0 1:0 2:2 3:0 4:0

Row: 3 -> 0:0 1:0 2:0 3:3 4:0

Row: 4 -> 0:0 1:0 2:0 3:0 4:4

Matrix set from its native device:

Row: 0 -> 0:5 1:5 2:5 3:5 4:5

Row: 1 -> 0:5 1:6 2:5 3:5 4:5

Row: 2 -> 0:5 1:5 2:7 3:5 4:5

Row: 3 -> 0:5 1:5 2:5 3:8 4:5

Row: 4 -> 0:5 1:5 2:5 3:5 4:9

◆ setValue()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::setValue ( const RealType & v )

Sets all matrix elements to value v.

Parameters

v	is value all matrix elements will be set to.

◆ vectorProduct()

template<typename Real , typename Device , typename Index , ElementsOrganization Organization>

template<typename InVector , typename OutVector >

void TNL::Matrices::DenseMatrixBase< Real, Device, Index, Organization >::vectorProduct	(	const InVector &	inVector,
		OutVector &	outVector,
		const RealType &	matrixMultiplicator = 1.0,
		const RealType &	outVectorMultiplicator = 0.0,
		IndexType	begin = 0,
		IndexType	end = 0 ) const

Computes product of matrix and vector.

More precisely, it computes:

outVector = matrixMultiplicator * ( *this ) * inVector + outVectorMultiplicator * outVector

Template Parameters

InVector	is type of input vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.
OutVector	is type of output vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.

Parameters

inVector	is input vector.
outVector	is output vector.
matrixMultiplicator	is a factor by which the matrix is multiplied. It is one by default.
outVectorMultiplicator	is a factor by which the outVector is multiplied before added to the result of matrix-vector product. It is zero by default.
begin	is the beginning of the rows range for which the vector product is computed. It is zero by default.
end	is the end of the rows range for which the vector product is computed. It is number if the matrix rows by default.

Note that the ouput vector dimension must be the same as the number of matrix rows no matter how we set begin and end parameters. These parameters just say that some matrix rows and the output vector elements are omitted.

The documentation for this class was generated from the following files:

src/TNL/Matrices/DenseMatrixBase.h
src/TNL/Matrices/DenseMatrixBase.hpp

Public Types

Public Member Functions

Static Public Member Functions

Protected Types

Protected Member Functions

Protected Attributes

Detailed Description

Member Typedef Documentation

◆ SegmentsType

Constructor & Destructor Documentation

◆ DenseMatrixBase() [1/3]

◆ DenseMatrixBase() [2/3]

◆ DenseMatrixBase() [3/3]

Member Function Documentation

◆ addElement()

◆ bind()

◆ forAllElements() [1/2]

◆ forAllElements() [2/2]

◆ forAllRows() [1/2]

◆ forAllRows() [2/2]

◆ forElements() [1/2]

◆ forElements() [2/2]

◆ forRows() [1/2]

◆ forRows() [2/2]

◆ getCompressedRowLengths()

◆ getElement()

◆ getRow() [1/2]

◆ getRow() [2/2]

◆ getRowCapacities()

◆ getSerializationType()

◆ operator!=()

◆ operator()() [1/2]

◆ operator()() [2/2]

◆ operator=()

◆ operator==()

◆ print()

◆ reduceAllRows()

◆ reduceRows()

◆ sequentialForAllRows() [1/2]

◆ sequentialForAllRows() [2/2]

◆ sequentialForRows() [1/2]

◆ sequentialForRows() [2/2]

◆ setElement()

◆ setValue()

◆ vectorProduct()