Template Numerical Library version\ main:bb09b17
Loading...
Searching...
No Matches
TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index > Class Template Reference

"Matrix-free matrix" based on lambda functions. More...

#include <TNL/Matrices/LambdaMatrix.h>

Public Types

using CompressedRowLengthsLambdaType = CompressedRowLengthsLambda
 Type of the lambda function returning the number of non-zero elements in each row.
 
using ConstRowView = RowView
 Type of constant Lambda matrix row view.
 
using DeviceType = Device
 The device where the matrix is allocated.
 
using IndexType = Index
 The type used for matrix elements indexing.
 
using MatrixElementsLambdaType = MatrixElementsLambda
 Type of the lambda function returning the matrix elements.
 
using RealType = Real
 The type of matrix elements.
 
using RowView = LambdaMatrixRowView< MatrixElementsLambdaType, CompressedRowLengthsLambdaType, RealType, IndexType >
 Type of Lambda matrix row view.
 

Public Member Functions

 LambdaMatrix (const LambdaMatrix &matrix)=default
 Copy constructor.
 
 LambdaMatrix (IndexType rows, IndexType columns, MatrixElementsLambda &matrixElements, CompressedRowLengthsLambda &compressedRowLengths)
 Constructor with matrix dimensions and lambda functions defining the matrix elements.
 
 LambdaMatrix (LambdaMatrix &&matrix) noexcept=default
 Move constructor.
 
 LambdaMatrix (MatrixElementsLambda &matrixElements, CompressedRowLengthsLambda &compressedRowLengths)
 Constructor with lambda functions defining the matrix elements.
 
template<typename Function >
void forAllElements (Function &function) const
 This method calls forElements for all matrix rows (for constant instances).
 
template<typename Function >
void forAllRows (Function &&function) const
 Method for parallel iteration over all matrix rows for constant instances.
 
template<typename Function >
void forElements (IndexType begin, IndexType end, Function &function) const
 Method for iteration over all matrix rows for constant instances.
 
template<typename Function >
void forRows (IndexType begin, IndexType end, Function &&function) const
 Method for parallel iteration over matrix rows from interval [ begin, end) for constant instances.
 
__cuda_callable__ IndexType getColumns () const
 Returns a number of matrix columns.
 
template<typename RowLengthsVector >
void getCompressedRowLengths (RowLengthsVector &rowLengths) const
 Computes number of non-zeros in each row.
 
template<typename Vector >
void getCompressedRowLengths (Vector &rowLengths) const
 
__cuda_callable__ const CompressedRowLengthsLambda & getCompressedRowLengthsLambda () const
 Get reference to the lambda function returning number of non-zero elements in each row.
 
RealType getElement (IndexType row, IndexType column) const
 Returns value of matrix element at position given by its row and column index.
 
__cuda_callable__ const MatrixElementsLambda & getMatrixElementsLambda () const
 Get reference to the lambda function returning the matrix elements values and column indexes.
 
IndexType getNonzeroElementsCount () const
 Returns number of non-zero matrix elements.
 
__cuda_callable__ ConstRowView getRow (IndexType rowIdx) const
 Getter of simple structure for accessing given matrix row.
 
template<typename Vector >
void getRowCapacities (Vector &rowCapacities) const
 Compute capacities of all rows.
 
__cuda_callable__ IndexType getRows () const
 Returns a number of matrix rows.
 
void print (std::ostream &str) const
 Method for printing the matrix to output stream.
 
template<typename Fetch , typename Reduce , typename Keep , typename FetchReal >
void reduceAllRows (Fetch &&fetch, const Reduce &reduce, Keep &&keep, const FetchReal &identity) const
 Method for performing general reduction on ALL matrix rows.
 
template<typename Fetch , typename Reduce , typename Keep , typename FetchReal >
void reduceRows (IndexType begin, IndexType end, Fetch &&fetch, const Reduce &reduce, Keep &&keep, const FetchReal &identity) const
 Method for performing general reduction on matrix rows.
 
template<typename Function >
void sequentialForAllRows (Function &&function) const
 This method calls sequentialForRows for all matrix rows (for constant instances).
 
template<typename Function >
void sequentialForRows (IndexType begin, IndexType end, Function &&function) const
 Method for sequential iteration over all matrix rows for constant instances.
 
void setDimensions (IndexType rows, IndexType columns)
 Set number of rows and columns of this matrix.
 
template<typename InVector , typename OutVector >
void vectorProduct (const InVector &inVector, OutVector &outVector, const RealType &matrixMultiplicator=1.0, const RealType &outVectorMultiplicator=0.0, IndexType begin=0, IndexType end=0) const
 Computes product of matrix and vector.
 

Static Public Member Functions

static constexpr bool isBinary ()
 
static constexpr bool isSymmetric ()
 

Protected Attributes

IndexType columns
 
CompressedRowLengthsLambda compressedRowLengthsLambda
 
MatrixElementsLambda matrixElementsLambda
 
IndexType rows
 

Detailed Description

template<typename MatrixElementsLambda, typename CompressedRowLengthsLambda, typename Real = double, typename Device = Devices::Host, typename Index = int>
class TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >

"Matrix-free matrix" based on lambda functions.

The elements of this matrix are not stored explicitly in memory but implicitly on a form of lambda functions.

Template Parameters
MatrixElementsLambdais a lambda function returning matrix elements values and positions.
MatrixElementsLambdais a lambda function returning matrix elements values and positions.

It has the following form:

auto matrixElements = [] __cuda_callable__ ( Index rows, Index columns, Index rowIdx, Index localIdx, Index& columnIdx, Real&
value ) { ... }
#define __cuda_callable__
Definition Macros.h:49
Definition Real.h:14

where rows is the number of matrix rows, columns is the number of matrix columns, rowIdx is the index of matrix row being queried, localIdx is the rank of the non-zero element in given row, columnIdx is a column index of the matrix element computed by this lambda and value is a value of the matrix element computed by this lambda.

Template Parameters
CompressedRowLengthsLambdais a lambda function returning a number of non-zero elements in each row.

It has the following form:

auto rowLengths = [] __cuda_callable__ ( Index rows, Index columns, Index rowIdx ) -> IndexType { ... }
Index IndexType
The type used for matrix elements indexing.
Definition LambdaMatrix.h:69

where rows is the number of matrix rows, columns is the number of matrix columns and rowIdx is an index of the row being queried.

Template Parameters
Realis a type of matrix elements values.
Deviceis a device on which the lambda functions will be evaluated.
Indexis a type to be used for indexing.

Constructor & Destructor Documentation

◆ LambdaMatrix() [1/4]

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::LambdaMatrix ( MatrixElementsLambda & matrixElements,
CompressedRowLengthsLambda & compressedRowLengths )

Constructor with lambda functions defining the matrix elements.

Note: It might be difficult to express the types of the lambdas. For easier creation of LambdaMatrix you may use LambdaMatrixFactory.

Parameters
matrixElementsis a lambda function giving matrix elements position and value.
compressedRowLengthsis a lambda function returning how many non-zero matrix elements are in given row.
Example
#include <iostream>
#include <TNL/Matrices/LambdaMatrix.h>
int
main( int argc, char* argv[] )
{
/***
* Lambda functions defining the matrix.
*/
auto compressedRowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
return 1;
};
auto matrixElements1 =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = rowIdx;
value = 1.0;
};
auto matrixElements2 =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = rowIdx;
value = rowIdx;
};
const int size = 5;
/***
* Matrix construction with explicit type definition.
*/
matrixElements1, compressedRowLengths ) );
MatrixType m1( size, size, matrixElements1, compressedRowLengths );
/***
* Matrix construction using 'auto'.
*/
auto m2 =
m2.setDimensions( size, size );
std::cout << "The first lambda matrix: " << std::endl << m1 << std::endl;
std::cout << "The second lambda matrix: " << std::endl << m2 << std::endl;
}
T endl(T... args)
static auto create(MatrixElementsLambda &matrixElementsLambda, CompressedRowLengthsLambda &compressedRowLengthsLambda) -> LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >
Creates lambda matrix with given lambda functions.
Definition LambdaMatrix.h:572
Output
The first lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 1:1
Row: 2 -> 2:1
Row: 3 -> 3:1
Row: 4 -> 4:1
The second lambda matrix:
Row: 0 ->
Row: 1 -> 1:1
Row: 2 -> 2:2
Row: 3 -> 3:3
Row: 4 -> 4:4

◆ LambdaMatrix() [2/4]

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::LambdaMatrix ( IndexType rows,
IndexType columns,
MatrixElementsLambda & matrixElements,
CompressedRowLengthsLambda & compressedRowLengths )

Constructor with matrix dimensions and lambda functions defining the matrix elements.

Note: It might be difficult to express the types of the lambdas. For easier creation of LambdaMatrix you may use LambdaMatrixFactory.

Parameters
rowsis a number of the matrix rows.
columnsis a number of the matrix columns.
matrixElementsis a lambda function giving matrix elements position and value.
compressedRowLengthsis a lambda function returning how many non-zero matrix elements are in given row.
Example
#include <iostream>
#include <TNL/Matrices/LambdaMatrix.h>
int
main( int argc, char* argv[] )
{
/***
* Lambda functions defining the matrix.
*/
auto compressedRowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
return 1;
};
auto matrixElements1 =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = rowIdx;
value = 1.0;
};
auto matrixElements2 =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = rowIdx;
value = rowIdx;
};
const int size = 5;
/***
* Matrix construction with explicit type definition.
*/
matrixElements1, compressedRowLengths ) );
MatrixType m1( size, size, matrixElements1, compressedRowLengths );
/***
* Matrix construction using 'auto'.
*/
auto m2 =
m2.setDimensions( size, size );
std::cout << "The first lambda matrix: " << std::endl << m1 << std::endl;
std::cout << "The second lambda matrix: " << std::endl << m2 << std::endl;
}
Output
The first lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 1:1
Row: 2 -> 2:1
Row: 3 -> 3:1
Row: 4 -> 4:1
The second lambda matrix:
Row: 0 ->
Row: 1 -> 1:1
Row: 2 -> 2:2
Row: 3 -> 3:3
Row: 4 -> 4:4

◆ LambdaMatrix() [3/4]

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real = double, typename Device = Devices::Host, typename Index = int>
TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::LambdaMatrix ( const LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index > & matrix)
default

Copy constructor.

Parameters
matrixis input matrix.

◆ LambdaMatrix() [4/4]

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real = double, typename Device = Devices::Host, typename Index = int>
TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::LambdaMatrix ( LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index > && matrix)
defaultnoexcept

Move constructor.

Parameters
matrixis input matrix.

Member Function Documentation

◆ forAllElements()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename Function >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::forAllElements ( Function & function) const

This method calls forElements for all matrix rows (for constant instances).

See LambdaMatrix::forElements.

Template Parameters
Functionis a type of lambda function that will operate on matrix elements.
Parameters
functionis an instance of the lambda function to be called in each row.
Example
#include <iostream>
#include <TNL/Matrices/DenseMatrix.h>
#include <TNL/Matrices/LambdaMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
forAllElementsExample()
{
/***
* Lambda functions defining the matrix.
*/
auto compressedRowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
return columns;
};
auto matrixElements =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = localIdx;
value = TNL::max( rowIdx - columnIdx + 1, 0 );
};
auto matrix = MatrixFactory::create( 5, 5, matrixElements, compressedRowLengths );
auto denseView = denseMatrix.getView();
auto f = [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double value ) mutable
{
denseView.setElement( rowIdx, columnIdx, value );
};
matrix.forAllElements( f );
std::cout << "Original lambda matrix:" << std::endl << matrix << std::endl;
std::cout << "Dense matrix:" << std::endl << denseMatrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Copying matrix on host: " << std::endl;
forAllElementsExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Copying matrix on CUDA device: " << std::endl;
forAllElementsExample< TNL::Devices::Cuda >();
#endif
}
Implementation of dense matrix, i.e. matrix storing explicitly all of its elements including zeros.
Definition DenseMatrix.h:32
constexpr ResultType max(const T1 &a, const T2 &b)
This function returns maximum of two numbers.
Definition Math.h:48
Helper class for creating instances of LambdaMatrix.
Definition LambdaMatrix.h:553
Output
Copying matrix on host:
Original lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0
Row: 1 -> 0:2 1:1 2:0 3:0 4:0
Row: 2 -> 0:3 1:2 2:1 3:0 4:0
Row: 3 -> 0:4 1:3 2:2 3:1 4:0
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Copying matrix on CUDA device:
Original lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0
Row: 1 -> 0:2 1:1 2:0 3:0 4:0
Row: 2 -> 0:3 1:2 2:1 3:0 4:0
Row: 3 -> 0:4 1:3 2:2 3:1 4:0
Row: 4 -> 0:5 1:4 2:3 3:2 4:1

◆ forAllRows()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename Function >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::forAllRows ( Function && function) const

Method for parallel iteration over all matrix rows for constant instances.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method LambdaMatrix::forAllElements where more than one thread can be mapped to each row.

Template Parameters
Functionis type of the lambda function.
Parameters
functionis an instance of the lambda function to be called for each row.
auto function = [] __cuda_callable__ ( RowView& row ) { ... };
RowView is a simple structure for accessing rows of Lambda matrix.
Definition LambdaMatrixRowView.h:50

RowView represents matrix row - see TNL::Matrices::LambdaMatrix::RowView.

Example
#include <iostream>
#include <TNL/Matrices/LambdaMatrix.h>
#include <TNL/Matrices/DenseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
forRowsExample()
{
/***
* Set matrix representing approximation of the Laplace operator on regular
* grid using the finite difference method.
*/
const int gridSize( 4 );
const int matrixSize = gridSize * gridSize;
auto rowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
const int gridRow = rowIdx / gridSize; // coordinates in the numerical grid
const int gridColumn = rowIdx % gridSize;
if( gridRow == 0 || gridRow == gridSize - 1 || // boundary grid node
gridColumn == 0 || gridColumn == gridSize - 1 )
return 1;
return 5;
};
auto matrixElements =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
const int gridRow = rowIdx / gridSize; // coordinates in the numerical grid
const int gridColumn = rowIdx % gridSize;
if( gridRow == 0 || gridRow == gridSize - 1 || // boundary grid node
gridColumn == 0 || gridColumn == gridSize - 1 )
{
columnIdx = rowIdx; // diagonal element ....
value = 1.0; // ... is set to 1
}
else // interior grid node
{
switch( localIdx ) // set diagonal element to 4
{ // and the others to -1
case 0:
columnIdx = rowIdx - gridSize;
value = 1;
break;
case 1:
columnIdx = rowIdx - 1;
value = 1;
break;
case 2:
columnIdx = rowIdx;
value = -4;
break;
case 3:
columnIdx = rowIdx + 1;
value = 1;
break;
case 4:
columnIdx = rowIdx + gridSize;
value = 1;
break;
}
}
};
auto matrix =
TNL::Matrices::LambdaMatrixFactory< double, Device, int >::create( matrixSize, matrixSize, matrixElements, rowLengths );
using MatrixType = decltype( matrix );
using RowView = typename MatrixType::RowView;
TNL::Matrices::DenseMatrix< double, Device > denseMatrix( matrixSize, matrixSize );
denseMatrix.setValue( 0.0 );
auto dense_view = denseMatrix.getView();
auto f = [ = ] __cuda_callable__( const RowView& row ) mutable
{
auto dense_row = dense_view.getRow( row.getRowIndex() );
for( int localIdx = 0; localIdx < row.getSize(); localIdx++ )
dense_row.setValue( row.getColumnIndex( localIdx ), row.getValue( localIdx ) );
};
matrix.forAllRows( f );
std::cout << "Laplace operator lambda matrix: " << std::endl << matrix << std::endl;
std::cout << "Laplace operator dense matrix: " << std::endl << denseMatrix << std::endl;
/***
* Compute sum of elements in each row and store it into a vector.
*/
auto sum_view = sum_vector.getView();
matrix.forAllRows(
[ = ] __cuda_callable__( const RowView& row ) mutable
{
double sum( 0.0 );
for( auto element : row )
sum += TNL::abs( element.value() );
sum_view[ row.getRowIndex() ] = sum;
} );
std::cout << "Sums in matrix rows = " << sum_vector << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Running example on CPU ... " << std::endl;
forRowsExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Running example on CUDA GPU ... " << std::endl;
forRowsExample< TNL::Devices::Cuda >();
#endif
}
Vector extends Array with algebraic operations.
Definition Vector.h:36
__cuda_callable__ T abs(const T &n)
This function returns absolute value of given number n.
Definition Math.h:74
Output
Running example on CPU ...
Laplace operator lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 1:1
Row: 2 -> 2:1
Row: 3 -> 3:1
Row: 4 -> 4:1
Row: 5 -> 1:1 4:1 5:-4 6:1 9:1
Row: 6 -> 2:1 5:1 6:-4 7:1 10:1
Row: 7 -> 7:1
Row: 8 -> 8:1
Row: 9 -> 5:1 8:1 9:-4 10:1 13:1
Row: 10 -> 6:1 9:1 10:-4 11:1 14:1
Row: 11 -> 11:1
Row: 12 -> 12:1
Row: 13 -> 13:1
Row: 14 -> 14:1
Row: 15 -> 15:1
Laplace operator dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 1 -> 0:0 1:1 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 2 -> 0:0 1:0 2:1 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 3 -> 0:0 1:0 2:0 3:1 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:1 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 5 -> 0:0 1:1 2:0 3:0 4:1 5:-4 6:1 7:0 8:0 9:1 10:0 11:0 12:0 13:0 14:0 15:0
Row: 6 -> 0:0 1:0 2:1 3:0 4:0 5:1 6:-4 7:1 8:0 9:0 10:1 11:0 12:0 13:0 14:0 15:0
Row: 7 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:1 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 8 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:1 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 9 -> 0:0 1:0 2:0 3:0 4:0 5:1 6:0 7:0 8:1 9:-4 10:1 11:0 12:0 13:1 14:0 15:0
Row: 10 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:1 7:0 8:0 9:1 10:-4 11:1 12:0 13:0 14:1 15:0
Row: 11 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:1 12:0 13:0 14:0 15:0
Row: 12 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:1 13:0 14:0 15:0
Row: 13 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:1 14:0 15:0
Row: 14 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:1 15:0
Row: 15 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:1
Sums in matrix rows = [ 1, 1, 1, 1, 1, 8, 8, 1, 1, 8, 8, 1, 1, 1, 1, 1 ]
Running example on CUDA GPU ...
Laplace operator lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 1:1
Row: 2 -> 2:1
Row: 3 -> 3:1
Row: 4 -> 4:1
Row: 5 -> 1:1 4:1 5:-4 6:1 9:1
Row: 6 -> 2:1 5:1 6:-4 7:1 10:1
Row: 7 -> 7:1
Row: 8 -> 8:1
Row: 9 -> 5:1 8:1 9:-4 10:1 13:1
Row: 10 -> 6:1 9:1 10:-4 11:1 14:1
Row: 11 -> 11:1
Row: 12 -> 12:1
Row: 13 -> 13:1
Row: 14 -> 14:1
Row: 15 -> 15:1
Laplace operator dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 1 -> 0:0 1:1 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 2 -> 0:0 1:0 2:1 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 3 -> 0:0 1:0 2:0 3:1 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:1 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 5 -> 0:0 1:1 2:0 3:0 4:1 5:-4 6:1 7:0 8:0 9:1 10:0 11:0 12:0 13:0 14:0 15:0
Row: 6 -> 0:0 1:0 2:1 3:0 4:0 5:1 6:-4 7:1 8:0 9:0 10:1 11:0 12:0 13:0 14:0 15:0
Row: 7 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:1 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 8 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:1 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 9 -> 0:0 1:0 2:0 3:0 4:0 5:1 6:0 7:0 8:1 9:-4 10:1 11:0 12:0 13:1 14:0 15:0
Row: 10 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:1 7:0 8:0 9:1 10:-4 11:1 12:0 13:0 14:1 15:0
Row: 11 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:1 12:0 13:0 14:0 15:0
Row: 12 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:1 13:0 14:0 15:0
Row: 13 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:1 14:0 15:0
Row: 14 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:1 15:0
Row: 15 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:1
Sums in matrix rows = [ 1, 1, 1, 1, 1, 8, 8, 1, 1, 8, 8, 1, 1, 1, 1, 1 ]

◆ forElements()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename Function >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::forElements ( IndexType begin,
IndexType end,
Function & function ) const

Method for iteration over all matrix rows for constant instances.

Template Parameters
Functionis type of lambda function that will operate on matrix elements. It is should have form like
auto function = [=] __cuda_callable__ ( IndexType rowIdx, IndexType columnIdx, IndexType columnIdx_, const RealType& value
) { ... };

The column index repeats twice only for compatibility with sparse matrices.

Parameters
begindefines beginning of the range [begin,end) of rows to be processed.
enddefines ending of the range [begin,end) of rows to be processed.
functionis an instance of the lambda function to be called in each row.
Example
#include <iostream>
#include <TNL/Matrices/DenseMatrix.h>
#include <TNL/Matrices/LambdaMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
forElementsExample()
{
/***
* Lambda functions defining the matrix.
*/
auto rowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
return columns;
};
auto matrixElements =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = localIdx;
value = TNL::max( rowIdx - columnIdx + 1, 0 );
};
auto matrix = MatrixFactory::create( 5, 5, matrixElements, rowLengths );
auto denseView = denseMatrix.getView();
auto f = [ = ] __cuda_callable__( int rowIdx, int localIdx, int columnIdx, double value ) mutable
{
denseView.setElement( rowIdx, columnIdx, value );
};
matrix.forElements( 0, matrix.getRows(), f );
std::cout << "Original lambda matrix:" << std::endl << matrix << std::endl;
std::cout << "Dense matrix:" << std::endl << denseMatrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Copying matrix on host: " << std::endl;
forElementsExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Copying matrix on CUDA device: " << std::endl;
forElementsExample< TNL::Devices::Cuda >();
#endif
}
Output
Copying matrix on host:
Original lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0
Row: 1 -> 0:2 1:1 2:0 3:0 4:0
Row: 2 -> 0:3 1:2 2:1 3:0 4:0
Row: 3 -> 0:4 1:3 2:2 3:1 4:0
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Copying matrix on CUDA device:
Original lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0
Row: 1 -> 0:2 1:1 2:0 3:0 4:0
Row: 2 -> 0:3 1:2 2:1 3:0 4:0
Row: 3 -> 0:4 1:3 2:2 3:1 4:0
Row: 4 -> 0:5 1:4 2:3 3:2 4:1

◆ forRows()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename Function >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::forRows ( IndexType begin,
IndexType end,
Function && function ) const

Method for parallel iteration over matrix rows from interval [ begin, end) for constant instances.

In each row, given lambda function is performed. Each row is processed by at most one thread unlike the method LambdaMatrix::forElements where more than one thread can be mapped to each row.

Template Parameters
Functionis type of the lambda function.
Parameters
begindefines beginning of the range [ begin,end ) of rows to be processed.
enddefines ending of the range [ begin, end ) of rows to be processed.
functionis an instance of the lambda function to be called for each row.
auto function = [] __cuda_callable__ ( RowView& row ) { ... };

RowView represents matrix row - see TNL::Matrices::LambdaMatrix::RowView.

Example
#include <iostream>
#include <TNL/Matrices/LambdaMatrix.h>
#include <TNL/Matrices/DenseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
forRowsExample()
{
/***
* Set matrix representing approximation of the Laplace operator on regular
* grid using the finite difference method.
*/
const int gridSize( 4 );
const int matrixSize = gridSize * gridSize;
auto rowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
const int gridRow = rowIdx / gridSize; // coordinates in the numerical grid
const int gridColumn = rowIdx % gridSize;
if( gridRow == 0 || gridRow == gridSize - 1 || // boundary grid node
gridColumn == 0 || gridColumn == gridSize - 1 )
return 1;
return 5;
};
auto matrixElements =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
const int gridRow = rowIdx / gridSize; // coordinates in the numerical grid
const int gridColumn = rowIdx % gridSize;
if( gridRow == 0 || gridRow == gridSize - 1 || // boundary grid node
gridColumn == 0 || gridColumn == gridSize - 1 )
{
columnIdx = rowIdx; // diagonal element ....
value = 1.0; // ... is set to 1
}
else // interior grid node
{
switch( localIdx ) // set diagonal element to 4
{ // and the others to -1
case 0:
columnIdx = rowIdx - gridSize;
value = 1;
break;
case 1:
columnIdx = rowIdx - 1;
value = 1;
break;
case 2:
columnIdx = rowIdx;
value = -4;
break;
case 3:
columnIdx = rowIdx + 1;
value = 1;
break;
case 4:
columnIdx = rowIdx + gridSize;
value = 1;
break;
}
}
};
auto matrix =
TNL::Matrices::LambdaMatrixFactory< double, Device, int >::create( matrixSize, matrixSize, matrixElements, rowLengths );
using MatrixType = decltype( matrix );
using RowView = typename MatrixType::RowView;
TNL::Matrices::DenseMatrix< double, Device > denseMatrix( matrixSize, matrixSize );
denseMatrix.setValue( 0.0 );
auto dense_view = denseMatrix.getView();
auto f = [ = ] __cuda_callable__( const RowView& row ) mutable
{
auto dense_row = dense_view.getRow( row.getRowIndex() );
for( int localIdx = 0; localIdx < row.getSize(); localIdx++ )
dense_row.setValue( row.getColumnIndex( localIdx ), row.getValue( localIdx ) );
};
matrix.forAllRows( f );
std::cout << "Laplace operator lambda matrix: " << std::endl << matrix << std::endl;
std::cout << "Laplace operator dense matrix: " << std::endl << denseMatrix << std::endl;
/***
* Compute sum of elements in each row and store it into a vector.
*/
auto sum_view = sum_vector.getView();
matrix.forAllRows(
[ = ] __cuda_callable__( const RowView& row ) mutable
{
double sum( 0.0 );
for( auto element : row )
sum += TNL::abs( element.value() );
sum_view[ row.getRowIndex() ] = sum;
} );
std::cout << "Sums in matrix rows = " << sum_vector << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Running example on CPU ... " << std::endl;
forRowsExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Running example on CUDA GPU ... " << std::endl;
forRowsExample< TNL::Devices::Cuda >();
#endif
}
Output
Running example on CPU ...
Laplace operator lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 1:1
Row: 2 -> 2:1
Row: 3 -> 3:1
Row: 4 -> 4:1
Row: 5 -> 1:1 4:1 5:-4 6:1 9:1
Row: 6 -> 2:1 5:1 6:-4 7:1 10:1
Row: 7 -> 7:1
Row: 8 -> 8:1
Row: 9 -> 5:1 8:1 9:-4 10:1 13:1
Row: 10 -> 6:1 9:1 10:-4 11:1 14:1
Row: 11 -> 11:1
Row: 12 -> 12:1
Row: 13 -> 13:1
Row: 14 -> 14:1
Row: 15 -> 15:1
Laplace operator dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 1 -> 0:0 1:1 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 2 -> 0:0 1:0 2:1 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 3 -> 0:0 1:0 2:0 3:1 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:1 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 5 -> 0:0 1:1 2:0 3:0 4:1 5:-4 6:1 7:0 8:0 9:1 10:0 11:0 12:0 13:0 14:0 15:0
Row: 6 -> 0:0 1:0 2:1 3:0 4:0 5:1 6:-4 7:1 8:0 9:0 10:1 11:0 12:0 13:0 14:0 15:0
Row: 7 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:1 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 8 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:1 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 9 -> 0:0 1:0 2:0 3:0 4:0 5:1 6:0 7:0 8:1 9:-4 10:1 11:0 12:0 13:1 14:0 15:0
Row: 10 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:1 7:0 8:0 9:1 10:-4 11:1 12:0 13:0 14:1 15:0
Row: 11 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:1 12:0 13:0 14:0 15:0
Row: 12 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:1 13:0 14:0 15:0
Row: 13 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:1 14:0 15:0
Row: 14 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:1 15:0
Row: 15 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:1
Sums in matrix rows = [ 1, 1, 1, 1, 1, 8, 8, 1, 1, 8, 8, 1, 1, 1, 1, 1 ]
Running example on CUDA GPU ...
Laplace operator lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 1:1
Row: 2 -> 2:1
Row: 3 -> 3:1
Row: 4 -> 4:1
Row: 5 -> 1:1 4:1 5:-4 6:1 9:1
Row: 6 -> 2:1 5:1 6:-4 7:1 10:1
Row: 7 -> 7:1
Row: 8 -> 8:1
Row: 9 -> 5:1 8:1 9:-4 10:1 13:1
Row: 10 -> 6:1 9:1 10:-4 11:1 14:1
Row: 11 -> 11:1
Row: 12 -> 12:1
Row: 13 -> 13:1
Row: 14 -> 14:1
Row: 15 -> 15:1
Laplace operator dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 1 -> 0:0 1:1 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 2 -> 0:0 1:0 2:1 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 3 -> 0:0 1:0 2:0 3:1 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:1 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 5 -> 0:0 1:1 2:0 3:0 4:1 5:-4 6:1 7:0 8:0 9:1 10:0 11:0 12:0 13:0 14:0 15:0
Row: 6 -> 0:0 1:0 2:1 3:0 4:0 5:1 6:-4 7:1 8:0 9:0 10:1 11:0 12:0 13:0 14:0 15:0
Row: 7 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:1 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 8 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:1 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 9 -> 0:0 1:0 2:0 3:0 4:0 5:1 6:0 7:0 8:1 9:-4 10:1 11:0 12:0 13:1 14:0 15:0
Row: 10 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:1 7:0 8:0 9:1 10:-4 11:1 12:0 13:0 14:1 15:0
Row: 11 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:1 12:0 13:0 14:0 15:0
Row: 12 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:1 13:0 14:0 15:0
Row: 13 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:1 14:0 15:0
Row: 14 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:1 15:0
Row: 15 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:1
Sums in matrix rows = [ 1, 1, 1, 1, 1, 8, 8, 1, 1, 8, 8, 1, 1, 1, 1, 1 ]

◆ getColumns()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
__cuda_callable__ Index TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::getColumns ( ) const
nodiscard

Returns a number of matrix columns.

Returns
number of matrix columns.

◆ getCompressedRowLengths()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real = double, typename Device = Devices::Host, typename Index = int>
template<typename RowLengthsVector >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::getCompressedRowLengths ( RowLengthsVector & rowLengths) const

Computes number of non-zeros in each row.

Parameters
rowLengthsis a vector into which the number of non-zeros in each row will be stored.
Example
#include <iostream>
#include <TNL/Matrices/LambdaMatrix.h>
int
main( int argc, char* argv[] )
{
/***
* Lambda functions defining the matrix.
*/
auto rowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
return columns;
};
auto matrixElements =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = localIdx;
value = TNL::max( rowIdx - columnIdx + 1, 0 );
};
const int size = 5;
auto matrix =
matrix.getCompressedRowLengths( rowLengthsVector );
std::cout << "Matrix looks as:" << std::endl << matrix << std::endl;
std::cout << "Compressed row lengths are: " << rowLengthsVector << std::endl;
}
Output
Matrix looks as:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Compressed row lengths are: [ 1, 2, 3, 4, 5 ]

◆ getCompressedRowLengthsLambda()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
__cuda_callable__ const CompressedRowLengthsLambda & TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::getCompressedRowLengthsLambda ( ) const
nodiscard

Get reference to the lambda function returning number of non-zero elements in each row.

Returns
constant reference to CompressedRowLengthsLambda.

◆ getElement()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
Real TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::getElement ( IndexType row,
IndexType column ) const
nodiscard

Returns value of matrix element at position given by its row and column index.

Parameters
rowis a row index of the matrix element.
columni a column index of the matrix element.
Returns
value of given matrix element.

◆ getMatrixElementsLambda()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
__cuda_callable__ const MatrixElementsLambda & TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::getMatrixElementsLambda ( ) const
nodiscard

Get reference to the lambda function returning the matrix elements values and column indexes.

Returns
constant reference to MatrixElementsLambda.

◆ getNonzeroElementsCount()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
Index TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::getNonzeroElementsCount ( ) const
nodiscard

Returns number of non-zero matrix elements.

Returns
number of all non-zero matrix elements.
Example
#include <iostream>
#include <TNL/Matrices/LambdaMatrix.h>
int
main( int argc, char* argv[] )
{
/***
* Lambda functions defining the matrix.
*/
auto rowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
return columns;
};
auto matrixElements =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = localIdx;
value = TNL::max( rowIdx - columnIdx + 1, 0 );
};
const int size = 5;
auto matrix =
std::cout << "Matrix looks as:" << std::endl << matrix << std::endl;
std::cout << "Non-zero elements count is: " << matrix.getNonzeroElementsCount() << std::endl;
}
Output
Matrix looks as:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Non-zero elements count is: 15

◆ getRow()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
__cuda_callable__ auto TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::getRow ( IndexType rowIdx) const
nodiscard

Getter of simple structure for accessing given matrix row.

Parameters
rowIdxis matrix row index.
Returns
RowView for accessing given matrix row.
Example
#include <iostream>
#include <TNL/Algorithms/parallelFor.h>
#include <TNL/Matrices/LambdaMatrix.h>
#include <TNL/Matrices/DenseMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
getRowExample()
{
/***
* Set matrix representing approximation of the Laplace operator on regular
* grid using the finite difference method.
*/
const int gridSize( 4 );
const int matrixSize = gridSize * gridSize;
auto rowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
const int gridRow = rowIdx / gridSize; // coordinates in the numerical grid
const int gridColumn = rowIdx % gridSize;
if( gridRow == 0 || gridRow == gridSize - 1 || // boundary grid node
gridColumn == 0 || gridColumn == gridSize - 1 )
return 1;
return 5;
};
auto matrixElements =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
const int gridRow = rowIdx / gridSize; // coordinates in the numerical grid
const int gridColumn = rowIdx % gridSize;
if( gridRow == 0 || gridRow == gridSize - 1 || // boundary grid node
gridColumn == 0 || gridColumn == gridSize - 1 )
{
columnIdx = rowIdx; // diagonal element ....
value = 1.0; // ... is set to 1
}
else // interior grid node
{
switch( localIdx ) // set diagonal element to 4
{ // and the others to -1
case 0:
columnIdx = rowIdx - gridSize;
value = 1;
break;
case 1:
columnIdx = rowIdx - 1;
value = 1;
break;
case 2:
columnIdx = rowIdx;
value = -4;
break;
case 3:
columnIdx = rowIdx + 1;
value = 1;
break;
case 4:
columnIdx = rowIdx + gridSize;
value = 1;
break;
}
}
};
auto matrix =
TNL::Matrices::LambdaMatrixFactory< double, Device, int >::create( matrixSize, matrixSize, matrixElements, rowLengths );
TNL::Matrices::DenseMatrix< double, Device > denseMatrix( matrixSize, matrixSize );
denseMatrix.setValue( 0.0 );
auto dense_view = denseMatrix.getView();
auto f = [ = ] __cuda_callable__( const int& rowIdx ) mutable
{
auto row = matrix.getRow( rowIdx );
auto dense_row = dense_view.getRow( rowIdx );
for( int localIdx = 0; localIdx < row.getSize(); localIdx++ )
dense_row.setValue( row.getColumnIndex( localIdx ), row.getValue( localIdx ) );
};
std::cout << "Laplace operator lambda matrix: " << std::endl << matrix << std::endl;
std::cout << "Laplace operator dense matrix: " << std::endl << denseMatrix << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Running example on CPU ... " << std::endl;
getRowExample< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Running example on CUDA GPU ... " << std::endl;
getRowExample< TNL::Devices::Cuda >();
#endif
}
std::enable_if_t< std::is_integral_v< Begin > &&std::is_integral_v< End > > parallelFor(const Begin &begin, const End &end, typename Device::LaunchConfiguration launch_config, Function f, FunctionArgs... args)
Parallel for-loop function for 1D range specified with integral values.
Definition parallelFor.h:41
Output
Running example on CPU ...
Laplace operator lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 1:1
Row: 2 -> 2:1
Row: 3 -> 3:1
Row: 4 -> 4:1
Row: 5 -> 1:1 4:1 5:-4 6:1 9:1
Row: 6 -> 2:1 5:1 6:-4 7:1 10:1
Row: 7 -> 7:1
Row: 8 -> 8:1
Row: 9 -> 5:1 8:1 9:-4 10:1 13:1
Row: 10 -> 6:1 9:1 10:-4 11:1 14:1
Row: 11 -> 11:1
Row: 12 -> 12:1
Row: 13 -> 13:1
Row: 14 -> 14:1
Row: 15 -> 15:1
Laplace operator dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 1 -> 0:0 1:1 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 2 -> 0:0 1:0 2:1 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 3 -> 0:0 1:0 2:0 3:1 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:1 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 5 -> 0:0 1:1 2:0 3:0 4:1 5:-4 6:1 7:0 8:0 9:1 10:0 11:0 12:0 13:0 14:0 15:0
Row: 6 -> 0:0 1:0 2:1 3:0 4:0 5:1 6:-4 7:1 8:0 9:0 10:1 11:0 12:0 13:0 14:0 15:0
Row: 7 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:1 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 8 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:1 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 9 -> 0:0 1:0 2:0 3:0 4:0 5:1 6:0 7:0 8:1 9:-4 10:1 11:0 12:0 13:1 14:0 15:0
Row: 10 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:1 7:0 8:0 9:1 10:-4 11:1 12:0 13:0 14:1 15:0
Row: 11 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:1 12:0 13:0 14:0 15:0
Row: 12 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:1 13:0 14:0 15:0
Row: 13 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:1 14:0 15:0
Row: 14 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:1 15:0
Row: 15 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:1
Running example on CUDA GPU ...
Laplace operator lambda matrix:
Row: 0 -> 0:1
Row: 1 -> 1:1
Row: 2 -> 2:1
Row: 3 -> 3:1
Row: 4 -> 4:1
Row: 5 -> 1:1 4:1 5:-4 6:1 9:1
Row: 6 -> 2:1 5:1 6:-4 7:1 10:1
Row: 7 -> 7:1
Row: 8 -> 8:1
Row: 9 -> 5:1 8:1 9:-4 10:1 13:1
Row: 10 -> 6:1 9:1 10:-4 11:1 14:1
Row: 11 -> 11:1
Row: 12 -> 12:1
Row: 13 -> 13:1
Row: 14 -> 14:1
Row: 15 -> 15:1
Laplace operator dense matrix:
Row: 0 -> 0:1 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 1 -> 0:0 1:1 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 2 -> 0:0 1:0 2:1 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 3 -> 0:0 1:0 2:0 3:1 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 4 -> 0:0 1:0 2:0 3:0 4:1 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 5 -> 0:0 1:1 2:0 3:0 4:1 5:-4 6:1 7:0 8:0 9:1 10:0 11:0 12:0 13:0 14:0 15:0
Row: 6 -> 0:0 1:0 2:1 3:0 4:0 5:1 6:-4 7:1 8:0 9:0 10:1 11:0 12:0 13:0 14:0 15:0
Row: 7 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:1 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 8 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:1 9:0 10:0 11:0 12:0 13:0 14:0 15:0
Row: 9 -> 0:0 1:0 2:0 3:0 4:0 5:1 6:0 7:0 8:1 9:-4 10:1 11:0 12:0 13:1 14:0 15:0
Row: 10 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:1 7:0 8:0 9:1 10:-4 11:1 12:0 13:0 14:1 15:0
Row: 11 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:1 12:0 13:0 14:0 15:0
Row: 12 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:1 13:0 14:0 15:0
Row: 13 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:1 14:0 15:0
Row: 14 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:1 15:0
Row: 15 -> 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 12:0 13:0 14:0 15:1

See LambdaMatrixRowView.

◆ getRowCapacities()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename Vector >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::getRowCapacities ( Vector & rowCapacities) const

Compute capacities of all rows.

The row capacities are not stored explicitly and must be computed.

Parameters
rowCapacitiesis a vector where the row capacities will be stored.

◆ getRows()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
__cuda_callable__ Index TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::getRows ( ) const
nodiscard

Returns a number of matrix rows.

Returns
number of matrix rows.

◆ print()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::print ( std::ostream & str) const

Method for printing the matrix to output stream.

Parameters
stris the output stream.

◆ reduceAllRows()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename Fetch , typename Reduce , typename Keep , typename FetchReal >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::reduceAllRows ( Fetch && fetch,
const Reduce & reduce,
Keep && keep,
const FetchReal & identity ) const

Method for performing general reduction on ALL matrix rows.

Template Parameters
Fetchis a type of lambda function for data fetch declared as
auto fetch = [=] __cuda_callable__ ( IndexType rowIdx, IndexType columnIdx, RealType elementValue ) -> FetchValue

The return type of this lambda can be any non void.

Template Parameters
Reduceis a type of lambda function for reduction declared as
auto reduce = [=] __cuda_callable__ ( const FetchValue& v1, const FetchValue& v2 ) -> FetchValue
Template Parameters
Keepis a type of lambda function for storing results of reduction in each row. It is declared as
auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value )
Template Parameters
FetchValueis type returned by the Fetch lambda function.
Parameters
fetchis an instance of lambda function for data fetch.
reduceis an instance of lambda function for reduction.
keepin an instance of lambda function for storing results.
identityis the identity element for the reduction operation, i.e. element which does not change the result of the reduction.
Example
#include <iostream>
#include <iomanip>
#include <functional>
#include <TNL/Matrices/LambdaMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
reduceAllRows()
{
/***
* Lambda functions defining the matrix.
*/
auto rowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
return columns;
};
auto matrixElements =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = localIdx;
value = TNL::max( rowIdx - columnIdx + 1, 0 );
};
auto matrix = MatrixFactory::create( 5, 5, matrixElements, rowLengths );
/***
* Find largest element in each row.
*/
TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );
/***
* Prepare vector view and matrix view for lambdas.
*/
auto rowMaxView = rowMax.getView();
/***
* Fetch lambda just returns absolute value of matrix elements.
*/
auto fetch = [ = ] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
{
return TNL::abs( value );
};
/***
* Reduce lambda return maximum of given values.
*/
auto reduce = [ = ] __cuda_callable__( double& a, const double& b ) -> double
{
return TNL::max( a, b );
};
/***
* Keep lambda store the largest value in each row to the vector rowMax.
*/
auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable
{
rowMaxView[ rowIdx ] = value;
};
/***
* Compute the largest values in each row.
*/
matrix.reduceAllRows( fetch, reduce, keep, std::numeric_limits< double >::lowest() );
std::cout << "The matrix reads as: " << std::endl << matrix << std::endl;
std::cout << "Max. elements in rows are: " << rowMax << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "All rows reduction on host:" << std::endl;
reduceAllRows< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "All rows reduction on CUDA device:" << std::endl;
reduceAllRows< TNL::Devices::Cuda >();
#endif
}
Output
All rows reduction on host:
The matrix reads as:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Max. elements in rows are: [ 1, 2, 3, 4, 5 ]
All rows reduction on CUDA device:
The matrix reads as:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Max. elements in rows are: [ 1, 2, 3, 4, 5 ]

◆ reduceRows()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename Fetch , typename Reduce , typename Keep , typename FetchReal >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::reduceRows ( IndexType begin,
IndexType end,
Fetch && fetch,
const Reduce & reduce,
Keep && keep,
const FetchReal & identity ) const

Method for performing general reduction on matrix rows.

Template Parameters
Fetchis a type of lambda function for data fetch declared as
auto fetch = [=] __cuda_callable__ ( IndexType rowIdx, IndexType columnIdx, RealType elementValue ) -> FetchValue

The return type of this lambda can be any non void.

Template Parameters
Reduceis a type of lambda function for reduction declared as
auto reduce = [=] __cuda_callbale__ ( const FetchValue& v1, const FetchValue& v2 ) -> FetchValue
Template Parameters
Keepis a type of lambda function for storing results of reduction in each row. It is declared as
auto keep = [=] __cuda_callable__ ( IndexType rowIdx, const RealType& value )
Template Parameters
FetchValueis type returned by the Fetch lambda function.
Parameters
begindefines beginning of the range [begin, end) of rows to be processed.
enddefines ending of the range [begin,end) of rows to be processed.
fetchis an instance of lambda function for data fetch.
reduceis an instance of lambda function for reduction.
keepin an instance of lambda function for storing results.
identityis the identity element for the reduction operation, i.e. element which does not change the result of the reduction.
Example
#include <iostream>
#include <iomanip>
#include <functional>
#include <TNL/Matrices/LambdaMatrix.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Device >
void
reduceRows()
{
/***
* Lambda functions defining the matrix.
*/
auto rowLengths = [ = ] __cuda_callable__( const int rows, const int columns, const int rowIdx ) -> int
{
return columns;
};
auto matrixElements =
const int rows, const int columns, const int rowIdx, const int localIdx, int& columnIdx, double& value )
{
columnIdx = localIdx;
value = TNL::max( rowIdx - columnIdx + 1, 0 );
};
auto matrix = MatrixFactory::create( 5, 5, matrixElements, rowLengths );
/***
* Find largest element in each row.
*/
TNL::Containers::Vector< double, Device > rowMax( matrix.getRows() );
/***
* Prepare vector view for lambdas.
*/
auto rowMaxView = rowMax.getView();
/***
* Fetch lambda just returns absolute value of matrix elements.
*/
auto fetch = [ = ] __cuda_callable__( int rowIdx, int columnIdx, const double& value ) -> double
{
return TNL::abs( value );
};
/***
* Reduce lambda return maximum of given values.
*/
auto reduce = [ = ] __cuda_callable__( const double& a, const double& b ) -> double
{
return TNL::max( a, b );
};
/***
* Keep lambda store the largest value in each row to the vector rowMax.
*/
auto keep = [ = ] __cuda_callable__( int rowIdx, const double& value ) mutable
{
rowMaxView[ rowIdx ] = value;
};
/***
* Compute the largest values in each row.
*/
matrix.reduceRows( 0, matrix.getRows(), fetch, reduce, keep, std::numeric_limits< double >::lowest() );
std::cout << "The matrix reads as: " << std::endl << matrix << std::endl;
std::cout << "Max. elements in rows are: " << rowMax << std::endl;
}
int
main( int argc, char* argv[] )
{
std::cout << "Rows reduction on host:" << std::endl;
reduceRows< TNL::Devices::Host >();
#ifdef __CUDACC__
std::cout << "Rows reduction on CUDA device:" << std::endl;
reduceRows< TNL::Devices::Cuda >();
#endif
}
Output
Rows reduction on host:
The matrix reads as:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Max. elements in rows are: [ 1, 2, 3, 4, 5 ]
Rows reduction on CUDA device:
The matrix reads as:
Row: 0 -> 0:1
Row: 1 -> 0:2 1:1
Row: 2 -> 0:3 1:2 2:1
Row: 3 -> 0:4 1:3 2:2 3:1
Row: 4 -> 0:5 1:4 2:3 3:2 4:1
Max. elements in rows are: [ 1, 2, 3, 4, 5 ]

◆ sequentialForAllRows()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename Function >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::sequentialForAllRows ( Function && function) const

This method calls sequentialForRows for all matrix rows (for constant instances).

See LambdaMatrix::sequentialForRows.

Template Parameters
Functionis a type of lambda function that will operate on matrix elements.
Parameters
functionis an instance of the lambda function to be called in each row.

◆ sequentialForRows()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename Function >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::sequentialForRows ( IndexType begin,
IndexType end,
Function && function ) const

Method for sequential iteration over all matrix rows for constant instances.

Template Parameters
Functionis type of lambda function that will operate on matrix elements. It is should have form like
auto function = [] __cuda_callable__ ( RowView& row ) { ... };

RowView represents matrix row - see TNL::Matrices::LambdaMatrix::RowView.

Parameters
begindefines beginning of the range [begin,end) of rows to be processed.
enddefines ending of the range [begin,end) of rows to be processed.
functionis an instance of the lambda function to be called in each row.

◆ setDimensions()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::setDimensions ( IndexType rows,
IndexType columns )

Set number of rows and columns of this matrix.

Parameters
rowsis the number of matrix rows.
columnsis the number of matrix columns.

◆ vectorProduct()

template<typename MatrixElementsLambda , typename CompressedRowLengthsLambda , typename Real , typename Device , typename Index >
template<typename InVector , typename OutVector >
void TNL::Matrices::LambdaMatrix< MatrixElementsLambda, CompressedRowLengthsLambda, Real, Device, Index >::vectorProduct ( const InVector & inVector,
OutVector & outVector,
const RealType & matrixMultiplicator = 1.0,
const RealType & outVectorMultiplicator = 0.0,
IndexType begin = 0,
IndexType end = 0 ) const

Computes product of matrix and vector.

More precisely, it computes:

outVector = matrixMultiplicator * ( *this ) * inVector + outVectorMultiplicator * outVector
Template Parameters
InVectoris type of input vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.
OutVectoris type of output vector. It can be TNL::Containers::Vector, TNL::Containers::VectorView, TNL::Containers::Array, TNL::Containers::ArrayView, or similar container.
Parameters
inVectoris input vector.
outVectoris output vector.
matrixMultiplicatoris a factor by which the matrix is multiplied. It is one by default.
outVectorMultiplicatoris a factor by which the outVector is multiplied before added to the result of matrix-vector product. It is zero by default.
beginis the beginning of the rows range for which the vector product is computed. It is zero by default.
endis the end of the rows range for which the vector product is computed. It is number if the matrix rows by default.

The documentation for this class was generated from the following files: