Template Numerical Library version\ main:eacc201d
Loading...
Searching...
No Matches
Static Public Member Functions | List of all members
TNL::Algorithms::ParallelFor< Device > Struct Template Reference

Parallel for loop for one dimensional interval of indices. More...

#include <TNL/Algorithms/ParallelFor.h>

Static Public Member Functions

template<typename Index , typename Function , typename... FunctionArgs>
static void exec (Index start, Index end, Function f, FunctionArgs... args)
 Static method for the execution of the loop. More...
 
template<typename Index , typename Function , typename... FunctionArgs>
static void exec (Index start, Index end, typename Device::LaunchConfiguration launch_config, Function f, FunctionArgs... args)
 Overload with custom launch configuration (which is ignored for TNL::Devices::Sequential).
 

Detailed Description

template<typename Device = Devices::Sequential>
struct TNL::Algorithms::ParallelFor< Device >

Parallel for loop for one dimensional interval of indices.

Template Parameters
Devicespecifies the device where the for-loop will be executed. It can be TNL::Devices::Host, TNL::Devices::Cuda or TNL::Devices::Sequential.

Member Function Documentation

◆ exec()

template<typename Device = Devices::Sequential>
template<typename Index , typename Function , typename... FunctionArgs>
static void TNL::Algorithms::ParallelFor< Device >::exec ( Index  start,
Index  end,
Function  f,
FunctionArgs...  args 
)
inlinestatic

Static method for the execution of the loop.

Template Parameters
Indexis the type of the loop indices.
Functionis the type of the functor to be called in each iteration (it is usually deduced from the argument used in the function call).
FunctionArgsis a variadic pack of types for additional parameters that are forwarded to the functor in every iteration.
Parameters
startis the left bound of the iteration range [begin, end).
endis the right bound of the iteration range [begin, end).
fis the function to be called in each iteration.
argsare additional parameters to be passed to the function f.
Example
#include <iostream>
#include <TNL/Containers/Vector.h>
#include <TNL/Algorithms/ParallelFor.h>
using namespace TNL;
using namespace TNL::Containers;
using namespace TNL::Algorithms;
/****
* Set all elements of the vector v to the constant c.
*/
template< typename Device >
void initVector( Vector< double, Device >& v,
const double& c )
{
auto view = v.getView();
auto init = [=] __cuda_callable__ ( int i ) mutable
{
view[ i ] = c;
};
}
int main( int argc, char* argv[] )
{
/***
* Firstly, test the vector initiation on CPU.
*/
initVector( host_v, 1.0 );
std::cout << "host_v = " << host_v << std::endl;
/***
* And then also on GPU.
*/
#ifdef HAVE_CUDA
initVector( cuda_v, 1.0 );
std::cout << "cuda_v = " << cuda_v << std::endl;
#endif
return EXIT_SUCCESS;
}
#define __cuda_callable__
Definition: CudaCallable.h:22
__cuda_callable__ IndexType getSize() const
Returns the current array size.
Definition: Array.hpp:244
Vector extends Array with algebraic operations.
Definition: Vector.h:40
ViewType getView(IndexType begin=0, IndexType end=0)
Returns a modifiable view of the vector.
Definition: Vector.hpp:29
T endl(T... args)
Namespace for fundamental TNL algorithms.
Definition: AtomicOperations.h:14
Namespace for TNL containers.
Definition: Array.h:21
The main TNL namespace.
Definition: AtomicOperations.h:13
static void exec(Index start, Index end, Function f, FunctionArgs... args)
Static method for the execution of the loop.
Definition: ParallelFor.h:70
Output
host_v = [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ]
cuda_v = [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ]

The documentation for this struct was generated from the following file: