The DevicePointer is like SharedPointer, except it takes an existing host object - there is no call to the ObjectType's constructor nor destructor.
More...
template<typename Object, typename Device = typename Object::DeviceType>
class TNL::Pointers::DevicePointer< Object, Device >
The DevicePointer is like SharedPointer, except it takes an existing host object - there is no call to the ObjectType's constructor nor destructor.
**NOTE: When using smart pointers to pass objects on GPU, one must call Pointers::synchronizeSmartPointersOnDevice< Devices::Cuda >() before calling a CUDA kernel working with smart pointers.**
- Template Parameters
-
Object | is a type of object to be owned by the pointer. |
Device | is device where the object is to be allocated. The object is always allocated on the host system as well for easier object manipulation. |
See also UniquePointer and SharedPointer.
See also DevicePointer< Object, Devices::Host > and DevicePointer< Object, Devices::Cuda >.
- Example
#include <iostream>
#include <cstdlib>
#include <TNL/Containers/Array.h>
#include <TNL/Pointers/DevicePointer.h>
struct Tuple
{
Tuple( ArrayCuda& _a1, ArrayCuda& _a2 )
: a1( _a1 ),
a2( _a2 )
{}
};
#ifdef __CUDACC__
__global__
void
printTuple( const Tuple t )
{
printf( "Tuple size is: %d\n", t.a1->getSize() );
for( int i = 0; i < t.a1->getSize(); i++ ) {
printf( "a1[ %d ] = %d \n", i, ( *t.a1 )[ i ] );
printf( "a2[ %d ] = %d \n", i, ( *t.a2 )[ i ] );
}
}
#endif
int
main( int argc, char* argv[] )
{
#ifdef __CUDACC__
ArrayCuda a1( 3 ), a2( 3 );
Tuple t( a1, a2 );
a1 = 1;
a2 = 2;
printTuple<<< 1, 1 >>>( t );
a1.setSize( 5 );
a2.setSize( 5 );
a1 = 3;
a2 = 4;
printTuple<<< 1, 1 >>>( t );
#endif
return EXIT_SUCCESS;
}
Array is responsible for memory management, access to array elements, and general array operations.
Definition Array.h:64
The DevicePointer is like SharedPointer, except it takes an existing host object - there is no call t...
Definition DevicePointer.h:42
bool synchronizeSmartPointersOnDevice(int deviceId=-1)
Definition SmartPointersRegister.h:108
The main TNL namespace.
Definition AtomicOperations.h:9
- Output
Tuple size is: 3
a1[ 0 ] = 1
a2[ 0 ] = 2
a1[ 1 ] = 1
a2[ 1 ] = 2
a1[ 2 ] = 1
a2[ 2 ] = 2
Tuple size is: 3
a1[ 0 ] = 3
a2[ 0 ] = 4
a1[ 1 ] = 3
a2[ 1 ] = 4
a1[ 2 ] = 3
a2[ 2 ] = 4