Template Numerical Library version\ main:4904c12
Loading...
Searching...
No Matches
TNL::Algorithms::Segments Namespace Reference

Namespace for the segments data structures. More...

Classes

class  AdaptiveCSR
 Data structure for the Adaptive CSR segments format. More...
class  AdaptiveCSRView
 AdaptiveCSRView is provides a non-owning encapsulation of meta-data stored in the AdaptiveCSR segments. More...
class  BiEllpack
 Data structure for Bisection Ellpack segments. More...
class  BiEllpackBase
 BiEllpackBase serves as a base class for TNL::Algorithms::Segments::BiEllpack and TNL::Algorithms::Segments::BiEllpackView. More...
class  BiEllpackSegmentView
 Data structure for accessing particular segment of BiEllpack segments. More...
class  BiEllpackView
 BiEllpackView is provides a non-owning encapsulation of meta-data stored in the TNL::Algorithms::Segments::BiEllpack segments. More...
class  ChunkedEllpack
 Data structure for Chunked Ellpack segments. More...
class  ChunkedEllpackBase
 ChunkedEllpackBase serves as a base class for TNL::Algorithms::Segments::ChunkedEllpack and TNL::Algorithms::Segments::ChunkedEllpackView. More...
class  ChunkedEllpackSegmentView
class  ChunkedEllpackSegmentView< Index, ColumnMajorOrder >
 Data structure for accessing particular segment of column-major Chunked Ellpack segments. More...
class  ChunkedEllpackSegmentView< Index, RowMajorOrder >
 Data structure for accessing particular segment of row-major Chunked Ellpack segments. More...
class  ChunkedEllpackView
 ChunkedView is provides a non-owning encapsulation of meta-data stored in the TNL::Algorithms::Segments::ChunkedEllpack segments. More...
class  CSR
 Data structure for CSR segments. More...
class  CSRBase
 CSRBase serves as a base class for TNL::Algorithms::Segments::CSR and TNL::Algorithms::Segments::CSRView. More...
class  CSRView
 CSRView is provides a non-owning encapsulation of meta-data stored in the TNL::Algorithms::Segments::CSR segments. More...
struct  DefaultElementsOrganization
class  Ellpack
 Data structure for Ellpack segments. More...
class  EllpackBase
 EllpackBase serves as a base class for TNL::Algorithms::Segments::Ellpack and TNL::Algorithms::Segments::EllpackView. More...
class  EllpackView
 EllpackView is provides a non-owning encapsulation of meta-data stored in the TNL::Algorithms::Segments::Ellpack segments. More...
struct  GrowingSegments
struct  GrowingSegmentsView
struct  HasGetSegmentCountMethod
struct  isAdaptiveCSRSegments
struct  isAdaptiveCSRSegments< AdaptiveCSR< Device, Index, IndexAllocator > >
struct  isAdaptiveCSRSegments< AdaptiveCSRView< Device, Index > >
struct  isBiEllpackSegments
struct  isBiEllpackSegments< BiEllpack< Device, Index, IndexAllocator, Organization, WarpSize_ > >
struct  isBiEllpackSegments< BiEllpackView< Device, Index, Organization, WarpSize_ > >
struct  isChunkedEllpackSegments
struct  isChunkedEllpackSegments< ChunkedEllpack< Device, Index, IndexAllocator, Organization > >
struct  isChunkedEllpackSegments< ChunkedEllpackView< Device, Index, Organization > >
struct  isCSRSegments
struct  isCSRSegments< CSR< Device, Index, IndexAllocator > >
struct  isCSRSegments< CSRView< Device, Index > >
struct  isEllpackSegments
struct  isEllpackSegments< Ellpack< Device, Index, IndexAllocator, Organization, Alignment > >
struct  isEllpackSegments< EllpackView< Device, Index, Organization, Alignment > >
struct  isSlicedEllpackSegments
struct  isSlicedEllpackSegments< SlicedEllpack< Device, Index, IndexAllocator, Organization, SliceSize > >
struct  isSlicedEllpackSegments< SlicedEllpackView< Device, Index, Organization, SliceSize > >
struct  isSortedSegments
struct  isSortedSegments< SortedAdaptiveCSR< Device, Index, IndexAllocator > >
struct  isSortedSegments< SortedAdaptiveCSRView< Device, Index > >
struct  isSortedSegments< SortedBiEllpack< Device, Index, IndexAllocator, Organization, WarpSize_ > >
struct  isSortedSegments< SortedBiEllpackView< Device, Index, Organization, WarpSize_ > >
struct  isSortedSegments< SortedChunkedEllpack< Device, Index, IndexAllocator, Organization > >
struct  isSortedSegments< SortedChunkedEllpackView< Device, Index, Organization > >
struct  isSortedSegments< SortedCSRView< Device, Index > >
struct  isSortedSegments< SortedEllpack< Device, Index, IndexAllocator, Organization, Alignment > >
struct  isSortedSegments< SortedEllpackView< Device, Index, Organization, Alignment > >
struct  isSortedSegments< SortedSegments< CSR< Device, Index, IndexAllocator > > >
struct  isSortedSegments< SortedSegments< EmbeddedSegments > >
struct  isSortedSegments< SortedSegmentsView< EmbeddedSegments > >
struct  isSortedSegments< SortedSlicedEllpack< Device, Index, IndexAllocator, Organization, SliceSize > >
struct  isSortedSegments< SortedSlicedEllpackView< Device, Index, Organization, SliceSize > >
struct  LaunchConfiguration
 Launch configuration for segment operations. More...
struct  LaunchConfigurationSetter_Default
 Creates default launch configuration for segments. More...
struct  LaunchConfigurationSetter_HybridCSR
 Launch configuration setter for CSR segments. More...
struct  LaunchConfigurationSetter_LightCSR
 Launch configuration setter for CSR segments. More...
class  SegmentElement
 Simple structure representing one element of a segment. More...
struct  SegmentsPrinter
class  SegmentView
 Data structure for accessing particular segment. More...
class  SegmentView< Index, ColumnMajorOrder >
 Data structure for accessing particular segment. More...
class  SegmentView< Index, RowMajorOrder >
class  SegmentViewIterator
 Iterator for iterating over elements of a segment. More...
class  SlicedEllpack
 Data structure for Sliced Ellpack segments. More...
class  SlicedEllpackBase
 SlicedEllpackBase serves as a base class for TNL::Algorithms::Segments::SlicedEllpack and TNL::Algorithms::Segments::SlicedEllpackView. More...
class  SlicedEllpackView
 SlicedEllpackView is provides a non-owning encapsulation of meta-data stored in the TNL::Algorithms::Segments::SlicedEllpack segments. More...
class  SortedSegments
 Data structure for sorted segments. More...
class  SortedSegmentsBase
 SortedSegmentsBase serves as a base class for TNL::Algorithms::Segments::SortedSegments and TNL::Algorithms::Segments::SortedSegmentsView. More...
class  SortedSegmentsView
 SortedSegmentsView is provides a non-owning encapsulation of meta-data stored in the TNL::Algorithms::Segments::SortedSegments segments. More...

Typedefs

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int WarpSize = Backend::getWarpSize()>
using ColumnMajorBiEllpack = BiEllpack< Device, Index, IndexAllocator, ColumnMajorOrder, WarpSize >
 Alias for column-major BiEllpack segments.
template<typename Device, typename Index, int WarpSize = Backend::getWarpSize()>
using ColumnMajorBiEllpackView = BiEllpackView< Device, Index, ColumnMajorOrder, WarpSize >
 Alias for column-major BiEllpack segments view.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using ColumnMajorChunkedEllpack = ChunkedEllpack< Device, Index, IndexAllocator, ColumnMajorOrder >
 Alias for column-major ChunkedEllpack segments.
template<typename Device, typename Index>
using ColumnMajorChunkedEllpackView = ChunkedEllpackView< Device, Index, ColumnMajorOrder >
 Alias for column-major ChunkedEllpack segments view.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int Alignment = 32>
using ColumnMajorEllpack = Ellpack< Device, Index, IndexAllocator, ColumnMajorOrder, Alignment >
 Alias for column-major Ellpack segments.
template<typename Device, typename Index, int Alignment = 32>
using ColumnMajorEllpackView = EllpackView< Device, Index, ColumnMajorOrder, Alignment >
 Alias for column-major Ellpack segments view.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int SliceSize = 32>
using ColumnMajorSlicedEllpack = SlicedEllpack< Device, Index, IndexAllocator, ColumnMajorOrder, SliceSize >
 Alias for column-major SlicedEllpack segments.
template<typename Device, typename Index, int SliceSize = 32>
using ColumnMajorSlicedEllpackView = SlicedEllpackView< Device, Index, ColumnMajorOrder, SliceSize >
 Alias for column-major SlicedEllpack segments view.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int WarpSize = Backend::getWarpSize()>
using RowMajorBiEllpack = BiEllpack< Device, Index, IndexAllocator, RowMajorOrder, WarpSize >
 Alias for row-major BiEllpack segments.
template<typename Device, typename Index, int WarpSize = Backend::getWarpSize()>
using RowMajorBiEllpackView = BiEllpackView< Device, Index, RowMajorOrder, WarpSize >
 Alias for row-major BiEllpack segments view.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using RowMajorChunkedEllpack = ChunkedEllpack< Device, Index, IndexAllocator, RowMajorOrder >
 Alias for row-major ChunkedEllpack segments.
template<typename Device, typename Index>
using RowMajorChunkedEllpackView = ChunkedEllpackView< Device, Index, RowMajorOrder >
 Alias for row-major ChunkedEllpack segments view.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int Alignment = 32>
using RowMajorEllpack = Ellpack< Device, Index, IndexAllocator, RowMajorOrder, Alignment >
 Alias for row-major Ellpack segments.
template<typename Device, typename Index, int Alignment = 32>
using RowMajorEllpackView = EllpackView< Device, Index, RowMajorOrder, Alignment >
 Alias for row-major Ellpack segments view.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int SliceSize = 32>
using RowMajorSlicedEllpack = SlicedEllpack< Device, Index, IndexAllocator, RowMajorOrder, SliceSize >
 Alias for row-major SlicedEllpack segments.
template<typename Device, typename Index, int SliceSize = 32>
using RowMajorSlicedEllpackView = SlicedEllpackView< Device, Index, RowMajorOrder, SliceSize >
 Alias for row-major SlicedEllpack segments view.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using SortedAdaptiveCSR = SortedSegments< AdaptiveCSR< Device, Index, IndexAllocator > >
 Alias for sorted segments based on AdaptiveCSR segments.
template<typename Device, typename Index>
using SortedAdaptiveCSRView = SortedSegmentsView< AdaptiveCSRView< Device, Index > >
 Alias for sorted segments based on CSR segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, ElementsOrganization Organization = TNL::Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization(), int WarpSize = Backend::getWarpSize()>
using SortedBiEllpack = SortedSegments< BiEllpack< Device, Index, IndexAllocator, Organization, WarpSize > >
 Alias for sorted segments based on BiEllpack segments.
template<typename Device, typename Index, ElementsOrganization Organization = Segments::DefaultElementsOrganization< Device >::getOrganization(), int WarpSize = Backend::getWarpSize()>
using SortedBiEllpackView = SortedSegmentsView< BiEllpackView< Device, Index, Organization, WarpSize > >
 Alias for sorted segments based on BiEllpackView segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, ElementsOrganization Organization = TNL::Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization()>
using SortedChunkedEllpack = SortedSegments< ChunkedEllpack< Device, Index, IndexAllocator, Organization > >
 Alias for sorted segments based on ChunkedEllpack segments.
template<typename Device, typename Index, ElementsOrganization Organization = Segments::DefaultElementsOrganization< Device >::getOrganization()>
using SortedChunkedEllpackView = SortedSegmentsView< ChunkedEllpackView< Device, Index, Organization > >
 Alias for sorted segments based on ChunkedEllpackView segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int WarpSize = Backend::getWarpSize()>
using SortedColumnMajorBiEllpack = SortedSegments< ColumnMajorBiEllpack< Device, Index, IndexAllocator, WarpSize > >
 Alias for sorted segments based on column-major BiEllpack segments.
template<typename Device, typename Index, int WarpSize = Backend::getWarpSize()>
using SortedColumnMajorBiEllpackView = SortedSegmentsView< ColumnMajorBiEllpackView< Device, Index, WarpSize > >
 Alias for sorted segments based on column-major BiEllpack segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using SortedColumnMajorChunkedEllpack = SortedSegments< ColumnMajorChunkedEllpack< Device, Index, IndexAllocator > >
 Alias for sorted segments based on column-major ChunkedEllpack segments.
template<typename Device, typename Index>
using SortedColumnMajorChunkedEllpackView = SortedSegmentsView< ColumnMajorChunkedEllpackView< Device, Index > >
 Alias for sorted segments based on column-major ChunkedEllpack segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int Alignment = 32>
using SortedColumnMajorEllpack = SortedSegments< ColumnMajorEllpack< Device, Index, IndexAllocator, Alignment > >
 Alias for sorted segments based on column-major Ellpack segments.
template<typename Device, typename Index, int Alignment = 32>
using SortedColumnMajorEllpackView = SortedSegmentsView< ColumnMajorEllpackView< Device, Index, Alignment > >
 Alias for sorted segments based on column-major Ellpack segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int SliceSize = 32>
using SortedColumnMajorSlicedEllpack = SortedSegments< ColumnMajorSlicedEllpack< Device, Index, IndexAllocator, SliceSize > >
 Alias for sorted column-major SlicedEllpack segments.
template<typename Device, typename Index, int SliceSize = 32>
using SortedColumnMajorSlicedEllpackView = SortedSegmentsView< ColumnMajorSlicedEllpackView< Device, Index, SliceSize > >
 Alias for sorted segments based on column-major SlicedEllpack segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using SortedCSR = SortedSegments< CSR< Device, Index, IndexAllocator > >
 Alias for sorted segments based on CSR segments.
template<typename Device, typename Index>
using SortedCSRView = SortedSegmentsView< CSRView< Device, Index > >
 Alias for sorted segments based on CSR segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, ElementsOrganization Organization = TNL::Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization(), int Alignment = 32>
using SortedEllpack = SortedSegments< Ellpack< Device, Index, IndexAllocator, Organization, Alignment > >
 Alias for sorted segments based on Ellpack segments.
template<typename Device, typename Index, ElementsOrganization Organization = Segments::DefaultElementsOrganization< Device >::getOrganization(), int Alignment = 32>
using SortedEllpackView = SortedSegmentsView< EllpackView< Device, Index, Organization, Alignment > >
 Alias for sorted segments based on EllpackView segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int WarpSize = Backend::getWarpSize()>
using SortedRowMajorBiEllpack = SortedSegments< RowMajorBiEllpack< Device, Index, IndexAllocator, WarpSize > >
 Alias for sorted segments based on row-major BiEllpack segments.
template<typename Device, typename Index, int WarpSize = Backend::getWarpSize()>
using SortedRowMajorBiEllpackView = SortedSegmentsView< RowMajorBiEllpackView< Device, Index, WarpSize > >
 Alias for sorted segments based on row-major BiEllpackView segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using SortedRowMajorChunkedEllpack = SortedSegments< RowMajorChunkedEllpack< Device, Index, IndexAllocator > >
 Alias for sorted segments based on row-major ChunkedEllpack segments.
template<typename Device, typename Index>
using SortedRowMajorChunkedEllpackView = SortedSegmentsView< RowMajorChunkedEllpackView< Device, Index > >
 Alias for sorted segments based on row-major ChunkedEllpackView segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int Alignment = 32>
using SortedRowMajorEllpack = SortedSegments< RowMajorEllpack< Device, Index, IndexAllocator, Alignment > >
 Alias for sorted segments based on row-major Ellpack segments.
template<typename Device, typename Index, int Alignment = 32>
using SortedRowMajorEllpackView = SortedSegmentsView< RowMajorEllpackView< Device, Index, Alignment > >
 Alias for sorted segments based on row-major EllpackView segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int SliceSize = 32>
using SortedRowMajorSlicedEllpack = SortedSegments< RowMajorSlicedEllpack< Device, Index, IndexAllocator, SliceSize > >
 Alias sorted for row-major SlicedEllpack segments.
template<typename Device, typename Index, int SliceSize = 32>
using SortedRowMajorSlicedEllpackView = SortedSegmentsView< RowMajorSlicedEllpackView< Device, Index, SliceSize > >
 Alias for sorted segments based on row-major SlicedEllpackView segments.
template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, ElementsOrganization Organization = TNL::Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization(), int SliceSize = 32>
using SortedSlicedEllpack = SortedSegments< SlicedEllpack< Device, Index, IndexAllocator, Organization, SliceSize > >
 Alias for sorted SlicedEllpack segments.
template<typename Device, typename Index, ElementsOrganization Organization = Segments::DefaultElementsOrganization< Device >::getOrganization(), int SliceSize = 32>
using SortedSlicedEllpackView = SortedSegmentsView< SlicedEllpackView< Device, Index, Organization, SliceSize > >
 Alias for sorted segments based on SlicedEllpackView segments.

Enumerations

enum  ElementsOrganization : std::uint8_t { ColumnMajorOrder = 0 , RowMajorOrder }
enum class  ThreadsToSegmentsMapping : std::uint8_t {
  Fixed , Warp , Block , BlockMerged ,
  DynamicGrouping
}
 Enumeration for mapping threads to segments. More...

Functions

template<typename Segments, typename Fetch, typename Reduce, typename Write>
void exclusiveScanAllSegments (const Segments &segments, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute exclusive prefix-sum (scan) within all segments. .
template<typename Segments, typename Condition, typename Fetch, typename Reduce, typename Write>
void exclusiveScanAllSegmentsIf (const Segments &segments, Condition &&condition, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute exclusive conditional prefix-sum (scan) within all segments. .
template<typename SegmentView, typename Fetch, typename Reduce, typename Write>
__cuda_callable__ void exclusiveScanSegment (SegmentView &segment, Fetch &&fetch, Reduce &&reduce, Write &&write)
 Computes an exclusive scan (or prefix sum) within a segment. .
template<typename Segments, typename Array, typename Fetch, typename Reduce, typename Write>
void exclusiveScanSegments (const Segments &segments, const Array &segmentIndexes, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute exclusive prefix-sum (scan) within all segments specified by a segment index array. .
template<typename Segments, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduce, typename Write, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void exclusiveScanSegments (const Segments &segments, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute exclusive prefix-sum (scan) within specified segments in a range. .
template<typename Segments, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduce, typename Write, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void exclusiveScanSegmentsIf (const Segments &segments, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute exclusive conditional prefix-sum (scan) within specified segments in a range. .
template<typename Segments, typename Function>
void forAllElements (const Segments &segments, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of all segments and applies the specified lambda function.
template<typename Segments, typename Condition, typename Function>
void forAllElementsIf (const Segments &segments, Condition condition, Function function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of all segments based on a condition.
template<typename Segments, typename Function>
void forAllSegments (const Segments &segments, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all segments and applies the given lambda function to each segment.
template<typename Segments, typename SegmentCondition, typename Function>
void forAllSegmentsIf (const Segments &segments, SegmentCondition &&segmentCondition, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all segments, applying a condition to determine whether each segment should be processed.
template<typename Segments, typename Array, typename Function>
void forElements (const Segments &segments, const Array &segmentIndexes, Function function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements of segments with the given indexes and applies the specified lambda function.
template<typename Segments, typename IndexBegin, typename IndexEnd, typename Function>
void forElements (const Segments &segments, IndexBegin begin, IndexEnd end, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements in the given range of segments and applies the specified lambda function. .
template<typename Segments, typename IndexBegin, typename IndexEnd, typename Condition, typename Function>
void forElementsIf (const Segments &segments, IndexBegin begin, IndexEnd end, Condition condition, Function function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over all elements in a given range of segments based on a condition.
template<typename Segments, typename Array, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value >>
void forSegments (const Segments &segments, const Array &segmentIndexes, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over segments with the given indexes and applies the specified lambda function to each segment.
template<typename Segments, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void forSegments (const Segments &segments, IndexBegin begin, IndexEnd end, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over segments within the specified range of segment indexes and applies the given lambda function to each segment.
template<typename Segments, typename IndexBegin, typename IndexEnd, typename SegmentCondition, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void forSegmentsIf (const Segments &segments, IndexBegin begin, IndexEnd end, SegmentCondition &&segmentCondition, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
 Iterates in parallel over segments within the given range of segment indexes, applying a condition to determine whether each segment should be processed.
template<typename Segments, typename Fetch, typename Reduce, typename Write>
void inclusiveScanAllSegments (const Segments &segments, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute inclusive prefix-sum (scan) within all segments. .
template<typename Segments, typename Condition, typename Fetch, typename Reduce, typename Write>
void inclusiveScanAllSegmentsIf (const Segments &segments, Condition &&condition, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute inclusive conditional prefix-sum (scan) within all segments. .
template<typename SegmentView, typename Fetch, typename Reduce, typename Write>
__cuda_callable__ void inclusiveScanSegment (SegmentView &segment, Fetch &&fetch, Reduce &&reduce, Write &&write)
 Computes an inclusive scan (or prefix sum) within a segment. .
template<typename Segments, typename Array, typename Fetch, typename Reduce, typename Write>
void inclusiveScanSegments (const Segments &segments, const Array &segmentIndexes, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute inclusive prefix-sum (scan) within segments specified by a segment index array. .
template<typename Segments, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduce, typename Write, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void inclusiveScanSegments (const Segments &segments, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute inclusive prefix-sum (scan) within specified segments in a range. .
template<typename Segments, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduce, typename Write, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void inclusiveScanSegmentsIf (const Segments &segments, IndexBegin begin, IndexEnd end, Condition &&condition, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
 Compute inclusive conditional prefix-sum (scan) within specified segments in a range. .
template<typename Segments, typename T = std::enable_if_t< isSegments_v< Segments > >>
std::ostreamoperator<< (std::ostream &str, const Segments &segments)
 Insertion operator of segments to output stream.
template<typename Segments, typename Fetch, std::enable_if_t< isSegments_v< Segments >, bool > = true>
SegmentsPrinter< typename Segments::ConstViewType, Fetch > print (const Segments &segments, Fetch fetch)
 Print segments sizes, i.e. the segments setup.
template<typename Segments>
std::ostreamprintSegments (std::ostream &str, const Segments &segments)
 Print segments sizes, i.e. the segments setup.
template<typename Segments, typename Fetch>
std::ostreamprintSegments (std::ostream &str, const Segments &segments, Fetch &&fetch)
template<typename SegmentView, typename Fetch, typename Compare, typename Swap>
__cuda_callable__ void segmentInsertionSort (SegmentView segment, Fetch &&fetch, Compare &&compare, Swap &&swap)
 Sorts a segment using insertion sort. .
template<typename Segments, typename Function>
void sequentialForAllSegments (const Segments &segments, Function &&function)
 Iterates in parallel over all segments and call given lambda function for each segment.
template<typename Segments, typename IndexBegin, typename IndexEnd, typename Function>
void sequentialForSegments (const Segments &segments, IndexBegin begin, IndexEnd end, Function &&function)
 Iterates sequentially over segments in given range of segment indexes and call given lambda function for each segment.

Variables

template<typename Segments>
constexpr bool isAdaptiveCSRSegments_v = isAdaptiveCSRSegments< Segments >::value
 Returns true if the given type is AdaptiveCSR segments.
template<typename Segments>
constexpr bool isChunkedEllpackSegments_v = isChunkedEllpackSegments< Segments >::value
 Returns true if the given type is ChunkedEllpack segments.
template<typename Segments>
constexpr bool isCSRSegments_v = isCSRSegments< Segments >::value
 Returns true if the given type is CSR segments.
template<typename Segments>
constexpr bool isEllpackSegments_v = isEllpackSegments< Segments >::value
 Returns true if the given type is Ellpack segments.
template<typename Segments>
constexpr bool isSegments_v = HasGetSegmentCountMethod< Segments >::value
template<typename Segments>
constexpr bool isSlicedEllpackSegments_v = isSlicedEllpackSegments< Segments >::value
 Returns true if the given type is SlicedEllpack segments.
template<typename Segments>
constexpr bool isSortedSegments_v = isSortedSegments< Segments >::value
 Returns true if the given type is CSR segments.

Detailed Description

Namespace for the segments data structures.

Typedef Documentation

◆ ColumnMajorBiEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::ColumnMajorBiEllpack = BiEllpack< Device, Index, IndexAllocator, ColumnMajorOrder, WarpSize >

Alias for column-major BiEllpack segments.

See TNL::Algorithms::Segments::BiEllpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ ColumnMajorBiEllpackView

template<typename Device, typename Index, int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::ColumnMajorBiEllpackView = BiEllpackView< Device, Index, ColumnMajorOrder, WarpSize >

Alias for column-major BiEllpack segments view.

See TNL::Algorithms::Segments::BiEllpackView for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ ColumnMajorChunkedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using TNL::Algorithms::Segments::ColumnMajorChunkedEllpack = ChunkedEllpack< Device, Index, IndexAllocator, ColumnMajorOrder >

Alias for column-major ChunkedEllpack segments.

See TNL::Algorithms::Segments::ChunkedEllpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ ColumnMajorChunkedEllpackView

template<typename Device, typename Index>
using TNL::Algorithms::Segments::ColumnMajorChunkedEllpackView = ChunkedEllpackView< Device, Index, ColumnMajorOrder >

Alias for column-major ChunkedEllpack segments view.

See TNL::Algorithms::Segments::ChunkedEllpackView for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ ColumnMajorEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int Alignment = 32>
using TNL::Algorithms::Segments::ColumnMajorEllpack = Ellpack< Device, Index, IndexAllocator, ColumnMajorOrder, Alignment >

Alias for column-major Ellpack segments.

See TNL::Algorithms::Segments::Ellpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ ColumnMajorEllpackView

template<typename Device, typename Index, int Alignment = 32>
using TNL::Algorithms::Segments::ColumnMajorEllpackView = EllpackView< Device, Index, ColumnMajorOrder, Alignment >

Alias for column-major Ellpack segments view.

See TNL::Algorithms::Segments::EllpackView for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ ColumnMajorSlicedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int SliceSize = 32>
using TNL::Algorithms::Segments::ColumnMajorSlicedEllpack = SlicedEllpack< Device, Index, IndexAllocator, ColumnMajorOrder, SliceSize >

Alias for column-major SlicedEllpack segments.

See TNL::Algorithms::Segments::SlicedEllpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ ColumnMajorSlicedEllpackView

template<typename Device, typename Index, int SliceSize = 32>
using TNL::Algorithms::Segments::ColumnMajorSlicedEllpackView = SlicedEllpackView< Device, Index, ColumnMajorOrder, SliceSize >

Alias for column-major SlicedEllpack segments view.

See TNL::Algorithms::Segments::SlicedEllpackView for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
SliceSizeThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ RowMajorBiEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::RowMajorBiEllpack = BiEllpack< Device, Index, IndexAllocator, RowMajorOrder, WarpSize >

Alias for row-major BiEllpack segments.

See TNL::Algorithms::Segments::BiEllpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ RowMajorBiEllpackView

template<typename Device, typename Index, int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::RowMajorBiEllpackView = BiEllpackView< Device, Index, RowMajorOrder, WarpSize >

Alias for row-major BiEllpack segments view.

See TNL::Algorithms::Segments::BiEllpackView for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ RowMajorChunkedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using TNL::Algorithms::Segments::RowMajorChunkedEllpack = ChunkedEllpack< Device, Index, IndexAllocator, RowMajorOrder >

Alias for row-major ChunkedEllpack segments.

See TNL::Algorithms::Segments::ChunkedEllpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ RowMajorChunkedEllpackView

template<typename Device, typename Index>
using TNL::Algorithms::Segments::RowMajorChunkedEllpackView = ChunkedEllpackView< Device, Index, RowMajorOrder >

Alias for row-major ChunkedEllpack segments view.

See TNL::Algorithms::Segments::ChunkedEllpackView for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ RowMajorEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int Alignment = 32>
using TNL::Algorithms::Segments::RowMajorEllpack = Ellpack< Device, Index, IndexAllocator, RowMajorOrder, Alignment >

Alias for row-major Ellpack segments.

See TNL::Algorithms::Segments::Ellpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ RowMajorEllpackView

template<typename Device, typename Index, int Alignment = 32>
using TNL::Algorithms::Segments::RowMajorEllpackView = EllpackView< Device, Index, RowMajorOrder, Alignment >

Alias for row-major Ellpack segments view.

See TNL::Algorithms::Segments::EllpackView for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ RowMajorSlicedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int SliceSize = 32>
using TNL::Algorithms::Segments::RowMajorSlicedEllpack = SlicedEllpack< Device, Index, IndexAllocator, RowMajorOrder, SliceSize >

Alias for row-major SlicedEllpack segments.

See TNL::Algorithms::Segments::SlicedEllpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ RowMajorSlicedEllpackView

template<typename Device, typename Index, int SliceSize = 32>
using TNL::Algorithms::Segments::RowMajorSlicedEllpackView = SlicedEllpackView< Device, Index, RowMajorOrder, SliceSize >

Alias for row-major SlicedEllpack segments view.

See TNL::Algorithms::Segments::SlicedEllpackView for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
SliceSizeThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ SortedAdaptiveCSR

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using TNL::Algorithms::Segments::SortedAdaptiveCSR = SortedSegments< AdaptiveCSR< Device, Index, IndexAllocator > >

Alias for sorted segments based on AdaptiveCSR segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedAdaptiveCSRView

template<typename Device, typename Index>
using TNL::Algorithms::Segments::SortedAdaptiveCSRView = SortedSegmentsView< AdaptiveCSRView< Device, Index > >

Alias for sorted segments based on CSR segments.

Template Parameters
Deviceis type of device where the segments will be operating.
Indexis type for indexing of the elements managed by the segments.

◆ SortedBiEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, ElementsOrganization Organization = TNL::Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization(), int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::SortedBiEllpack = SortedSegments< BiEllpack< Device, Index, IndexAllocator, Organization, WarpSize > >

Alias for sorted segments based on BiEllpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedBiEllpackView

template<typename Device, typename Index, ElementsOrganization Organization = Segments::DefaultElementsOrganization< Device >::getOrganization(), int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::SortedBiEllpackView = SortedSegmentsView< BiEllpackView< Device, Index, Organization, WarpSize > >

Alias for sorted segments based on BiEllpackView segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedChunkedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, ElementsOrganization Organization = TNL::Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization()>
using TNL::Algorithms::Segments::SortedChunkedEllpack = SortedSegments< ChunkedEllpack< Device, Index, IndexAllocator, Organization > >

Alias for sorted segments based on ChunkedEllpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedChunkedEllpackView

template<typename Device, typename Index, ElementsOrganization Organization = Segments::DefaultElementsOrganization< Device >::getOrganization()>
using TNL::Algorithms::Segments::SortedChunkedEllpackView = SortedSegmentsView< ChunkedEllpackView< Device, Index, Organization > >

Alias for sorted segments based on ChunkedEllpackView segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedColumnMajorBiEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::SortedColumnMajorBiEllpack = SortedSegments< ColumnMajorBiEllpack< Device, Index, IndexAllocator, WarpSize > >

Alias for sorted segments based on column-major BiEllpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedColumnMajorBiEllpackView

template<typename Device, typename Index, int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::SortedColumnMajorBiEllpackView = SortedSegmentsView< ColumnMajorBiEllpackView< Device, Index, WarpSize > >

Alias for sorted segments based on column-major BiEllpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedColumnMajorChunkedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using TNL::Algorithms::Segments::SortedColumnMajorChunkedEllpack = SortedSegments< ColumnMajorChunkedEllpack< Device, Index, IndexAllocator > >

Alias for sorted segments based on column-major ChunkedEllpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedColumnMajorChunkedEllpackView

template<typename Device, typename Index>
using TNL::Algorithms::Segments::SortedColumnMajorChunkedEllpackView = SortedSegmentsView< ColumnMajorChunkedEllpackView< Device, Index > >

Alias for sorted segments based on column-major ChunkedEllpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedColumnMajorEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int Alignment = 32>
using TNL::Algorithms::Segments::SortedColumnMajorEllpack = SortedSegments< ColumnMajorEllpack< Device, Index, IndexAllocator, Alignment > >

Alias for sorted segments based on column-major Ellpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ SortedColumnMajorEllpackView

template<typename Device, typename Index, int Alignment = 32>
using TNL::Algorithms::Segments::SortedColumnMajorEllpackView = SortedSegmentsView< ColumnMajorEllpackView< Device, Index, Alignment > >

Alias for sorted segments based on column-major Ellpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedColumnMajorSlicedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int SliceSize = 32>
using TNL::Algorithms::Segments::SortedColumnMajorSlicedEllpack = SortedSegments< ColumnMajorSlicedEllpack< Device, Index, IndexAllocator, SliceSize > >

Alias for sorted column-major SlicedEllpack segments.

See TNL::Algorithms::Segments::SlicedEllpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedColumnMajorSlicedEllpackView

template<typename Device, typename Index, int SliceSize = 32>
using TNL::Algorithms::Segments::SortedColumnMajorSlicedEllpackView = SortedSegmentsView< ColumnMajorSlicedEllpackView< Device, Index, SliceSize > >

Alias for sorted segments based on column-major SlicedEllpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
SliceSizeThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ SortedCSR

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using TNL::Algorithms::Segments::SortedCSR = SortedSegments< CSR< Device, Index, IndexAllocator > >

Alias for sorted segments based on CSR segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedCSRView

template<typename Device, typename Index>
using TNL::Algorithms::Segments::SortedCSRView = SortedSegmentsView< CSRView< Device, Index > >

Alias for sorted segments based on CSR segments.

Template Parameters
Deviceis type of device where the segments will be operating.
Indexis type for indexing of the elements managed by the segments.

◆ SortedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, ElementsOrganization Organization = TNL::Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization(), int Alignment = 32>
using TNL::Algorithms::Segments::SortedEllpack = SortedSegments< Ellpack< Device, Index, IndexAllocator, Organization, Alignment > >

Alias for sorted segments based on Ellpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ SortedEllpackView

template<typename Device, typename Index, ElementsOrganization Organization = Segments::DefaultElementsOrganization< Device >::getOrganization(), int Alignment = 32>
using TNL::Algorithms::Segments::SortedEllpackView = SortedSegmentsView< EllpackView< Device, Index, Organization, Alignment > >

Alias for sorted segments based on EllpackView segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedRowMajorBiEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::SortedRowMajorBiEllpack = SortedSegments< RowMajorBiEllpack< Device, Index, IndexAllocator, WarpSize > >

Alias for sorted segments based on row-major BiEllpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedRowMajorBiEllpackView

template<typename Device, typename Index, int WarpSize = Backend::getWarpSize()>
using TNL::Algorithms::Segments::SortedRowMajorBiEllpackView = SortedSegmentsView< RowMajorBiEllpackView< Device, Index, WarpSize > >

Alias for sorted segments based on row-major BiEllpackView segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedRowMajorChunkedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >>
using TNL::Algorithms::Segments::SortedRowMajorChunkedEllpack = SortedSegments< RowMajorChunkedEllpack< Device, Index, IndexAllocator > >

Alias for sorted segments based on row-major ChunkedEllpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedRowMajorChunkedEllpackView

template<typename Device, typename Index>
using TNL::Algorithms::Segments::SortedRowMajorChunkedEllpackView = SortedSegmentsView< RowMajorChunkedEllpackView< Device, Index > >

Alias for sorted segments based on row-major ChunkedEllpackView segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedRowMajorEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int Alignment = 32>
using TNL::Algorithms::Segments::SortedRowMajorEllpack = SortedSegments< RowMajorEllpack< Device, Index, IndexAllocator, Alignment > >

Alias for sorted segments based on row-major Ellpack segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
AlignmentThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ SortedRowMajorEllpackView

template<typename Device, typename Index, int Alignment = 32>
using TNL::Algorithms::Segments::SortedRowMajorEllpackView = SortedSegmentsView< RowMajorEllpackView< Device, Index, Alignment > >

Alias for sorted segments based on row-major EllpackView segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedRowMajorSlicedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, int SliceSize = 32>
using TNL::Algorithms::Segments::SortedRowMajorSlicedEllpack = SortedSegments< RowMajorSlicedEllpack< Device, Index, IndexAllocator, SliceSize > >

Alias sorted for row-major SlicedEllpack segments.

See TNL::Algorithms::Segments::SlicedEllpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedRowMajorSlicedEllpackView

template<typename Device, typename Index, int SliceSize = 32>
using TNL::Algorithms::Segments::SortedRowMajorSlicedEllpackView = SortedSegmentsView< RowMajorSlicedEllpackView< Device, Index, SliceSize > >

Alias for sorted segments based on row-major SlicedEllpackView segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
SliceSizeThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

◆ SortedSlicedEllpack

template<typename Device, typename Index, typename IndexAllocator = typename Allocators::Default< Device >::template Allocator< Index >, ElementsOrganization Organization = TNL::Algorithms::Segments::DefaultElementsOrganization< Device >::getOrganization(), int SliceSize = 32>
using TNL::Algorithms::Segments::SortedSlicedEllpack = SortedSegments< SlicedEllpack< Device, Index, IndexAllocator, Organization, SliceSize > >

Alias for sorted SlicedEllpack segments.

See TNL::Algorithms::Segments::SlicedEllpack for more details.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.

◆ SortedSlicedEllpackView

template<typename Device, typename Index, ElementsOrganization Organization = Segments::DefaultElementsOrganization< Device >::getOrganization(), int SliceSize = 32>
using TNL::Algorithms::Segments::SortedSlicedEllpackView = SortedSegmentsView< SlicedEllpackView< Device, Index, Organization, SliceSize > >

Alias for sorted segments based on SlicedEllpackView segments.

Template Parameters
DeviceThe type of device on which the segments will operate.
IndexThe type used for indexing elements managed by the segments.
IndexAllocatorThe allocator used for managing index containers.
SliceSizeThe alignment of the number of segments (to optimize data alignment, particularly on GPUs).

Enumeration Type Documentation

◆ ElementsOrganization

Enumerator
ColumnMajorOrder 

Column-major order.

RowMajorOrder 

Row-major order.

◆ ThreadsToSegmentsMapping

Enumeration for mapping threads to segments.

This enumeration defines how threads are mapped to segments during parallel operations. It includes options for mapping one thread per segment, one warp per segment, and user-defined mappings.

Function Documentation

◆ exclusiveScanAllSegments()

template<typename Segments, typename Fetch, typename Reduce, typename Write>
void TNL::Algorithms::Segments::exclusiveScanAllSegments ( const Segments & segments,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute exclusive prefix-sum (scan) within all segments. .

This is a convenience function that computes exclusive prefix-sum in all segments. It internally calls exclusiveScanSegments with the full range of segments.

Template Parameters
SegmentsType of the segments container.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
#define __cuda_callable__
Definition Macros.h:49
Data structure for CSR segments.
Definition CSR.h:37
Vector extends Array with algebraic operations.
Definition Vector.h:37
ConstViewType getConstView(IndexType begin=0, IndexType end=0) const
Returns a non-modifiable view of the vector.
void inclusiveScanAllSegmentsIf(const Segments &segments, Condition &&condition, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
Compute inclusive conditional prefix-sum (scan) within all segments. .
SegmentsPrinter< typename Segments::ConstViewType, Fetch > print(const Segments &segments, Fetch fetch)
Print segments sizes, i.e. the segments setup.
void inclusiveScanSegments(const Segments &segments, IndexBegin begin, IndexEnd end, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
Compute inclusive prefix-sum (scan) within specified segments in a range. .
void forAllElements(const Segments &segments, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over all elements of all segments and applies the specified lambda function.
void exclusiveScanAllSegments(const Segments &segments, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
Compute exclusive prefix-sum (scan) within all segments. .
void inclusiveScanAllSegments(const Segments &segments, Fetch &&fetch, Reduce &&reduce, Write &&write, LaunchConfiguration launchConfig=LaunchConfiguration())
Compute inclusive prefix-sum (scan) within all segments. .
Function object implementing x + y.
Definition Functional.h:34
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ exclusiveScanAllSegmentsIf()

template<typename Segments, typename Condition, typename Fetch, typename Reduce, typename Write>
void TNL::Algorithms::Segments::exclusiveScanAllSegmentsIf ( const Segments & segments,
Condition && condition,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute exclusive conditional prefix-sum (scan) within all segments. .

This is a convenience function that computes exclusive prefix-sum in all segments, but only for elements that satisfy the given condition. It internally calls exclusiveScanSegmentsIf with the full range of segments.

Template Parameters
SegmentsType of the segments container.
ConditionType of the condition function.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
conditionFunction that returns true for elements to include in scan. See Condition Lambda.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ exclusiveScanSegment()

template<typename SegmentView, typename Fetch, typename Reduce, typename Write>
__cuda_callable__ void TNL::Algorithms::Segments::exclusiveScanSegment ( SegmentView & segment,
Fetch && fetch,
Reduce && reduce,
Write && write )

Computes an exclusive scan (or prefix sum) within a segment. .

Template Parameters
SegmentViewType of the segment view.
FetchType of the fetch function.
Reduceis a type of function performing the reduction.
WriteType of the write function.
Parameters
segmentThe segment view to scan.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.

◆ exclusiveScanSegments() [1/2]

template<typename Segments, typename Array, typename Fetch, typename Reduce, typename Write>
void TNL::Algorithms::Segments::exclusiveScanSegments ( const Segments & segments,
const Array & segmentIndexes,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute exclusive prefix-sum (scan) within all segments specified by a segment index array. .

This is a convenience function that computes exclusive prefix-sum in all segments specified by the segmentIndexes array. It internally calls exclusiveScanSegments with the full range of the segment index array.

Template Parameters
SegmentsType of the segments container.
ArrayType of the segment indexes array.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
segmentIndexesArray containing indices of segments to scan.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ exclusiveScanSegments() [2/2]

template<typename Segments, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduce, typename Write, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Algorithms::Segments::exclusiveScanSegments ( const Segments & segments,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute exclusive prefix-sum (scan) within specified segments in a range. .

This function computes exclusive prefix-sum within segments in the range [ begin, end). Each segment is processed independently using sequential scan. The scan operation is performed based on the provided fetch, reduce, and write functions.

Template Parameters
SegmentsType of the segments container.
IndexBeginType of the begin index.
IndexEndType of the end index.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
beginThe beginning of the range of segments to scan.
endThe end of the range of segments to scan.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ exclusiveScanSegmentsIf()

template<typename Segments, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduce, typename Write, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Algorithms::Segments::exclusiveScanSegmentsIf ( const Segments & segments,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute exclusive conditional prefix-sum (scan) within specified segments in a range. .

This function computes exclusive prefix-sum within segments in the range [ begin, end), but only for elements that satisfy the given condition. Each segment is processed independently using sequential scan.

Template Parameters
SegmentsType of the segments container.
IndexBeginType of the begin index.
IndexEndType of the end index.
ConditionType of the condition function.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
beginThe beginning of the range of segments to scan.
endThe end of the range of segments to scan.
conditionFunction that returns true for elements to include in scan. See Condition Lambda.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ forAllElements()

template<typename Segments, typename Function>
void TNL::Algorithms::Segments::forAllElements ( const Segments & segments,
Function && function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of all segments and applies the specified lambda function.

See also: Overview of Segment Traversal Functions

Template Parameters
SegmentsThe type of the segments.
FunctionThe type of the lambda function to be applied to each element.
Parameters
segmentsThe segments whose elements will be processed using the lambda function.
functionThe lambda function to be applied to each element. See Full Form (With All Parameters) or Brief Form (Without Local Index).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.

Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
13 using Device = typename Segments::DeviceType;
14
16 /***
17 * Create segments with given segments sizes.
18 */
19 Segments segments{ 1, 2, 3, 4, 5 };
20
21 /***
22 * Allocate array for the segments;
23 */
24 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
26
28 /***
29 * Insert data into particular segments with no check.
30 */
31 auto data_view = data.getView();
33 segments,
34 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
35 {
36 data_view[ globalIdx ] = segmentIdx;
37 } );
39
41 /***
42 * Print the data managed by the segments.
43 */
44 std::cout << "Data setup with no check ...\n";
45 std::cout << "Array: " << data << '\n';
46 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
47 {
48 return data_view[ globalIdx ];
49 };
50 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
52
54 /***
55 * Insert data into particular segments.
56 */
57 data = 0.0;
59 segments,
60 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
61 {
62 if( localIdx <= segmentIdx )
63 data_view[ globalIdx ] = segmentIdx;
64 } );
66
68 /***
69 * Print the data managed by the segments.
70 */
71 std::cout << "Data setup with check for padding elements...\n";
72 std::cout << "Array: " << data << '\n';
73 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
75}
76
77int
78main( int argc, char* argv[] )
79{
80 std::cout << "Example of CSR segments on host:\n";
81 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
82
83 std::cout << "Example of Ellpack segments on host:\n";
84 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
85
86#ifdef __CUDACC__
87 std::cout << "Example of CSR segments on CUDA GPU:\n";
88 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
89
90 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
91 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
92#endif
93 return EXIT_SUCCESS;
94}
Array is responsible for memory management, access to array elements, and general array operations.
Definition Array.h:65
Namespace for the segments data structures.
Definition _NamespaceDoxy.h:7
Output
Example of CSR segments on host:
Data setup with no check ...
Array: [ 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 1, 1 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Data setup with check for padding elements...
Array: [ 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 1, 1 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on host:
Data setup with no check ...
Array: [ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 1, 1, 1, 1, 1 ]
Segment 2: [ 2, 2, 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Data setup with check for padding elements...
Array: [ 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 2, 2, 2, 0, 0, 3, 3, 3, 3, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 1, 1, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 3, 3, 3, 3, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of CSR segments on CUDA GPU:
Data setup with no check ...
Array: [ 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 1, 1 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Data setup with check for padding elements...
Array: [ 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 1, 1 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on CUDA GPU:
Data setup with no check ...
Array: [ 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 1, 1, 1, 1, 1 ]
Segment 2: [ 2, 2, 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Data setup with check for padding elements...
Array: [ 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 1, 1, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 3, 3, 3, 3, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]

◆ forAllElementsIf()

template<typename Segments, typename Condition, typename Function>
void TNL::Algorithms::Segments::forAllElementsIf ( const Segments & segments,
Condition condition,
Function function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of all segments based on a condition.

See also: Overview of Segment Traversal Functions

For each segment, a condition lambda function is evaluated based on the segment index. If the condition lambda function returns true, all elements of the segment are traversed, and the specified lambda function is applied to each element. If the condition lambda function returns false, the segment is skipped.

Template Parameters
SegmentsThe type of the segments.
ConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
segmentsThe segments whose elements will be processed using the lambda function.
conditionLambda function for condition checking. See Condition Check.
functionThe lambda function to be applied to each element. See Full Form (With All Parameters) or Brief Form (Without Local Index).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
13 using Device = typename Segments::DeviceType;
14
15 /***
16 * Create segments with given segments sizes.
17 */
18 Segments segments{ 1, 2, 3, 4, 5 };
19
20 /***
21 * Allocate array for the segments;
22 */
23 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
24
26 /***
27 * Insert data into particular segments with no check.
28 */
29 auto data_view = data.getView();
31 segments,
33 [ = ] __cuda_callable__( int segmentIdx ) -> bool
34 {
35 return segmentIdx % 2 == 0; // Iterate only over even-indexed segments.
36 },
38 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
39 {
40 if( localIdx <= segmentIdx )
41 data_view[ globalIdx ] = segmentIdx;
42 } );
44
45 /***
46 * Print the data managed by the segments.
47 */
48 std::cout << "Array: " << data << '\n';
49 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
50 {
51 return data_view[ globalIdx ];
52 };
53 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
54}
55
56int
57main( int argc, char* argv[] )
58{
59 std::cout << "Example of CSR segments on host:\n";
60 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
61
62 std::cout << "Example of Ellpack segments on host:\n";
63 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
64
65#ifdef __CUDACC__
66 std::cout << "Example of CSR segments on CUDA GPU:\n";
67 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
68
69 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
70 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
71#endif
72 return EXIT_SUCCESS;
73}
void forAllElementsIf(const Segments &segments, Condition condition, Function function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over all elements of all segments based on a condition.
Output
Example of CSR segments on host:
Array: [ 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 0, 0 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on host:
Array: [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 0, 0, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 0, 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of CSR segments on CUDA GPU:
Array: [ 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 0, 0 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on CUDA GPU:
Array: [ 0, 0, 2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 0, 0, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 0, 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]

◆ forAllSegments()

template<typename Segments, typename Function>
void TNL::Algorithms::Segments::forAllSegments ( const Segments & segments,
Function && function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all segments and applies the given lambda function to each segment.

See also: Overview of Segment Traversal Functions

Template Parameters
SegmentsThe type of the segments.
FunctionThe type of the lambda function to be executed on each segment.
Parameters
segmentsThe segments on which the lambda function will be applied.
functionThe lambda function to be applied to each segment. See Segment View Lambda.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
14 using Device = typename Segments::DeviceType;
15
16 /***
17 * Create segments with given segments sizes.
18 */
19 Segments segments{ 1, 2, 3, 4, 5 };
20
21 /***
22 * Allocate array for the segments;
23 */
24 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
26
28 /***
29 * Insert data into particular segments.
30 */
31 auto data_view = data.getView();
32 using SegmentViewType = typename Segments::SegmentViewType;
34 segments,
35 [ = ] __cuda_callable__( const SegmentViewType& segment ) mutable
36 {
37 double sum( 0.0 );
38 for( auto element : segment )
39 if( element.localIndex() <= element.segmentIndex() ) {
40 sum += element.localIndex() + 1;
41 data_view[ element.globalIndex() ] = sum;
42 }
43 } );
45
47 /***
48 * Print the data managed by the segments.
49 */
50 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
51 {
52 return data_view[ globalIdx ];
53 };
54 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
56}
57
58int
59main( int argc, char* argv[] )
60{
61 std::cout << "Example of CSR segments on host:\n";
62 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
63
64 std::cout << "Example of Ellpack segments on host:\n";
65 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
66
67#ifdef __CUDACC__
68 std::cout << "Example of CSR segments on CUDA GPU:\n";
69 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
70
71 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
72 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
73#endif
74 return EXIT_SUCCESS;
75}
void forAllSegments(const Segments &segments, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over all segments and applies the given lambda function to each segment.
Output
Example of CSR segments on host:
Segment 0: [ 1 ]
Segment 1: [ 1, 3 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 3, 6, 10 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on host:
Segment 0: [ 1, 0, 0, 0, 0 ]
Segment 1: [ 1, 3, 0, 0, 0 ]
Segment 2: [ 1, 3, 6, 0, 0 ]
Segment 3: [ 1, 3, 6, 10, 0 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of CSR segments on CUDA GPU:
Segment 0: [ 1 ]
Segment 1: [ 1, 3 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 3, 6, 10 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on CUDA GPU:
Segment 0: [ 1, 0, 0, 0, 0 ]
Segment 1: [ 1, 3, 0, 0, 0 ]
Segment 2: [ 1, 3, 6, 0, 0 ]
Segment 3: [ 1, 3, 6, 10, 0 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10SegmentsExample()
11{
12 using SegmentsType = typename TNL::Algorithms::Segments::CSR< Device, int >;
13
14 /***
15 * Create segments with given segments sizes.
16 */
17 SegmentsType segments{ 1, 2, 3, 4, 5 };
18
19 /***
20 * Allocate array for the segments;
21 */
22 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
23
24 /***
25 * Insert data into particular segments.
26 */
27 auto data_view = data.getView();
29 segments,
30 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
31 {
32 data_view[ globalIdx ] = localIdx + 1;
33 } );
34
35 /***
36 * Print the data by the segments.
37 */
38 std::cout << "Values of elements after initial setup:\n";
39 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
40 {
41 return data_view[ globalIdx ];
42 };
43 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
44
46 /***
47 * Divide elements in each segment by a sum of all elements in the segment
48 */
49 using SegmentViewType = typename SegmentsType::SegmentViewType;
51 segments,
52 [ = ] __cuda_callable__( const SegmentViewType& segment ) mutable
53 {
54 // Compute the sum first ...
55 double sum = 0.0;
56 for( auto element : segment )
57 if( element.localIndex() <= element.segmentIndex() )
58 sum += data_view[ element.globalIndex() ];
59 // ... divide all elements.
60 for( auto element : segment )
61 if( element.localIndex() <= element.segmentIndex() )
62 data_view[ element.globalIndex() ] /= sum;
63 } );
65
66 /***
67 * Print the data managed by the segments.
68 */
69 std::cout << "Value of elements after dividing by sum in each segment:\n";
70 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
71}
72
73int
74main( int argc, char* argv[] )
75{
76 std::cout << "Example of CSR segments on host:\n";
77 SegmentsExample< TNL::Devices::Host >();
78
79#ifdef __CUDACC__
80 std::cout << "Example of CSR segments on CUDA GPU:\n";
81 SegmentsExample< TNL::Devices::Cuda >();
82#endif
83 return EXIT_SUCCESS;
84}
Output
Example of CSR segments on host:
Values of elements after initial setup:
Segment 0: [ 1 ]
Segment 1: [ 1, 2 ]
Segment 2: [ 1, 2, 3 ]
Segment 3: [ 1, 2, 3, 4 ]
Segment 4: [ 1, 2, 3, 4, 5 ]
Value of elements after dividing by sum in each segment:
Segment 0: [ 1 ]
Segment 1: [ 0.333333, 0.666667 ]
Segment 2: [ 0.166667, 0.333333, 0.5 ]
Segment 3: [ 0.1, 0.2, 0.3, 0.4 ]
Segment 4: [ 0.0666667, 0.133333, 0.2, 0.266667, 0.333333 ]
Example of CSR segments on CUDA GPU:
Values of elements after initial setup:
Segment 0: [ 1 ]
Segment 1: [ 1, 2 ]
Segment 2: [ 1, 2, 3 ]
Segment 3: [ 1, 2, 3, 4 ]
Segment 4: [ 1, 2, 3, 4, 5 ]
Value of elements after dividing by sum in each segment:
Segment 0: [ 1 ]
Segment 1: [ 0.333333, 0.666667 ]
Segment 2: [ 0.166667, 0.333333, 0.5 ]
Segment 3: [ 0.1, 0.2, 0.3, 0.4 ]
Segment 4: [ 0.0666667, 0.133333, 0.2, 0.266667, 0.333333 ]

◆ forAllSegmentsIf()

template<typename Segments, typename SegmentCondition, typename Function>
void TNL::Algorithms::Segments::forAllSegmentsIf ( const Segments & segments,
SegmentCondition && segmentCondition,
Function && function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all segments, applying a condition to determine whether each segment should be processed.

See also: Overview of Segment Traversal Functions

For each segment, a condition lambda function is evaluated based on the segment index. If the condition lambda function returns true, the specified lambda function is executed for the segment. If the condition lambda function returns false, the segment is skipped.

Template Parameters
SegmentsThe type of the segments.
SegmentConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be executed on each segment.
Parameters
segmentsThe segments on which the lambda function will be applied.
segmentConditionLambda function for condition checking. See Condition Check.
functionThe lambda function to be applied to each segment. See Segment View Lambda.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
13 using Device = typename Segments::DeviceType;
14
15 /***
16 * Create segments with given segments sizes.
17 */
18 Segments segments{ 1, 2, 3, 4, 5 };
19
20 /***
21 * Allocate array for the segments;
22 */
23 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
24
25 /***
26 * Insert data into particular segments.
27 */
28 auto data_view = data.getView();
30 segments,
31 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
32 {
33 data_view[ globalIdx ] = localIdx + 1;
34 } );
35
36 /***
37 * Print the data by the segments.
38 */
39 std::cout << "Values of elements after initial setup:\n";
40 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
41 {
42 return data_view[ globalIdx ];
43 };
44 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
45
47 /***
48 * Compute cumulative sums in particular segments.
49 */
50 using SegmentViewType = typename Segments::SegmentViewType;
52 segments,
53 [ = ] __cuda_callable__( const int segmentIdx ) -> bool
54 {
55 return segmentIdx % 2 == 0; // Iterate only over even-indexed segments.
56 },
57 [ = ] __cuda_callable__( const SegmentViewType& segment ) mutable
58 {
59 double sum( 0.0 );
60 for( auto element : segment )
61 if( element.localIndex() <= element.segmentIndex() ) {
62 sum += element.localIndex() + 1;
63 data_view[ element.globalIndex() ] = sum;
64 }
65 } );
67
68 /***
69 * Print the data managed by the segments.
70 */
71 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
72}
73
74int
75main( int argc, char* argv[] )
76{
77 std::cout << "Example of CSR segments on host:\n";
78 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
79
80 std::cout << "Example of Ellpack segments on host:\n";
81 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
82
83#ifdef __CUDACC__
84 std::cout << "Example of CSR segments on CUDA GPU:\n";
85 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
86
87 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
88 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
89#endif
90 return EXIT_SUCCESS;
91}
void forAllSegmentsIf(const Segments &segments, SegmentCondition &&segmentCondition, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over all segments, applying a condition to determine whether each segment should...
Output
Example of CSR segments on host:
Segment 0: [ 1 ]
Segment 1: [ 1, 3 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 3, 6, 10 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on host:
Segment 0: [ 1, 0, 0, 0, 0 ]
Segment 1: [ 1, 3, 0, 0, 0 ]
Segment 2: [ 1, 3, 6, 0, 0 ]
Segment 3: [ 1, 3, 6, 10, 0 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of CSR segments on CUDA GPU:
Segment 0: [ 1 ]
Segment 1: [ 1, 3 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 3, 6, 10 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on CUDA GPU:
Segment 0: [ 1, 0, 0, 0, 0 ]
Segment 1: [ 1, 3, 0, 0, 0 ]
Segment 2: [ 1, 3, 6, 0, 0 ]
Segment 3: [ 1, 3, 6, 10, 0 ]
Segment 4: [ 1, 3, 6, 10, 15 ]

◆ forElements() [1/2]

template<typename Segments, typename Array, typename Function>
void TNL::Algorithms::Segments::forElements ( const Segments & segments,
const Array & segmentIndexes,
Function function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements of segments with the given indexes and applies the specified lambda function.

See also: Overview of Segment Traversal Functions

Template Parameters
SegmentsThe type of the segments.
ArrayThe type of the array containing the indexes of the segments to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
FunctionThe type of the lambda function to be applied to each element.
Parameters
segmentsThe segments whose elements will be processed using the lambda function.
segmentIndexesThe array containing the indexes of the segments to iterate over.
functionThe lambda function to be applied to each element. See Full Form (With All Parameters) or Brief Form (Without Local Index).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
13 using Device = typename Segments::DeviceType;
14
16 /***
17 * Create segments with given segments sizes.
18 */
19 Segments segments{ 1, 2, 3, 4, 5 };
20
21 /***
22 * Allocate array for the segments;
23 */
24 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
26
28 /***
29 * Create array with the indexes of segments we want to iterate over.
30 */
31 TNL::Containers::Array< int, Device > segmentIndexes{ 0, 2, 4 };
32
33 /***
34 * Insert data into particular segments with no check.
35 */
36 auto data_view = data.getView();
38 segments,
39 segmentIndexes,
40 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
41 {
42 if( localIdx <= segmentIdx )
43 data_view[ globalIdx ] = segmentIdx;
44 } );
46
47 /***
48 * Print the data managed by the segments.
49 */
50 std::cout << "Array: " << data << '\n';
51 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
52 {
53 return data_view[ globalIdx ];
54 };
55 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
56}
57
58int
59main( int argc, char* argv[] )
60{
61 std::cout << "Example of CSR segments on host:\n";
62 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
63
64 std::cout << "Example of Ellpack segments on host:\n";
65 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
66
67#ifdef __CUDACC__
68 std::cout << "Example of CSR segments on CUDA GPU:\n";
69 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
70
71 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
72 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
73#endif
74 return EXIT_SUCCESS;
75}
ViewType getView(IndexType begin=0, IndexType end=0)
Returns a modifiable view of the array.
void forElements(const Segments &segments, IndexBegin begin, IndexEnd end, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over all elements in the given range of segments and applies the specified lambd...
Output
Example of CSR segments on host:
Array: [ 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 0, 0 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on host:
Array: [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 0, 0, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 0, 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of CSR segments on CUDA GPU:
Array: [ 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 0, 0 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on CUDA GPU:
Array: [ 0, 0, 2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 0, 0, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 0, 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]

◆ forElements() [2/2]

template<typename Segments, typename IndexBegin, typename IndexEnd, typename Function>
void TNL::Algorithms::Segments::forElements ( const Segments & segments,
IndexBegin begin,
IndexEnd end,
Function && function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements in the given range of segments and applies the specified lambda function. .

See also: Overview of Segment Traversal Functions

Template Parameters
SegmentsThe type of the segments.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of segments whose elements we want to process using the lambda function.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of segments whose elements we want to process using the lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
segmentsThe segments whose elements will be processed using the lambda function.
beginThe beginning of the interval [ begin, end ) of segments whose elements will be processed using the lambda function.
endThe end of the interval [ begin, end ) of segments whose elements will be processed using the lambda function.
functionThe lambda function to be applied to each element. See Full Form (With All Parameters) or Brief Form (Without Local Index).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
13 using Device = typename Segments::DeviceType;
14
16 /***
17 * Create segments with given segments sizes.
18 */
19 Segments segments{ 1, 2, 3, 4, 5 };
20
21 /***
22 * Allocate array for the segments;
23 */
24 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
26
28 /***
29 * Insert data into particular segments with no check.
30 */
31 auto data_view = data.getView();
33 segments,
34 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
35 {
36 data_view[ globalIdx ] = segmentIdx;
37 } );
39
41 /***
42 * Print the data managed by the segments.
43 */
44 std::cout << "Data setup with no check ...\n";
45 std::cout << "Array: " << data << '\n';
46 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
47 {
48 return data_view[ globalIdx ];
49 };
50 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
52
54 /***
55 * Insert data into particular segments.
56 */
57 data = 0.0;
59 segments,
60 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
61 {
62 if( localIdx <= segmentIdx )
63 data_view[ globalIdx ] = segmentIdx;
64 } );
66
68 /***
69 * Print the data managed by the segments.
70 */
71 std::cout << "Data setup with check for padding elements...\n";
72 std::cout << "Array: " << data << '\n';
73 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
75}
76
77int
78main( int argc, char* argv[] )
79{
80 std::cout << "Example of CSR segments on host:\n";
81 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
82
83 std::cout << "Example of Ellpack segments on host:\n";
84 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
85
86#ifdef __CUDACC__
87 std::cout << "Example of CSR segments on CUDA GPU:\n";
88 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
89
90 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
91 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
92#endif
93 return EXIT_SUCCESS;
94}
Output
Example of CSR segments on host:
Data setup with no check ...
Array: [ 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 1, 1 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Data setup with check for padding elements...
Array: [ 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 1, 1 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on host:
Data setup with no check ...
Array: [ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 1, 1, 1, 1, 1 ]
Segment 2: [ 2, 2, 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Data setup with check for padding elements...
Array: [ 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 2, 2, 2, 0, 0, 3, 3, 3, 3, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 1, 1, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 3, 3, 3, 3, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of CSR segments on CUDA GPU:
Data setup with no check ...
Array: [ 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 1, 1 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Data setup with check for padding elements...
Array: [ 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 1, 1 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on CUDA GPU:
Data setup with no check ...
Array: [ 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 1, 1, 1, 1, 1 ]
Segment 2: [ 2, 2, 2, 2, 2 ]
Segment 3: [ 3, 3, 3, 3, 3 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Data setup with check for padding elements...
Array: [ 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 1, 1, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 3, 3, 3, 3, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]

◆ forElementsIf()

template<typename Segments, typename IndexBegin, typename IndexEnd, typename Condition, typename Function>
void TNL::Algorithms::Segments::forElementsIf ( const Segments & segments,
IndexBegin begin,
IndexEnd end,
Condition condition,
Function function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over all elements in a given range of segments based on a condition.

See also: Overview of Segment Traversal Functions

For each segment, a condition lambda function is evaluated based on the segment index. If the condition lambda function returns true, all elements of the segment are traversed, and the specified lambda function is applied to each element. If the condition lambda function returns false, the segment is skipped.

Template Parameters
SegmentsThe type of the segments.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of segments whose elements will be processed using the lambda function.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of segments whose elements will be processed using the lambda function.
ConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be applied to each element.
Parameters
segmentsThe segments whose elements will be processed using the lambda function.
beginThe beginning of the interval [ begin, end ) of segments whose elements will be processed using the lambda function.
endThe end of the interval [ begin, end ) of segments whose elements will be processed using the lambda function.
conditionLambda function for condition checking. See Condition Check.
functionThe lambda function to be applied to each element. See Full Form (With All Parameters) or Brief Form (Without Local Index).
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
13 using Device = typename Segments::DeviceType;
14
15 /***
16 * Create segments with given segments sizes.
17 */
18 Segments segments{ 1, 2, 3, 4, 5 };
19
20 /***
21 * Allocate array for the segments;
22 */
23 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
24
26 /***
27 * Insert data into particular segments with no check.
28 */
29 auto data_view = data.getView();
31 segments,
33 [ = ] __cuda_callable__( int segmentIdx ) -> bool
34 {
35 return segmentIdx % 2 == 0; // Iterate only over even-indexed segments.
36 },
38 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
39 {
40 if( localIdx <= segmentIdx )
41 data_view[ globalIdx ] = segmentIdx;
42 } );
44
45 /***
46 * Print the data managed by the segments.
47 */
48 std::cout << "Array: " << data << '\n';
49 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
50 {
51 return data_view[ globalIdx ];
52 };
53 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
54}
55
56int
57main( int argc, char* argv[] )
58{
59 std::cout << "Example of CSR segments on host:\n";
60 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
61
62 std::cout << "Example of Ellpack segments on host:\n";
63 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
64
65#ifdef __CUDACC__
66 std::cout << "Example of CSR segments on CUDA GPU:\n";
67 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
68
69 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
70 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
71#endif
72 return EXIT_SUCCESS;
73}
Output
Example of CSR segments on host:
Array: [ 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 0, 0 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on host:
Array: [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 0, 0, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 0, 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of CSR segments on CUDA GPU:
Array: [ 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 4, 4, 4, 4, 4 ]
Segment 0: [ 0 ]
Segment 1: [ 0, 0 ]
Segment 2: [ 2, 2, 2 ]
Segment 3: [ 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]
Example of Ellpack segments on CUDA GPU:
Array: [ 0, 0, 2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
Segment 0: [ 0, 0, 0, 0, 0 ]
Segment 1: [ 0, 0, 0, 0, 0 ]
Segment 2: [ 2, 2, 2, 0, 0 ]
Segment 3: [ 0, 0, 0, 0, 0 ]
Segment 4: [ 4, 4, 4, 4, 4 ]

◆ forSegments() [1/2]

template<typename Segments, typename Array, typename Function, typename T = std::enable_if_t< IsArrayType< Array >::value >>
void TNL::Algorithms::Segments::forSegments ( const Segments & segments,
const Array & segmentIndexes,
Function && function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over segments with the given indexes and applies the specified lambda function to each segment.

See also: Overview of Segment Traversal Functions

Template Parameters
SegmentsThe type of the segments.
ArrayThe type of the array containing the indexes of the segments to iterate over. This can be containers such as TNL::Containers::Array, TNL::Containers::ArrayView, TNL::Containers::Vector, or TNL::Containers::VectorView.
FunctionThe type of the lambda function to be executed on each segment.
Parameters
segmentsThe segments on which the lambda function will be applied.
segmentIndexesThe array containing the indexes of the segments to iterate over.
functionThe lambda function to be applied to each segment. See Segment View Lambda.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
13 using Device = typename Segments::DeviceType;
14
15 /***
16 * Create segments with given segments sizes.
17 */
18 Segments segments{ 1, 2, 3, 4, 5 };
19
20 /***
21 * Allocate array for the segments;
22 */
23 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
24
25 /***
26 * Insert data into particular segments.
27 */
28 auto data_view = data.getView();
30 segments,
31 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
32 {
33 data_view[ globalIdx ] = localIdx + 1;
34 } );
35
36 /***
37 * Print the data by the segments.
38 */
39 std::cout << "Values of elements after initial setup:\n";
40 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
41 {
42 return data_view[ globalIdx ];
43 };
44 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
45
47 /***
48 * Create array with the indexes of segments we want to iterate over.
49 */
50 TNL::Containers::Array< int, Device > segmentIndexes{ 0, 2, 4 };
51
52 /***
53 * Compute cumulative sums in particular segments.
54 */
55 using SegmentViewType = typename Segments::SegmentViewType;
57 segments,
58 segmentIndexes,
59 [ = ] __cuda_callable__( const SegmentViewType& segment ) mutable
60 {
61 double sum( 0.0 );
62 for( auto element : segment )
63 if( element.localIndex() <= element.segmentIndex() ) {
64 sum += element.localIndex() + 1;
65 data_view[ element.globalIndex() ] = sum;
66 }
67 } );
69
70 /***
71 * Print the data managed by the segments.
72 */
73 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
74}
75
76int
77main( int argc, char* argv[] )
78{
79 std::cout << "Example of CSR segments on host:\n";
80 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
81
82 std::cout << "Example of Ellpack segments on host:\n";
83 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
84
85#ifdef __CUDACC__
86 std::cout << "Example of CSR segments on CUDA GPU:\n";
87 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
88
89 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
90 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
91#endif
92 return EXIT_SUCCESS;
93}
void forSegments(const Segments &segments, IndexBegin begin, IndexEnd end, Function &&function, LaunchConfiguration launchConfig=Algorithms::Segments::LaunchConfiguration())
Iterates in parallel over segments within the specified range of segment indexes and applies the give...
Output
Example of CSR segments on host:
Values of elements after initial setup:
Segment 0: [ 1 ]
Segment 1: [ 1, 2 ]
Segment 2: [ 1, 2, 3 ]
Segment 3: [ 1, 2, 3, 4 ]
Segment 4: [ 1, 2, 3, 4, 5 ]
Segment 0: [ 1 ]
Segment 1: [ 1, 2 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 2, 3, 4 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on host:
Values of elements after initial setup:
Segment 0: [ 1, 2, 3, 4, 5 ]
Segment 1: [ 1, 2, 3, 4, 5 ]
Segment 2: [ 1, 2, 3, 4, 5 ]
Segment 3: [ 1, 2, 3, 4, 5 ]
Segment 4: [ 1, 2, 3, 4, 5 ]
Segment 0: [ 1, 2, 3, 4, 5 ]
Segment 1: [ 1, 2, 3, 4, 5 ]
Segment 2: [ 1, 3, 6, 4, 5 ]
Segment 3: [ 1, 2, 3, 4, 5 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of CSR segments on CUDA GPU:
Values of elements after initial setup:
Segment 0: [ 1 ]
Segment 1: [ 1, 2 ]
Segment 2: [ 1, 2, 3 ]
Segment 3: [ 1, 2, 3, 4 ]
Segment 4: [ 1, 2, 3, 4, 5 ]
Segment 0: [ 1 ]
Segment 1: [ 1, 2 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 2, 3, 4 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on CUDA GPU:
Values of elements after initial setup:
Segment 0: [ 1, 2, 3, 4, 5 ]
Segment 1: [ 1, 2, 3, 4, 5 ]
Segment 2: [ 1, 2, 3, 4, 5 ]
Segment 3: [ 1, 2, 3, 4, 5 ]
Segment 4: [ 1, 2, 3, 4, 5 ]
Segment 0: [ 1, 2, 3, 4, 5 ]
Segment 1: [ 1, 2, 3, 4, 5 ]
Segment 2: [ 1, 3, 6, 4, 5 ]
Segment 3: [ 1, 2, 3, 4, 5 ]
Segment 4: [ 1, 3, 6, 10, 15 ]

◆ forSegments() [2/2]

template<typename Segments, typename IndexBegin, typename IndexEnd, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Algorithms::Segments::forSegments ( const Segments & segments,
IndexBegin begin,
IndexEnd end,
Function && function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over segments within the specified range of segment indexes and applies the given lambda function to each segment.

See also: Overview of Segment Traversal Functions

Template Parameters
SegmentsThe type of the segments.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of segments on which the lambda function will be applied.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of segments on which the lambda function will be applied.
FunctionThe type of the lambda function to be executed on each segment.
Parameters
segmentsThe segments on which the lambda function will be applied.
beginThe beginning of the interval [ begin, end ) of segments that will be processed using the lambda function.
endThe end of the interval [ begin, end ) of segments that will be processed using the lambda function.
functionThe lambda function to be applied to each segment. See Segment View Lambda.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
14 using Device = typename Segments::DeviceType;
15
16 /***
17 * Create segments with given segments sizes.
18 */
19 Segments segments{ 1, 2, 3, 4, 5 };
20
21 /***
22 * Allocate array for the segments;
23 */
24 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
26
28 /***
29 * Insert data into particular segments.
30 */
31 auto data_view = data.getView();
32 using SegmentViewType = typename Segments::SegmentViewType;
34 segments,
35 [ = ] __cuda_callable__( const SegmentViewType& segment ) mutable
36 {
37 double sum( 0.0 );
38 for( auto element : segment )
39 if( element.localIndex() <= element.segmentIndex() ) {
40 sum += element.localIndex() + 1;
41 data_view[ element.globalIndex() ] = sum;
42 }
43 } );
45
47 /***
48 * Print the data managed by the segments.
49 */
50 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
51 {
52 return data_view[ globalIdx ];
53 };
54 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
56}
57
58int
59main( int argc, char* argv[] )
60{
61 std::cout << "Example of CSR segments on host:\n";
62 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
63
64 std::cout << "Example of Ellpack segments on host:\n";
65 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
66
67#ifdef __CUDACC__
68 std::cout << "Example of CSR segments on CUDA GPU:\n";
69 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
70
71 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
72 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
73#endif
74 return EXIT_SUCCESS;
75}
Output
Example of CSR segments on host:
Segment 0: [ 1 ]
Segment 1: [ 1, 3 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 3, 6, 10 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on host:
Segment 0: [ 1, 0, 0, 0, 0 ]
Segment 1: [ 1, 3, 0, 0, 0 ]
Segment 2: [ 1, 3, 6, 0, 0 ]
Segment 3: [ 1, 3, 6, 10, 0 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of CSR segments on CUDA GPU:
Segment 0: [ 1 ]
Segment 1: [ 1, 3 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 3, 6, 10 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on CUDA GPU:
Segment 0: [ 1, 0, 0, 0, 0 ]
Segment 1: [ 1, 3, 0, 0, 0 ]
Segment 2: [ 1, 3, 6, 0, 0 ]
Segment 3: [ 1, 3, 6, 10, 0 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/traverse.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7
8template< typename Device >
9void
10SegmentsExample()
11{
12 using SegmentsType = typename TNL::Algorithms::Segments::CSR< Device, int >;
13
14 /***
15 * Create segments with given segments sizes.
16 */
17 SegmentsType segments{ 1, 2, 3, 4, 5 };
18
19 /***
20 * Allocate array for the segments;
21 */
22 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
23
24 /***
25 * Insert data into particular segments.
26 */
27 auto data_view = data.getView();
29 segments,
30 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
31 {
32 data_view[ globalIdx ] = localIdx + 1;
33 } );
34
35 /***
36 * Print the data by the segments.
37 */
38 std::cout << "Values of elements after initial setup:\n";
39 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
40 {
41 return data_view[ globalIdx ];
42 };
43 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
44
46 /***
47 * Divide elements in each segment by a sum of all elements in the segment
48 */
49 using SegmentViewType = typename SegmentsType::SegmentViewType;
51 segments,
52 [ = ] __cuda_callable__( const SegmentViewType& segment ) mutable
53 {
54 // Compute the sum first ...
55 double sum = 0.0;
56 for( auto element : segment )
57 if( element.localIndex() <= element.segmentIndex() )
58 sum += data_view[ element.globalIndex() ];
59 // ... divide all elements.
60 for( auto element : segment )
61 if( element.localIndex() <= element.segmentIndex() )
62 data_view[ element.globalIndex() ] /= sum;
63 } );
65
66 /***
67 * Print the data managed by the segments.
68 */
69 std::cout << "Value of elements after dividing by sum in each segment:\n";
70 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
71}
72
73int
74main( int argc, char* argv[] )
75{
76 std::cout << "Example of CSR segments on host:\n";
77 SegmentsExample< TNL::Devices::Host >();
78
79#ifdef __CUDACC__
80 std::cout << "Example of CSR segments on CUDA GPU:\n";
81 SegmentsExample< TNL::Devices::Cuda >();
82#endif
83 return EXIT_SUCCESS;
84}
Output
Example of CSR segments on host:
Values of elements after initial setup:
Segment 0: [ 1 ]
Segment 1: [ 1, 2 ]
Segment 2: [ 1, 2, 3 ]
Segment 3: [ 1, 2, 3, 4 ]
Segment 4: [ 1, 2, 3, 4, 5 ]
Value of elements after dividing by sum in each segment:
Segment 0: [ 1 ]
Segment 1: [ 0.333333, 0.666667 ]
Segment 2: [ 0.166667, 0.333333, 0.5 ]
Segment 3: [ 0.1, 0.2, 0.3, 0.4 ]
Segment 4: [ 0.0666667, 0.133333, 0.2, 0.266667, 0.333333 ]
Example of CSR segments on CUDA GPU:
Values of elements after initial setup:
Segment 0: [ 1 ]
Segment 1: [ 1, 2 ]
Segment 2: [ 1, 2, 3 ]
Segment 3: [ 1, 2, 3, 4 ]
Segment 4: [ 1, 2, 3, 4, 5 ]
Value of elements after dividing by sum in each segment:
Segment 0: [ 1 ]
Segment 1: [ 0.333333, 0.666667 ]
Segment 2: [ 0.166667, 0.333333, 0.5 ]
Segment 3: [ 0.1, 0.2, 0.3, 0.4 ]
Segment 4: [ 0.0666667, 0.133333, 0.2, 0.266667, 0.333333 ]

◆ forSegmentsIf()

template<typename Segments, typename IndexBegin, typename IndexEnd, typename SegmentCondition, typename Function, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Algorithms::Segments::forSegmentsIf ( const Segments & segments,
IndexBegin begin,
IndexEnd end,
SegmentCondition && segmentCondition,
Function && function,
LaunchConfiguration launchConfig = Algorithms::Segments::LaunchConfiguration() )

Iterates in parallel over segments within the given range of segment indexes, applying a condition to determine whether each segment should be processed.

See also: Overview of Segment Traversal Functions

For each segment, a condition lambda function is evaluated based on the segment index. If the condition lambda function returns true, the specified lambda function is executed for the segment. If the condition lambda function returns false, the segment is skipped.

Template Parameters
SegmentsThe type of the segments.
IndexBeginThe type of the index defining the beginning of the interval [ begin, end ) of segments on which the lambda function will be applied.
IndexEndThe type of the index defining the end of the interval [ begin, end ) of segments on which the lambda function will be applied.
SegmentConditionThe type of the condition lambda function.
FunctionThe type of the lambda function to be executed on each segment.
Parameters
segmentsThe segments on which the lambda function will be applied.
beginThe beginning of the interval [ begin, end ) of segment indexes whose corresponding segments will be processed using the lambda function.
endThe end of the interval [ begin, end ) of segment indexes whose corresponding segments will be processed using the lambda function.
segmentConditionLambda function for condition checking. See Condition Check.
functionThe lambda function to be applied to each segment. See Segment View Lambda.
launchConfigThe configuration of the launch - see TNL::Algorithms::Segments::LaunchConfiguration.
Example
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Algorithms/Segments/CSR.h>
4#include <TNL/Algorithms/Segments/Ellpack.h>
5#include <TNL/Algorithms/Segments/traverse.h>
6#include <TNL/Devices/Host.h>
7#include <TNL/Devices/Cuda.h>
8
9template< typename Segments >
10void
11SegmentsExample()
12{
13 using Device = typename Segments::DeviceType;
14
15 /***
16 * Create segments with given segments sizes.
17 */
18 Segments segments{ 1, 2, 3, 4, 5 };
19
20 /***
21 * Allocate array for the segments;
22 */
23 TNL::Containers::Array< double, Device > data( segments.getStorageSize(), 0.0 );
24
25 /***
26 * Insert data into particular segments.
27 */
28 auto data_view = data.getView();
30 segments,
31 [ = ] __cuda_callable__( int segmentIdx, int localIdx, int globalIdx ) mutable
32 {
33 data_view[ globalIdx ] = localIdx + 1;
34 } );
35
36 /***
37 * Print the data by the segments.
38 */
39 std::cout << "Values of elements after initial setup:\n";
40 auto fetch = [ = ] __cuda_callable__( int globalIdx ) -> double
41 {
42 return data_view[ globalIdx ];
43 };
44 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
45
47 /***
48 * Compute cumulative sums in particular segments.
49 */
50 using SegmentViewType = typename Segments::SegmentViewType;
52 segments,
53 [ = ] __cuda_callable__( const int segmentIdx ) -> bool
54 {
55 return segmentIdx % 2 == 0; // Iterate only over even-indexed segments.
56 },
57 [ = ] __cuda_callable__( const SegmentViewType& segment ) mutable
58 {
59 double sum( 0.0 );
60 for( auto element : segment )
61 if( element.localIndex() <= element.segmentIndex() ) {
62 sum += element.localIndex() + 1;
63 data_view[ element.globalIndex() ] = sum;
64 }
65 } );
67
68 /***
69 * Print the data managed by the segments.
70 */
71 std::cout << TNL::Algorithms::Segments::print( segments, fetch ) << '\n';
72}
73
74int
75main( int argc, char* argv[] )
76{
77 std::cout << "Example of CSR segments on host:\n";
78 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
79
80 std::cout << "Example of Ellpack segments on host:\n";
81 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
82
83#ifdef __CUDACC__
84 std::cout << "Example of CSR segments on CUDA GPU:\n";
85 SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
86
87 std::cout << "Example of Ellpack segments on CUDA GPU:\n";
88 SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
89#endif
90 return EXIT_SUCCESS;
91}
Output
Example of CSR segments on host:
Segment 0: [ 1 ]
Segment 1: [ 1, 3 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 3, 6, 10 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on host:
Segment 0: [ 1, 0, 0, 0, 0 ]
Segment 1: [ 1, 3, 0, 0, 0 ]
Segment 2: [ 1, 3, 6, 0, 0 ]
Segment 3: [ 1, 3, 6, 10, 0 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of CSR segments on CUDA GPU:
Segment 0: [ 1 ]
Segment 1: [ 1, 3 ]
Segment 2: [ 1, 3, 6 ]
Segment 3: [ 1, 3, 6, 10 ]
Segment 4: [ 1, 3, 6, 10, 15 ]
Example of Ellpack segments on CUDA GPU:
Segment 0: [ 1, 0, 0, 0, 0 ]
Segment 1: [ 1, 3, 0, 0, 0 ]
Segment 2: [ 1, 3, 6, 0, 0 ]
Segment 3: [ 1, 3, 6, 10, 0 ]
Segment 4: [ 1, 3, 6, 10, 15 ]

◆ inclusiveScanAllSegments()

template<typename Segments, typename Fetch, typename Reduce, typename Write>
void TNL::Algorithms::Segments::inclusiveScanAllSegments ( const Segments & segments,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute inclusive prefix-sum (scan) within all segments. .

This is a convenience function that computes inclusive prefix-sum in all segments. It internally calls inclusiveScanSegments with the full range of segments.

Template Parameters
SegmentsType of the segments container.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ inclusiveScanAllSegmentsIf()

template<typename Segments, typename Condition, typename Fetch, typename Reduce, typename Write>
void TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf ( const Segments & segments,
Condition && condition,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute inclusive conditional prefix-sum (scan) within all segments. .

This is a convenience function that computes inclusive prefix-sum in all segments, but only for elements that satisfy the given condition. It internally calls inclusiveScanSegmentsIf with the full range of segments.

Template Parameters
SegmentsType of the segments container.
ConditionType of the condition function.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
conditionFunction that returns true for elements to include in scan. See Condition Lambda.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ inclusiveScanSegment()

template<typename SegmentView, typename Fetch, typename Reduce, typename Write>
__cuda_callable__ void TNL::Algorithms::Segments::inclusiveScanSegment ( SegmentView & segment,
Fetch && fetch,
Reduce && reduce,
Write && write )

Computes an inclusive scan (or prefix sum) within a segment. .

Template Parameters
SegmentViewType of the segment view.
FetchType of the fetch function.
Reduceis a type of function performing the reduction.
WriteType of the write function.
Parameters
segmentThe segment view to scan.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.

◆ inclusiveScanSegments() [1/2]

template<typename Segments, typename Array, typename Fetch, typename Reduce, typename Write>
void TNL::Algorithms::Segments::inclusiveScanSegments ( const Segments & segments,
const Array & segmentIndexes,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute inclusive prefix-sum (scan) within segments specified by a segment index array. .

This is a convenience function that computes inclusive prefix-sum in segments specified by the segmentIndexes array. It internally calls inclusiveScanSegments with the full range of the segment index array.

Template Parameters
SegmentsType of the segments container.
ArrayType of the segment indexes array.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
segmentIndexesArray containing indices of segments to scan.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ inclusiveScanSegments() [2/2]

template<typename Segments, typename IndexBegin, typename IndexEnd, typename Fetch, typename Reduce, typename Write, typename T = std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Algorithms::Segments::inclusiveScanSegments ( const Segments & segments,
IndexBegin begin,
IndexEnd end,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute inclusive prefix-sum (scan) within specified segments in a range. .

This function computes inclusive prefix-sum within segments in the range [ begin, end). Each segment is processed independently using sequential scan. The scan operation is performed based on the provided fetch, reduce, and write functions.

Template Parameters
SegmentsType of the segments container.
IndexBeginType of the begin index.
IndexEndType of the end index.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
beginThe beginning of the range of segments to scan.
endThe end of the range of segments to scan.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ inclusiveScanSegmentsIf()

template<typename Segments, typename IndexBegin, typename IndexEnd, typename Condition, typename Fetch, typename Reduce, typename Write, typename T = typename std::enable_if_t< std::is_integral_v< IndexBegin > && std::is_integral_v< IndexEnd > >>
void TNL::Algorithms::Segments::inclusiveScanSegmentsIf ( const Segments & segments,
IndexBegin begin,
IndexEnd end,
Condition && condition,
Fetch && fetch,
Reduce && reduce,
Write && write,
LaunchConfiguration launchConfig = LaunchConfiguration() )

Compute inclusive conditional prefix-sum (scan) within specified segments in a range. .

This function computes inclusive prefix-sum within segments in the range [ begin, end), but only for elements that satisfy the given condition. Each segment is processed independently using sequential scan.

Template Parameters
SegmentsType of the segments container.
IndexBeginType of the begin index.
IndexEndType of the end index.
ConditionType of the condition function.
FetchType of the fetch function.
Reduceis a function object performing the reduction, some Function objects for reduction operations.
WriteType of the write function.
Parameters
segmentsThe segments container.
beginThe beginning of the range of segments to scan.
endThe end of the range of segments to scan.
conditionFunction that returns true for elements to include in scan. See Condition Lambda.
fetchFunction to fetch element value at given position. See Fetch Lambda.
reduceFunction object performing the reduction. See Reduction Function Object.
writeFunction to write result at given position. See Write Lambda.
launchConfigConfiguration for parallel execution.
Example
1#include <iostream>
2#include <TNL/Algorithms/Segments/CSR.h>
3#include <TNL/Algorithms/Segments/scan.h>
4#include <TNL/Algorithms/Segments/print.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Devices/Host.h>
7
8template< typename Device, typename Value = double, typename Index = int >
9void
10scanExample()
11{
12 // Create segments with different sizes
13 TNL::Containers::Vector< Index, Device > segmentsSizes{ 1, 2, 3, 4, 5 };
14 auto segmentsSizesView = segmentsSizes.getConstView();
16
17 // Create data to be scanned within segments
18 TNL::Containers::Vector< Value, Device, Index > data( segments.getStorageSize() );
19 TNL::Containers::Vector< Value, Device, Index > inclusive_result( segments.getStorageSize() );
20 TNL::Containers::Vector< Value, Device, Index > exclusive_result( segments.getStorageSize() );
21 auto inclusive_result_view = inclusive_result.getView();
22 auto exclusive_result_view = exclusive_result.getView();
23
24 // Initialize data with segment index + 1
25 auto data_view = data.getView();
27 segments,
28 [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) mutable
29 {
30 data_view[ globalIdx ] = segmentIdx + 1;
31 } );
32
33 // Print original data
34 std::cout << "Original data in segments:\n";
36 segments,
37 [ = ] __cuda_callable__( Index globalIdx ) -> Value
38 {
39 return data_view[ globalIdx ];
40 } ) << '\n';
41
43 // Define fetch, reduce and write functions
44 auto fetch = [ = ] __cuda_callable__( Index segmentIdx, Index localIdx, Index globalIdx ) -> Value
45 {
46 if( localIdx < segmentsSizesView[ segmentIdx ] )
47 return data_view[ globalIdx ];
48 else
49 return 0; // Return 0 for padding elements
50 };
51 auto write_inclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
52 {
53 inclusive_result_view[ globalIdx ] = value;
54 };
55
56 auto write_exclusive = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
57 {
58 exclusive_result_view[ globalIdx ] = value;
59 };
60
61 // Perform inclusive scan
62 TNL::Algorithms::Segments::inclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_inclusive );
63
64 // Perform exclusive scan
65 TNL::Algorithms::Segments::exclusiveScanAllSegments( segments, fetch, TNL::Plus{}, write_exclusive );
67
68 // Print results
69 std::cout << "\nInclusive scan results:\n";
71 segments,
72 [ = ] __cuda_callable__( Index globalIdx ) -> Value
73 {
74 return inclusive_result_view[ globalIdx ];
75 } ) << '\n';
76
77 std::cout << "\nExclusive scan results:\n";
79 segments,
80 [ = ] __cuda_callable__( Index globalIdx ) -> Value
81 {
82 return exclusive_result_view[ globalIdx ];
83 } ) << '\n';
84
85 // Example of scanning only specific segments
86 TNL::Containers::Vector< Index, Device > segmentIndexes{ 1, 3 }; // Scan only segments 1 and 3
87
88 auto write_partial = [ = ] __cuda_callable__( Index globalIdx, Value value ) mutable
89 {
90 data_view[ globalIdx ] = value; // All segment scan algorithms may work even as inplace scan
91 };
92
93 // Perform inclusive scan on selected segments
94 TNL::Algorithms::Segments::inclusiveScanSegments( segments, segmentIndexes, fetch, TNL::Plus{}, write_partial );
95
96 std::cout << "\nPartial inclusive inplace scan results (only segments 1 and 3):\n";
98 segments,
99 [ = ] __cuda_callable__( Index globalIdx ) -> Value
100 {
101 return data_view[ globalIdx ];
102 } ) << '\n';
103
104 // Scanning the rest of segments using condition
105 auto condition = [ = ] __cuda_callable__( Index segmentIdx ) mutable -> bool
106 {
107 return segmentIdx % 2 == 0; // Only scan even segments
108 };
109
110 // Perform inclusive scan on selected segments
111 TNL::Algorithms::Segments::inclusiveScanAllSegmentsIf( segments, condition, fetch, TNL::Plus{}, write_partial );
112
113 std::cout << "\nPartial inclusive inplace scan results (only even segments):\n";
115 segments,
116 [ = ] __cuda_callable__( Index globalIdx ) -> Value
117 {
118 return data_view[ globalIdx ];
119 } ) << '\n';
120}
121
122int
123main( int argc, char* argv[] )
124{
125 std::cout << "Running example on Host:\n";
126 scanExample< TNL::Devices::Host >();
127
128#ifdef __CUDACC__
129 std::cout << "\nRunning example on Cuda:\n";
130 scanExample< TNL::Devices::Cuda >();
131#endif
132
133 return 0;
134}
Output
Running example on Host:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Running example on Cuda:
Original data in segments:
Segment 0: [ 1 ]
Segment 1: [ 2, 2 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 4, 4, 4 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Inclusive scan results:
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]
Exclusive scan results:
Segment 0: [ 0 ]
Segment 1: [ 0, 2 ]
Segment 2: [ 0, 3, 6 ]
Segment 3: [ 0, 4, 8, 12 ]
Segment 4: [ 0, 5, 10, 15, 20 ]
Partial inclusive inplace scan results (only segments 1 and 3):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 3, 3 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 5, 5, 5, 5 ]
Partial inclusive inplace scan results (only even segments):
Segment 0: [ 1 ]
Segment 1: [ 2, 4 ]
Segment 2: [ 3, 6, 9 ]
Segment 3: [ 4, 8, 12, 16 ]
Segment 4: [ 5, 10, 15, 20, 25 ]

◆ operator<<()

template<typename Segments, typename T = std::enable_if_t< isSegments_v< Segments > >>
std::ostream & TNL::Algorithms::Segments::operator<< ( std::ostream & str,
const Segments & segments )

Insertion operator of segments to output stream.

Template Parameters
Deviceis the device type of the source segments.
Indexis the index type of the source segments.
IndexAllocatoris the index allocator of the source segments.
Parameters
stris the output stream.
segmentsare the source segments.
Returns
reference to the output stream.

◆ print()

template<typename Segments, typename Fetch, std::enable_if_t< isSegments_v< Segments >, bool > = true>
SegmentsPrinter< typename Segments::ConstViewType, Fetch > TNL::Algorithms::Segments::print ( const Segments & segments,
Fetch fetch )

Print segments sizes, i.e. the segments setup.

Template Parameters
Segmentsis type of segments.
Fetchis type of the lambda function for fetching data.
Parameters
segmentsis an instance of segments.
fetchis a lambda function for fetching data.
Returns
reference to the output stream.
Example
#include <iostream>
#include <TNL/Containers/Vector.h>
#include <TNL/Algorithms/Segments/CSR.h>
#include <TNL/Algorithms/Segments/Ellpack.h>
#include <TNL/Algorithms/Segments/ChunkedEllpack.h>
#include <TNL/Algorithms/Segments/BiEllpack.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Segments >
void
SegmentsExample()
{
/***
* Create segments with given segments sizes and print their setup.
*/
Segments segments{ 1, 2, 3, 4, 5 };
std::cout << "Segments sizes are: " << segments << "\n\n";
}
int
main( int argc, char* argv[] )
{
std::cout << "Example of CSR segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
std::cout << "Example of Ellpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
std::cout << "Example of ChunkedEllpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::ChunkedEllpack< TNL::Devices::Host, int > >();
std::cout << "Example of BiEllpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::BiEllpack< TNL::Devices::Host, int > >();
#ifdef __CUDACC__
std::cout << "Example of CSR segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
std::cout << "Example of Ellpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
std::cout << "Example of ChunkedEllpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::ChunkedEllpack< TNL::Devices::Cuda, int > >();
std::cout << "Example of BiEllpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::BiEllpack< TNL::Devices::Cuda, int > >();
#endif
return EXIT_SUCCESS;
}
Output
Example of CSR segments on host:
Segments sizes are: [ 1, 2, 3, 4, 5 ]
Example of Ellpack segments on host:
Segments sizes are: [ 5, 5, 5, 5, 5 ]
Example of ChunkedEllpack segments on host:
Segments sizes are: [ 17, 34, 51, 67, 87 ]
Example of BiEllpack segments on host:
Segments sizes are: [ 2, 4, 4, 5, 5 ]
Example of CSR segments on host:
Segments sizes are: [ 1, 2, 3, 4, 5 ]
Example of Ellpack segments on host:
Segments sizes are: [ 5, 5, 5, 5, 5 ]
Example of ChunkedEllpack segments on host:
Segments sizes are: [ 17, 34, 51, 67, 87 ]
Example of BiEllpack segments on host:
Segments sizes are: [ 2, 4, 4, 5, 5 ]

◆ printSegments()

template<typename Segments>
std::ostream & TNL::Algorithms::Segments::printSegments ( std::ostream & str,
const Segments & segments )

Print segments sizes, i.e. the segments setup.

Template Parameters
Segmentsis type of segments.
Parameters
segmentsis an instance of segments.
stris output stream.
Returns
reference to the output stream.
Example
#include <iostream>
#include <TNL/Containers/Vector.h>
#include <TNL/Algorithms/Segments/CSR.h>
#include <TNL/Algorithms/Segments/Ellpack.h>
#include <TNL/Algorithms/Segments/ChunkedEllpack.h>
#include <TNL/Algorithms/Segments/BiEllpack.h>
#include <TNL/Devices/Host.h>
#include <TNL/Devices/Cuda.h>
template< typename Segments >
void
SegmentsExample()
{
/***
* Create segments with given segments sizes and print their setup.
*/
Segments segments{ 1, 2, 3, 4, 5 };
std::cout << "Segments sizes are: " << segments << "\n\n";
}
int
main( int argc, char* argv[] )
{
std::cout << "Example of CSR segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Host, int > >();
std::cout << "Example of Ellpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Host, int > >();
std::cout << "Example of ChunkedEllpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::ChunkedEllpack< TNL::Devices::Host, int > >();
std::cout << "Example of BiEllpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::BiEllpack< TNL::Devices::Host, int > >();
#ifdef __CUDACC__
std::cout << "Example of CSR segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::CSR< TNL::Devices::Cuda, int > >();
std::cout << "Example of Ellpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::Ellpack< TNL::Devices::Cuda, int > >();
std::cout << "Example of ChunkedEllpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::ChunkedEllpack< TNL::Devices::Cuda, int > >();
std::cout << "Example of BiEllpack segments on host:\n";
SegmentsExample< TNL::Algorithms::Segments::BiEllpack< TNL::Devices::Cuda, int > >();
#endif
return EXIT_SUCCESS;
}
Output
Example of CSR segments on host:
Segments sizes are: [ 1, 2, 3, 4, 5 ]
Example of Ellpack segments on host:
Segments sizes are: [ 5, 5, 5, 5, 5 ]
Example of ChunkedEllpack segments on host:
Segments sizes are: [ 17, 34, 51, 67, 87 ]
Example of BiEllpack segments on host:
Segments sizes are: [ 2, 4, 4, 5, 5 ]
Example of CSR segments on host:
Segments sizes are: [ 1, 2, 3, 4, 5 ]
Example of Ellpack segments on host:
Segments sizes are: [ 5, 5, 5, 5, 5 ]
Example of ChunkedEllpack segments on host:
Segments sizes are: [ 17, 34, 51, 67, 87 ]
Example of BiEllpack segments on host:
Segments sizes are: [ 2, 4, 4, 5, 5 ]

◆ segmentInsertionSort()

template<typename SegmentView, typename Fetch, typename Compare, typename Swap>
__cuda_callable__ void TNL::Algorithms::Segments::segmentInsertionSort ( SegmentView segment,
Fetch && fetch,
Compare && compare,
Swap && swap )

Sorts a segment using insertion sort. .

This function sorts the elements of a segment using insertion sort algorithm.

Template Parameters
SegmentViewType of the segment view.
FetchType of the fetch function.
CompareType of the comparison function.
SwapType of the swap function.
Parameters
segmentThe segment view to be sorted.
fetchFunction to fetch element value at given position. See Fetch Lambda.
compareFunction to compare two elements. See Compare Lambda.
swapFunction to swap two elements. See Swap Lambda.

This function performs an in-place sort of the segment using the insertion sort algorithm. The sorting is done in ascending order based on the comparison function provided.

◆ sequentialForAllSegments()

template<typename Segments, typename Function>
void TNL::Algorithms::Segments::sequentialForAllSegments ( const Segments & segments,
Function && function )

Iterates in parallel over all segments and call given lambda function for each segment.

See also: Overview of Segment Traversal Functions

This function is just a sequential variant of TNL::Algorithms::Segments::forAllSegments.

◆ sequentialForSegments()

template<typename Segments, typename IndexBegin, typename IndexEnd, typename Function>
void TNL::Algorithms::Segments::sequentialForSegments ( const Segments & segments,
IndexBegin begin,
IndexEnd end,
Function && function )

Iterates sequentially over segments in given range of segment indexes and call given lambda function for each segment.

This function is just a sequential variant of TNL::Algorithms::Segments::forSegments.

Variable Documentation

◆ isAdaptiveCSRSegments_v

template<typename Segments>
bool TNL::Algorithms::Segments::isAdaptiveCSRSegments_v = isAdaptiveCSRSegments< Segments >::value
inlineconstexpr

Returns true if the given type is AdaptiveCSR segments.

Template Parameters
SegmentsThe type of the segments.

◆ isSortedSegments_v

template<typename Segments>
bool TNL::Algorithms::Segments::isSortedSegments_v = isSortedSegments< Segments >::value
inlineconstexpr

Returns true if the given type is CSR segments.

Template Parameters
SegmentsThe type of the segments.