Template Numerical Library version\ main:0b2c40f
Loading...
Searching...
No Matches
TNL::Benchmarks::Benchmark Class Reference

Base class for running benchmarks with timing and logging support. More...

#include <TNL/Benchmarks/Benchmark.h>

Collaboration diagram for TNL::Benchmarks::Benchmark:
[legend]

Public Types

using MetadataColumns = Logging::MetadataColumns
using MetadataElement = Logging::MetadataElement
using SolverMonitorType = Solvers::IterativeSolverMonitor< double >

Public Member Functions

void addErrorMessage (const std::string &message)
 Logs an error message through all configured loggers.
void addLogger (std::unique_ptr< Logging > logger)
 Adds a logger for outputting benchmark results.
double getBaseTime () const
 Returns the base time used for speedup calculations.
bool getCatchExceptions () const
 Returns whether exceptions are caught during timing.
SolverMonitorType & getMonitor ()
 Returns reference to the solver monitor.
void setCatchExceptions (bool catchExceptions)
 Sets whether to catch exceptions during timing of computations.
void setDatasetSize (double datasetSize=0.0, double baseTime=0.0)
 Sets dataset size and base time for derived metrics.
void setLoops (std::size_t loops)
 Sets the number of iterations for each measurement.
void setMetadataColumns (const MetadataColumns &metadata)
 Sets metadata columns for all subsequent result rows.
void setMetadataElement (const typename MetadataColumns::value_type &element)
 Updates or adds a single metadata element.
void setMinTime (double minTime)
 Sets the minimum runtime for measurements.
void setOperation (const std::string &operation, double datasetSize=0.0, double baseTime=0.0)
 Sets the current operation name and optionally overrides dataset size/base time.
void setOperationsPerLoop (std::size_t operationsPerLoop)
 Sets the number of operations performed per loop iteration.
void setup (const Config::ParameterContainer &parameters, const std::string &programName="")
 Initializes the benchmark from parsed parameters.
void setWarmupLoops (std::size_t warmupLoops)
 Sets the number of warmup iterations.
void setWarmupMinTime (double warmupMinTime)
 Sets the minimum warmup runtime.
template<typename Device, typename ComputeFunction>
BenchmarkResult time (const std::string &performer, ComputeFunction &compute)
 Times a compute function without explicit reset (returns result).
template<typename Device, typename ComputeFunction>
void time (const std::string &performer, ComputeFunction &compute, BenchmarkResult &result)
 Times a compute function without explicit reset.
template<typename Device, typename ResetFunction, typename ComputeFunction>
BenchmarkResult time (ResetFunction reset, const std::string &performer, ComputeFunction &compute)
 Times a compute function with reset between iterations (returns result).
template<typename Device, typename ResetFunction, typename ComputeFunction>
void time (ResetFunction reset, const std::string &performer, ComputeFunction &compute, BenchmarkResult &result)
 Times a compute function with reset between iterations.

Static Public Member Functions

static void configSetup (Config::ConfigDescription &config)
 Configures benchmark-related command line options.

Protected Types

using BenchmarkLoggers = std::vector< std::unique_ptr< Logging > >

Protected Attributes

double baseTime = 0.0
bool catchExceptions = true
double datasetSize = 0.0
std::ofstream logFile
BenchmarkLoggers loggers
std::size_t loops = 10
double minTime = 0.0
SolverMonitorType monitor
std::size_t operations_per_loop = 0
std::size_t warmupLoops = 1
double warmupMinTime = 0.0

Detailed Description

Base class for running benchmarks with timing and logging support.

The Benchmark class provides a unified interface for measuring performance of computational kernels across different devices (CPU, GPU). It supports:

  • Multiple iterations with automatic loop count determination
  • Minimum runtime specification for statistical significance
  • Automatic warmup iteration before timing begins
  • Configurable output logging
  • Metadata tracking (device, operation, performer, etc.)
  • Bandwidth and speedup calculations
  • CPU cycle counting (host devices only)

Example usage:

// Configure benchmark
// Parse command line parameters
auto parameters = config.parseCommandLine();
benchmark.setup( parameters, argv[ 0 ] );
// Set up operation to benchmark
Benchmark::MetadataColumns(
{
{ "precision", getType< Real >() },
{ "size", std::to_string( size ) },
} ) );
double datasetSize = size * sizeof( Real ) / oneGB;
benchmark.setOperation( "operation-name", datasetSize );
// Define reset and compute functions
auto reset = []() { ... };
auto compute = []() { ... };
// Run benchmark
benchmark.time< Device >( reset, "performer-name", compute );
Base class for running benchmarks with timing and logging support.
Definition Benchmark.h:68
void setup(const Config::ParameterContainer &parameters, const std::string &programName="")
Initializes the benchmark from parsed parameters.
void setOperation(const std::string &operation, double datasetSize=0.0, double baseTime=0.0)
Sets the current operation name and optionally overrides dataset size/base time.
static void configSetup(Config::ConfigDescription &config)
Configures benchmark-related command line options.
void setMetadataColumns(const MetadataColumns &metadata)
Sets metadata columns for all subsequent result rows.
void time(ResetFunction reset, const std::string &performer, ComputeFunction &compute, BenchmarkResult &result)
Times a compute function with reset between iterations.
Definition ConfigDescription.h:18
std::string getType()
Returns a human-readable string representation of given type.
Definition TypeInfo.h:72
T to_string(T... args)

Member Function Documentation

◆ addErrorMessage()

void TNL::Benchmarks::Benchmark::addErrorMessage ( const std::string & message)

Logs an error message through all configured loggers.

Should be called when the time method cannot be executed due to errors like memory allocation failures.

Parameters
messageError description

◆ addLogger()

void TNL::Benchmarks::Benchmark::addLogger ( std::unique_ptr< Logging > logger)

Adds a logger for outputting benchmark results.

Multiple loggers can be added (e.g., both JsonLogging and TerminalLogger).

Parameters
loggerUnique pointer to a logger object

◆ configSetup()

void TNL::Benchmarks::Benchmark::configSetup ( Config::ConfigDescription & config)
static

Configures benchmark-related command line options.

Must be called before parsing command line arguments. Adds the following configuration entries:

  • log-file: Path to JSONL output file (default: <program>.log)
  • output-mode: "append" or "overwrite"
  • loops: Number of iterations (default: 10)
  • min-time: Minimum runtime in seconds (default: 0.0)
  • warmup-loops: Number of warmup iterations (default: 1)
  • warmup-min-time: Minimum warmup runtime in seconds (default: 0.0)
  • verbose: Verbosity level (default: 1)
  • catch-exceptions: Catch exceptions during timing (default: true)
Parameters
configReference to configuration description object

◆ getBaseTime()

double TNL::Benchmarks::Benchmark::getBaseTime ( ) const
nodiscard

Returns the base time used for speedup calculations.

Returns
Current base time value

◆ getCatchExceptions()

bool TNL::Benchmarks::Benchmark::getCatchExceptions ( ) const
nodiscard

Returns whether exceptions are caught during timing.

Returns
true if exceptions are caught, false if they propagate

◆ getMonitor()

SolverMonitorType & TNL::Benchmarks::Benchmark::getMonitor ( )
nodiscard

Returns reference to the solver monitor.

The monitor tracks iterative solver convergence during benchmarking.

Returns
Reference to SolverMonitorType instance

◆ setCatchExceptions()

void TNL::Benchmarks::Benchmark::setCatchExceptions ( bool catchExceptions)

Sets whether to catch exceptions during timing of computations.

When enabled (default), exceptions thrown during benchmark execution are caught and logged as error messages. When disabled, exceptions propagate up to the caller, allowing for stricter error handling in automated tests.

Parameters
catchExceptionstrue to catch exceptions (default), false to propagate them

◆ setDatasetSize()

void TNL::Benchmarks::Benchmark::setDatasetSize ( double datasetSize = 0.0,
double baseTime = 0.0 )

Sets dataset size and base time for derived metrics.

Parameters
datasetSizeDataset size in GB
baseTimeBaseline time for speedup calculation

◆ setLoops()

void TNL::Benchmarks::Benchmark::setLoops ( std::size_t loops)

Sets the number of iterations for each measurement.

Parameters
loopsNumber of loops to execute

◆ setMetadataColumns()

void TNL::Benchmarks::Benchmark::setMetadataColumns ( const MetadataColumns & metadata)

Sets metadata columns for all subsequent result rows.

Metadata values persist until explicitly changed. Common metadata includes device type, problem size, algorithm variant, etc.

Parameters
metadataVector of key-value pairs to set as metadata

◆ setMetadataElement()

void TNL::Benchmarks::Benchmark::setMetadataElement ( const typename MetadataColumns::value_type & element)

Updates or adds a single metadata element.

Useful for incrementally building metadata when running multiple related benchmarks.

Parameters
elementKey-value pair to set

◆ setMinTime()

void TNL::Benchmarks::Benchmark::setMinTime ( double minTime)

Sets the minimum runtime for measurements.

If specified, the benchmark will continue iterating until at least this much real time has elapsed, regardless of the loop count.

Parameters
minTimeMinimum time in seconds

◆ setOperation()

void TNL::Benchmarks::Benchmark::setOperation ( const std::string & operation,
double datasetSize = 0.0,
double baseTime = 0.0 )

Sets the current operation name and optionally overrides dataset size/base time.

Operations create vertical divisions in result tables. The baseTime parameter can be used to establish a new baseline for subsequent speedup calculations.

Parameters
operationName of the current operation
datasetSizeOptional dataset size override in GB
baseTimeOptional baseline time override

◆ setOperationsPerLoop()

void TNL::Benchmarks::Benchmark::setOperationsPerLoop ( std::size_t operationsPerLoop)

Sets the number of operations performed per loop iteration.

Used to calculate cycles per operation metric.

Parameters
operationsPerLoopNumber of operations per loop

◆ setup()

void TNL::Benchmarks::Benchmark::setup ( const Config::ParameterContainer & parameters,
const std::string & programName = "" )

Initializes the benchmark from parsed parameters.

Extracts benchmark settings from the parameter container and initializes loggers (JSON and/or terminal; only on rank 0 in MPI configurations).

If the log-file parameter is empty, the log file name defaults to <programName>.log, where programName is typically derived from argv[0].

Parameters
parametersParsed configuration parameters
programNameProgram name used as default log file base name

◆ setWarmupLoops()

void TNL::Benchmarks::Benchmark::setWarmupLoops ( std::size_t warmupLoops)

Sets the number of warmup iterations.

Warmup iterations are executed before timing begins to stabilize thermal and frequency states and amortize one-time costs such as CUDA JIT compilation.

Parameters
warmupLoopsNumber of warmup loops (0 to disable)

◆ setWarmupMinTime()

void TNL::Benchmarks::Benchmark::setWarmupMinTime ( double warmupMinTime)

Sets the minimum warmup runtime.

If specified, warmup will continue until at least this much real time has elapsed, regardless of the warmup loop count.

Parameters
warmupMinTimeMinimum warmup time in seconds

◆ time() [1/4]

template<typename Device, typename ComputeFunction>
BenchmarkResult TNL::Benchmarks::Benchmark::time ( const std::string & performer,
ComputeFunction & compute )

Times a compute function without explicit reset (returns result).

Equivalent to calling time with an empty reset function.

Template Parameters
DeviceDevice type
ComputeFunctionCompute function type
Parameters
performerName identifying the implementation
computeFunction to benchmark
Returns
BenchmarkResult containing timing data

◆ time() [2/4]

template<typename Device, typename ComputeFunction>
void TNL::Benchmarks::Benchmark::time ( const std::string & performer,
ComputeFunction & compute,
BenchmarkResult & result )

Times a compute function without explicit reset.

Equivalent to calling time with an empty reset function.

Template Parameters
DeviceDevice type
ComputeFunctionCompute function type
Parameters
performerName identifying the implementation
computeFunction to benchmark
resultOutput structure for benchmark results

◆ time() [3/4]

template<typename Device, typename ResetFunction, typename ComputeFunction>
BenchmarkResult TNL::Benchmarks::Benchmark::time ( ResetFunction reset,
const std::string & performer,
ComputeFunction & compute )

Times a compute function with reset between iterations (returns result).

Convenience overload that creates and returns a BenchmarkResult object.

Template Parameters
DeviceDevice type
ResetFunctionReset function type
ComputeFunctionCompute function type
Parameters
resetFunction called before each compute iteration
performerName identifying the implementation
computeFunction to benchmark
Returns
BenchmarkResult containing timing data

◆ time() [4/4]

template<typename Device, typename ResetFunction, typename ComputeFunction>
void TNL::Benchmarks::Benchmark::time ( ResetFunction reset,
const std::string & performer,
ComputeFunction & compute,
BenchmarkResult & result )

Times a compute function with reset between iterations.

Executes the compute function multiple times, calling reset() before each iteration. Results are logged through configured loggers.

One untimed warmup iteration (reset + compute) is performed automatically before the timed loop begins, to stabilize thermal and frequency states and amortize one-time costs such as CUDA JIT compilation.

Template Parameters
DeviceDevice type (e.g., TNL::Devices::Host, TNL::Devices::Cuda)
ResetFunctionCallable that resets state before each iteration
ComputeFunctionCallable containing the code to benchmark
Parameters
resetFunction called before each compute iteration
performerName identifying the implementation being tested
computeFunction to benchmark
resultOutput structure for benchmark results

The documentation for this class was generated from the following file: