Introduction
In this section, we describe solvers of ordinary differential equations (ODE) characterized by the following equation:
\[ \frac{\mathrm{d} \vec u(t)}{\mathrm{d}t} = \vec f( t, \vec u(t)) \text{ on } (0,T), \]
and the initial condition
\[ \vec u( 0 ) = \vec u_{ini}, \]
where \( T>0 \). This class of problems can be solved by TNL::Solvers::ODE::ODESolver which incorporates the following template parameters:
- Method - specifies the numerical method used for solving the ODE.
- Vector - denotes a container used to represent the vector \( \vec u(t) \).
- SolverMonitor - is a tool for tracking the progress of the ODE solver.
Method
TNL provides several methods for ODE solution, categorized based on their order of accuracy:
1-st order accuracy methods:
- TNL::Solvers::ODE::Methods::Euler or TNL::Solvers::ODE::Methods::Matlab::ode1 - the forward Euler method.
- TNL::Solvers::ODE::Methods::Midpoint - the explicit midpoint method.
2-nd order accuracy methods:
- TNL::Solvers::ODE::Methods::Heun2 or TNL::Solvers::ODE::Methods::Matlab::ode2 - the Heun method with adaptive choice of the time step.
- TNL::Solvers::ODE::Methods::Ralston2 - the Ralston method.
- TNL::Solvers::ODE::Methods::Fehlberg2 - the Fehlberg method with adaptive choice of the time step.
3-rd order accuracy methods:
- TNL::Solvers::ODE::Methods::Kutta - the Kutta method.
- TNL::Solvers::ODE::Methods::Heun3 - the Heun method.
- TNL::Solvers::ODE::Methods::VanDerHouwenWray - the Van der Houwen/Wray method.
- TNL::Solvers::ODE::Methods::Ralston3 - the Ralston method.
- TNL::Solvers::ODE::Methods::SSPRK3 - the Strong Stability Preserving Runge-Kutta method.
- TNL::Solvers::ODE::Methods::BogackiShampin or TNL::Solvers::ODE::Methods::Matlab::ode23 - Bogacki-Shampin method with adaptive choice of the time step.
4-th order accuracy methods:
- TNL::Solvers::ODE::Methods::OriginalRungeKutta - the "original" Runge-Kutta method.
- TNL::Solvers::ODE::Methods::Rule38 - 3/8 rule method.
- TNL::Solvers::ODE::Methods::Ralston4 - the Ralston method.
- TNL::Solvers::ODE::Methods::KuttaMerson - the Runge-Kutta-Merson method with adaptive choice of the time step.
5-th order accuracy methods:
- TNL::Solvers::ODE::Methods::CashKarp - the Cash-Karp method with adaptive choice of the time step.
- TNL::Solvers::ODE::Methods::DormandPrince or TNL::Solvers::ODE::Methods::Matlab::ode45 - the Dormand-Prince with adaptive choice of the time step.
- TNL::Solvers::ODE::Methods::Fehlberg5 - the Fehlberg method with adaptive choice of the time step.
Vector
The vector \( \vec u(t) \) in ODE solvers can be represented using different types of containers, depending on the size and nature of the ODE system:
- Static vectors (TNL::Containers::StaticVector): This is suitable for small systems of ODEs with a fixed number of unknowns. Utilizing StaticVector allows the ODE solver to be executed within GPU kernels. This capability is particularly useful for scenarios like running multiple sequential solvers in parallel, as in the case of TNL::Algorithms::parallelFor.
- Dynamic vectors (TNL::Containers::Vector or TNL::Containers::VectorView): These are preferred when dealing with large systems of ODEs, such as those arising in the solution of parabolic partial differential equations using the method of lines. In these instances, the solver typically handles a single, large-scale problem that can be executed in parallel internally.
Static ODE solvers
Scalar problem
Static solvers are primarily intended for scenarios where \( x \in R \) is scalar or \( \vec x \in R^n \) is vector with a relatively small dimension. We will demonstrate this through a scalar problem defined as follows:
\[ \frac{\mathrm{d}u}{\mathrm{d}t} = t \sin ( t ) \text{ on } (0,T), \]
with the initial condition
\[ u( 0 ) = 0. \]
First, we define the Real type to represent floating-point arithmetic, here chosen as double.
Next we define the main function of the solver:
10int
11main( int argc, char* argv[] )
12{
18
20 const Real final_t = 10.0;
21 const Real tau = 0.001;
22 const Real output_time_step = 0.25;
24
26 ODESolver solver;
27 solver.setTau( tau );
28 solver.setTime( 0.0 );
29 Vector u = 0.0;
31
33 while( solver.getTime() < final_t ) {
34 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
35 auto f = []( const Real& t, const Real& tau, const Vector& u, Vector& fu )
36 {
38 };
39 solver.solve( u, f );
41 }
43}
We begin the main function by defining necessary types:
We set Vector as StaticVector with a size of one. We also choose the basic Euler method as the Method. Finally, we construct the ODESolver type.
Next we define the time-related variables:
20 const Real final_t = 10.0;
21 const Real tau = 0.001;
22 const Real output_time_step = 0.25;
Here:
- final_t represents the size of the time interval \( (0,T)\).
- tau is the integration time step.
- output_time_step denotes checkpoints in which we will print value of the solution \( u(t)\).
Next, we initialize the solver:
26 ODESolver solver;
27 solver.setTau( tau );
28 solver.setTime( 0.0 );
29 Vector u = 0.0;
We create an instance of the ODESolver, set the integration time step (using setTau) and the initial time of the solver with setTime. Then, we initialize the variable u (representing the ODE solution) to the initial state \( u(0) = 0\).
We proceed to the main loop iterating over the time interval \( (0,T) \):
33 while( solver.getTime() < final_t ) {
34 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
35 auto f = []( const Real& t, const Real& tau, const Vector& u, Vector& fu )
36 {
38 };
39 solver.solve( u, f );
41 }
We iterate with the time variable \( t \) (represented by getTime) until the time \( T \) (represented by final_t) with step given by the frequency of the checkpoints (represented by output_time_steps). We let the solver to iterate until the next checkpoint or the end of the interval \((0,T) \) depending on what occurs first (it is expressed by TNL::min(solver.getTime()+output_time_step,final_t)). The lambda function f represents the right-hand side \( f \) of the ODE being solved. The lambda function receives the following arguments:
- t is the current value of the time variable \( t \in (0,T)\),
- tau is the current integration time step,
- u is the current value of the solution \( u(t)\),
- fu is a reference on a variable into which we evaluate the right-hand side \( f(u,t) \) on the ODE.
The lambda function is supposed to compute just the value of fu. It is fu=t*sin(t) in our case. Finally we call the ODE solver (solver.solve(u,f)). As parameters, we pass the variable u representing the solution \( u(t)\) and a lambda function representing the right-hand side of the ODE. At the end, we print values of the solution at given checkpoints.
The complete example looks as:
1#include <iostream>
2#include <TNL/Solvers/ODE/ODESolver.h>
3#include <TNL/Solvers/ODE/Methods/Euler.h>
4
6using Real = double;
8
10int
11main( int argc, char* argv[] )
12{
18
20 const Real final_t = 10.0;
21 const Real tau = 0.001;
22 const Real output_time_step = 0.25;
24
26 ODESolver solver;
27 solver.setTau( tau );
28 solver.setTime( 0.0 );
29 Vector u = 0.0;
31
33 while( solver.getTime() < final_t ) {
34 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
35 auto f = []( const Real& t, const Real& tau, const Vector& u, Vector& fu )
36 {
37 fu = t * sin( t );
38 };
39 solver.solve( u, f );
41 }
43}
Vector with constant size.
Definition StaticVector.h:19
constexpr ResultType min(const T1 &a, const T2 &b)
This function returns minimum of two numbers.
Definition Math.h:24
First order Euler method.
Definition Euler.h:15
Integrator or solver of systems of ordinary differential equations.
Definition ODESolver.h:62
The output is as follows:
0.25 0.00514497
0.5 0.0405145
0.75 0.132617
1 0.300748
1.25 0.554239
1.5 0.890641
1.75 1.29506
2 1.74068
2.25 2.19059
2.5 2.60058
2.75 2.92297
3 3.11089
3.25 3.1229
3.5 2.92743
3.75 2.50661
4 1.85929
4.25 1.00278
4.5 -0.0267497
4.75 -1.17553
5 -2.37484
5.25 -3.54513
5.5 -4.60128
5.75 -5.45867
6 -6.0396
6.25 -6.27963
6.5 -6.1334
6.75 -5.57927
7 -4.62263
7.25 -3.29735
7.5 -1.66528
7.75 0.186339
8 2.1494
8.25 4.10122
8.5 5.91219
8.75 7.45439
9 8.61044
9.25 9.28216
9.5 9.39834
9.75 8.92094
10 7.84941
These results can be visualized using several methods. One option is to use Gnuplot. The Gnuplot command to plot the data is:
plot 'StaticODESolver-SineExample.out' with lines
Alternatively, the data can be processed and visualized using the following Python script, which employs Matplotlib for graphing.
1
2
3import sys
4import matplotlib.pyplot as plt
5import numpy as np
6
7
9plt.rcParams["text.usetex"] = True
10
11
13f = open(sys.argv[1], "r")
14x_lst = []
15y_lst = []
16for line in f:
17 line = line.strip()
18 a = line.split()
19 x_lst.append(float(a[0]))
20 y_lst.append(float(a[1]))
21
22
24x = np.array(x_lst)
25y = np.array(y_lst)
26
27
29fig, ax = plt.subplots()
30ax.set_xlim([0, 10])
31ax.set_ylim([-10, 10])
32ax.plot(x, y, linewidth=2.0)
33ax.set_xlabel("$t$")
34ax.set_ylabel("$u(t)$")
35plt.savefig(sys.argv[2])
36plt.close(fig)
The graph depicting the solution of the scalar ODE problem is illustrated below:

Lorenz system
In this example, we demonstrate the application of the static ODE solver in solving a system of ODEs, specifically the Lorenz system. The Lorenz system is a set of three coupled, nonlinear differential equations defined as follows:
\[ \frac{\mathrm{d}x}{\mathrm{d}t} = \sigma( x - y), \text{ on } (0,T), \]
\[ \frac{\mathrm{d}y}{\mathrm{d}t} = x(\rho - z ) - y, \text{ on } (0,T), \]
\[ \frac{\mathrm{d}z}{\mathrm{d}t} = xy - \beta z, \text{ on } (0,T), \]
with the initial condition
\[ \vec u(0) = (x(0),y(0),z(0)) = \vec u_{ini}. \]
Here, \( \sigma, \rho \) and \( \beta \) are given constants. The solution of the system, \( \vec u(t) = (x(t), y(t), z(t)) \in R^3 \) is represented by three-dimensional static vector (TNL::Containers::StaticVector).
The implementation of the solver for the Lorenz system is outlined below:
1#include <iostream>
2#include <TNL/Containers/StaticVector.h>
3#include <TNL/Solvers/ODE/ODESolver.h>
4#include <TNL/Solvers/ODE/Methods/Euler.h>
5
7using Real = double;
9
11int
12main( int argc, char* argv[] )
13{
19
21 const Real final_t = 25.0;
22 const Real tau = 0.001;
23 const Real output_time_step = 0.01;
25
27 const Real sigma = 10.0;
28 const Real rho = 28.0;
29 const Real beta = 8.0 / 3.0;
31
33 ODESolver solver;
34 solver.setTau( tau );
35 solver.setTime( 0.0 );
37
39 Vector u( 1.0, 2.0, 3.0 );
41
43 while( solver.getTime() < final_t ) {
44 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
46 auto f = [ = ]( const Real& t, const Real& tau, const Vector& u, Vector& fu )
47 {
48 const Real& x = u[ 0 ];
49 const Real& y = u[ 1 ];
50 const Real& z = u[ 2 ];
51 fu[ 0 ] = sigma * ( y - x );
52 fu[ 1 ] = rho * x - y - x * z;
53 fu[ 2 ] = -beta * z + x * y;
54 };
56 solver.solve( u, f );
58 }
60}
This code shares similarities with the previous example, with the following key differences:
- We define the type Vector of the variable u representing the solution \( \vec u(t) \) as TNL::Containers::StaticVector< 3, Real >, i.e. static vector with size of three.
- Alongside the solver parameters (final_t, tau and output_time_step) we define the Lorenz system's parameters (sigma, rho and beta).
27 const Real sigma = 10.0;
28 const Real rho = 28.0;
29 const Real beta = 8.0 / 3.0;
- The initial condition is \( \vec u(0) = (1,2,3) \).
39 Vector u( 1.0, 2.0, 3.0 );
- In the lambda function, which represents the right-hand side of the Lorenz system, auxiliary aliases x, y, and z are defined for readability. The main computation for the right-hand side of the system is implemented subsequently.
46 auto f = [ = ]( const Real& t, const Real& tau, const Vector& u, Vector& fu )
47 {
48 const Real& x = u[ 0 ];
49 const Real& y = u[ 1 ];
50 const Real& z = u[ 2 ];
51 fu[ 0 ] = sigma * ( y - x );
52 fu[ 1 ] = rho * x - y - x * z;
53 fu[ 2 ] = -beta * z + x * y;
54 };
- In the remaining part of the time loop, we simply execute the solver, allowing it to evolve the solution until the next snapshot time. At this point, we also print the current state of the solution to the terminal.
The solver generates a file containing the solution values \( (\sigma(i \tau), \rho( i \tau), \beta( i \tau )) \) for each time step, where \( i = 0, 1, \ldots N \). These values are recorded on separate lines. The content of the output file is structured as follows:
sigma[ 0 ] rho[ 0 ] beta[ 0 ]
sigma[ 1 ] rho[ 1 ] beta[ 1 ]
sigma[ 2 ] rho[ 2 ] beta[ 2 ]
...
The output file generated by the solver can be visualized in various ways. One effective method is to use Gnuplot, which allows for interactive 3D visualization. The Gnuplot command for plotting the data in 3D is:
splot 'StaticODESolver-LorenzExample.out' with lines
Alternatively, the data can also be processed using the following Python script:
1
2
3import sys
4import matplotlib.pyplot as plt
5import numpy as np
6
7
9plt.rcParams["text.usetex"] = True
10
11
13f = open(sys.argv[1], "r")
14x_lst = []
15y_lst = []
16z_lst = []
17for line in f:
18 line = line.strip()
19 a = line.split()
20 x_lst.append(float(a[0]))
21 y_lst.append(float(a[1]))
22 z_lst.append(float(a[2]))
23
24
26x = np.array(x_lst)
27y = np.array(y_lst)
28z = np.array(z_lst)
29
30
32fig, ax = plt.subplots()
33ax.plot(x, y, label="Lorenz attractor")
34ax.legend()
35plt.savefig(sys.argv[2])
36plt.close(fig)
This script is structured similarly to the one in the previous example. It processes the output data and creates a visual representation of the solution.
The resultant visualization of the Lorenz problem is shown below:

Combining static ODE solvers with parallel for
Static solvers can be effectively utilized within lambda functions in conjunction with TNL::Algorithms::parallelFor. This approach is particularly beneficial when there's a need to solve a large number of independent ODE problems, such as in parametric analysis scenarios. We will demonstrate this application using the two examples previously described.
Solving scalar problems in parallel
The first example addresses an ODE defined by the following equation
\[ \frac{\mathrm{d}u}{\mathrm{d}t} = t \sin ( c t ) \text{ on } (0,T), \]
and the initial condition
\[ u( 0 ) = 0, \]
where \( c \) is a constant. We aim to solve this ODE in parallel for a range of values \( c \in \langle c_{min}, c_{max} \rangle \). The exact solution for this equation is available here. The implementation for this parallel computation is detailed in the following code:
1#include <iostream>
2#include <fstream>
3#include <TNL/Solvers/ODE/ODESolver.h>
4#include <TNL/Solvers/ODE/Methods/Euler.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Algorithms/parallelFor.h>
7
8using Real = double;
9
10template< typename Device >
11void
12solveParallelODEs( const char* file_name )
13{
20
22 const Real final_t = 10.0;
23 const Real tau = 0.001;
24 const Real output_time_step = 0.1;
26
28 const Real c_min = 1.0;
29 const Real c_max = 5.0;
30 const int c_vals = 5.0;
31 const Real c_step = ( c_max - c_min ) / ( c_vals - 1 );
33
35 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
36 Vector results( output_time_steps * c_vals, 0.0 );
37 auto results_view = results.getView();
39
41 auto f = [ = ]
__cuda_callable__(
const Real& t,
const Real& tau,
const StaticVector& u, StaticVector& fu,
const Real& c )
42 {
43 fu = t * sin( c * t );
44 };
46
49 {
51 const Real c = c_min + idx * c_step;
52 ODESolver solver;
53 solver.setTau( tau );
54 solver.setTime( 0.0 );
55 StaticVector u = 0.0;
56 int time_step( 1 );
57 results_view[ idx ] = u;
59
61 while( time_step < output_time_steps ) {
62 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
63 solver.solve( u, f, c );
64 results_view[ time_step++ * c_vals + idx ] = u;
65 }
67 };
70
73 file.
open( file_name, std::ios::out );
74 for( int i = 0; i < c_vals; i++ ) {
75 file <<
"# c = " << c_min + i * c_step <<
std::endl;
76 for( int k = 0; k < output_time_steps; k++ )
77 file << k * output_time_step <<
" " << results.getElement( k * c_vals + i )[ 0 ] <<
std::endl;
79 }
81}
82
83int
84main( int argc, char* argv[] )
85{
86 if( argc != 2 ) {
88 return EXIT_FAILURE;
89 }
91 file_name += "/StaticODESolver-SineParallelExample-result.out";
92 solveParallelODEs< TNL::Devices::Host >( file_name.getString() );
93#ifdef __CUDACC__
94 solveParallelODEs< TNL::Devices::Cuda >( file_name.getString() );
95#endif
96}
#define __cuda_callable__
Definition Macros.h:49
Vector extends Array with algebraic operations.
Definition Vector.h:36
Class for managing strings.
Definition String.h:31
std::enable_if_t< std::is_integral_v< Begin > &&std::is_integral_v< End > > parallelFor(const Begin &begin, const End &end, typename Device::LaunchConfiguration launch_config, Function f, FunctionArgs... args)
Parallel for-loop function for 1D range specified with integral values.
Definition parallelFor.h:41
In this example, we demonstrate how to execute the ODE solver on a GPU. To facilitate this, the main solver logic has been encapsulated within a separate function, solveParallelODEs, which accepts a template parameter Device indicating the target device for execution. The results from individual ODE solutions are stored in memory and eventually written to a file named file_name. The variable \( u \), being scalar, is represented by the type TNL::Containers::StaticVector<1,Real> within the solver.
Next, we define the parameters of the ODE solver (final_t, tau and output_time_step) as shown:
22 const Real final_t = 10.0;
23 const Real tau = 0.001;
24 const Real output_time_step = 0.1;
The interval for the ODE parameter \( c \in \langle c_{min}, c_{max} \rangle \) ( c_min, c_max, ) is established, along with the number of values c_vals distributed equidistantly in the interval, determined by the step size c_step.
28 const Real c_min = 1.0;
29 const Real c_max = 5.0;
30 const int c_vals = 5.0;
31 const Real c_step = ( c_max - c_min ) / ( c_vals - 1 );
We use the range of different c values as the range for the parallelFor loop. This loop processes the lambda function solve. Before diving into the main lambda function, we allocate the vector results for storing the ODE problem results at different time levels. As data cannot be directly written from GPU to an output file, results serves as intermediary storage. Additionally, to enable GPU access, we prepare the vector view results_view.
35 const int output_time_steps =
ceil( final_t / output_time_step ) + 1;
36 Vector results( output_time_steps * c_vals, 0.0 );
37 auto results_view = results.getView();
Proceeding to the main lambda function solve:
49 {
51 const Real c = c_min + idx * c_step;
52 ODESolver solver;
53 solver.setTau( tau );
54 solver.setTime( 0.0 );
55 StaticVector u = 0.0;
56 int time_step( 1 );
57 results_view[ idx ] = u;
59
61 while( time_step < output_time_steps ) {
62 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
63 solver.solve( u, f, c );
64 results_view[ time_step++ * c_vals + idx ] = u;
65 }
67 };
This function receives idx, the index of the value of the parameter c. After calculating c, we create the ODE solver solver and set its parameters using setTau and setTime. We also set the initial condition of the ODE and define the variable time_step to count checkpoints, which are stored in memory using results_view.
51 const Real c = c_min + idx * c_step;
52 ODESolver solver;
53 solver.setTau( tau );
54 solver.setTime( 0.0 );
55 StaticVector u = 0.0;
56 int time_step( 1 );
57 results_view[ idx ] = u;
In the time loop, we iterate over the interval \( (0, T) \), setting the solver’s stop time with setStopTime and running the solver with solve.
61 while( time_step < output_time_steps ) {
62 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
63 solver.solve( u, f, c );
64 results_view[ time_step++ * c_vals + idx ] = u;
65 }
Each checkpoint's result is then stored in the results_view.
73 file.
open( file_name, std::ios::out );
74 for( int i = 0; i < c_vals; i++ ) {
75 file <<
"# c = " << c_min + i * c_step <<
std::endl;
76 for( int k = 0; k < output_time_steps; k++ )
77 file << k * output_time_step <<
" " << results.getElement( k * c_vals + i )[ 0 ] <<
std::endl;
79 }
It's important to note how the parameter c is passed to the lambda function f. The solve method of ODE solvers accepts user-defined parameters through variadic templates. This means additional parameters like c can be included alongside u and the right-hand side f, and are accessible within the lambda function f.
41 auto f = [ = ]
__cuda_callable__(
const Real& t,
const Real& tau,
const StaticVector& u, StaticVector& fu,
const Real& c )
42 {
43 fu = t *
sin( c * t );
44 };
Due to limitations of the nvcc compiler, which does not accept lambda functions defined within another lambda function, the f lambda function cannot be defined inside the solve lambda function. Therefore, c, defined in solve, cannot be captured by f.
The solver outputs a file in the following format:
# c = c[ 0 ]
x[ 0 ] u( c[ 0 ], x[ 0 ] )
x[ 1 ] u( c[ 0 ], x[ 1 ] )
....
# c = c[ 1 ]
x[ 0 ] u( c[ 1 ], x[ 0 ] )
x[ 1 ] u( c[ 1 ], x[ 1 ] )
...
The file can be visuallized using Gnuplot as follows
splot 'StaticODESolver-SineParallelExample-result.out' with lines
or it can be processed by the following Python script:
1
2
3import sys
4import matplotlib.pyplot as plt
5import numpy as np
6
7
9plt.rcParams["text.usetex"] = True
10
11
13f = open(sys.argv[1], "r")
14current_c = 0.0
15x_lst = []
16u_lst = []
17x_data = []
18u_data = []
19parameters = []
20
21for line in f:
22 line = line.strip()
23 a = line.split()
24 if not a:
25 continue
26 if a[0] == "#":
27 if x_lst:
28 parameters.append(current_c)
29 u_data.append(np.array(u_lst))
30 if not x_data:
31 x_data.append(np.array(x_lst))
32 u_lst.clear()
33
34 current_c = float(a[3])
35 else:
36 if not x_data:
37 x_lst.append(float(a[0]))
38 u_lst.append(float(a[1]))
39parameters.append(current_c)
40u_data.append(np.array(u_lst))
41
42
44n = len(parameters)
45fig, ax = plt.subplots(1, n, figsize=(15, 3), sharey=True)
46idx = 0
47for u in u_data:
48 ax[idx].plot(x_data[0], u, linewidth=2.0, label=f"$c={parameters[idx]}$")
49 ax[idx].set_xlabel("t")
50 ax[idx].set_ylabel("u(t)")
51 idx = idx + 1
52plt.savefig(sys.argv[2])
53plt.close(fig)
The result of this example looks as follows:

Solving the Lorenz system in parallel
The second example is a parametric analysis of the Lorenz model
\[ \frac{\mathrm{d}x}{\mathrm{d}t} = \sigma( x - y), \text{ on } (0,T) \]
\[ \frac{\mathrm{d}y}{\mathrm{d}t} = x(\rho - z ) - y, \text{ on } (0,T) \]
\[ \frac{\mathrm{d}z}{\mathrm{d}t} = xy - \beta z, \text{ on } (0,T) \]
with the initial condition
\[ \vec u(0) = (x(0),y(0),z(0)) = \vec u_{ini}. \]
We solve it for different values of the model parameters:
\[ \sigma_i = \sigma_{min} + i \Delta \sigma, \]
\[ \rho_j = \rho_{min} + j \Delta \rho, \]
\[ \beta_k = \beta_{min} + k \Delta \beta, \]
where we set \(( \Delta \sigma = \Delta \rho = \Delta \beta = l / (p-1) \)) and \(( i,j,k = 0, 1, \ldots, p - 1 \)). The code of the solver looks as follows:
1#include <iostream>
2#include <fstream>
3#include <TNL/Containers/Vector.h>
4#include <TNL/Algorithms/parallelFor.h>
5#include <TNL/Containers/StaticVector.h>
6#include <TNL/Solvers/ODE/ODESolver.h>
7#include <TNL/Solvers/ODE/Methods/Euler.h>
8
11using Real = double;
13
14template< typename Device >
15void
16solveParallelODEs( const char* file_name )
17{
23
25 const Real final_t = 50.0;
26 const Real tau = 0.001;
27 const Real output_time_step = 0.005;
29
31 const Real sigma_min( 10.0 ), rho_min( 15.0 ), beta_min( 1.0 );
32 const int parametric_steps = 4;
33 const Real sigma_step = 30.0 / ( parametric_steps - 1 );
34 const Real rho_step = 21.0 / ( parametric_steps - 1 );
35 const Real beta_step = 15.0 / ( parametric_steps - 1 );
36 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
38
40 const int results_size( output_time_steps * parametric_steps * parametric_steps * parametric_steps );
42 auto results_view = results.getView();
44
47 const Real& tau,
48 const StaticVector& u,
49 StaticVector& fu,
50 const Real& sigma_i,
51 const Real& rho_j,
52 const Real& beta_k )
53 {
54 const Real& x = u[ 0 ];
55 const Real& y = u[ 1 ];
56 const Real& z = u[ 2 ];
57 fu[ 0 ] = sigma_i * ( y - x );
58 fu[ 1 ] = rho_j * x - y - x * z;
59 fu[ 2 ] = -beta_k * z + x * y;
60 };
62
65 {
67 const Real sigma_i = sigma_min + idx[ 0 ] * sigma_step;
68 const Real rho_j = rho_min + idx[ 1 ] * rho_step;
69 const Real beta_k = beta_min + idx[ 2 ] * beta_step;
71
73 ODESolver solver;
74 solver.setTau( tau );
75 solver.setTime( 0.0 );
76 StaticVector u( 1.0, 1.0, 1.0 );
77 int time_step( 1 );
78 results_view[ ( idx[ 0 ] * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ] ] = u;
80
82 while( time_step < output_time_steps ) {
83 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
85 solver.solve( u, f, sigma_i, rho_j, beta_k );
87 const int k =
88 ( ( time_step++ * parametric_steps + idx[ 0 ] ) * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ];
89 results_view[ k ] = u;
90 }
92 };
94
96 const MultiIndex
begin = { 0, 0, 0 };
97 const MultiIndex
end = { parametric_steps, parametric_steps, parametric_steps };
100
103 file.
open( file_name, std::ios::out );
104 for( int sigma_idx = 0; sigma_idx < parametric_steps; sigma_idx++ )
105 for( int rho_idx = 0; rho_idx < parametric_steps; rho_idx++ )
106 for( int beta_idx = 0; beta_idx < parametric_steps; beta_idx++ ) {
107 Real sigma = sigma_min + sigma_idx * sigma_step;
108 Real rho = rho_min + rho_idx * rho_step;
109 Real beta = beta_min + beta_idx * beta_step;
110 file <<
"# sigma " << sigma <<
" rho " << rho <<
" beta " << beta <<
std::endl;
111 for( int i = 0; i < output_time_steps - 1; i++ ) {
112 int offset = ( ( i * parametric_steps + sigma_idx ) * parametric_steps + rho_idx ) * parametric_steps + beta_idx;
113 auto u = results.getElement( offset );
114 file << u[ 0 ] <<
" " << u[ 1 ] <<
" " << u[ 2 ] <<
std::endl;
115 }
117 }
119}
120
121int
122main( int argc, char* argv[] )
123{
124 if( argc != 2 ) {
126 return EXIT_FAILURE;
127 }
129 file_name += "/StaticODESolver-LorenzParallelExample-result.out";
130
131 solveParallelODEs< TNL::Devices::Host >( file_name.getString() );
132#ifdef __CUDACC__
133 solveParallelODEs< TNL::Devices::Cuda >( file_name.getString() );
134#endif
135}
Array with constant size.
Definition StaticArray.h:21
It is very similar to the previous one. There are just the following changes:
Since we are analysing dependence on three parameters, we involve a three-dimensional parallelFor. For this, we introduce a type for a three-dimensional multiindex.
We define minimal values for the parameters \( \sigma \in [10,40], \beta \in [15, 36] \) and \( \rho \in [1,16]\) and set the number of different values for each parameter by parametric_steps. The size of equidistant steps (sigma_steps, rho_steps and beta_steps) in the parameter variations is also set.
31 const Real sigma_min( 10.0 ), rho_min( 15.0 ), beta_min( 1.0 );
32 const int parametric_steps = 4;
33 const Real sigma_step = 30.0 / ( parametric_steps - 1 );
34 const Real rho_step = 21.0 / ( parametric_steps - 1 );
35 const Real beta_step = 15.0 / ( parametric_steps - 1 );
36 const int output_time_steps =
ceil( final_t / output_time_step ) + 1;
Next, we allocate vector results for storing the solution of the Lorenz problem for various parameters.
40 const int results_size( output_time_steps * parametric_steps * parametric_steps * parametric_steps );
42 auto results_view = results.getView();
We define the lambda function f for the right-hand side of the Lorenz problem and the lambda function solve for the ODE solver, with a specific setup of the parameters.
47 const Real& tau,
48 const StaticVector& u,
49 StaticVector& fu,
50 const Real& sigma_i,
51 const Real& rho_j,
52 const Real& beta_k )
53 {
54 const Real& x = u[ 0 ];
55 const Real& y = u[ 1 ];
56 const Real& z = u[ 2 ];
57 fu[ 0 ] = sigma_i * ( y - x );
58 fu[ 1 ] = rho_j * x - y - x * z;
59 fu[ 2 ] = -beta_k * z + x * y;
60 };
and the lambda function solve representing the ODE solver for the Lorenz problem
65 {
67 const Real sigma_i = sigma_min + idx[ 0 ] * sigma_step;
68 const Real rho_j = rho_min + idx[ 1 ] * rho_step;
69 const Real beta_k = beta_min + idx[ 2 ] * beta_step;
71
73 ODESolver solver;
74 solver.setTau( tau );
75 solver.setTime( 0.0 );
76 StaticVector u( 1.0, 1.0, 1.0 );
77 int time_step( 1 );
78 results_view[ ( idx[ 0 ] * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ] ] = u;
80
82 while( time_step < output_time_steps ) {
83 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
85 solver.solve( u, f, sigma_i, rho_j, beta_k );
87 const int k =
88 ( ( time_step++ * parametric_steps + idx[ 0 ] ) * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ];
89 results_view[ k ] = u;
90 }
92 };
with setup of the parameters
73 ODESolver solver;
74 solver.setTau( tau );
75 solver.setTime( 0.0 );
76 StaticVector u( 1.0, 1.0, 1.0 );
77 int time_step( 1 );
78 results_view[ ( idx[ 0 ] * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ] ] = u;
The solve lambda function is executed using a three-dimensional parallelFor (TNL::Algorithms::parallelFor). We define multi-indexes begin and end for this purpose:
96 const MultiIndex
begin = { 0, 0, 0 };
97 const MultiIndex
end = { parametric_steps, parametric_steps, parametric_steps };
- The lambda function solve takes a multi-index idx, which is used to compute specific values for \( \sigma_i, \rho_j, \beta_k \), denoted as sigma_i, rho_j, and beta_k:
67 const Real sigma_i = sigma_min + idx[ 0 ] * sigma_step;
68 const Real rho_j = rho_min + idx[ 1 ] * rho_step;
69 const Real beta_k = beta_min + idx[ 2 ] * beta_step;
These parameters must be explicitly passed to the lambda function f. This necessity arises due to the nvcc compiler's limitation of not accepting a lambda function defined within another lambda function, as mentioned before: 85 solver.solve( u, f, sigma_i, rho_j, beta_k );
The initial condition for the Lorenz problem is set to vector \( (1,1,1) \):
73 ODESolver solver;
74 solver.setTau( tau );
75 solver.setTime( 0.0 );
76 StaticVector u( 1.0, 1.0, 1.0 );
77 int time_step( 1 );
78 results_view[ ( idx[ 0 ] * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ] ] = u;
Subsequently, we initiate the time loop. Within this loop, we store the state of the solution in the vector view results_view at intervals defined by output_time_step:
82 while( time_step < output_time_steps ) {
83 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
85 solver.solve( u, f, sigma_i, rho_j, beta_k );
87 const int k =
88 ( ( time_step++ * parametric_steps + idx[ 0 ] ) * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ];
89 results_view[ k ] = u;
90 }
Upon solving all ODEs, we transfer all solutions from the vector results to an output file:
103 file.
open( file_name, std::ios::out );
104 for( int sigma_idx = 0; sigma_idx < parametric_steps; sigma_idx++ )
105 for( int rho_idx = 0; rho_idx < parametric_steps; rho_idx++ )
106 for( int beta_idx = 0; beta_idx < parametric_steps; beta_idx++ ) {
107 Real sigma = sigma_min + sigma_idx * sigma_step;
108 Real rho = rho_min + rho_idx * rho_step;
109 Real beta = beta_min + beta_idx * beta_step;
110 file <<
"# sigma " << sigma <<
" rho " << rho <<
" beta " << beta <<
std::endl;
111 for( int i = 0; i < output_time_steps - 1; i++ ) {
112 int offset = ( ( i * parametric_steps + sigma_idx ) * parametric_steps + rho_idx ) * parametric_steps + beta_idx;
113 auto u = results.getElement( offset );
114 file << u[ 0 ] <<
" " << u[ 1 ] <<
" " << u[ 2 ] <<
std::endl;
115 }
117 }
The output file has the following format:
# sigma = c[ 0 ] rho = rho[ 0 ] beta = beta[ 0 ]
x[ 0 ] u( sigma[ 0 ], rho[ 0 ], beta[ 0 ], x[ 0 ] )
x[ 1 ] u( sigma[ 0 ], rho[ 0 ], beta[ 0 ], x[ 1 ] )
....
# sigma = c[ 1 ] rho = rho[ 1 ] beta = beta[ 1 ]
x[ 0 ] u( sigma[ 1 ], rho[ 1 ], beta[ 1 ], x[ 0 ] )
x[ 1 ] u( sigma[ 1 ], rho[ 1 ], beta[ 1 ], x[ 1 ] )
...
It can be processed by the following Python script:
1
2
3import sys
4import matplotlib.pyplot as plt
5import numpy as np
6from mpl_toolkits.mplot3d import axes3d, Axes3D
7
8plt.rcParams["text.usetex"] = True
9
10
12f = open(sys.argv[1], "r")
13current_sigma = 0.0
14current_rho = 0.0
15current_beta = 0.0
16sigma_lst = []
17rho_lst = []
18beta_lst = []
19x_lst = []
20y_lst = []
21z_lst = []
22x_data = []
23y_data = []
24z_data = []
25parameters = []
26data = {}
27size = 0
28for line in f:
29 line = line.strip()
30 a = line.split()
31 if not a:
32 continue
33 if a[0] == "#":
34 if x_lst:
35 parameters_tuple = (current_sigma, current_rho, current_beta)
36 parameters.append(parameters_tuple)
37 data_tuple = (np.array(x_lst), np.array(y_lst), np.array(z_lst))
38 data[parameters_tuple] = data_tuple
39 x_lst.clear()
40 y_lst.clear()
41 z_lst.clear()
42
43 current_sigma = float(a[2])
44 current_rho = float(a[4])
45 current_beta = float(a[6])
46 if current_sigma not in sigma_lst:
47 sigma_lst.append(current_sigma)
48 if current_rho not in rho_lst:
49 rho_lst.append(current_rho)
50 if current_beta not in beta_lst:
51 beta_lst.append(current_beta)
52 else:
53 x_lst.append(float(a[0]))
54 y_lst.append(float(a[1]))
55 z_lst.append(float(a[2]))
56parameters_tuple = (current_sigma, current_rho, current_beta)
57parameters.append(parameters_tuple)
58data_tuple = (np.array(x_lst), np.array(y_lst), np.array(z_lst))
59data[parameters_tuple] = data_tuple
60
61
63sigma_n = len(sigma_lst)
64sigma_idx = 1
65for sigma in sigma_lst:
66 rho_n = len(rho_lst)
67 beta_n = len(beta_lst)
68
69 fig, ax = plt.subplots(rho_n, beta_n, figsize=(8, 8), sharey=True, sharex=True)
70 fig.suptitle(rf"$\sigma={sigma}$")
71
72 rho_idx = 0
73 beta_idx = 0
74 for rho in rho_lst:
75 for beta in beta_lst:
76 parameters_tuple = (sigma, rho, beta)
77 data_tuple = data[parameters_tuple]
78 ax[rho_idx, beta_idx].plot(data_tuple[1], data_tuple[2], linewidth=1.0)
79 if beta_idx == 0:
80 ax[rho_idx, beta_idx].set_ylabel(rf"$\rho={rho}$")
81 if rho_idx == rho_n - 1:
82 ax[rho_idx, beta_idx].set_xlabel(rf"$\beta={beta}$")
83 beta_idx = beta_idx + 1
84 beta_idx = 0
85 rho_idx = rho_idx + 1
86
87 plt.savefig(f"{sys.argv[2]}-{sigma_idx}.png")
88 sigma_idx = sigma_idx + 1
89 plt.close(fig)
The results are visualized in the following images:


Dynamic ODE Solvers
In this section, we demonstrate how to solve the simple 1D heat equation, a parabolic partial differential equation expressed as:
\[\frac{\partial u(t,x)}{\partial t} - \frac{\partial^2 u(t,x)}{\partial^2 x} = 0 \text{ on } (0,T) \times (0,1),
\]
with boundary conditions
\[u(t,0) = 0,
\]
\[u(t,0) = 1,
\]
and initial condition
\[u(0,x) = u_{ini}(x) \text{ on } [0,1].
\]
We discretize the equation by the finite difference method for numerical approximation. First, we define set of nodes \( x_i = ih \) for \(i=0,\ldots n-1 \) where \(h = 1 / (n-1) \) (adopting C++ indexing for consistency). Employing the method of lines and approximating the second derivative by the central finite difference
\[\frac{\partial^2 u(t,x)}{\partial^2 x} \approx \frac{u_{i-1} - 2 u_i + u_{i+1}}{h^2},
\]
we derive system of ODEs:
\[\frac{\mathrm{d} u_i(t)}{\mathrm{d}t} = \frac{u_{i-1} - 2 u_i + u_{i+1}}{h^2} \text{ for } i = 1, \ldots, n-2,
\]
where \( u_i(t) = u(t,ih) \) represents the function value at node \( i \) and \( h \) is the spatial step between two adjacent nodes of the numerical mesh. The boundary conditions are set as:
\[u_0(t) = u_{n-1}(t) = 0 \text{ on } [0,T].
\]
What are the main differences compared to the Lorenz model?
- System Size and Vector Representation:
- The Lorenz model has a fixed size of three, representing the parameters \( (x, y, z) \) in \( R^3 \). This small, fixed-size vector can be efficiently represented using a static vector, specifically TNL::Containers::StaticVector< 3, Real >.
- In contrast, the size of the ODE system for the heat equation, as determined by the method of lines, varies based on the desired accuracy. The greater the value of \( n \), the more accurate the numerical approximation. The number of nodes \( n \) for spatial discretization defines the number of parameters or degrees of freedom, DOFs. Consequently, the system size can be large, necessitating the use of a dynamic vector, TNL::Containers::Vector, for solutions.
- Evaluation Approach and Parallelization:
- Due to its small size, the Lorenz model's right-hand side can be evaluated sequentially by a single thread.
- However, the ODE system for the heat equation can be very large, requiring parallel computation to efficiently evaluate its right-hand side.
- Data Allocation and Execution Environment:
- The dynamic vector TNL::Containers::Vector allocates data dynamically, which precludes its creation within a GPU kernel. As a result, ODE solvers cannot be instantiated within a GPU kernel.
- Therefore, the lambda function f, which evaluates the right-hand side of the ODE system, is executed on the host. It utilizes TNL::Algorithms::parallelFor to facilitate the parallel evaluation of the system's right-hand side.
Basic setup
The implementation of the solver for the heat equation is detailed in the following way:
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Solvers/ODE/ODESolver.h>
4#include <TNL/Solvers/ODE/Methods/Euler.h>
5#include "write.h"
6
7using Real = double;
8using Index = int;
9
10template< typename Device >
11void
12solveHeatEquation( const char* file_name )
13{
16 using VectorView = typename Vector::ViewType;
20
21
22
23
25 const Real final_t = 0.05;
26 const Real output_time_step = 0.005;
27 const Index n = 41;
28 const Real h = 1.0 / ( n - 1 );
29 const Real tau = 0.1 * h * h;
30 const Real h_sqr_inv = 1.0 / ( h * h );
32
33
34
35
37 Vector u( n );
38 u.forAllElements(
40 {
41 const Real x = i * h;
42 if( x >= 0.4 && x <= 0.6 )
43 value = 1.0;
44 else
45 value = 0.0;
46 } );
48 file.
open( file_name, std::ios::out );
49 write( file, u, n, h, (Real) 0.0 );
51
52
53
54
56 ODESolver solver;
57 solver.setTau( tau );
58 solver.setTime( 0.0 );
60
61
62
63
65 while( solver.getTime() < final_t ) {
66 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
68 auto f = [ = ]
__cuda_callable__( Index i,
const VectorView& u, VectorView& fu )
mutable
69 {
70 if( i == 0 || i == n - 1 )
71 fu[ i ] = 0.0;
72 else
73 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
74 };
77 auto time_stepping = [ = ]( const Real& t, const Real& tau, const VectorView& u, VectorView& fu )
78 {
82 };
84 solver.solve( u, time_stepping );
85 write( file, u, n, h, solver.getTime() );
86 }
88}
89
90int
91main( int argc, char* argv[] )
92{
93 if( argc != 2 ) {
95 return EXIT_FAILURE;
96 }
98 file_name += "/ODESolver-HeatEquationExample-result.out";
99
100 solveHeatEquation< TNL::Devices::Host >( file_name.getString() );
101#ifdef __CUDACC__
102 solveHeatEquation< TNL::Devices::Cuda >( file_name.getString() );
103#endif
104}
The solver is encapsulated within the function solveHeatEquation, which includes a template parameter Device. This parameter specifies the device (e.g., CPU or GPU) on which the solver will execute. The implementation begins with defining necessary types:
16 using VectorView = typename Vector::ViewType;
- Vector is alias for TNL::Containers::Vector. The choice of a dynamic vector (rather than a static vector like TNL::Containers::StaticVector) is due to the potentially large number of degrees of freedom (DOFs) that need to be stored in a resizable and dynamically allocated vector.
- VectorView is designated for creating vector view, which is essential for accessing data within lambda functions, especially when these functions are executed on the GPU.
- Method defines the numerical method for time integration.
- ODESolver is the type of the ODE solver that will be used for calculating the time evolution in the time-dependent heat equation.
After defining these types, the next step in the implementation is to establish the parameters of the discretization:
25 const Real final_t = 0.05;
26 const Real output_time_step = 0.005;
27 const Index n = 41;
28 const Real h = 1.0 / ( n - 1 );
29 const Real tau = 0.1 * h * h;
30 const Real h_sqr_inv = 1.0 / ( h * h );
- final_t represents the length of the time interval \( [0,T] \).
- output_time_step defines the time intervals at which the solution \( u \) will be written into a file.
- n stands for the number of DOFs, i.e. number of nodes used for the finite difference method.
- h is the space step, meaning the distance between two consecutive nodes.
- tau represents the time step used for integration by the ODE solver. For a second order parabolic problem like this, the time step should be proportional to \( h^2 \).
- h_sqr_inv is an auxiliary constant equal to \( 1/h^2 \). It is used later in the finite difference method for approximating the second derivative.
The initial condition \( u_{ini} \) is set as:
\[u_{ini}(x) = \left\{
\begin{array}{rl}
0 & \text{ for } x < 0.4, \\
1 & \text{ for } 0.4 \leq x \leq 0.6, \\
0 & \text{ for } x > 0. \\
\end{array}
\right.
\]
This initial condition defines the state of the system at the beginning of the simulation. It specifies a region (between 0.4 and 0.6) where the temperature (or the value of \( u \)) is set to 1, and outside this region, the temperature is 0.
37 Vector u( n );
38 u.forAllElements(
40 {
41 const Real x = i * h;
42 if( x >= 0.4 && x <= 0.6 )
43 value = 1.0;
44 else
45 value = 0.0;
46 } );
48 file.
open( file_name, std::ios::out );
49 write( file, u, n, h, (Real) 0.0 );
After setting the initial condition, the next step is to write it to a file. This is done using the write function, which is detailed later.
Next, we create an instance of the ODE solver solver and set the integration time step tau of the solver (TNL::Solvers::ODE::ExplicitSolver::setTau ) The initial time is set to zero with (TNL::Solvers::ODE::ExplicitSolver::setTime).
56 ODESolver solver;
57 solver.setTau( tau );
58 solver.setTime( 0.0 );
Finally, we proceed to the time loop:
65 while( solver.getTime() < final_t ) {
66 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
68 auto f = [ = ]
__cuda_callable__( Index i,
const VectorView& u, VectorView& fu )
mutable
69 {
70 if( i == 0 || i == n - 1 )
71 fu[ i ] = 0.0;
72 else
73 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
74 };
77 auto time_stepping = [ = ]( const Real& t, const Real& tau, const VectorView& u, VectorView& fu )
78 {
82 };
84 solver.solve( u, time_stepping );
85 write( file, u, n, h, solver.getTime() );
86 }
The time loop is executed, utilizing the time variable from the ODE solver (TNL::Solvers::ODE::ExplicitSolver::getTime). The loop iterates until reaching the end of the specified time interval \( [0, T] \), defined by the variable final_t. The stop time of the ODE solver is set with TNL::Solvers::ODE::ExplicitSolver::setStopTime, either to the next checkpoint for storing the state of the system or to the end of the time interval, depending on which comes first.
The lambda function f is defined to express the discretization of the second derivative of \( u \) using the central finite difference and to incorporate the boundary conditions.
68 auto f = [ = ]
__cuda_callable__( Index i,
const VectorView& u, VectorView& fu )
mutable
69 {
70 if( i == 0 || i == n - 1 )
71 fu[ i ] = 0.0;
72 else
73 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
74 };
The function receives the following parameters:
- i represents the index of the node, corresponding to the individual ODE derived from the method of lines. It is essential for evaluating the update of \( u_i^k \) to reach the next time level \( u_i^{k+1} \).
- u is a vector view representing the state \( u_i^k \) of the heat equation on the \( k-\)th time level.
- fu is a vector that stores updates or time derivatives in the method of lines which will transition \( u \) to the next time level.
In this implementation, the lambda function f plays a crucial role in applying the finite difference method to approximate the second derivative and in enforcing the boundary conditions, thereby driving the evolution of the system in time according to the heat equation.
In the lambda function f, the solution \( u \) remains unchanged on the boundary nodes. Therefore, the function returns zero for these boundary nodes. For the interior nodes, f evaluates the central difference to approximate the second derivative.
The lambda function time_stepping is responsible for computing updates for all nodes \( i = 0, \ldots, n-1 \). This computation is performed using TNL::Algorithms::parallelFor, which iterates over all the nodes and calls the function f for each one. Since the nvcc compiler does not support lambda functions defined within another lambda function, f is defined separately, and the parameters u and fu are explicitly passed to it.
The ODE solver is executed with solve. The current state of the heat equation u and the lambda function f (controlling the time evolution) are passed to the solve method. After each iteration, the current state is saved to a file using the write function.
1#include <fstream>
2
3template< typename Vector, typename Real = typename Vector::RealType, typename Index = typename Vector::IndexType >
4void
5write(
std::fstream& file,
const Vector& u,
const Index n,
const Real& h,
const Real& time )
6{
8 for( Index i = 0; i < n; i++ )
9 file << i * h <<
" " << u.getElement( i ) <<
std::endl;
11}
12
13template< typename Vector, typename Real = typename Vector::RealType >
14void
15write(
std::fstream& file,
const Vector& u,
const Real& h,
const Real& time )
16{
18 const auto localRange = u.getLocalRange();
19 for( auto i = localRange.getBegin(); i < localRange.getEnd(); i++ )
20 file << i * h <<
" " << u.getElement( i ) <<
std::endl;
22}
The write function is used to write the solution of the heat equation to a file. The specifics of this function, including its parameters and functionality, are detailed in the following:
- file specifies the file into which the solution will be stored.
- u represents the solution of the heat equation at a given time. It can be a vector or vector view.
- n indicates the number of nodes used for approximating the solution.
- h refers to the space step, i.e., the distance between two consecutive nodes.
- time represents the current time of the evolution being computed.
The solver writes the results in a structured format, making it convenient for visualization and analysis:
# time = t[ 0 ]
x[ 0 ] u( t[ 0 ], x[ 0 ] )
x[ 1 ] u( t[ 0 ], x[ 1 ] )
x[ 2 ] u( t[ 0 ], x[ 2 ] )
...
# time = t[ 1 ]
x[ 0 ] u( t[ 1 ], x[ 0 ] )
x[ 1 ] u( t[ 1 ], x[ 1 ] )
x[ 2 ] u( t[ 1 ], x[ 2 ] )
...
This format records the solution of the heat equation at various time steps and spatial nodes, providing a comprehensive view of the system's evolution over time.
The solution can be visualised with Gnuplot using the command:
plot 'ODESolver-HeatEquationExample-result.out' with lines
This command plots the solution as a line graph, offering a visual representation of how the heat equation's solution evolves. The solution can also be parsed and processed in Python, using Matplotlib for visualization. The specifics of this process are detailed in the following script:
1
2
3import sys
4import matplotlib.pyplot as plt
5import numpy as np
6from mpl_toolkits.mplot3d import axes3d, Axes3D
7
8plt.rcParams["text.usetex"] = True
9
10f = open(sys.argv[1], "r")
11current_time = 0.0
12time_lst = []
13x_lst = []
14u_lst = []
15x_data = []
16u_data = []
17size = 0
18for line in f:
19 line = line.strip()
20 a = line.split()
21 if not a:
22 continue
23 if a[0] == "#":
24 if u_lst:
25 time_lst.append(current_time)
26 u_data.append(np.array(u_lst))
27 if not x_data:
28 x_data.append(np.array(x_lst))
29 x_lst.clear()
30 u_lst.clear()
31
32 current_time = float(a[3])
33 time_lst.append(current_time)
34 else:
35 x_lst.append(float(a[0]))
36 u_lst.append(float(a[1]))
37if u_lst:
38 time_lst.append(current_time)
39 u_data.append(np.array(u_lst))
40 if not x_data:
41 x_data.append(np.array(x_lst))
42 x_lst.clear()
43 u_lst.clear()
44
45
46fig, ax = plt.subplots(1, 1, figsize=(4, 4))
47for u in u_data:
48 ax.plot(x_data[0], u, linewidth=2.0)
49
50ax.set_ylabel(f"$ u(x) $")
51ax.set_xlabel(f"$x $")
52plt.savefig(sys.argv[2])
53plt.close(fig)
The outcome of the solver, once visualized, is shown as follows:

Setup with a solver monitor
In this section, we'll discuss how to integrate an ODE solver with a solver monitor, as demonstrated in the example:
1#include <iostream>
2#include <TNL/Containers/Vector.h>
3#include <TNL/Solvers/ODE/ODESolver.h>
4#include <TNL/Solvers/ODE/Methods/Euler.h>
5#include <TNL/Solvers/IterativeSolverMonitor.h>
6#include "write.h"
7
8using Real = double;
9using Index = int;
10
11template< typename Device >
12void
13solveHeatEquation( const char* file_name )
14{
16 using VectorView = typename Vector::ViewType;
19
20
21
22
23 const Real final_t = 0.05;
24 const Real output_time_step = 0.005;
25 const Index n = 41;
26 const Real h = 1.0 / ( n - 1 );
27 const Real tau = 0.001 * h * h;
28 const Real h_sqr_inv = 1.0 / ( h * h );
29
30
31
32
33 Vector u( n );
34 u.forAllElements(
36 {
37 const Real x = i * h;
38 if( x >= 0.4 && x <= 0.6 )
39 value = 1.0;
40 else
41 value = 0.0;
42 } );
44 file.
open( file_name, std::ios::out );
45 write( file, u, n, h, (Real) 0.0 );
46
47
48
49
52 IterativeSolverMonitorType monitor;
54 monitor.setRefreshRate( 10 );
55 monitor.setVerbose( 1 );
56 monitor.setStage( "ODE solver stage:" );
58
59
60
61
62 ODESolver solver;
63 solver.setTau( tau );
64 solver.setTime( 0.0 );
65 solver.setSolverMonitor( monitor );
66
67
68
69
70 while( solver.getTime() < final_t ) {
71 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
72 auto f = [ = ]
__cuda_callable__( Index i,
const VectorView& u, VectorView& fu )
mutable
73 {
74 if( i == 0 || i == n - 1 )
75 fu[ i ] = 0.0;
76 else
77 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
78 };
79 auto time_stepping = [ = ]( const Real& t, const Real& tau, const VectorView& u, VectorView& fu )
80 {
82 };
83 solver.solve( u, time_stepping );
84 write( file, u, n, h, solver.getTime() );
85 }
86 monitor.stopMainLoop();
87}
88
89int
90main( int argc, char* argv[] )
91{
92 if( argc != 2 ) {
94 return EXIT_FAILURE;
95 }
97 file_name += "/ODESolver-HeatEquationExampleWithMonitor-result.out";
98
99 solveHeatEquation< TNL::Devices::Host >( file_name.getString() );
100#ifdef __CUDACC__
101 solveHeatEquation< TNL::Devices::Cuda >( file_name.getString() );
102#endif
103}
Object for monitoring convergence of iterative solvers.
Definition IterativeSolverMonitor.h:36
A RAII wrapper for launching the SolverMonitor's main loop in a separate thread.
Definition SolverMonitor.h:133
This setup incorporates a solver monitor into the ODE solver framework, which differs from the previous example in several key ways:
The first difference is the inclusion of a header file TNL/Solvers/IterativeSolverMonitor.h for the iterative solver monitor. This step is essential for enabling monitoring capabilities within the solver. We have to setup the solver monitor:
52 IterativeSolverMonitorType monitor;
54 monitor.setRefreshRate( 10 );
55 monitor.setVerbose( 1 );
56 monitor.setStage( "ODE solver stage:" );
First, we define the monitor type IterativeSolverMonitorType and we create an instance of the monitor. A separate thread (monitorThread) is created for the monitor. The refresh rate of the monitor is set to 10 milliseconds with setRefreshRate and verbose mode is enabled with setVerbose for detailed monitoring. The solver stage name is specified with setStage. The monitor is connected to the solver using TNL::Solvers::IterativeSolver::setSolverMonitor. Subsequently, the numerical computation is performed and after it finishes, the monitor is stopped by calling TNL::Solvers::IterativeSolverMonitor::stopMainLoop.
Distributed ODE Solvers
Distributed ODE solvers in TNL extend the capabilities of single-node dynamic ODE solvers to handle problems that are too large for single-memory systems by distributing the computational workload across multiple processes or nodes in a cluster.
Key differences from dynamic ODE solvers
The two approaches differ fundamentally in their approach to memory management and parallelization:
Dynamic ODE Solvers:
- Local memory: All data resides in the memory space of a single process
- Limited scale: Constrained by available RAM on the host system
- No data exchange: No inter-process communication required for data exchange
Distributed ODE Solvers:
- Memory Distribution: The solution vector is distributed across multiple processes using MPI (Message Passing Interface)
- Scalability: Can handle problems that exceed the memory capacity of individual nodes
- Communication overhead: Requires explicit synchronization and communication between processes
Distributed ODE Solver Example
The distributed implementation requires several key modifications compared to the standard approach. The following example illustrates these modifications:
1#include <iostream>
2#include <TNL/Containers/BlockPartitioning.h>
3#include <TNL/Containers/DistributedArraySynchronizer.h>
4#include <TNL/Containers/DistributedVector.h>
5#include <TNL/MPI/ScopedInitializer.h>
6#include <TNL/Solvers/ODE/ODESolver.h>
7#include <TNL/Solvers/ODE/Methods/Euler.h>
8#include <string>
9#include "write.h"
10
11using Real = double;
12using Index = int;
13
14template< typename Device >
15void
16solveHeatEquation( const char* file_name )
17{
20 using VectorView = typename Vector::ViewType;
21 using LocalVectorView = typename Vector::LocalViewType;
22 using ConstLocalVectorView = typename Vector::ConstLocalViewType;
23 using LocalRangeType = typename Vector::LocalRangeType;
28
30 const Real final_t = 0.05;
31 const Real output_time_step = 0.005;
32 const Index n = 41;
33 const Real h = 1.0 / ( n - 1 );
34 const Real tau = 0.1 * h * h;
35 const Real h_sqr_inv = 1.0 / ( h * h );
37
41 const Index ghosts = 2;
42
43 Vector u( localRange, ghosts, n, communicator );
46
48 u.forElements( 0,
49 n,
51 {
52 const Real x = i * h;
53 if( x >= 0.4 && x <= 0.6 )
54 value = 1.0;
55 else
56 value = 0.0;
57 } );
59 file.
open( file_name, std::ios::out );
60 write( file, u, h, (Real) 0.0 );
62
64 ODESolver solver;
65 solver.setTau( tau );
66 solver.setTime( 0.0 );
68
70 while( solver.getTime() < final_t ) {
71 solver.setStopTime(
TNL::min( solver.getTime() + output_time_step, final_t ) );
73 auto f = [ = ]
__cuda_callable__( Index gi,
const ConstLocalVectorView& u, LocalVectorView& fu )
74 {
75
76 const Index i = localRange.getLocalIndex( gi );
77
78 if( gi == 0 || gi == n - 1 ) {
79
80 fu[ i ] = 0.0;
81 }
82 else {
83
84 const Index
left = localRange.isLocal( gi - 1 ) ? i - 1 : localRange.getSize();
85 const Index
right = localRange.isLocal( gi + 1 ) ? i + 1 : localRange.getSize() + 1;
86
87
88 fu[ i ] = h_sqr_inv * ( u[
left ] - 2.0 * u[ i ] + u[
right ] );
89 }
90 };
93 auto time_stepping = [ = ]( const Real& t, const Real& tau, const VectorView& u, VectorView& fu ) mutable
94 {
96 const_cast< VectorView& >( u ).startSynchronization();
97 u.waitForSynchronization();
99 localRange.getBegin(), localRange.getEnd(), f, u.getConstLocalViewWithGhosts(), fu.getLocalView() );
101 };
103 solver.solve( u, time_stepping );
104 write( file, u, h, solver.getTime() );
105 }
107}
108
109int
110main( int argc, char* argv[] )
111{
113
114 if( argc != 2 ) {
116 return EXIT_FAILURE;
117 }
119 file_name += "/DistributedODESolver-HeatEquationExample-result-";
121 file_name += ".out";
122
123 solveHeatEquation< TNL::Devices::Host >( file_name.getString() );
124#ifdef __CUDACC__
125 TNL::MPI::Barrier();
126 solveHeatEquation< TNL::Devices::Cuda >( file_name.getString() );
127#endif
128}
Definition DistributedArraySynchronizer.h:18
DistributedVector extends DistributedArray with algebraic operations.
Definition DistributedVector.h:24
An RAII wrapper for custom MPI communicators.
Definition Comm.h:63
Subrange< Index > splitRange(Index rangeBegin, Index rangeEnd, int rank, int num_subintervals)
A helper function which splits a one-dimensional range.
Definition BlockPartitioning.h:27
Definition ScopedInitializer.h:63
The example demonstrates how to set up a distributed ODE solver for solving the heat equation. Below we explain the key differences from the standard approach.
MPI Initialization:
First, the example uses TNL::MPI::ScopedInitializer to initialize MPI correctly at program startup, ensuring all processes are properly synchronized for distributed execution. This is demonstrated in the main function with the line
Distributed Data Structures:
The core difference lies in the vector type: TNL::Containers::DistributedVector replaces the standard TNL::Containers::Vector. This new type manages memory distribution across processes and includes ghost zones for data communicated from neighboring processes. We also define a type alias for the TNL::Containers::DistributedArraySynchronizer that will handle communication between processes during synchronization.
20 using VectorView = typename Vector::ViewType;
21 using LocalVectorView = typename Vector::LocalViewType;
22 using ConstLocalVectorView = typename Vector::ConstLocalViewType;
23 using LocalRangeType = typename Vector::LocalRangeType;
Domain Decomposition:
The global vector is split across processes using TNL::Containers::splitRange, which computes each process's local range. The distributed vector is initialized using the local range, number of ghost elements, global size n, and the MPI communicator. After initialization, an instance of the Synchronizer is set to the vector.
41 const Index ghosts = 2;
42
43 Vector u( localRange, ghosts, n, communicator );
The initial condition and solver setup are the same as in the standard approach.
Time Loop:
The main difference lies in how boundary conditions are handled and ensuring data synchronization between processes for consistency during the time loop.
First, the lambda function f is modified to handle access to boundary elements that may reside on different processes. In case of subdomain interface, the element is retrieved from the neighboring process using the synchronizer and stored locally in the ghost zone that is allocated after the local elements of the array. In the 1D example, the element from the left neighbor is stored at position localRange.getSize() and the element from the right neighbor is stored at position localRange.getSize() + 1.
73 auto f = [ = ]
__cuda_callable__( Index gi,
const ConstLocalVectorView& u, LocalVectorView& fu )
74 {
75
76 const Index i = localRange.getLocalIndex( gi );
77
78 if( gi == 0 || gi == n - 1 ) {
79
80 fu[ i ] = 0.0;
81 }
82 else {
83
84 const Index
left = localRange.isLocal( gi - 1 ) ? i - 1 : localRange.getSize();
85 const Index
right = localRange.isLocal( gi + 1 ) ? i + 1 : localRange.getSize() + 1;
86
87
88 fu[ i ] = h_sqr_inv * ( u[
left ] - 2.0 * u[ i ] + u[
right ] );
89 }
90 };
Secondly, explicit synchronization is added to ensure data consistency across subdomain boundaries during the computation.
93 auto time_stepping = [ = ]( const Real& t, const Real& tau, const VectorView& u, VectorView& fu ) mutable
94 {
96 const_cast< VectorView& >( u ).startSynchronization();
97 u.waitForSynchronization();
99 localRange.getBegin(), localRange.getEnd(), f, u.getConstLocalViewWithGhosts(), fu.getLocalView() );
101 };
2D/3D Considerations:
While this 1D example handles boundary communication through ghost zones, 2D/3D implementations would require additional complexity in domain decomposition and synchronization patterns (e.g., halo exchanges across multiple dimensions). TNL provides advanced functionality in TNL::Meshes::DistributedMeshes::DistributedMeshSynchronizer that can be used for mapping structured as well as unstructured meshes across processes using the same principle of distributed vectors with ghost zones.
Use of the iterate method
The ODE solvers in TNL provide an iterate method for performing just one iteration. This is particularly useful when there is a need for enhanced control over the time loop, or when developing a hybrid solver that combines multiple integration methods. The usage of this method is demonstrated in the following example:
1#include <iostream>
2#include <TNL/Solvers/ODE/ODESolver.h>
3#include <TNL/Solvers/ODE/Methods/Euler.h>
4
6using Real = double;
8
10int
11main( int argc, char* argv[] )
12{
18
20 const Real final_time = 10.0;
21 const Real output_time_step = 0.25;
22 Real next_output_time =
TNL::min( output_time_step, final_time );
23 Real tau = 0.001;
24 Real time = 0.0;
25 Real current_tau = tau;
27
29 Vector u = 0.0;
30 ODESolver solver;
31 solver.init( u );
33
35 while( time < final_time ) {
36 auto f = []( const Real& t, const Real& current_tau, const Vector& u, Vector& fu )
37 {
38 fu = t * sin( t );
39 };
40 solver.iterate( u, time, current_tau, f );
41 if( time >= next_output_time ) {
43 next_output_time += output_time_step;
44 if( next_output_time > final_time )
45 next_output_time = final_time;
46 }
47 if( time + current_tau > next_output_time )
48 current_tau = next_output_time -
time;
49 else
50 current_tau = tau;
51 }
53}
For simplicity, we demonstrate the use of iterate with a static solver, but the process is similar for dynamic solvers. There are two main differences compared to using the solve method:
- Initialization: Before calling iterate, it's necessary to initialize the solver using the init method. This step sets up auxiliary vectors within the solver. For ODE solvers with dynamic vectors, the internal vectors of the solver are allocated based on the size of the vector u.
29 Vector u = 0.0;
30 ODESolver solver;
31 solver.init( u );
- Time Loop: Within the time loop, the iterate method is called. It requires the vector u, the right-hand side function f of the ODE, and also the variables time and tau, representing the current time \( t \) and the integration time step, respectively. The variable time is incremented by tau with each iteration. Additionally, tau can be adjusted if the solver performs adaptive time step selection. It's important to adjust tau to ensure that the next_output_time is reached exactly.
35 while( time < final_time ) {
36 auto f = []( const Real& t, const Real& current_tau, const Vector& u, Vector& fu )
37 {
39 };
40 solver.iterate( u, time, current_tau, f );
41 if( time >= next_output_time ) {
43 next_output_time += output_time_step;
44 if( next_output_time > final_time )
45 next_output_time = final_time;
46 }
47 if( time + current_tau > next_output_time )
48 current_tau = next_output_time -
time;
49 else
50 current_tau = tau;
51 }
User defined methods
The Runge-Kutta methods used for solving ODEs can generally be expressed as follows:
\[ k_1 = f(t, \vec u) \]
\[ k_2 = f(t + c_2, \vec u + \tau(a_{21} k_1)) \]
\[ k_3 = f(t + c_3, \vec u + \tau(a_{31} k_1 + a_{32} k_2)) \]
\[ \vdots \]
\[ k_s = f(t + c_s, \vec u + \tau( \sum_{j=1}^{s-1} a_{si} k_i ) )\]
\[ \vec u_{n+1} = \vec u_n + \tau \sum_{i=1}^s b_i k_i\]
\[ \vec u^\ast_{n+1} = \vec u_n + \tau \sum_{i=1}^s b^\ast_i k_i\]
\[ \vec e_{n+1} = \vec u_{n+1} - \vec u^\ast_{n+1} = \vec u_n + \tau \sum_{i=1}^s (b_i - b^\ast_i) k_i \]
where \(s\) denotes the number of stages, the vector \(\vec e_{n+1} \) is an error estimate and a basis for the adaptive choice of the integration step. Each such method can be expressed in the form of a Butcher tableau having the following form:
| | | | |
\( c_1 \) | | | | |
\( c_2 \) | \( a_{21} \) | | | |
\( c_3 \) | \( a_{31} \) | \( a_{32} \) | | |
\( \vdots \) | \( \vdots \) | \( \vdots \) | \( \ddots \) | |
\( c_s \) | \( a_{s1} \) | \( a_{s2} \) | \( \ldots \) | \( a_{s,s-1} \) |
| \( b_1 \) | \( b_2 \) | \( \ldots \) | \( b_s \) |
| \( b^\ast_1 \) | \( b^\ast_2 \) | \( \ldots \) | \( b^\ast_s \) |
For example, the the Fehlberg RK1(2) method can be expressed as:
| | | |
\( 0 \) | | | |
\( 1/2 \) | \( 1/2 \) | | |
\( 1 \) | \( 1/256 \) | \( 1/256 \) | |
| \( 1/512 \) | \( 255/256 \) | \( 1/521 \) |
| \( 1/256 \) | \( 255/256 \) | |
TNL allows the implementation of new Runge-Kutta methods simply by specifying the Butcher tableau. The following is an example of the implementation of the Fehlberg RK1(2):
1#pragma once
2
3#include <array>
4#include <cstddef>
5#include <TNL/Cuda/CudaCallable.h>
6#include <TNL/Math.h>
7
8template< typename Value = double >
9struct Fehlberg2
10{
11 using ValueType = Value;
12
14 static constexpr size_t Stages = 3;
16
17 static constexpr size_t
18 getStages()
19 {
20 return Stages;
21 }
22
24 static constexpr bool
25 isAdaptive()
26 {
27 return true;
28 }
30
31 static constexpr ValueType
32 getCoefficient( const size_t stage, const size_t i )
33 {
34 return k_coefficients[ stage ][ i ];
35 }
36
37 static constexpr ValueType
38 getTimeCoefficient( size_t i )
39 {
40 return time_coefficients[ i ];
41 }
42
43 static constexpr ValueType
44 getUpdateCoefficient( size_t i )
45 {
46 return higher_order_update_coefficients[ i ];
47 }
48
50 static constexpr ValueType
51 getErrorCoefficient( size_t i )
52 {
53 return higher_order_update_coefficients[ i ] - lower_order_update_coefficients[ i ];
54 }
56
57protected:
58
60 static constexpr std::array< std::array< Value, Stages >, Stages > k_coefficients{
61 std::array< Value, Stages >{ 0.0, 0.0, 0.0 },
62 std::array< Value, Stages >{ 1.0/2.0, 0.0, 0.0 },
63 std::array< Value, Stages >{ 1.0/256.0, 255.0/256.0, 0.0 }
64 };
66
68 static constexpr std::array< Value, Stages > time_coefficients{ 0.0, 1.0/2.0, 1.0 };
70
72 static constexpr std::array< Value, Stages > higher_order_update_coefficients{ 1.0/512.0, 255.0/256.0, 1.0/512.0 };
73 static constexpr std::array< Value, Stages > lower_order_update_coefficients { 1.0/256.0, 255.0/256.0, 0.0 };
75
76};
The method is a templated structure accepting a template parameter Value indicating the numeric type of the method. To implement a new method, we need to do the following:
Set the Stages of the Method: This is \( s \) in the definition of the Runge-Kutta method and equals the number of vectors \( \vec k_i \).
14 static constexpr size_t Stages = 3;
Determine Adaptivity: If the Runge-Kutta method allows an adaptive choice of the time step, the method isAdaptive shall return true, and false otherwise.
24 static constexpr bool
25 isAdaptive()
26 {
27 return true;
28 }
Define Method for Error Estimation: Adaptive methods need coefficients for error estimation. Typically, these are the differences between coefficients for updates of higher and lower order of accuracy, as seen in getErrorCoefficients. However, this method can be altered if there's a different formula for error estimation.
50 static constexpr ValueType
51 getErrorCoefficient( size_t i )
52 {
53 return higher_order_update_coefficients[ i ] - lower_order_update_coefficients[ i ];
54 }
Define Coefficients for Evaluating Vectors \( \vec k_i \): The coefficients for the evaluation of the vectors \( \vec k_i \) need to be defined.
60 static constexpr std::array< std::array< Value, Stages >, Stages > k_coefficients{
61 std::array< Value, Stages >{ 0.0, 0.0, 0.0 },
62 std::array< Value, Stages >{ 1.0/2.0, 0.0, 0.0 },
63 std::array< Value, Stages >{ 1.0/256.0, 255.0/256.0, 0.0 }
64 };
and
68 static constexpr std::array< Value, Stages > time_coefficients{ 0.0, 1.0/2.0, 1.0 };
Zero coefficients are omitted in the generated formulas at compile time, ensuring no performance drop.
Define Update Coefficients: Finally, the update coefficients must be defined.
72 static constexpr std::array< Value, Stages > higher_order_update_coefficients{ 1.0/512.0, 255.0/256.0, 1.0/512.0 };
73 static constexpr std::array< Value, Stages > lower_order_update_coefficients { 1.0/256.0, 255.0/256.0, 0.0 };
Such structure can be substituted to the TNL::Solvers::ODE::ODESolver as the template parameter Method.