Introduction
In this part, we describe solvers of ordinary differential equations having the following form:
\[ \frac{d \vec u}{dt} = \vec f( t, \vec u) \text{ on } (0,T), \]
\[ \vec u( 0 ) = \vec u_{ini}. \]
TNL offers the following ODE solvers:
- TNL::Solvers::ODE::Euler - the Euler method with the 1-st order of accuracy.
- TNL::Solvers::ODE::Merson - the Runge-Kutta-Merson solver with the 4-th order of accuracy and adaptive choice of the time step.
Each solver has its static counterpart which can be run even in the GPU kernels which means that it can be combined with TNL::Algorithms::parallelFor for example. The static ODE solvers are the following:
- TNL::Solvers::ODE::StaticEuler - the Euler method with the 1-st order of accuracy.
- TNL::Solvers::ODE::StaticMerson - the Runge-Kutta-Merson solver with the 4-th order of accuracy and adaptive choice of the time step.
Static ODE solvers
Static solvers are supposed to be used mainly when \( x \in R \) is scalar or \( \vec x \in R^n \) is vector where \( n \) is small. Firstly, we show example of scalar problem of the following form:
\[ \frac{d u}{dt} = t \sin ( t )\ \rm{ on }\ (0,T), \]
\[ u( 0 ) = 0. \]
This problem can be solved as follows:
2#include <TNL/Solvers/ODE/StaticEuler.h>
6int main(
int argc,
char* argv[] )
9 const Real final_t = 10.0;
10 const Real tau = 0.001;
11 const Real output_time_step = 0.25;
15 solver.setTime( 0.0 );
17 while( solver.getTime() < final_t )
19 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
20 auto f = [] (
const Real& t,
const Real& tau,
const Real& u, Real& fu ) {
Solver of ODEs with the first order of accuracy.
Definition StaticEuler.h:55
__cuda_callable__ void setTau(const RealType &tau)
Setter of the time step used for the computation.
Definition StaticExplicitSolver.hpp:52
We first define the type Real
representing the floating-point arithmetics, which is double
in this case (line 4). In the main function, we define the type of the ODE solver (ODESolver
, line 8). We choose TNL::Solvers::ODE::StaticEuler. We define the variable final_t
(line 9) representing the size of the time interval \( (0,T)\), next we define the integration time step tau
(line 10) and the variable output_time_step
(line 11) representing checkpoints in which we will print value of the solution \( x(t)\). On the line 13, we create an instance of the ODESolver
and set the integration time step (line 14) and the initial time of the solver (line 15). Next we create variable u
representing the solution of the given ODE and we initiate it with the initial condition \( u(0) = 0\) (line 16). On the lines 17-25, we iterate over the interval \( (0,T) \) with step given by the frequency of the checkpoints given by the variable output_time_steps
. On the line 19, we let the solver to iterate until the next checkpoint or the end of the interval \((0,T) \) depending on what occurs first (it is expressed by TNL::min( solver.getTime() + output_time_step, final_t )
). On the lines 20-22, we define the lambda function f
representing the right-hand side of the ODE being solved. The lambda function receives the following arguments:
t
is the current value of the time variable \( t \in (0,T)\),
tau
is the current integration time step,
u
is the current value of the solution \( u(t)\),
fu
is a reference on a variable into which we evaluate the right-hand side \( f(u,t) \) on the ODE.
The lambda function is supposed to compute just the value of fu
. It is t * sin(t)
on our case. Finally we call the ODE solver (line 23). As parameters, we pass the variable u
representing the solution \( u(t)\) and a lambda function representing the right-hand side of the ODE. On the line 23, we print values of the solution at given checkpoints. The result looks as follows:
0.001 0
0.002 1e-09
0.252 0.00526916
0.502 0.0409948
0.752 0.13364
1.002 0.302432
1.252 0.556612
1.502 0.893635
1.752 1.2985
2.002 1.74432
2.252 2.19409
2.502 2.60357
2.752 2.92506
3.002 3.11173
3.252 3.1222
3.502 2.92497
3.752 2.50232
4.002 1.85323
4.252 0.995174
4.502 -0.0355494
4.752 -1.18502
5.002 -2.38443
5.252 -3.55415
5.502 -4.60904
5.752 -5.46451
6.002 -6.04295
6.252 -6.28004
6.502 -6.1306
6.752 -5.57319
7.002 -4.61342
7.252 -3.28541
7.502 -1.65121
7.752 0.201757
8.002 2.16523
8.252 4.11644
8.502 5.92576
8.752 7.46532
9.002 8.61785
9.252 9.28537
9.502 9.3969
9.752 8.9147
10 7.84941
Such data can by visualized using Gnuplot as follows
plot 'StaticODESolver-SineExample.out' with lines
or it can be processed by the following Python script which draws graph of the function \( u(t) \) using Matplotlib.
4import matplotlib.pyplot
as plt
9plt.rcParams[
'text.usetex'] =
True
13f = open( sys.argv[1],
'r' )
19 x_lst.append( float( a[ 0 ] ) )
20 y_lst.append( float( a[ 1 ] ) )
29fig, ax = plt.subplots()
31ax.set_ylim( [-10,10] )
32ax.plot(x, y, linewidth=2.0)
34ax.set_ylabel(
"$u(t)$" )
35plt.savefig( sys.argv[2] )
We first parse the input file (lines 13-20) and convert the data into NumPy arrays (lines 24-25). Finaly, these arrays are drawn into a graph (lines 29-36). The graph of the solution looks as follows:

In the next example, we demonstrate use of the static ODE solver to solve a system of ODEs, namely the Lorenz system which reads as:
\[ \frac{dx}{dt} = \sigma( x - y),\ \rm{ on }\ (0,T) \]
\[ \frac{dy}{dt} = x(\rho - z ) - y,\ \rm{ on }\ (0,T) \]
\[ \frac{dz}{dt} = xy - \beta z,\ \rm{ on }\ (0,T) \]
\[ \vec u(0) = (x(0),y(0),z(0)) = \vec u_{ini} \]
for given constants \( \sigma, \rho \) and \( \beta \). The solution \( \vec u(t) = (x(t), y(t), z(t)) \in R^3 \) is represented by three-dimensional static vector (TNL::Containers::StaticVector). The solver looks as follows:
2#include <TNL/Containers/StaticVector.h>
3#include <TNL/Solvers/ODE/StaticEuler.h>
7int main(
int argc,
char* argv[] )
11 const Real final_t = 25.0;
12 const Real tau = 0.001;
13 const Real output_time_step = 0.01;
14 const Real sigma = 10.0;
15 const Real rho = 28.0;
16 const Real beta = 8.0 / 3.0;
20 solver.setTime( 0.0 );
21 Vector u( 1.0, 2.0, 3.0 );
22 while( solver.getTime() < final_t )
24 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
25 auto f = [=] (
const Real& t,
const Real& tau,
const Vector& u, Vector& fu ) {
26 const Real& x = u[ 0 ];
27 const Real& y = u[ 1 ];
28 const Real& z = u[ 2 ];
29 fu[ 0 ] = sigma * (y - x );
30 fu[ 1 ] = rho * x - y - x * z;
31 fu[ 2 ] = -beta * z + x * y;
The code is very similar to the previous example. There are the following differences:
- We define the type of the variable
u
representing the solution \( \vec u(t) \) as TNL::Containers::StaticVector< 3, Real > (line 9) which is reflected even in the definition of the ODE solver (TNL::Solvers::ODE::StaticEuler< Vector > , line 10) and the variable u
(line 21).
- In addition to the parameters of the solver (
final_t
, tau
and output_time_step
, lines 14-16) we define parameters of the Lorenz system (sigma
, rho
and beta
, lines 14-16).
- The initial condition \( \vec u(0) = (1,2,3) \) is set on the line 21.
- In the lambda function representing the right-hand side of the Lorenz system (lines 25-32), we first define auxiliary aliases
x
,y
and z
(lines 26-28) to make the code easier to read. The main right-hand side of the Lorenz system is implemented on the lines (29-31).
- In the line 3, we print all components of the vector
u
.
The solver creates file with the solution \( (\sigma(i \tau), \rho( i \tau), \beta( i \tau )) \) for ' \( i = 0, 1, \ldots N \) on separate lines. It looks as follows:
sigma[ 0 ] rho[ 0 ] beta[ 0 ]
sigma[ 1 ] rho[ 1 ] beta[ 1 ]
sigma[ 2 ] rho[ 2 ] beta[ 2 ]
...
Such file can visualized using Gnuplot interactively in 3D as follows
splot 'StaticODESolver-LorenzExample.out' with lines
or it can be processed by the following Python script:
4import matplotlib.pyplot
as plt
5from mpl_toolkits.mplot3d
import axes3d, Axes3D
10plt.rcParams[
'text.usetex'] =
True
14f = open( sys.argv[1],
'r' )
21 x_lst.append( float( a[ 0 ] ) )
22 y_lst.append( float( a[ 1 ] ) )
23 z_lst.append( float( a[ 2 ] ) )
35theta = np.linspace(-4 * np.pi, 4 * np.pi, 100)
36ax.plot(x, y, z, label=
'Lorenz attractor')
38plt.savefig( sys.argv[2] )
The script has very similar structure as in the previous example. The result looks as follows:

Combining static ODE solvers with parallel for
The static solvers can be used inside of lambda functions for TNL::Algorithms::parallelFor for example. This can be useful when we need to solve large number of independent ODE problems, for example for parametric analysis. We demonstrate it on the two examples we have described above.
Solving scalar problems in parallel
The first example solves ODE given by
\[ \frac{d u}{dt} = t \sin ( c t )\ \rm{ on }\ (0,T), \]
\[ u( 0 ) = 0, \]
where \( c \) is a constant. We will solve it in parallel ODEs with different values \( c \in \langle c_{min}, c_{max} \rangle \). The exact solution can be found here. The code reads as follows:
3#include <TNL/Solvers/ODE/StaticEuler.h>
4#include <TNL/Containers/Vector.h>
5#include <TNL/Algorithms/parallelFor.h>
9template<
typename Device >
10void solveParallelODEs(
const char* file_name )
14 const Real final_t = 10.0;
15 const Real tau = 0.001;
16 const Real output_time_step = 0.1;
17 const Real c_min = 1.0;
18 const Real c_max = 5.0;
19 const int c_vals = 5.0;
20 const Real c_step = ( c_max - c_min ) / ( c_vals - 1 );
21 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
23 Vector results( output_time_steps * c_vals, 0.0 );
24 auto results_view = results.getView();
25 auto f = [=]
__cuda_callable__ (
const Real& t,
const Real& tau,
const Real& u, Real& fu,
const Real& c ) {
26 fu = t * sin( c * t );
29 const Real c = c_min + idx * c_step;
32 solver.setTime( 0.0 );
35 results_view[ idx ] = u;
36 while( time_step < output_time_steps )
38 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
39 solver.solve( u, f, c );
40 results_view[ time_step++ * c_vals + idx ] = u;
46 file.
open( file_name, std::ios::out );
47 for(
int i = 0; i < c_vals; i++ )
49 file <<
"# c = " << c_min + i * c_step <<
std::endl;
50 for(
int k = 0; k < output_time_steps;k++ )
51 file << k * output_time_step <<
" " << results.getElement( k * c_vals + i ) <<
std::endl;
56int main(
int argc,
char* argv[] )
59 file_name +=
"/StaticODESolver-SineParallelExample-result.out";
60 solveParallelODEs< TNL::Devices::Host >( file_name.getString() );
62 solveParallelODEs< TNL::Devices::Cuda >( file_name.getString() );
#define __cuda_callable__
Definition CudaCallable.h:22
Vector extends Array with algebraic operations.
Definition Vector.h:39
Class for managing strings.
Definition String.h:33
std::enable_if_t< std::is_integral_v< Begin > &&std::is_integral_v< End > > parallelFor(const Begin &begin, const End &end, typename Device::LaunchConfiguration launch_config, Function f, FunctionArgs... args)
Parallel for-loop function for 1D range specified with integral values.
Definition parallelFor.h:44
In this example we also show, how to run it on GPU. Therefore we moved the main solver to separate function solveParallelODEs
which has one template parameter Device
telling on what device it is supposed to run. The results of particular ODEs are stored in a memory and at the end they are copied to a file with given filename file_name
. The variable \( u \) is scalar therefore we represent it by the type Real
in the solver (line 12). Next we define parameters of the ODE solver (final_t
, tau
and output_time_step
, lines 14-16), interval for the parameter of the ODE \( c \in \langle c_{min}, c_{max} \rangle \) ( c_min
, c_max
, lines 17-18) and number of values c_vals
(line 19) distributed equidistantly in the interval with step c_step
(line 20). We use the number of different values of the parameter c
as a range for the parallel for on the line 43. This parallel for processes the lambda function solve
which is defined on the lines 28-42. It receives a parameter idx
which is index of the value of the parameter c
. We compute its value on the line 29. Next we create the ODE solver (line 30) and setup its parameters (lines 31-32). We set initial condition of the given ODE and we define variable time_step
which counts checkpoints which we store in the memory in vector results
allocated in the line 23 and accessed in the lambda function via vector view results_view
(defined on the line 24). We iterate over the interval \( (0, T) \) in the while loop starting on the line 36. We set the stop time of the ODE solver (line 38) and we run the solver (line 39). Finally we store the result at given checkpoint into vector view results_view
. If the solver runs on the GPU, it cannot write the checkpoints into a file. This is done in postprocessing in the lines 45-53.
Note, how we pass the value of parameter c
to the lambda function f
. The method solve
of the ODE solvers(TNL::Solvers::ODE::StaticEuler::solve, for example) accepts user defined parameters via variadic templates. It means that in addition to the variable u
and the right-hand side f
we can add any other parameters like c
in this example (line 39). This parameter appears in the lambda function f
(line 25). The reason for this is that the nvcc
compiler (version 10.1) does not accept lambda function defined within another lambda function. If such a construction is accepted by a compiler, the lambda function f
which can be defined within the lambda function solve
and the variable c
defined in the lambda function solve
could be captured by f
.
The solver generates file of the following format:
# c = c[ 0 ]
x[ 0 ] u( c[ 0 ], x[ 0 ] )
x[ 1 ] u( c[ 0 ], x[ 1 ] )
....
# c = c[ 1 ]
x[ 0 ] u( c[ 1 ], x[ 0 ] )
x[ 1 ] u( c[ 1 ], x[ 1 ] )
...
The file an visualized using Gnuplot as follows
splot 'StaticODESolver-SineParallelExample-result.out' with lines
or it can be processed by the following Python script:
4import matplotlib.pyplot
as plt
9plt.rcParams[
'text.usetex'] =
True
13f = open( sys.argv[1],
'r' )
28 parameters.append( current_c )
29 u_data.append( np.array( u_lst ) )
31 x_data.append( np.array( x_lst ) )
34 current_c = float( a [ 3 ] )
37 x_lst.append( float( a[ 0 ] ) )
38 u_lst.append( float( a[ 1 ] ) )
39parameters.append( current_c )
40u_data.append( np.array( u_lst ) )
45fig, ax = plt.subplots( 1, n, figsize=(15, 3), sharey=
True )
48 ax[ idx ].plot( x_data[0], u, linewidth=2.0, label=f
"$c={parameters[idx]}$" )
49 ax[ idx ].set_xlabel(
"t" )
50 ax[ idx ].set_ylabel(
"u(t)" )
52plt.savefig( sys.argv[2] )
The result of this example looks as follows:

Solving the Lorenz system in parallel
The second example is a parametric analysis of the Lorenz model
\[ \frac{dx}{dt} = \sigma( x - y),\ \rm{ on }\ (0,T) \]
\[ \frac{dy}{dt} = x(\rho - z ) - y,\ \rm{ on }\ (0,T) \]
\[ \frac{dz}{dt} = xy - \beta z,\ \rm{ on }\ (0,T) \]
\[ \vec u(0) = (x(0),y(0),z(0)) = \vec u_{ini} \]
which we solve for different values of the model parameters
\[ \sigma_i = \sigma_{min} + i \Delta \sigma, \]
\[ \rho_j = \rho_{min} + j \Delta \rho, \]
\[ \beta_k = \beta_{min} + k \Delta \beta, \]
where we set \( \Delta \sigma = \Delta \rho = \Delta \beta = l / (p-1) \) and where \( i,j,k = 0, 1, \ldots, p - 1 \). The code of the solver looks as follows:
3#include <TNL/Solvers/ODE/StaticEuler.h>
4#include <TNL/Containers/Vector.h>
5#include <TNL/Algorithms/parallelFor.h>
6#include <TNL/Containers/StaticArray.h>
11template<
typename Device >
12void solveParallelODEs(
const char* file_name )
16 const Real final_t = 50.0;
17 const Real tau = 0.001;
18 const Real output_time_step = 0.005;
19 const Real sigma_min( 10.0 ), rho_min( 15.0 ), beta_min( 1.0 );
20 const int parametric_steps = 4;
21 const Real sigma_step = 30.0 / ( parametric_steps - 1 );
22 const Real rho_step = 21.0 / ( parametric_steps - 1 );
23 const Real beta_step = 15.0 / ( parametric_steps - 1 );
24 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
26 const int results_size( output_time_steps * parametric_steps * parametric_steps * parametric_steps );
28 auto results_view = results.getView();
29 auto f = [=]
__cuda_callable__ (
const Real& t,
const Real& tau,
const Vector& u, Vector& fu,
30 const Real& sigma_i,
const Real& rho_j,
const Real& beta_k ) {
31 const Real& x = u[ 0 ];
32 const Real& y = u[ 1 ];
33 const Real& z = u[ 2 ];
34 fu[ 0 ] = sigma_i * (y - x );
35 fu[ 1 ] = rho_j * x - y - x * z;
36 fu[ 2 ] = -beta_k * z + x * y;
39 const Real sigma_i = sigma_min + i[ 0 ] * sigma_step;
40 const Real rho_j = rho_min + i[ 1 ] * rho_step;
41 const Real beta_k = beta_min + i[ 2 ] * beta_step;
45 solver.setTime( 0.0 );
46 Vector u( 1.0, 1.0, 1.0 );
48 results_view[ ( i[ 0 ] * parametric_steps + i[ 1 ] ) * parametric_steps + i[ 2 ] ] = u;
49 while( time_step < output_time_steps )
51 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
52 solver.solve( u, f, sigma_i, rho_j, beta_k );
53 const int idx = ( ( time_step++ * parametric_steps + i[ 0 ] ) * parametric_steps + i[ 1 ] ) * parametric_steps + i[ 2 ];
54 results_view[ idx ] = u;
57 const MultiIndex
begin = { 0, 0, 0 };
58 const MultiIndex
end = { parametric_steps, parametric_steps, parametric_steps };
62 file.
open( file_name, std::ios::out );
63 for(
int sigma_idx = 0; sigma_idx < parametric_steps; sigma_idx++ )
64 for(
int rho_idx = 0; rho_idx < parametric_steps; rho_idx++ )
65 for(
int beta_idx = 0; beta_idx < parametric_steps; beta_idx++ )
67 Real sigma = sigma_min + sigma_idx * sigma_step;
68 Real rho = rho_min + rho_idx * rho_step;
69 Real beta = beta_min + beta_idx * beta_step;
70 file <<
"# sigma " << sigma <<
" rho " << rho <<
" beta " << beta <<
std::endl;
71 for(
int i = 0; i < output_time_steps - 1; i++ )
73 int offset = ( ( i * parametric_steps + sigma_idx ) * parametric_steps + rho_idx ) * parametric_steps + beta_idx;
74 auto u = results.getElement( offset );
75 file << u[ 0 ] <<
" " << u[ 1 ] <<
" " << u[ 2 ] <<
std::endl;
81int main(
int argc,
char* argv[] )
84 file_name +=
"/StaticODESolver-LorenzParallelExample-result.out";
86 solveParallelODEs< TNL::Devices::Host >( file_name.getString() );
88 solveParallelODEs< TNL::Devices::Cuda >( file_name.getString() );
Array with constant size.
Definition StaticArray.h:23
It is very similar to the previous one. There are just the following changes:
- On the line 17, we define minimal values for the parameters \( \sigma, \beta \) and \( \rho \). On the line 18, we define how many different values we will consider for each parameter. The size of equidistant steps in the parameter variations is defined on the line 19. The interval for parameters variations is set to \( [0,30] \) (line 19 as well).
- On the line 23, we allocate vector
results
into which we will store the solution of the Lorenz problem for various parameters.
- Next we define the lambda function
f
representing the right-hand side of the Lorenz problem (lines 25-33) and the lambda function solve
representing the ODE solver for the Lorenz problem with particular setup of the parameters (lines 34-52). This lambda function is processed by TNL::Algorithms::parallelFor called on the line 53. Therefore the lambda function solve
receives three indexes i
, j
and k
which are used to compute particular values of the parameters \( \sigma_i, \rho_j, \beta_k \) which are represented by variables sigma_i
, rho_j
and beta_k
(lines 35-37). These parameters must be passed to the lambda function f
explicitly (line 48). The reason is the same as in the previous example - nvcc (version 10.1) does not accept a lambda function defined within another lambda function.
- The initial condition for the Lorenz problem is set to vector \( (1,1,1) \) (line 42). Finally, we start the time loop (lines 45-51) and we store the state of the solution into the vector
results
using related vector view results_view
in the time intervals given by the variable output_time_step
.
- When all ODEs ares solved, we copy all the solutions from the vector
results
into an output file.
The files has the following format:
# sigma = c[ 0 ] rho = rho[ 0 ] beta = beta[ 0 ]
x[ 0 ] u( sigma[ 0 ], rho[ 0 ], beta[ 0 ], x[ 0 ] )
x[ 1 ] u( sigma[ 0 ], rho[ 0 ], beta[ 0 ], x[ 1 ] )
....
# sigma = c[ 1 ] rho = rho[ 1 ] beta = beta[ 1 ]
x[ 0 ] u( sigma[ 1 ], rho[ 1 ], beta[ 1 ], x[ 0 ] )
x[ 1 ] u( sigma[ 1 ], rho[ 1 ], beta[ 1 ], x[ 1 ] )
...
The file can be processed by the following Python script:
4import matplotlib.pyplot
as plt
6from mpl_toolkits.mplot3d
import axes3d, Axes3D
8plt.rcParams[
'text.usetex'] =
True
12f = open( sys.argv[1],
'r' )
35 parameters_tuple = ( current_sigma, current_rho, current_beta )
36 parameters.append( parameters_tuple )
37 data_tuple = ( np.array( x_lst ), np.array( y_lst ), np.array( z_lst ) )
38 data[ parameters_tuple ] = data_tuple
43 current_sigma = float( a[ 2 ] )
44 current_rho = float( a[ 4 ] )
45 current_beta = float( a [ 6 ] )
46 if current_sigma
not in sigma_lst:
47 sigma_lst.append( current_sigma )
48 if current_rho
not in rho_lst:
49 rho_lst.append( current_rho )
50 if current_beta
not in beta_lst:
51 beta_lst.append( current_beta )
53 x_lst.append( float( a[ 0 ] ) )
54 y_lst.append( float( a[ 1 ] ) )
55 z_lst.append( float( a[ 2 ] ) )
56parameters_tuple = ( current_sigma, current_rho, current_beta )
57parameters.append( parameters_tuple )
58data_tuple = ( np.array( x_lst ), np.array( y_lst ), np.array( z_lst ) )
59data[ parameters_tuple ] = data_tuple
63sigma_n = len( sigma_lst )
65for sigma
in sigma_lst:
66 rho_n = len( rho_lst )
67 beta_n = len( beta_lst )
69 fig, ax = plt.subplots( rho_n, beta_n, figsize=(8, 8), sharey=
True, sharex=
True )
70 fig.suptitle( f
'$\sigma={sigma}$')
76 parameters_tuple = ( sigma, rho, beta )
77 data_tuple = data[ parameters_tuple ]
78 ax[ rho_idx, beta_idx ].plot( data_tuple[ 1 ], data_tuple[ 2 ], linewidth=1.0 )
80 ax[ rho_idx, beta_idx ].set_ylabel( f
'$\\rho={rho}$' )
81 if rho_idx == rho_n-1:
82 ax[ rho_idx, beta_idx ].set_xlabel( f
'$\\beta={beta}$' )
83 beta_idx = beta_idx + 1
87 plt.savefig( f
"{sys.argv[2]}-{sigma_idx}.png" )
88 sigma_idx = sigma_idx + 1
The result looks as follows:


Non-static ODE Solvers
In this section, we will show how to solve simple heat equation in 1D. The heat equation is a parabolic partial differential equation which reads as follows:
\[
\frac{\partial u(t,x)}{\partial t} - \frac{\partial^2 u(t,x)}{\partial^2 x} = 0\ \rm{on}\ (0,T) \times (0,1),
\]
\[
u(t,0) = 0,
\]
\[
u(t,0) = 1,
\]
\[
u(0,x) = u_{ini}(x)\ \rm{on}\ [0,1].
\]
We discretize it by the finite difference method to get numerical approximation. We first define set of nodes \( x_i = ih \) for \(i=0,\ldots n-1 \) where \(h = 1 / (n-1) \) (we use C++ indexing for the consistency). Using the method of lines and approximating the second derivative by the central finite diference ( \( \frac{\partial^2 u(t,x)}{\partial^2 x} \approx \frac{u_{i-1} - 2 u_i + u_{i+1}}{h^2} \)), we obtain system of ODEs of the following form
\[
\frac{\rm{d}u_i(t)}{\rm{d} t} = \frac{u_{i-1} - 2 u_i + u_{i+1}}{h^2}\ \rm{for}\ i = 1, \ldots, n-2,
\]
where \( u_i(t) = u(t,ih) \) and \( h \) space step between two adjacent nodes of the numerical mesh. We also set
\[
u_0(t) = u_{n-1}(t) = 0\ \rm{on}\ [0,T].
\]
What are the main differences compared to the Lorenz model?
- The size of the Lorenz model is fixed. It is equal to three since \( (\sigma, \rho, \beta) \in R^3 \) which is small vector of fixed size and it can be represented by the static vector TNL::Containers::StaticVector< 3, Real >. On the other hand, the size of the ODE system arising in the discretization by the method of lines depends not on the problem we solve but on the desired accuracy - the larger \( n \) the more accurate numerical approximation we get. The number of nodes \( n \) used for the space discretisation defines the number of parameters defining the mesh function. These parameters are also referred as degrees of freedom, DOFs. Therefore the size of the system can be large and it is better to employ dynamic vector TNL::Containers::Vector for the solution.
- The size of the Lorenz model is small and so the evaluation of its right-hand side can be done sequentially by one thread. The size of the ODE system can be very large and evaluating the right-hand side asks for being performed in parallel.
- The dynamic vector TNL::Containers::Vector allocates data dynamically and therefore it cannot be created within a GPU kernel which means the ODE solvers cannot be created in the GPU kernel either. For this reason, the lambda function
f
evaluating the right-hand side of the ODE system is always executed on the host and it calls TNL::Algorithms::parallelFor to evaluate the right-hand side of the ODE system.
Basic setup
The implementation of the solver looks as follows:
2#include <TNL/FileName.h>
3#include <TNL/Containers/Vector.h>
4#include <TNL/Solvers/ODE/Euler.h>
10template<
typename Device >
11void solveHeatEquation(
const char* file_name )
14 using VectorView =
typename Vector::ViewType;
20 const Real final_t = 0.05;
21 const Real output_time_step = 0.005;
23 const Real h = 1.0 / ( n - 1 );
24 const Real tau = 0.1 * h * h;
25 const Real h_sqr_inv = 1.0 / ( h * h );
33 if( x >= 0.4 && x <= 0.6 )
38 file.
open( file_name, std::ios::out );
39 write( file, u, n, h, ( Real ) 0.0 );
46 solver.setTime( 0.0 );
51 while( solver.getTime() < final_t )
53 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
54 auto f = [=]
__cuda_callable__ ( Index i,
const VectorView& u, VectorView& fu )
mutable {
55 if( i == 0 || i == n-1 )
58 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
60 auto time_stepping = [=] (
const Real& t,
const Real& tau,
const VectorView& u, VectorView& fu ) {
62 solver.solve( u, time_stepping );
63 write( file, u, n, h, solver.getTime() );
67int main(
int argc,
char* argv[] )
70 file_name +=
"/ODESolver-HeatEquationExample-result.out";
72 solveHeatEquation< TNL::Devices::Host >( file_name.getString() );
74 solveHeatEquation< TNL::Devices::Cuda >( file_name.getString() );
Solver of ODEs with the first order of accuracy.
Definition Euler.h:45
The solver is implemented in separate function solveHeatEquation
(line 11) having one template parameter Device
(line 10) defining on what device the solver is supposed to be executed. We first define necessary types:
Vector
(line 13) is alias for TNL::Containers::Vector. The number of DOFs can be large and therefore they are stored in the resizable dynamically allocated vector TNL::Containers::Vector rather then in static vector - TNL::Containers::StaticVector.
VectorView
(line 14) will be used for accessing DOFs within the lambda functions.
ODESolver
(line 15) is type of the ODE solver which we will use for the computation of the time evolution in the time dependent heat equation.
Next we define parameters of the discretization:
final_t
(line 20) represents the size of the time interval \( [0,T] \).
output_time_step
(line 21) defines time intervals in which we will write the solution \( u \) into a file.
n
(line 22) is the number of DOFs, i.e. number of nodes used for the finite difference method.
h
(line 23) is the space step, i.e. distance between two consecutive nodes.
tau
(line 24) is the time step used for the integration by the ODE solver. Since we solve the second order parabolic problem, the time step should be proportional to \( h^2 \).
h_sqr_inv
(line 25) is auxiliary constant equal to \( 1/h^2 \) which we use later in finite difference for approximation of the second derivative.
Next we set the initial condition \( u_{ini} \) (lines ) which is given as:
\[
u_{ini}(x) = \left\{
\begin{array}{rl}
0 & \rm{for}\ x < 0.4, \\
1 & \rm{for}\ 0.4 \leq x \leq 0.6, \\
0 & \rm{for}\ x > 0. \\
\end{array}
\right.
\]
Next we write the initial condition to a file (lines 37-39) using the function write
which we will describe later. On the lines (44-46) we create instance of the ODE solver solver
, we set the integration time step tau
of the solver (TNL::Solvers::ODE::ExplicitSolver::setTau ) and we set the initial time to zero (TNL::Solvers::ODE::ExplicitSolver::setTime).
Finally, we proceed to the time loop (lines 51-64) but before we prepare counter of the states to be written into files (output_idx
). The time loop uses the time variable within the ODE solver (TNL::Solvers::ODE::ExplicitSolver::getTime ) and it iterates until we reach the end of the time interval \( [0, T] \) given by the variable final_t
. On the line 53, we set the stop time of the ODE solver (TNL::Solvers::ODE::ExplicitSolver::setStopTime ) to the next checkpoint for storing the state of the heat equation or the end of the time interval depending on what comes first. Next we define the lambda function f
expressing the discretization of the second derivative of \( u \) by the central finite difference and the boundary conditions. The function receives the following parameters:
i
is the index of the node and the related ODE arising from the method of lines. In fact, we have to evaluate the update of \( u_i^k \) to get to the next time level \( u_i^{k+1} \).
u
is vector view representing the state \( u_i^k \) of the heat equation on the \( k- \) time level.
fu
is vector of updates or time derivatives in the method of lines which will bring \( u \) to the next time level.
As we mentioned above, since nvcc
does not accept lambda functions defined within another lambda function, we have to define f
separately and pass the parameters u
and fu
explicitly (see the line 61).
Now look at the code of the lambda function f
. Since the solution \( u \) does not change on the boundaries, we return zero on the boundary nodes (lines 55-56) and we evaluate the central difference for approximation of the second derivative on the interior nodes (line 58).
Next we define the lambda function time_stepping
(lines ) which is responsible for computing of the updates for all nodes \( i = 0, \ldots n-1 \). It is done by means of TNL::Algorithms::parallelFor which iterates over all the nodes and calling the function f
on each of them. It passes the vector views u
and fu
explicitly to f
for the reasons we have mentioned above.
Finally, we run the ODE solver (TNL::Solvers::ODE::Euler::solve ) (line 62) and we pass u
as the current state of the heat equation and f
the lambda function controlling the time evolution to the method solve
. On the line 63, we store the current state to a file.
The function write
which we use for writing the solution of the heat equation reads as follows:
3template<
typename Vector,
4 typename Real =
typename Vector::RealType,
5 typename Index =
typename Vector::IndexType >
6void write(
std::fstream& file,
const Vector& u,
const Index n,
const Real& h,
const Real& time )
9 for( Index i = 0; i < n; i++ )
10 file << i*h <<
" " << u.getElement( i ) <<
std::endl;
The function accepts the following parameters:
file
is the file into which we want to store the solution.
u
is a vector or vector view representing solution of the heat equation at given time.
n
is number of nodes the we use for the approximation of the solution.
h
is space step, i.e. distance between two consecutive nodes.
time
is the current time of the evolution being computed.
The solver writes the results in the following format:
# time = t[ 0 ]
x[ 0 ] u( t[ 0 ], x[ 0 ] )
x[ 1 ] u( t[ 0 ], x[ 1 ] )
x[ 2 ] u( t[ 0 ], x[ 2 ] )
...
# time = t[ 1 ]
x[ 0 ] u( t[ 1 ], x[ 0 ] )
x[ 1 ] u( t[ 1 ], x[ 1 ] )
x[ 2 ] u( t[ 1 ], x[ 2 ] )
...
The solution can be visualised with Gnuplot as follows:
plot 'ODESolver-HeatEquationExample-result.out' with lines
or it can be easily parsed in Python and processes by Matplotlib using the following script:
4import matplotlib.pyplot
as plt
6from mpl_toolkits.mplot3d
import axes3d, Axes3D
8plt.rcParams[
'text.usetex'] =
True
10f = open( sys.argv[1],
'r' )
25 time_lst.append( current_time )
26 u_data.append( np.array( u_lst ) )
28 x_data.append( np.array( x_lst ) )
32 current_time = float( a[ 3 ] )
33 time_lst.append( current_time )
35 x_lst.append( float( a[ 0 ] ) )
36 u_lst.append( float( a[ 1 ] ) )
38 time_lst.append( current_time )
39 u_data.append( np.array( u_lst ) )
41 x_data.append( np.array( x_lst ) )
46fig, ax = plt.subplots( 1, 1, figsize=(4, 4) )
48 ax.plot( x_data[ 0 ], u, linewidth=2.0 )
50ax.set_ylabel( f
'$ u(x) $' )
51ax.set_xlabel( f
'$x $' )
52plt.savefig( sys.argv[2] )
The result looks as follows:

Setup with a solver monitor
In this section we will show how to connect ODE solver with the solver monitor. Look at the following example:
2#include <TNL/FileName.h>
3#include <TNL/Containers/Vector.h>
4#include <TNL/Solvers/ODE/Euler.h>
5#include <TNL/Solvers/IterativeSolverMonitor.h>
11template<
typename Device >
12void solveHeatEquation(
const char* file_name )
15 using VectorView =
typename Vector::ViewType;
21 const Real final_t = 0.05;
22 const Real output_time_step = 0.005;
24 const Real h = 1.0 / ( n - 1 );
25 const Real tau = 0.001 * h * h;
26 const Real h_sqr_inv = 1.0 / ( h * h );
34 if( x >= 0.4 && x <= 0.6 )
39 file.
open( file_name, std::ios::out );
40 write( file, u, n, h, ( Real ) 0.0 );
46 IterativeSolverMonitorType monitor;
48 monitor.setRefreshRate(10);
49 monitor.setVerbose(1);
50 monitor.setStage(
"ODE solver stage:" );
57 solver.setTime( 0.0 );
58 solver.setSolverMonitor( monitor );
63 while( solver.getTime() < final_t )
65 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
66 auto f = [=]
__cuda_callable__ ( Index i,
const VectorView& u, VectorView& fu )
mutable {
67 if( i == 0 || i == n-1 )
70 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
72 auto time_stepping = [=] (
const Real& t,
const Real& tau,
const VectorView& u, VectorView& fu ) {
74 solver.solve( u, time_stepping );
75 write( file, u, n, h, solver.getTime() );
77 monitor.stopMainLoop();
80int main(
int argc,
char* argv[] )
83 file_name +=
"/ODESolver-HeatEquationExampleWithMonitor-result.out";
85 solveHeatEquation< TNL::Devices::Host >( file_name.getString() );
87 solveHeatEquation< TNL::Devices::Cuda >( file_name.getString() );
A RAII wrapper for launching the SolverMonitor's main loop in a separate thread.
Definition SolverMonitor.h:136
There are the following differences compared to the previous example:
- We have to include a header file with the iterative solver monitor (line 5).
- We have to setup the solver monitor (lines 45-50). First, we define the monitor type (line 45) and we create an instance of the monitor (line 46). Next we create a separate thread for the monitor (line 47), set the refresh rate to 10 milliseconds (line 48), turn on the verbose mode (line 49) and set the solver stage name (line 50). On the line 58, we connect the monitor with the solver using method TNL::Solvers::IterativeSolver::setSolverMonitor. We stop the monitor after the ODE solver finishes by calling TNL::Solvers::IterativeSolverMonitor::stopMainLoop in the line 77.
The result looks as follows:
ODE solver stage: ITER: 6581 RES: 15.86
ODE solver stage: ITER: 7179 RES: 8.2742
ODE solver stage: ITER: 4058 RES: 4.1229
ODE solver stage: ITER: 4888 RES: 2.7442
ODE solver stage: ITER: 5705 RES: 1.9642
ODE solver stage: ITER: 6154 RES: 1.5876
ODE solver stage: ITER: 1948 RES: 1.4427
ODE solver stage: ITER: 7076 RES: 1.4002
ODE solver stage: ITER: 2668 RES: 27.575
ODE solver stage: ITER: 7041 RES: 15.198
ODE solver stage: ITER: 3375 RES: 10.602
ODE solver stage: ITER: 6987 RES: 8.3727
ODE solver stage: ITER: 1852 RES: 7.0295
ODE solver stage: ITER: 4750 RES: 6.0064
ODE solver stage: ITER: 7805 RES: 5.1036
ODE solver stage: ITER: 4208 RES: 4.0931
ODE solver stage: ITER: 7488 RES: 3.4914
ODE solver stage: ITER: 3727 RES: 2.8853
ODE solver stage: ITER: 146 RES: 2.3951
ODE solver stage: ITER: 4281 RES: 2.0585
ODE solver stage: ITER: 518 RES: 1.8056
ODE solver stage: ITER: 4906 RES: 1.6265
ODE solver stage: ITER: 1213 RES: 1.5174
ODE solver stage: ITER: 5590 RES: 1.4644
ODE solver stage: ITER: 785 RES: 1.4466
ODE solver stage: ITER: 3372 RES: 1.4394
ODE solver stage: ITER: 6724 RES: 1.4322
ODE solver stage: ITER: 1443 RES: 1.4237
ODE solver stage: ITER: 4941 RES: 1.4099
ODE solver stage: ITER: 6685 RES: 1.402