Introduction
In this section, we describe solvers of ordinary differential equations (ODE) characterized by the following equation:
\[ \frac{d \vec u(t)}{dt} = \vec f( t, \vec u(t)) \text{ on } (0,T), \]
and the initial condition
\[ \vec u( 0 ) = \vec u_{ini}, \]
where \( T>0 \). This class of problems can be solved by TNL::Solvers::ODE::ODESolver which incorporates the following template parameters:
Method
- specifies the numerical method used for solving the ODE.
Vector
- denotes a container used to represent the vector \( \vec u(t) \).
SolverMonitor
- is a tool for tracking the progress of the ODE solver.
Method
TNL provides several methods for ODE solution, categorized based on their order of accuracy:
1-st order accuracy methods:
- TNL::Solvers::ODE::Methods::Euler or TNL::Solvers::ODE::Methods::Matlab::ode1 - the forward Euler method.
- TNL::Solvers::ODE::Methods::Midpoint - the explicit midpoint method.
2-nd order accuracy methods:
- TNL::Solvers::ODE::Methods::Heun2 or TNL::Solvers::ODE::Methods::Matlab::ode2 - the Heun method with adaptive choice of the time step.
- TNL::Solvers::ODE::Methods::Ralston2 - the Ralston method.
- TNL::Solvers::ODE::Methods::Fehlberg2 - the Fehlberg method with adaptive choice of the time step.
3-rd order accuracy methods:
- TNL::Solvers::ODE::Methods::Kutta - the Kutta method.
- TNL::Solvers::ODE::Methods::Heun3 - the Heun method.
- TNL::Solvers::ODE::Methods::VanDerHouwenWray - the Van der Houwen/Wray method.
- TNL::Solvers::ODE::Methods::Ralston3 - the Ralston method.
- TNL::Solvers::ODE::Methods::SSPRK3 - the Strong Stability Preserving Runge-Kutta method.
- TNL::Solvers::ODE::Methods::BogackiShampin or TNL::Solvers::ODE::Methods::Matlab::ode23 - Bogacki-Shampin method with adaptive choice of the time step.
4-th order accuracy methods:
- TNL::Solvers::ODE::Methods::OriginalRungeKutta - the "original" Runge-Kutta method.
- TNL::Solvers::ODE::Methods::Rule38 - 3/8 rule method.
- TNL::Solvers::ODE::Methods::Ralston4 - the Ralston method.
- TNL::Solvers::ODE::Methods::KuttaMerson - the Runge-Kutta-Merson method with adaptive choice of the time step.
5-th order accuracy methods:
- TNL::Solvers::ODE::Methods::CashKarp - the Cash-Karp method with adaptive choice of the time step.
- TNL::Solvers::ODE::Methods::DormandPrince or TNL::Solvers::ODE::Methods::Matlab::ode45 - the Dormand-Prince with adaptive choice of the time step.
- TNL::Solvers::ODE::Methods::Fehlberg5 - the Fehlberg method with adaptive choice of the time step.
Vector
The vector \( \vec u(t) \) in ODE solvers can be represented using different types of containers, depending on the size and nature of the ODE system:
- Static vectors (TNL::Containers::StaticVector): This is suitable for small systems of ODEs with a fixed number of unknowns. Utilizing
StaticVector
allows the ODE solver to be executed within GPU kernels. This capability is particularly useful for scenarios like running multiple sequential solvers in parallel, as in the case of TNL::Algorithms::parallelFor.
- Dynamic vectors (TNL::Containers::Vector or TNL::Containers::VectorView): These are preferred when dealing with large systems of ODEs, such as those arising in the solution of parabolic partial differential equations using the method of lines. In these instances, the solver typically handles a single, large-scale problem that can be executed in parallel internally.
Static ODE solvers
Scalar problem
Static solvers are primarily intended for scenarios where \( x \in R \) is scalar or \( \vec x \in R^n \) is vector with a relatively small dimension. We will demonstrate this through a scalar problem defined as follows:
\[ \frac{d u}{dt} = t \sin ( t )\ \rm{ on }\ (0,T), \]
with the intial condition
\[ u( 0 ) = 0. \]
First, we define the Real
type to represent floating-point arithmetic, here chosen as double
.
Next we define the main function of the solver:
10int
11main( int argc, char* argv[] )
12{
18
20 const Real final_t = 10.0;
21 const Real tau = 0.001;
22 const Real output_time_step = 0.25;
24
26 ODESolver solver;
27 solver.setTau( tau );
28 solver.setTime( 0.0 );
29 Vector u = 0.0;
31
33 while( solver.getTime() < final_t ) {
34 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
35 auto f = []( const Real& t, const Real& tau, const Vector& u, Vector& fu )
36 {
37 fu = t * sin( t );
38 };
39 solver.solve( u, f );
41 }
43}
Vector with constant size.
Definition StaticVector.h:19
First order Euler method.
Definition Euler.h:15
Integrator or solver of systems of ordinary differential equations.
Definition ODESolver.h:61
We begin the main function by defining necessary types:
We set Vector
as StaticVector
with a size of one. We also choose the basic Euler method as the Method
. Finally, we construct the ODESolver
type.
Next we define the time-related variables:
20 const Real final_t = 10.0;
21 const Real tau = 0.001;
22 const Real output_time_step = 0.25;
Here:
final_t
represents the size of the time interval \( (0,T)\).
tau
is the integration time step.
output_time_step
denotes checkpoints in which we will print value of the solution \( u(t)\).
Next, we initialize the solver:
26 ODESolver solver;
27 solver.setTau( tau );
28 solver.setTime( 0.0 );
29 Vector u = 0.0;
We create an instance of the ODESolver
, set the integration time step (using setTau
) and the initial time of the solver with setTime
. Then, we initialize the variable u
(representing the ODE solution) to the initial state \( u(0) = 0\).
We proceed to the main loop iterating over the time interval \( (0,T) \):
33 while( solver.getTime() < final_t ) {
34 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
35 auto f = []( const Real& t, const Real& tau, const Vector& u, Vector& fu )
36 {
37 fu = t * sin( t );
38 };
39 solver.solve( u, f );
41 }
We iterate with the time variable \( t \) (represented by getTime
) until the time \( T \) (represented by final_t
) with step given by the frequency of the checkpoints (represented by output_time_steps
). We let the solver to iterate until the next checkpoint or the end of the interval \((0,T) \) depending on what occurs first (it is expressed by TNL::min(solver.getTime()+output_time_step,final_t)
). The lambda function f
represents the right-hand side \( f \) of the ODE being solved. The lambda function receives the following arguments:
t
is the current value of the time variable \( t \in (0,T)\),
tau
is the current integration time step,
u
is the current value of the solution \( u(t)\),
fu
is a reference on a variable into which we evaluate the right-hand side \( f(u,t) \) on the ODE.
The lambda function is supposed to compute just the value of fu
. It is fu=t*sin(t)
in our case. Finally we call the ODE solver (solver.solve(u,f)
). As parameters, we pass the variable u
representing the solution \( u(t)\) and a lambda function representing the right-hand side of the ODE. At the end, we print values of the solution at given checkpoints.
The complete example looks as:
2#include <TNL/Solvers/ODE/ODESolver.h>
3#include <TNL/Solvers/ODE/Methods/Euler.h>
11main(
int argc,
char* argv[] )
20 const Real final_t = 10.0;
21 const Real tau = 0.001;
22 const Real output_time_step = 0.25;
28 solver.setTime( 0.0 );
33 while( solver.getTime() < final_t ) {
34 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
35 auto f = [](
const Real& t,
const Real& tau,
const Vector& u, Vector& fu )
Vector with constant size.
Definition StaticVector.h:19
First order Euler method.
Definition Euler.h:15
Integrator or solver of systems of ordinary differential equations.
Definition ODESolver.h:61
The output is as follows:
0.25 0.00514497
0.5 0.0405145
0.75 0.132617
1 0.300748
1.25 0.554239
1.5 0.890641
1.75 1.29506
2 1.74068
2.25 2.19059
2.5 2.60058
2.75 2.92297
3 3.11089
3.25 3.1229
3.5 2.92743
3.75 2.50661
4 1.85929
4.25 1.00278
4.5 -0.0267497
4.75 -1.17553
5 -2.37484
5.25 -3.54513
5.5 -4.60128
5.75 -5.45867
6 -6.0396
6.25 -6.27963
6.5 -6.1334
6.75 -5.57927
7 -4.62263
7.25 -3.29735
7.5 -1.66528
7.75 0.186339
8 2.1494
8.25 4.10122
8.5 5.91219
8.75 7.45439
9 8.61044
9.25 9.28216
9.5 9.39834
9.75 8.92094
10 7.84941
These results can be visualized using several methods. One option is to use Gnuplot. The Gnuplot command to plot the data is:
plot 'StaticODESolver-SineExample.out' with lines
Alternatively, the data can be processed and visualized using the following Python script, which employs Matplotlib for graphing.
4import matplotlib.pyplot
as plt
9plt.rcParams[
"text.usetex"] =
True
13f = open(sys.argv[1],
"r")
19 x_lst.append(float(a[0]))
20 y_lst.append(float(a[1]))
29fig, ax = plt.subplots()
32ax.plot(x, y, linewidth=2.0)
34ax.set_ylabel(
"$u(t)$")
35plt.savefig(sys.argv[2])
The graph depicting the solution of the scalar ODE problem is illustrated below:
Lorenz system
In this example, we demonstrate the application of the static ODE solver in solving a system of ODEs, specifically the Lorenz system. The Lorenz system is a set of three coupled, nonlinear differential equations defined as follows:
\[ \frac{dx}{dt} = \sigma( x - y),\ \rm{ on }\ (0,T), \]
\[ \frac{dy}{dt} = x(\rho - z ) - y,\ \rm{ on }\ (0,T), \]
\[ \frac{dz}{dt} = xy - \beta z,\ \rm{ on }\ (0,T), \]
with the inital condition
\[ \vec u(0) = (x(0),y(0),z(0)) = \vec u_{ini}. \]
Here, \( \sigma, \rho \) and \( \beta \) are given constants. The solution of the system, \( \vec u(t) = (x(t), y(t), z(t)) \in R^3 \) is represented by three-dimensional static vector (TNL::Containers::StaticVector).
The implementation of the solver for the Lorenz system is outlined below:
2#include <TNL/Containers/StaticVector.h>
3#include <TNL/Solvers/ODE/ODESolver.h>
4#include <TNL/Solvers/ODE/Methods/Euler.h>
12main(
int argc,
char* argv[] )
21 const Real final_t = 25.0;
22 const Real tau = 0.001;
23 const Real output_time_step = 0.01;
27 const Real sigma = 10.0;
28 const Real rho = 28.0;
29 const Real beta = 8.0 / 3.0;
35 solver.setTime( 0.0 );
39 Vector u( 1.0, 2.0, 3.0 );
43 while( solver.getTime() < final_t ) {
44 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
46 auto f = [ = ](
const Real& t,
const Real& tau,
const Vector& u, Vector& fu )
48 const Real& x = u[ 0 ];
49 const Real& y = u[ 1 ];
50 const Real& z = u[ 2 ];
51 fu[ 0 ] = sigma * ( y - x );
52 fu[ 1 ] = rho * x - y - x * z;
53 fu[ 2 ] = -beta * z + x * y;
This code shares similarities with the previous example, with the following key differences:
- We define the type
Vector
of the variable u
representing the solution \( \vec u(t) \) as TNL::Containers::StaticVector< 3, Real >, i.e. static vector with size of three.
- Alongside the solver parameters (
final_t
, tau
and output_time_step
) we define the Lorenz system's parameters (sigma
, rho
and beta
). 27 const Real sigma = 10.0;
28 const Real rho = 28.0;
29 const Real beta = 8.0 / 3.0;
- The initial condition is \( \vec u(0) = (1,2,3) \).
39 Vector u( 1.0, 2.0, 3.0 );
- In the lambda function, which represents the right-hand side of the Lorenz system, auxiliary aliases
x
, y
, and z
are defined for readability. The main computation for the right-hand side of the system is implemented subsequently. 46 auto f = [ = ]( const Real& t, const Real& tau, const Vector& u, Vector& fu )
47 {
48 const Real& x = u[ 0 ];
49 const Real& y = u[ 1 ];
50 const Real& z = u[ 2 ];
51 fu[ 0 ] = sigma * ( y - x );
52 fu[ 1 ] = rho * x - y - x * z;
53 fu[ 2 ] = -beta * z + x * y;
54 };
- In the remaining part of the time loop, we simply execute the solver, allowing it to evolve the solution until the next snapshot time. At this point, we also print the current state of the solution to the terminal.
The solver generates a file containing the solution values \( (\sigma(i \tau), \rho( i \tau), \beta( i \tau )) \) for each time step, where \( i = 0, 1, \ldots N \). These values are recorded on separate lines. The content of the output file is structured as follows:
sigma[ 0 ] rho[ 0 ] beta[ 0 ]
sigma[ 1 ] rho[ 1 ] beta[ 1 ]
sigma[ 2 ] rho[ 2 ] beta[ 2 ]
...
The output file generated by the solver can be visualized in various ways. One effective method is to use Gnuplot, which allows for interactive 3D visualization. The Gnuplot command for plotting the data in 3D is:
splot 'StaticODESolver-LorenzExample.out' with lines
Alternatively, the data can also be processed using the following Python script:
4import matplotlib.pyplot
as plt
9plt.rcParams[
"text.usetex"] =
True
13f = open(sys.argv[1],
"r")
20 x_lst.append(float(a[0]))
21 y_lst.append(float(a[1]))
22 z_lst.append(float(a[2]))
32fig, ax = plt.subplots()
33ax.plot(x, y, label=
"Lorenz attractor")
35plt.savefig(sys.argv[2])
This script is structured similarly to the one in the previous example. It processes the output data and creates a visual representation of the solution.
The resultant visualization of the Lorenz problem is shown below:
Combining static ODE solvers with parallel for
Static solvers can be effectively utilized within lambda functions in conjunction with TNL::Algorithms::parallelFor. This approach is particularly beneficial when there's a need to solve a large number of independent ODE problems, such as in parametric analysis scenarios. We will demonstrate this application using the two examples previously described.
Solving scalar problems in parallel
The first example addresses an ODE defined by the following equation
\[ \frac{d u}{dt} = t \sin ( c t )\ \rm{ on }\ (0,T), \]
and the initial condition
\[ u( 0 ) = 0, \]
where \( c \) is a constant. We aim to solve this ODE in parallel for a range of values \( c \in \langle c_{min}, c_{max} \rangle \). The exact solution for this equation is available here. The implementation for this parallel computation is detailed in the following code:
3#include <TNL/Solvers/ODE/ODESolver.h>
4#include <TNL/Solvers/ODE/Methods/Euler.h>
5#include <TNL/Containers/Vector.h>
6#include <TNL/Algorithms/parallelFor.h>
10template<
typename Device >
12solveParallelODEs(
const char* file_name )
22 const Real final_t = 10.0;
23 const Real tau = 0.001;
24 const Real output_time_step = 0.1;
28 const Real c_min = 1.0;
29 const Real c_max = 5.0;
30 const int c_vals = 5.0;
31 const Real c_step = ( c_max - c_min ) / ( c_vals - 1 );
35 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
36 Vector results( output_time_steps * c_vals, 0.0 );
37 auto results_view = results.getView();
41 auto f = [ = ]
__cuda_callable__(
const Real& t,
const Real& tau,
const StaticVector& u, StaticVector& fu,
const Real& c )
43 fu = t * sin( c * t );
51 const Real c = c_min + idx * c_step;
54 solver.setTime( 0.0 );
57 results_view[ idx ] = u;
61 while( time_step < output_time_steps ) {
62 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
63 solver.solve( u, f, c );
64 results_view[ time_step++ * c_vals + idx ] = u;
73 file.
open( file_name, std::ios::out );
74 for(
int i = 0; i < c_vals; i++ ) {
75 file <<
"# c = " << c_min + i * c_step <<
std::endl;
76 for(
int k = 0; k < output_time_steps; k++ )
77 file << k * output_time_step <<
" " << results.getElement( k * c_vals + i )[ 0 ] <<
std::endl;
84main(
int argc,
char* argv[] )
91 file_name +=
"/StaticODESolver-SineParallelExample-result.out";
92 solveParallelODEs< TNL::Devices::Host >( file_name.getString() );
94 solveParallelODEs< TNL::Devices::Cuda >( file_name.getString() );
#define __cuda_callable__
Definition Macros.h:49
Vector extends Array with algebraic operations.
Definition Vector.h:36
Class for managing strings.
Definition String.h:30
std::enable_if_t< std::is_integral_v< Begin > &&std::is_integral_v< End > > parallelFor(const Begin &begin, const End &end, typename Device::LaunchConfiguration launch_config, Function f, FunctionArgs... args)
Parallel for-loop function for 1D range specified with integral values.
Definition parallelFor.h:41
In this example, we demonstrate how to execute the ODE solver on a GPU. To facilitate this, the main solver logic has been encapsulated within a separate function, solveParallelODEs
, which accepts a template parameter Device
indicating the target device for execution. The results from individual ODE solutions are stored in memory and eventually written to a file named file_name
. The variable \( u \), being scalar, is represented by the type TNL::Containers::StaticVector<1,Real>
within the solver.
Next, we define the parameters of the ODE solver (final_t
, tau
and output_time_step
) as shown:
22 const Real final_t = 10.0;
23 const Real tau = 0.001;
24 const Real output_time_step = 0.1;
The interval for the ODE parameter \( c \in \langle c_{min}, c_{max} \rangle \) ( c_min
, c_max
, ) is established, along with the number of values c_vals
distributed equidistantly in the interval, determined by the step size c_step
.
28 const Real c_min = 1.0;
29 const Real c_max = 5.0;
30 const int c_vals = 5.0;
31 const Real c_step = ( c_max - c_min ) / ( c_vals - 1 );
We use the range of different c values
as the range for the parallelFor
loop. This loop processes the lambda function solve
. Before diving into the main lambda function, we allocate the vector results
for storing the ODE problem results at different time levels. As data cannot be directly written from GPU to an output file, results
serves as intermediary storage. Additionally, to enable GPU access, we prepare the vector view results_view
.
35 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
36 Vector results( output_time_steps * c_vals, 0.0 );
37 auto results_view = results.getView();
Proceeding to the main lambda function solve
:
49 {
51 const Real c = c_min + idx * c_step;
52 ODESolver solver;
53 solver.setTau( tau );
54 solver.setTime( 0.0 );
55 StaticVector u = 0.0;
56 int time_step( 1 );
57 results_view[ idx ] = u;
59
61 while( time_step < output_time_steps ) {
62 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
63 solver.solve( u, f, c );
64 results_view[ time_step++ * c_vals + idx ] = u;
65 }
67 };
This function receives idx
, the index of the value of the parameter c
. After calculating c
, we create the ODE solver solver
and set its parameters using setTau
and setTime
. We also set the initial condition of the ODE and define the variable time_step
to count checkpoints, which are stored in memory using results_view
.
51 const Real c = c_min + idx * c_step;
52 ODESolver solver;
53 solver.setTau( tau );
54 solver.setTime( 0.0 );
55 StaticVector u = 0.0;
56 int time_step( 1 );
57 results_view[ idx ] = u;
In the time loop, we iterate over the interval \( (0, T) \), setting the solver’s stop time with setStopTime
and running the solver with solve
.
61 while( time_step < output_time_steps ) {
62 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
63 solver.solve( u, f, c );
64 results_view[ time_step++ * c_vals + idx ] = u;
65 }
Each checkpoint's result is then stored in the results_view
.
73 file.
open( file_name, std::ios::out );
74 for( int i = 0; i < c_vals; i++ ) {
75 file <<
"# c = " << c_min + i * c_step <<
std::endl;
76 for( int k = 0; k < output_time_steps; k++ )
77 file << k * output_time_step <<
" " << results.getElement( k * c_vals + i )[ 0 ] <<
std::endl;
79 }
It's important to note how the parameter c
is passed to the lambda function f
. The solve
method of ODE solvers accepts user-defined parameters through variadic templates. This means additional parameters like c
can be included alongside u
and the right-hand side f
, and are accessible within the lambda function f
.
41 auto f = [ = ]
__cuda_callable__(
const Real& t,
const Real& tau,
const StaticVector& u, StaticVector& fu,
const Real& c )
42 {
43 fu = t * sin( c * t );
44 };
Due to limitations of the nvcc
compiler, which does not accept lambda functions defined within another lambda function, the f
lambda function cannot be defined inside the solve
lambda function. Therefore, c
, defined in solve
, cannot be captured by f
.
The solver outputs a file in the following format:
# c = c[ 0 ]
x[ 0 ] u( c[ 0 ], x[ 0 ] )
x[ 1 ] u( c[ 0 ], x[ 1 ] )
....
# c = c[ 1 ]
x[ 0 ] u( c[ 1 ], x[ 0 ] )
x[ 1 ] u( c[ 1 ], x[ 1 ] )
...
The file can be visuallized using Gnuplot as follows
splot 'StaticODESolver-SineParallelExample-result.out' with lines
or it can be processed by the following Python script:
4import matplotlib.pyplot
as plt
9plt.rcParams[
"text.usetex"] =
True
13f = open(sys.argv[1],
"r")
28 parameters.append(current_c)
29 u_data.append(np.array(u_lst))
31 x_data.append(np.array(x_lst))
34 current_c = float(a[3])
37 x_lst.append(float(a[0]))
38 u_lst.append(float(a[1]))
39parameters.append(current_c)
40u_data.append(np.array(u_lst))
45fig, ax = plt.subplots(1, n, figsize=(15, 3), sharey=
True)
48 ax[idx].plot(x_data[0], u, linewidth=2.0, label=f
"$c={parameters[idx]}$")
49 ax[idx].set_xlabel(
"t")
50 ax[idx].set_ylabel(
"u(t)")
52plt.savefig(sys.argv[2])
The result of this example looks as follows:
Solving the Lorenz system in parallel
The second example is a parametric analysis of the Lorenz model
\[ \frac{dx}{dt} = \sigma( x - y),\ \rm{ on }\ (0,T) \]
\[ \frac{dy}{dt} = x(\rho - z ) - y,\ \rm{ on }\ (0,T) \]
\[ \frac{dz}{dt} = xy - \beta z,\ \rm{ on }\ (0,T) \]
with the initial condition
\[ \vec u(0) = (x(0),y(0),z(0)) = \vec u_{ini}. \]
We solve it for different values of the model parameters:
\[ \sigma_i = \sigma_{min} + i \Delta \sigma, \]
\[ \rho_j = \rho_{min} + j \Delta \rho, \]
\[ \beta_k = \beta_{min} + k \Delta \beta, \]
where we set \(( \Delta \sigma = \Delta \rho = \Delta \beta = l / (p-1) \)) and \(( i,j,k = 0, 1, \ldots, p - 1 \)). The code of the solver looks as follows:
3#include <TNL/Containers/Vector.h>
4#include <TNL/Algorithms/parallelFor.h>
5#include <TNL/Containers/StaticVector.h>
6#include <TNL/Solvers/ODE/ODESolver.h>
7#include <TNL/Solvers/ODE/Methods/Euler.h>
14template<
typename Device >
16solveParallelODEs(
const char* file_name )
25 const Real final_t = 50.0;
26 const Real tau = 0.001;
27 const Real output_time_step = 0.005;
31 const Real sigma_min( 10.0 ), rho_min( 15.0 ), beta_min( 1.0 );
32 const int parametric_steps = 4;
33 const Real sigma_step = 30.0 / ( parametric_steps - 1 );
34 const Real rho_step = 21.0 / ( parametric_steps - 1 );
35 const Real beta_step = 15.0 / ( parametric_steps - 1 );
36 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
40 const int results_size( output_time_steps * parametric_steps * parametric_steps * parametric_steps );
42 auto results_view = results.getView();
48 const StaticVector& u,
54 const Real& x = u[ 0 ];
55 const Real& y = u[ 1 ];
56 const Real& z = u[ 2 ];
57 fu[ 0 ] = sigma_i * ( y - x );
58 fu[ 1 ] = rho_j * x - y - x * z;
59 fu[ 2 ] = -beta_k * z + x * y;
67 const Real sigma_i = sigma_min + idx[ 0 ] * sigma_step;
68 const Real rho_j = rho_min + idx[ 1 ] * rho_step;
69 const Real beta_k = beta_min + idx[ 2 ] * beta_step;
75 solver.setTime( 0.0 );
76 StaticVector u( 1.0, 1.0, 1.0 );
78 results_view[ ( idx[ 0 ] * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ] ] = u;
82 while( time_step < output_time_steps ) {
83 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
85 solver.solve( u, f, sigma_i, rho_j, beta_k );
88 ( ( time_step++ * parametric_steps + idx[ 0 ] ) * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ];
89 results_view[ k ] = u;
96 const MultiIndex
begin = { 0, 0, 0 };
97 const MultiIndex
end = { parametric_steps, parametric_steps, parametric_steps };
103 file.
open( file_name, std::ios::out );
104 for(
int sigma_idx = 0; sigma_idx < parametric_steps; sigma_idx++ )
105 for(
int rho_idx = 0; rho_idx < parametric_steps; rho_idx++ )
106 for(
int beta_idx = 0; beta_idx < parametric_steps; beta_idx++ ) {
107 Real sigma = sigma_min + sigma_idx * sigma_step;
108 Real rho = rho_min + rho_idx * rho_step;
109 Real beta = beta_min + beta_idx * beta_step;
110 file <<
"# sigma " << sigma <<
" rho " << rho <<
" beta " << beta <<
std::endl;
111 for(
int i = 0; i < output_time_steps - 1; i++ ) {
112 int offset = ( ( i * parametric_steps + sigma_idx ) * parametric_steps + rho_idx ) * parametric_steps + beta_idx;
113 auto u = results.getElement( offset );
114 file << u[ 0 ] <<
" " << u[ 1 ] <<
" " << u[ 2 ] <<
std::endl;
122main(
int argc,
char* argv[] )
129 file_name +=
"/StaticODESolver-LorenzParallelExample-result.out";
131 solveParallelODEs< TNL::Devices::Host >( file_name.getString() );
133 solveParallelODEs< TNL::Devices::Cuda >( file_name.getString() );
Array with constant size.
Definition StaticArray.h:20
It is very similar to the previous one. There are just the following changes:
Since we are analysing dependence on three parameters, we involve a three-dimensional parallelFor
. For this, we introduce a type for a three-dimensional multiindex.
We define minimal values for the parameters \( \sigma \in [10,40], \beta \in [15, 36] \) and \( \rho \in [1,16]\) and set the number of different values for each parameter by parametric_steps
. The size of equidistant steps (sigma_steps
, rho_steps
and beta_steps
) in the parameter variations is also set.
31 const Real sigma_min( 10.0 ), rho_min( 15.0 ), beta_min( 1.0 );
32 const int parametric_steps = 4;
33 const Real sigma_step = 30.0 / ( parametric_steps - 1 );
34 const Real rho_step = 21.0 / ( parametric_steps - 1 );
35 const Real beta_step = 15.0 / ( parametric_steps - 1 );
36 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
Next, we allocate vector results
for storing the solution of the Lorenz problem for various parameters.
40 const int results_size( output_time_steps * parametric_steps * parametric_steps * parametric_steps );
42 auto results_view = results.getView();
We define the lambda function f
for the right-hand side of the Lorenz problem and the lambda function solve
for the ODE solver, with a specific setup of the parameters.
47 const Real& tau,
48 const StaticVector& u,
49 StaticVector& fu,
50 const Real& sigma_i,
51 const Real& rho_j,
52 const Real& beta_k )
53 {
54 const Real& x = u[ 0 ];
55 const Real& y = u[ 1 ];
56 const Real& z = u[ 2 ];
57 fu[ 0 ] = sigma_i * ( y - x );
58 fu[ 1 ] = rho_j * x - y - x * z;
59 fu[ 2 ] = -beta_k * z + x * y;
60 };
and the lambda function solve
representing the ODE solver for the Lorenz problem
65 {
67 const Real sigma_i = sigma_min + idx[ 0 ] * sigma_step;
68 const Real rho_j = rho_min + idx[ 1 ] * rho_step;
69 const Real beta_k = beta_min + idx[ 2 ] * beta_step;
71
73 ODESolver solver;
74 solver.setTau( tau );
75 solver.setTime( 0.0 );
76 StaticVector u( 1.0, 1.0, 1.0 );
77 int time_step( 1 );
78 results_view[ ( idx[ 0 ] * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ] ] = u;
80
82 while( time_step < output_time_steps ) {
83 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
85 solver.solve( u, f, sigma_i, rho_j, beta_k );
87 const int k =
88 ( ( time_step++ * parametric_steps + idx[ 0 ] ) * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ];
89 results_view[ k ] = u;
90 }
92 };
with setup of the parameters
73 ODESolver solver;
74 solver.setTau( tau );
75 solver.setTime( 0.0 );
76 StaticVector u( 1.0, 1.0, 1.0 );
77 int time_step( 1 );
78 results_view[ ( idx[ 0 ] * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ] ] = u;
The solve
lambda function is executed using a three-dimensional parallelFor
(TNL::Algorithms::parallelFor). We define multi-indexes begin
and end
for this purpose:
96 const MultiIndex
begin = { 0, 0, 0 };
97 const MultiIndex
end = { parametric_steps, parametric_steps, parametric_steps };
- The lambda function
solve
takes a multi-index idx
, which is used to compute specific values for \( \sigma_i, \rho_j, \beta_k \), denoted as sigma_i, rho_j, and beta_k: 67 const Real sigma_i = sigma_min + idx[ 0 ] * sigma_step;
68 const Real rho_j = rho_min + idx[ 1 ] * rho_step;
69 const Real beta_k = beta_min + idx[ 2 ] * beta_step;
These parameters must be explicitly passed to the lambda function f
. This necessity arises due to the nvcc
compiler's limitation of not accepting a lambda function defined within another lambda function, as mentioned before: 85 solver.solve( u, f, sigma_i, rho_j, beta_k );
The initial condition for the Lorenz problem is set to vector \( (1,1,1) \):
73 ODESolver solver;
74 solver.setTau( tau );
75 solver.setTime( 0.0 );
76 StaticVector u( 1.0, 1.0, 1.0 );
77 int time_step( 1 );
78 results_view[ ( idx[ 0 ] * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ] ] = u;
Subsequently, we initiate the time loop. Within this loop, we store the state of the solution in the vector view results_view
at intervals defined by output_time_step
:
82 while( time_step < output_time_steps ) {
83 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
85 solver.solve( u, f, sigma_i, rho_j, beta_k );
87 const int k =
88 ( ( time_step++ * parametric_steps + idx[ 0 ] ) * parametric_steps + idx[ 1 ] ) * parametric_steps + idx[ 2 ];
89 results_view[ k ] = u;
90 }
Upon solving all ODEs, we transfer all solutions from the vector results
to an output file:
103 file.
open( file_name, std::ios::out );
104 for( int sigma_idx = 0; sigma_idx < parametric_steps; sigma_idx++ )
105 for( int rho_idx = 0; rho_idx < parametric_steps; rho_idx++ )
106 for( int beta_idx = 0; beta_idx < parametric_steps; beta_idx++ ) {
107 Real sigma = sigma_min + sigma_idx * sigma_step;
108 Real rho = rho_min + rho_idx * rho_step;
109 Real beta = beta_min + beta_idx * beta_step;
110 file <<
"# sigma " << sigma <<
" rho " << rho <<
" beta " << beta <<
std::endl;
111 for( int i = 0; i < output_time_steps - 1; i++ ) {
112 int offset = ( ( i * parametric_steps + sigma_idx ) * parametric_steps + rho_idx ) * parametric_steps + beta_idx;
113 auto u = results.getElement( offset );
114 file << u[ 0 ] <<
" " << u[ 1 ] <<
" " << u[ 2 ] <<
std::endl;
115 }
117 }
The output file has the following format:
# sigma = c[ 0 ] rho = rho[ 0 ] beta = beta[ 0 ]
x[ 0 ] u( sigma[ 0 ], rho[ 0 ], beta[ 0 ], x[ 0 ] )
x[ 1 ] u( sigma[ 0 ], rho[ 0 ], beta[ 0 ], x[ 1 ] )
....
# sigma = c[ 1 ] rho = rho[ 1 ] beta = beta[ 1 ]
x[ 0 ] u( sigma[ 1 ], rho[ 1 ], beta[ 1 ], x[ 0 ] )
x[ 1 ] u( sigma[ 1 ], rho[ 1 ], beta[ 1 ], x[ 1 ] )
...
It can be processed by the following Python script:
4import matplotlib.pyplot
as plt
6from mpl_toolkits.mplot3d
import axes3d, Axes3D
8plt.rcParams[
"text.usetex"] =
True
12f = open(sys.argv[1],
"r")
35 parameters_tuple = (current_sigma, current_rho, current_beta)
36 parameters.append(parameters_tuple)
37 data_tuple = (np.array(x_lst), np.array(y_lst), np.array(z_lst))
38 data[parameters_tuple] = data_tuple
43 current_sigma = float(a[2])
44 current_rho = float(a[4])
45 current_beta = float(a[6])
46 if current_sigma
not in sigma_lst:
47 sigma_lst.append(current_sigma)
48 if current_rho
not in rho_lst:
49 rho_lst.append(current_rho)
50 if current_beta
not in beta_lst:
51 beta_lst.append(current_beta)
53 x_lst.append(float(a[0]))
54 y_lst.append(float(a[1]))
55 z_lst.append(float(a[2]))
56parameters_tuple = (current_sigma, current_rho, current_beta)
57parameters.append(parameters_tuple)
58data_tuple = (np.array(x_lst), np.array(y_lst), np.array(z_lst))
59data[parameters_tuple] = data_tuple
63sigma_n = len(sigma_lst)
65for sigma
in sigma_lst:
67 beta_n = len(beta_lst)
69 fig, ax = plt.subplots(rho_n, beta_n, figsize=(8, 8), sharey=
True, sharex=
True)
70 fig.suptitle(f
"$\sigma={sigma}$")
76 parameters_tuple = (sigma, rho, beta)
77 data_tuple = data[parameters_tuple]
78 ax[rho_idx, beta_idx].plot(data_tuple[1], data_tuple[2], linewidth=1.0)
80 ax[rho_idx, beta_idx].set_ylabel(f
"$\\rho={rho}$")
81 if rho_idx == rho_n - 1:
82 ax[rho_idx, beta_idx].set_xlabel(f
"$\\beta={beta}$")
83 beta_idx = beta_idx + 1
87 plt.savefig(f
"{sys.argv[2]}-{sigma_idx}.png")
88 sigma_idx = sigma_idx + 1
The results are visualized in the following images:
Dynamic ODE Solvers
In this section, we demonstrate how to solve the simple 1D heat equation, a parabolic partial differential equation expressed as:
\[
\frac{\partial u(t,x)}{\partial t} - \frac{\partial^2 u(t,x)}{\partial^2 x} = 0\ \rm{on}\ (0,T) \times (0,1),
\]
with boundary conditions
\[
u(t,0) = 0,
\]
\[
u(t,0) = 1,
\]
and initial condition
\[
u(0,x) = u_{ini}(x)\ \rm{on}\ [0,1].
\]
We discretize the equation by the finite difference method for numerical approximation. First, we define set of nodes \( x_i = ih \) for \(i=0,\ldots n-1 \) where \(h = 1 / (n-1) \) (adopting C++ indexing for consistency). Employing the method of lines and approximating the second derivative by the central finite diference
\[ \frac{\partial^2 u(t,x)}{\partial^2 x} \approx \frac{u_{i-1} - 2 u_i + u_{i+1}}{h^2}, \]
,
we derive system of ODEs:
\[
\frac{\rm{d}u_i(t)}{\rm{d} t} = \frac{u_{i-1} - 2 u_i + u_{i+1}}{h^2}\ \rm{for}\ i = 1, \ldots, n-2,
\]
where \( u_i(t) = u(t,ih) \) represents the function value at node \( i \) and \( h \) is the spatial step between two adjacent nodes of the numerical mesh. The boundary conditions are set as:
\[
u_0(t) = u_{n-1}(t) = 0\ \rm{on}\ [0,T].
\]
What are the main differences compared to the Lorenz model?
- System Size and Vector Representation:
- The Lorenz model has a fixed size of three, representing the parameters \( (x, y, z) \) in \( R^3 \). This small, fixed-size vector can be efficiently represented using a static vector, specifically TNL::Containers::StaticVector< 3, Real >.
- In contrast, the size of the ODE system for the heat equation, as determined by the method of lines, varies based on the desired accuracy. The greater the value of \( n \), the more accurate the numerical approximation. The number of nodes \( n \) for spatial discretization defines the number of parameters or degrees of freedom, DOFs. Consequently, the system size can be large, necessitating the use of a dynamic vector, TNL::Containers::Vector, for solutions.
- Evaluation Approach and Parallelization:
- Due to its small size, the Lorenz model's right-hand side can be evaluated sequentially by a single thread.
- However, the ODE system for the heat equation can be very large, requiring parallel computation to efficiently evaluate its right-hand side.
- Data Allocation and Execution Environment:
- The dynamic vector TNL::Containers::Vector allocates data dynamically, which precludes its creation within a GPU kernel. As a result, ODE solvers cannot be instantiated within a GPU kernel.
- Therefore, the lambda function
f
, which evaluates the right-hand side of the ODE system, is executed on the host. It utilizes TNL::Algorithms::parallelFor to facilitate the parallel evaluation of the system's right-hand side.
Basic setup
The implementation of the solver for the heat equation is detailed in the following way:
2#include <TNL/FileName.h>
3#include <TNL/Containers/Vector.h>
4#include <TNL/Solvers/ODE/ODESolver.h>
5#include <TNL/Solvers/ODE/Methods/Euler.h>
11template<
typename Device >
13solveHeatEquation(
const char* file_name )
17 using VectorView =
typename Vector::ViewType;
26 const Real final_t = 0.05;
27 const Real output_time_step = 0.005;
29 const Real h = 1.0 / ( n - 1 );
30 const Real tau = 0.1 * h * h;
31 const Real h_sqr_inv = 1.0 / ( h * h );
43 if( x >= 0.4 && x <= 0.6 )
49 file.
open( file_name, std::ios::out );
50 write( file, u, n, h, (Real) 0.0 );
59 solver.setTime( 0.0 );
66 while( solver.getTime() < final_t ) {
67 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
69 auto f = [ = ]
__cuda_callable__( Index i,
const VectorView& u, VectorView& fu )
mutable
71 if( i == 0 || i == n - 1 )
74 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
78 auto time_stepping = [ = ](
const Real& t,
const Real& tau,
const VectorView& u, VectorView& fu )
85 solver.solve( u, time_stepping );
86 write( file, u, n, h, solver.getTime() );
92main(
int argc,
char* argv[] )
99 file_name +=
"/ODESolver-HeatEquationExample-result.out";
101 solveHeatEquation< TNL::Devices::Host >( file_name.getString() );
103 solveHeatEquation< TNL::Devices::Cuda >( file_name.getString() );
The solver is encapsulated within the function solveHeatEquation
, which includes a template parameter Device
. This parameter specifies the device (e.g., CPU or GPU) on which the solver will execute. The implementation begins with defining necessary types:
17 using VectorView = typename Vector::ViewType;
Vector
is alias for TNL::Containers::Vector. The choice of a dynamic vector (rather than a static vector like TNL::Containers::StaticVector) is due to the potentially large number of degrees of freedom (DOFs) that need to be stored in a resizable and dynamically allocated vector.
VectorView
is designated for creating vector view, which is essential for accessing data within lambda functions, especially when these functions are executed on the GPU.
Method
defines the numerical method for time integration.
ODESolver
is the type of the ODE solver that will be used for calculating the time evolution in the time-dependent heat equation.
After defining these types, the next step in the implementation is to establish the parameters of the discretization:
26 const Real final_t = 0.05;
27 const Real output_time_step = 0.005;
28 const Index n = 41;
29 const Real h = 1.0 / ( n - 1 );
30 const Real tau = 0.1 * h * h;
31 const Real h_sqr_inv = 1.0 / ( h * h );
final_t
represents the length of the time interval \( [0,T] \).
output_time_step
defines the time intervals at which the solution \( u \) will be written into a file.
n
stands for the number of DOFs, i.e. number of nodes used for the finite difference method.
h
is the space step, meaning the distance between two consecutive nodes.
tau
represents the time step used for integration by the ODE solver. For a second order parabolic problem like this, the time step should be proportional to \( h^2 \).
h_sqr_inv
is an auxiliary constant equal to \( 1/h^2 \). It is used later in the finite difference method for approximating the second derivative.
The initial condition \( u_{ini} \) is set as:
\[
u_{ini}(x) = \left\{
\begin{array}{rl}
0 & \rm{for}\ x < 0.4, \\
1 & \rm{for}\ 0.4 \leq x \leq 0.6, \\
0 & \rm{for}\ x > 0. \\
\end{array}
\right.
\]
This initial condition defines the state of the system at the beginning of the simulation. It specifies a region (between 0.4 and 0.6) where the temperature (or the value of \( u \)) is set to 1, and outside this region, the temperature is 0.
38 Vector u( n );
39 u.forAllElements(
41 {
42 const Real x = i * h;
43 if( x >= 0.4 && x <= 0.6 )
44 value = 1.0;
45 else
46 value = 0.0;
47 } );
49 file.
open( file_name, std::ios::out );
50 write( file, u, n, h, (Real) 0.0 );
After setting the initial condition, the next step is to write it to a file. This is done using the write
function, which is detailed later.
Next, we create an instance of the ODE solver solver
and set the integration time step tau
of the solver (TNL::Solvers::ODE::ExplicitSolver::setTau ) The initial time is set to zero with (TNL::Solvers::ODE::ExplicitSolver::setTime).
57 ODESolver solver;
58 solver.setTau( tau );
59 solver.setTime( 0.0 );
Finally, we proceed to the time loop:
66 while( solver.getTime() < final_t ) {
67 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
69 auto f = [ = ]
__cuda_callable__( Index i,
const VectorView& u, VectorView& fu )
mutable
70 {
71 if( i == 0 || i == n - 1 )
72 fu[ i ] = 0.0;
73 else
74 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
75 };
78 auto time_stepping = [ = ]( const Real& t, const Real& tau, const VectorView& u, VectorView& fu )
79 {
83 };
85 solver.solve( u, time_stepping );
86 write( file, u, n, h, solver.getTime() );
87 }
The time loop is executed, utilizing the time variable from the ODE solver (TNL::Solvers::ODE::ExplicitSolver::getTime). The loop iterates until reaching the end of the specified time interval \( [0, T] \), defined by the variable final_t
. The stop time of the ODE solver is set with TNL::Solvers::ODE::ExplicitSolver::setStopTime, either to the next checkpoint for storing the state of the system or to the end of the time interval, depending on which comes first.
The lambda function f
is defined to express the discretization of the second derivative of \( u \) using the central finite difference and to incorporate the boundary conditions.
69 auto f = [ = ]
__cuda_callable__( Index i,
const VectorView& u, VectorView& fu )
mutable
70 {
71 if( i == 0 || i == n - 1 )
72 fu[ i ] = 0.0;
73 else
74 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
75 };
The function receives the following parameters:
i
represents the index of the node, corresponding to the individual ODE derived from the method of lines. It is essential for evaluating the update of \( u_i^k \) to reach the next time level \( u_i^{k+1} \).
u
is a vector view representing the state \( u_i^k \) of the heat equation on the \( k-\)th time level.
fu
is a vector that stores updates or time derivatives in the method of lines which will transition \( u \) to the next time level.
In this implementation, the lambda function f
plays a crucial role in applying the finite difference method to approximate the second derivative and in enforcing the boundary conditions, thereby driving the evolution of the system in time according to the heat equation.
In the lambda function f
, the solution \( u \) remains unchanged on the boundary nodes. Therefore, the function returns zero for these boundary nodes. For the interior nodes, f
evaluates the central difference to approximate the second derivative.
The lambda function time_stepping
is responsible for computing updates for all nodes \( i = 0, \ldots, n-1 \). This computation is performed using TNL::Algorithms::parallelFor, which iterates over all the nodes and calls the function f
for each one. Since the nvcc
compiler does not support lambda functions defined within another lambda function, f
is defined separately, and the parameters u
and fu
are explicitly passed to it.
The ODE solver is executed with solve
. The current state of the heat equation u
and the lambda function f
(controlling the time evolution) are passed to the solve
method. After each iteration, the current state is saved to a file using the write
function.
3template<
typename Vector,
typename Real =
typename Vector::RealType,
typename Index =
typename Vector::IndexType >
5write(
std::fstream& file,
const Vector& u,
const Index n,
const Real& h,
const Real& time )
8 for( Index i = 0; i < n; i++ )
9 file << i * h <<
" " << u.getElement( i ) <<
std::endl;
The write
function is used to write the solution of the heat equation to a file. The specifics of this function, including its parameters and functionality, are detailed in the following:
file
specifies the file into which the solution will be stored.
u
represents the solution of the heat equation at a given time. It can be a vector or vector view.
n
indicates the number of nodes used for approximating the solution.
h
refers to the space step, i.e., the distance between two consecutive nodes.
time
represents the current time of the evolution being computed.
The solver writes the results in a structured format, making it convenient for visualization and analysis:
# time = t[ 0 ]
x[ 0 ] u( t[ 0 ], x[ 0 ] )
x[ 1 ] u( t[ 0 ], x[ 1 ] )
x[ 2 ] u( t[ 0 ], x[ 2 ] )
...
# time = t[ 1 ]
x[ 0 ] u( t[ 1 ], x[ 0 ] )
x[ 1 ] u( t[ 1 ], x[ 1 ] )
x[ 2 ] u( t[ 1 ], x[ 2 ] )
...
This format records the solution of the heat equation at various time steps and spatial nodes, providing a comprehensive view of the system's evolution over time.
The solution can be visualised with Gnuplot using the command:
plot 'ODESolver-HeatEquationExample-result.out' with lines
This command plots the solution as a line graph, offering a visual representation of how the heat equation's solution evolves. The solution can also be parsed and processed in Python, using Matplotlib for visualization. The specifics of this process are detailed in the following script:
4import matplotlib.pyplot
as plt
6from mpl_toolkits.mplot3d
import axes3d, Axes3D
8plt.rcParams[
"text.usetex"] =
True
10f = open(sys.argv[1],
"r")
25 time_lst.append(current_time)
26 u_data.append(np.array(u_lst))
28 x_data.append(np.array(x_lst))
32 current_time = float(a[3])
33 time_lst.append(current_time)
35 x_lst.append(float(a[0]))
36 u_lst.append(float(a[1]))
38 time_lst.append(current_time)
39 u_data.append(np.array(u_lst))
41 x_data.append(np.array(x_lst))
46fig, ax = plt.subplots(1, 1, figsize=(4, 4))
48 ax.plot(x_data[0], u, linewidth=2.0)
50ax.set_ylabel(f
"$ u(x) $")
52plt.savefig(sys.argv[2])
The outcome of the solver, once visualized, is shown as follows:
Setup with a solver monitor
In this section, we'll discuss how to integrate an ODE solver with a solver monitor, as demonstrated in the example:
2#include <TNL/FileName.h>
3#include <TNL/Containers/Vector.h>
4#include <TNL/Solvers/ODE/ODESolver.h>
5#include <TNL/Solvers/ODE/Methods/Euler.h>
6#include <TNL/Solvers/IterativeSolverMonitor.h>
12template<
typename Device >
14solveHeatEquation(
const char* file_name )
17 using VectorView =
typename Vector::ViewType;
24 const Real final_t = 0.05;
25 const Real output_time_step = 0.005;
27 const Real h = 1.0 / ( n - 1 );
28 const Real tau = 0.001 * h * h;
29 const Real h_sqr_inv = 1.0 / ( h * h );
39 if( x >= 0.4 && x <= 0.6 )
45 file.
open( file_name, std::ios::out );
46 write( file, u, n, h, (Real) 0.0 );
53 IterativeSolverMonitorType monitor;
55 monitor.setRefreshRate( 10 );
56 monitor.setVerbose( 1 );
57 monitor.setStage(
"ODE solver stage:" );
65 solver.setTime( 0.0 );
66 solver.setSolverMonitor( monitor );
71 while( solver.getTime() < final_t ) {
72 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
73 auto f = [ = ]
__cuda_callable__( Index i,
const VectorView& u, VectorView& fu )
mutable
75 if( i == 0 || i == n - 1 )
78 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
80 auto time_stepping = [ = ](
const Real& t,
const Real& tau,
const VectorView& u, VectorView& fu )
84 solver.solve( u, time_stepping );
85 write( file, u, n, h, solver.getTime() );
87 monitor.stopMainLoop();
91main(
int argc,
char* argv[] )
98 file_name +=
"/ODESolver-HeatEquationExampleWithMonitor-result.out";
100 solveHeatEquation< TNL::Devices::Host >( file_name.getString() );
102 solveHeatEquation< TNL::Devices::Cuda >( file_name.getString() );
A RAII wrapper for launching the SolverMonitor's main loop in a separate thread.
Definition SolverMonitor.h:133
This setup incorporates a solver monitor into the ODE solver framework, which differs from the previous example in several key ways:
The first difference is the inclusion of a header file TNL/Solvers/IterativeSolverMonitor.h
for the iterative solver monitor. This step is essential for enabling monitoring capabilities within the solver. We have to setup the solver monitor:
53 IterativeSolverMonitorType monitor;
55 monitor.setRefreshRate( 10 );
56 monitor.setVerbose( 1 );
57 monitor.setStage( "ODE solver stage:" );
First, we define the monitor type IterativeSolverMonitorType
and we create an instance of the monitor. A separate thread (monitorThread
) is created for the monitor. The refresh rate of the monitor is set to 10 milliseconds with setRefreshRate
and verbose mode is enabled with setVerbose
for detailed monitoring. The solver stage name is specified with setStage
. The monitor is connected to the solver using TNL::Solvers::IterativeSolver::setSolverMonitor. Subsequently, the numerical computation is performed and after it finishes, the monitor is stopped by calling TNL::Solvers::IterativeSolverMonitor::stopMainLoop.
Use of the iterate method
The ODE solvers in TNL provide an iterate
method for performing just one iteration. This is particularly useful when there is a need for enhanced control over the time loop, or when developing a hybrid solver that combines multiple integration methods. The usage of this method is demonstrated in the following example:
2#include <TNL/Solvers/ODE/ODESolver.h>
3#include <TNL/Solvers/ODE/Methods/Euler.h>
11main(
int argc,
char* argv[] )
20 const Real final_time = 10.0;
21 const Real output_time_step = 0.25;
22 Real next_output_time =
TNL::min( output_time_step, final_time );
25 Real current_tau = tau;
35 while( time < final_time ) {
36 auto f = [](
const Real& t,
const Real& current_tau,
const Vector& u, Vector& fu )
40 solver.iterate( u, time, current_tau, f );
41 if( time >= next_output_time ) {
43 next_output_time += output_time_step;
44 if( next_output_time > final_time )
45 next_output_time = final_time;
47 if( time + current_tau > next_output_time )
48 current_tau = next_output_time -
time;
constexpr ResultType min(const T1 &a, const T2 &b)
This function returns minimum of two numbers.
Definition Math.h:23
For simplicity, we demonstrate the use of iterate
with a static solver, but the process is similar for dynamic solvers. There are two main differences compared to using the solve method:
- Initialization: Before calling
iterate
, it's necessary to initialize the solver using the init
method. This step sets up auxiliary vectors within the solver. For ODE solvers with dynamic vectors, the internal vectors of the solver are allocated based on the size of the vector u
.
29 Vector u = 0.0;
30 ODESolver solver;
31 solver.init( u );
- Time Loop: Within the time loop, the
iterate
method is called. It requires the vector u
, the right-hand side function f
of the ODE, and also the variables time
and tau
, representing the current time \( t \) and the integration time step, respectively. The variable time
is incremented by tau
with each iteration. Additionally, tau
can be adjusted if the solver performs adaptive time step selection. It's important to adjust tau
to ensure that the next_output_time
is reached exactly.
35 while( time < final_time ) {
36 auto f = []( const Real& t, const Real& current_tau, const Vector& u, Vector& fu )
37 {
38 fu = t * sin( t );
39 };
40 solver.iterate( u, time, current_tau, f );
41 if( time >= next_output_time ) {
43 next_output_time += output_time_step;
44 if( next_output_time > final_time )
45 next_output_time = final_time;
46 }
47 if( time + current_tau > next_output_time )
48 current_tau = next_output_time -
time;
49 else
50 current_tau = tau;
51 }
User defined methods
The Runge-Kutta methods used for solving ODEs can generally be expressed as follows:
\[ k_1 = f(t, \vec u) \]
\[ k_2 = f(t + c_2, \vec u + \tau(a_{21} k_1)) \]
\[ k_3 = f(t + c_3, \vec u + \tau(a_{31} k_1 + a_{32} k_2)) \]
\[ \vdots \]
\[ k_s = f(t + c_s, \vec u + \tau( \sum_{j=1}^{s-1} a_{si} k_i ) )\]
\[ \vec u_{n+1} = \vec u_n + \tau \sum_{i=1}^s b_i k_i\]
\[ \vec u^\ast_{n+1} = \vec u_n + \tau \sum_{i=1}^s b^\ast_i k_i\]
\[ \vec e_{n+1} = \vec u_{n+1} - \vec u^\ast_{n+1} = \vec u_n + \tau \sum_{i=1}^s (b_i - b^\ast_i) k_i \]
where \(s\) denotes the number of stages, the vector \(\vec e_{n+1} \) is an error estimate and a basis for the adaptive choice of the integration step. Each such method can be expressed in the form of a Butcher tableau having the following form:
| | | | |
\( c_1 \) | | | | |
\( c_2 \) | \( a_{21} \) | | | |
\( c_3 \) | \( a_{31} \) | \( a_{32} \) | | |
\( \vdots \) | \( \vdots \) | \( \vdots \) | \( \ddots \) | |
\( c_s \) | \( a_{s1} \) | \( a_{s2} \) | \( \ldots \) | \( a_{s,s-1} \) |
| \( b_1 \) | \( b_2 \) | \( \ldots \) | \( b_s \) |
| \( b^\ast_1 \) | \( b^\ast_2 \) | \( \ldots \) | \( b^\ast_s \) |
For example, the the Fehlberg RK1(2) method can be expressed as:
| | | |
\( 0 \) | | | |
\( 1/2 \) | \( 1/2 \) | | |
\( 1 \) | \( 1/256 \) | \( 1/256 \) | |
| \( 1/512 \) | \( 255/256 \) | \( 1/521 \) |
| \( 1/256 \) | \( 255/256 \) | |
TNL allows the implementation of new Runge-Kutta methods simply by specifying the Butcher tableau. The following is an example of the implementation of the Fehlberg RK1(2):
5#include <TNL/Cuda/CudaCallable.h>
8template<
typename Value =
double >
11 using ValueType = Value;
14 static constexpr size_t Stages = 3;
17 static constexpr size_t
31 static constexpr ValueType
32 getCoefficient(
const size_t stage,
const size_t i )
34 return k_coefficients[ stage ][ i ];
37 static constexpr ValueType
38 getTimeCoefficient(
size_t i )
40 return time_coefficients[ i ];
43 static constexpr ValueType
44 getUpdateCoefficient(
size_t i )
46 return higher_order_update_coefficients[ i ];
50 static constexpr ValueType
51 getErrorCoefficient(
size_t i )
53 return higher_order_update_coefficients[ i ] - lower_order_update_coefficients[ i ];
The method is a templated structure accepting a tempate parameter Value
indicating the numeric type of the method. To implement a new method, we need to do the following:
Set the Stages of the Method: This is \( s \) in the definition of the Runge-Kutta method and equals the number of vectors \( \vec k_i \).
14 static constexpr size_t Stages = 3;
Determine Adaptivity: If the Runge-Kutta method allows an adaptive choice of the time step, the method isAdaptive
shall return true
, and false
otherwise.
24 static constexpr bool
25 isAdaptive()
26 {
27 return true;
28 }
Define Method for Error Estimation: Adaptive methods need coefficients for error estimation. Typically, these are the differences between coefficients for updates of higher and lower order of accuracy, as seen in getErrorCoefficients
. However, this method can be altered if there's a different formula for error estimation.
50 static constexpr ValueType
51 getErrorCoefficient( size_t i )
52 {
53 return higher_order_update_coefficients[ i ] - lower_order_update_coefficients[ i ];
54 }
Define Coefficients for Evaluating Vectors \( \vec k_i \): The coefficients for the evaluation of the vectors \( \vec k_i \) need to be defined.
and
Zero coefficients are omitted in the generated formulas at compile time, ensuring no performance drop.
Define Update Coefficients: Finaly, the update coefficients must be defined.
Such structure can be substituted to the TNL::Solvers::ODE::ODESolver as the template parameter Method
.