Skip to content

Commit

Permalink
Merge pull request #205 from firedrakeproject/asq_manuscript_examples
Browse files Browse the repository at this point in the history
Scripts for asQ manuscript
  • Loading branch information
JHopeCollins authored Sep 30, 2024
2 parents b1a4cf9 + 9fd55d3 commit 7bc3f82
Show file tree
Hide file tree
Showing 12 changed files with 2,303 additions and 4 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,11 @@ asQ is designed to allow fast prototyping of new ParaDiag methods, while still b
This is achieved using the Firedrake and PETSc libraries.
The finite element models are defined by specifying the weak form using [Firedrake, "*an automated system for the portable solution of partial differential equations using the finite element method*](https://www.firedrakeproject.org/)", and the linear and nonlinear solvers required are provided by [PETSc, "*the Portable, Extensible Toolkit for Scientific Computation*"](https://petsc.org/release/).

See the arXiv paper https://arxiv.org/abs/2409.18792 for a description and demonstration of asQ. The code used for this paper can be found in the [asq_manuscript_examples](https://github.com/firedrakeproject/asQ/tree/master/asq_manuscript_examples) directory.

## ParaDiag

ParaDiag is a parallel-in-time method, meaning that is solves for multiple timesteps of a timeseries simultaneously, rather than one at a time like traditional serial-in-time methods.
ParaDiag is a parallel-in-time method, meaning that it solves for multiple timesteps of a timeseries simultaneously, rather than one at a time like traditional serial-in-time methods.
This [review article](https://arxiv.org/abs/2005.09158) provides a good introduction to the method.
asQ implements the ParaDiag-II family of methods based on creating a block-circulant approximation to the all-at-once system which can be block-diagonalised with the FFT and solved efficiently in parallel.

Expand All @@ -22,10 +23,9 @@ To install asQ, pass the arguments `--install asQ` to the `firedrake-install` sc

## Getting started

The best place to start is the [examples directory](https://github.com/firedrakeproject/asQ/tree/master/examples).
Annotated scripts for the linear advection equation and the heat equation show how to set up a problem with asQ and solve the timeseries using ParaDiag.
The best place to start is the arXiv paper and associated examples in [this directory](https://github.com/firedrakeproject/asQ/tree/master/asq_manuscript_examples).

More advanced scripts can be found in the [case studies directory](https://github.com/firedrakeproject/asQ/tree/master/case_studies), including scripts for the shallow water equations and a model for stratigraphic evolution of the sea floor.
Other examples can be found in the [examples directory](https://github.com/firedrakeproject/asQ/tree/master/examples) and more advanced scripts can be found in the [case studies directory](https://github.com/firedrakeproject/asQ/tree/master/case_studies), including scripts for the shallow water equations and a model for stratigraphic evolution of the sea floor.

## Help and support

Expand Down
44 changes: 44 additions & 0 deletions asq_manuscript_examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Example scripts for "asQ: parallel-in-time finite element simulations using ParaDiag for geoscientific models and beyond"

These are the python scripts used to generate the data for the asQ library paper https://arxiv.org/abs/2409.18792.

## Section 3.2 "Heat equation example" script
`heat.py` is the example in Section 3.2 "A heat equation example".
It explicitly uses the `AllAtOnce*` objects to create each component of the all-at-once system.
Because it is hard-coded to four ensemble ranks (i.e. four MPI ranks in time), it must be run with a multiple of 4 MPI ranks, e.g.

```mpiexec -np 4 heat.py```

will run with 4 ranks in time, and serial in space, whereas

```mpiexec -np 8 heat.py```

will run with 4 ranks in time, and 2 ranks in each spatial communicator.
To change the time-parallelism, change the `time_partition` list in the script.

## Section 4 "Numerical Examples" scripts

All scripts use `argparse` to process command line argument, so will print out information on how to use them if run with the `-h` flag. They do not have to be run in parallel to do this.

`python <script_name.py> -h`

All scripts will also accept a `--show_args` argument, which will print out the value of all argparse arguments at the beginning of the script.
The default arguments for the `*_paradiag.py` scripts do not use time-parallelism, so can be run in serial.
To specify the time-parallelism, see the help for the `--nslices` and `--slice_length` command line arguments.

- The data in Section 4.1 "Advection equation" was generated with
- `advection_serial.py` for the serial-in-time results.
- `advection_paradiag.py` for the parallel-in-time results.
- The data in Section 4.2 "Linear shallow water equations" was generated with
- `linear_shallow_water_serial.py` for the serial-in-time results.
- `linear_shallow_water_paradiag.py` for the parallel-in-time results.
- The data in Section 4.3 "Nonlinear shallow water equations" was generated with
- `nonlinear_shallow_water_serial.py` for the serial-in-time results.
- `nonlinear_shallow_water_paradiag.py` for the parallel-in-time results.
- The data in Section 4.4 "Compressible Euler equations" was generated with
- `vertical_slice_serial.py` for the serial-in-time results.
- `vertical_slice_paradiag.py` for the parallel-in-time results.

The `*_serial.py` scripts all use the `SerialMiniApp` class to run the serial-in-time method.
The parallel-in-time shallow water equation scripts use the `ShallowWaterMiniApp` to set up the all-at-once system specifically fo the shallow water equations.
The parallel-in-time advection and vertical slice scripts use the `Paradiag` class to construct the all-at-once system without having to manually create each `AllAtOnce*` object.
253 changes: 253 additions & 0 deletions asq_manuscript_examples/advection_paradiag.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,253 @@
from math import pi, cos, sin
from utils.timing import SolverTimer
import firedrake as fd
from firedrake.petsc import PETSc
import asQ
from argparse import ArgumentParser
from argparse_formatter import DefaultsAndRawTextFormatter

parser = ArgumentParser(
description='ParaDiag timestepping for scalar advection of a Gaussian bump in a periodic square with DG in space and implicit-theta in time.',
epilog="""\
Optional PETSc command line arguments:
-circulant_alpha :float: The circulant parameter to use in the preconditioner. Default 1e-4.
-ksp_rtol :float: The relative residual drop required for convergence. Default 1e-11.
See https://petsc.org/release/manualpages/KSP/KSPSetTolerances/
-ksp_type :str: The Krylov method to use for the all-at-once iterations. Default 'richardson'.
Alternatives include gmres or fgmres.
See https://petsc.org/release/manualpages/KSP/KSPSetType/
""",
formatter_class=DefaultsAndRawTextFormatter
)
parser.add_argument('--nx', type=int, default=16, help='Number of cells along each side of the square.')
parser.add_argument('--cfl', type=float, default=0.8, help='Convective CFL number.')
parser.add_argument('--angle', type=float, default=pi/6, help='Angle of the convective velocity to the horizontal.')
parser.add_argument('--degree', type=int, default=1, help='Degree of the scalar space.')
parser.add_argument('--theta', type=float, default=0.5, help='Parameter for the implicit theta timestepping method.')
parser.add_argument('--width', type=float, default=0.2, help='Width of the Gaussian bump.')
parser.add_argument('--nwindows', type=int, default=1, help='Number of time-windows to solve.')
parser.add_argument('--nslices', type=int, default=1, help='Number of time-slices in the all-at-once system. Must divide the number of MPI ranks exactly.')
parser.add_argument('--slice_length', type=int, default=4, help='Number of timesteps per time-slice. Total number of timesteps in the all-at-once system is nslices*slice_length.')
parser.add_argument('--metrics_dir', type=str, default='metrics/advection', help='Directory to save paradiag output metrics to.')
parser.add_argument('--show_args', action='store_true', help='Print all the arguments when the script starts.')

args = parser.parse_known_args()
args = args[0]

if args.show_args:
PETSc.Sys.Print(args)

# The time partition describes how many timesteps
# are included on each time-slice of the ensemble

time_partition = tuple(args.slice_length for _ in range(args.nslices))
window_length = sum(time_partition)
nsteps = args.nwindows*window_length

# Calculate the timestep size dt from the CFL number
umax = 1.
dx = 1./args.nx
dt = args.cfl*dx/umax

# The Ensemble with the spatial and time communicators
ensemble = asQ.create_ensemble(time_partition)

# # # === --- domain --- === # # #

# The mesh needs to be created with the spatial communicator
mesh = fd.PeriodicUnitSquareMesh(args.nx, args.nx, quadrilateral=True, comm=ensemble.comm)

# We use a discontinuous Galerkin space for the advected scalar
V = fd.FunctionSpace(mesh, "DQ", args.degree)

# # # === --- initial conditions --- === # # #

x, y = fd.SpatialCoordinate(mesh)


# The scalar initial condition is a Gaussian bump centred at (0.5, 0.5)
def radius(x, y):
return fd.sqrt(pow(x-0.5, 2) + pow(y-0.5, 2))


def gaussian(x, y):
return fd.exp(-0.5*pow(radius(x, y)/args.width, 2))


q0 = fd.Function(V, name="scalar_initial")
q0.interpolate(1 + gaussian(x, y))

# The advecting velocity field is constant and directed at an angle to the x-axis
u = fd.Constant(fd.as_vector((umax*cos(args.angle), umax*sin(args.angle))))

# # # === --- finite element forms --- === # # #


# The time-derivative mass form for the scalar advection equation.
# asQ assumes that the mass form is linear so here
# q is a TrialFunction and phi is a TestFunction
def form_mass(q, phi):
return phi*q*fd.dx


# The DG advection form for the scalar advection equation.
# asQ assumes that the function form may be nonlinear so
# here q is a Function and phi is a TestFunction
def form_function(q, phi, t):
# upwind switch
n = fd.FacetNormal(mesh)
un = fd.Constant(0.5)*(fd.dot(u, n) + abs(fd.dot(u, n)))

# integration over element volume
int_cell = q*fd.div(phi*u)*fd.dx

# integration over internal facets
int_facet = (phi('+') - phi('-'))*(un('+')*q('+') - un('-')*q('-'))*fd.dS

return int_facet - int_cell


# # # === --- PETSc solver parameters --- === # # #

# The PETSc solver parameters used to solve the
# blocks in step (b) of inverting the ParaDiag matrix.
# MUMPS is a parallel direct solver so spatial parallelism can be used
block_parameters = {
'ksp_type': 'preonly',
'pc_type': 'lu',
'pc_factor_mat_solver_type': 'mumps',
}

# The PETSc solver parameters for solving the all-at-once system.
# The python preconditioner 'asQ.CirculantPC' applies the ParaDiag matrix.
#
# The equation is linear so we can either:
# a) Solve it using a preconditioned Krylov method:
# P^{-1}Au = P^{-1}b
# The solver option for this is:
# -ksp_type gmres
# b) Solve it with stationary iterations:
# Pu_{k+1} = (P - A)u_{k} + b
# The solver option for this is:
# -ksp_type richardson

paradiag_parameters = {
'snes_type': 'ksponly', # only solve 1 "Newton iteration" per window (i.e. a linear problem)
'ksp_type': 'richardson', # stationary iterations
'ksp': {
'monitor': None, # show the residual at every iteration
'converged_rate': None, # show the contraction rate once the linear solve has converged
'rtol': 1e-11, # relative residual tolerance
},
'pc_type': 'python',
'pc_python_type': 'asQ.CirculantPC', # the alpha-circulant preconditioner
'circulant_alpha': 1e-4, # set other values from command line using: -circulant_alpha <value>
'circulant_block': block_parameters, # options dictionary for the inner solve
'circulant_state': 'linear', # system is linear so don't update the preconditioner reference state
'aaos_jacobian_state': 'linear', # system is linear so don't update the jacobian reference state
}


# # # === --- Setup ParaDiag --- === # # #

# Give everything to the Paradiag object which will build the all-at-once system.
paradiag = asQ.Paradiag(ensemble=ensemble,
form_function=form_function,
form_mass=form_mass,
ics=q0, dt=dt, theta=args.theta,
time_partition=time_partition,
solver_parameters=paradiag_parameters)

# create a timer to profile the calculations
timer = SolverTimer()


# This function will be called before paradiag solves each time-window. We can use
# this to make the output a bit easier to read, and to time the window calculation
def window_preproc(paradiag, wndw, rhs):
PETSc.Sys.Print(f'### === --- Calculating time-window {wndw} --- === ###')
PETSc.Sys.Print('')
# for now we are interested in timing only the solve, this
# makes sure we don't time any synchronisation after prints.
with PETSc.Log.Event("window_preproc.Coll_Barrier"):
paradiag.ensemble.ensemble_comm.Barrier()
timer.start_timing()


# This function will be called after paradiag solves each time-window. We can use
# this to finish the window calculation timing and print the result.
def window_postproc(paradiag, wndw, rhs):
timer.stop_timing()
PETSc.Sys.Print('')
PETSc.Sys.Print(f'Window solution time: {round(timer.times[-1], 5)}')
PETSc.Sys.Print('')


# Setup all solver objects. The firedrake DM and options management
# makes it difficult to setup some preconditioners without actually
# calling `solve`, so we just run once to set everything up.
PETSc.Sys.Print('')
PETSc.Sys.Print('### === --- Setting up solver and prefactoring --- === ###')
PETSc.Sys.Print('')
with PETSc.Log.Event("warmup_solve"):
paradiag.solve(1)
PETSc.Sys.Print('')

# reset solution and iteration counts for timed solved
paradiag.reset_diagnostics()
aaofunc = paradiag.solver.aaofunc
aaofunc.bcast_field(-1, aaofunc.initial_condition)
aaofunc.assign(aaofunc.initial_condition)

PETSc.Sys.Print('### === --- Solving timeseries --- === ###')
PETSc.Sys.Print('')

# Solve nwindows of the all-at-once system
with PETSc.Log.Event("timed_solves"):
paradiag.solve(args.nwindows,
preproc=window_preproc,
postproc=window_postproc)

# # # === --- Solver diagnostics --- === # # #

PETSc.Sys.Print('### === --- Iteration and timing results --- === ###')
PETSc.Sys.Print('')

asQ.write_paradiag_metrics(paradiag, directory=args.metrics_dir)

nw = paradiag.total_windows
nt = paradiag.total_timesteps
PETSc.Sys.Print(f'Total windows: {nw}')
PETSc.Sys.Print(f'Total timesteps: {nt}')
PETSc.Sys.Print('')

# Show the parallel partition sizes.
PETSc.Sys.Print(f'Total DoFs per window: {V.dim()*window_length}')
PETSc.Sys.Print(f'DoFs per timestep: {V.dim()}')
PETSc.Sys.Print(f'Total number of MPI ranks: {ensemble.global_comm.size}')
PETSc.Sys.Print(f'Number of MPI ranks per timestep: {mesh.comm.size}')
PETSc.Sys.Print(f'DoFs/rank: {V.dim()/mesh.comm.size}')
PETSc.Sys.Print(f'Complex block DoFs/rank: {2*V.dim()/mesh.comm.size}')
PETSc.Sys.Print('')

# paradiag collects a few iteration counts for us
lits = paradiag.linear_iterations
nlits = paradiag.nonlinear_iterations
blits = paradiag.block_iterations.data()

# Number of nonlinear iterations will be 1 per window for linear problems
PETSc.Sys.Print(f'Nonlinear iterations: {str(nlits).rjust(5)} | Iterations per window: {str(nlits/nw).rjust(5)}')

# Number of linear iterations of the all-at-once system, total and per window.
PETSc.Sys.Print(f'Linear iterations: {str(lits).rjust(5)} | Iterations per window: {str(lits/nw).rjust(5)}')

# Number of iterations needed for each block in step-(b), total and per block solve
PETSc.Sys.Print(f'Total block linear iterations: {blits}')
PETSc.Sys.Print(f'Iterations per block solve: {blits/lits}')
PETSc.Sys.Print('')

# Timing measurements
PETSc.Sys.Print(timer.string(timesteps_per_solve=window_length,
total_iterations=paradiag.linear_iterations, ndigits=5))
PETSc.Sys.Print('')
Loading

0 comments on commit 7bc3f82

Please sign in to comment.