Numba-accelerated Pythonic implementation of MPDATA with Jupyter examples

These details have not been verified by PyPI

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

PyMPDATA

PyMPDATA is a high-performance Numba-accelerated Pythonic implementation of the MPDATA algorithm of Smolarkiewicz et al. for numerically solving generalised transport equations - partial differential equations used to model conservation/balance laws, scalar-transport problems, convection-diffusion phenomena (in geophysical fluid dynamics and beyond). As of the current version, PyMPDATA supports homogeneous transport in 1D, 2D and 3D (work in progress) using structured meshes, optionally generalised by employment of a Jacobian of coordinate transformation. PyMPDATA includes implementation of a set of MPDATA variants including the non-oscillatory option, infinite-gauge, divergent-flow, double-pass donor cell (DPDC) and third-order-terms options. It also features support for integration of Fickian-terms in advection-diffusion problems using the pseudo-transport velocity approach. In 2D and 3D simulations, domain-decomposition is used for multi-threaded parallelism.

PyMPDATA is engineered purely in Python targeting both performance and usability, the latter encompassing research users', developers' and maintainers' perspectives. From researcher's perspective, PyMPDATA offers hassle-free installation on multitude of platforms including Linux, OSX and Windows, and eliminates compilation stage from the perspective of the user. From developers' and maintainers' perspective, PyMPDATA offers a suite of unit tests, multi-platform continuous integration setup, seamless integration with Python development aids including debuggers and profilers.

PyMPDATA design features a custom-built multi-dimensional Arakawa-C grid layer allowing to concisely represent multi-dimensional stencil operations. The grid layer is built on top of NumPy's ndarrays (using "C" ordering) using Numba's @njit functionality for high-performance array traversals. It enables one to code once for multiple dimensions, and automatically handles (and hides from the user) any halo-filling logic related with boundary conditions.

PyMPDATA ships with a set of examples/demos offered as github-hosted Jupyer notebooks offering single-click deployment in the cloud using mybinder.org or using colab.research.google.com. The examples/demos reproduce results from several published works on MPDATA and its applications, and provide a validation of the implementation and its performance.

Dependencies and installation

PyMPDATA depends on NumPy, Numba and SciPy which are all listed as project dependencies within the project's setup.py file.

To install PyMPDATA, one may use: pip install --pre git+https://github.com/atmos-cloud-sim-uj/PyMPDATA.git

Running the tests shipped with the package requires additional packages listed in the test-time-requirements.txt file.

PyMPDATA examples listed below are hosted in a separate repository and constitute the PyMPDATA_examples package. The examples have additional dependencies listed in PyMPDATA_examples package setup.py file. Running the examples requires the PyMPDATA_examples package to be installed. Since the examples package includes Jupyter notebooks (and their execution requires write access), the suggested install and launch steps are:

git clone https://github.com/atmos-cloud-sim-uj/PyMPDATA-examples.git
cd PyMPDATA-examples
pip install -e .
jupyter-notebook

Examples:

PyMPDATA ships with several demos that reproduce results from literature, including:

Smolarkiewicz 2006 Figs 3,4,10,11 & 12

(1D homogeneous cases depicting infinite-gauge and flux-corrected transport cases)
Arabas & Farhat 2020 Figs 1-3 & Tab. 1

(1D advection-diffusion example based on Black-Scholes equation)
Olesik et al. 2020

(1D particle population condensational growth problem with coordinate transformations)
Molenkamp 2D solid-body rotation test (as in Jaruga et al. 2015, Fig. 12)
1D advection-diffusion example with animation
2D advection on a sphere (upwind-only setup depicting coordinate transformation based on Williamson and Rasch 1989)

Package structure and API:

In short, PyMPDATA numerically solves the following equation:

$\partial_t (G \psi) + \nabla \cdot (Gu \psi) = 0$

where scalar field $\psi$ is referred to as the advectee, vector field u is referred to as advector, and the G factor corresponds to optional coordinate transformation.

The key classes constituting the PyMPDATA interface are summarised below with code snippets exemplifying usage of PyMPDATA from Python, Julia and Matlab.

A pdoc-generated documentation of PyMPDATA public API is maintained at: https://atmos-cloud-sim-uj.github.io/PyMPDATA

Options class

The Options class groups both algorithm variant options as well as some implementation-related flags that need to be set at the first place. All are set at the time of instantiation using the following keyword arguments of the constructor (all having default values indicated below):

n_iters: int = 2: number of iterations (2 means upwind + one corrective iteration)
infinite_gauge: bool = False: flag enabling the infinite-gauge option (does not maintain sign of the advected field, thus in practice implies switching flux corrected transport on)
divergent_flow: bool = False: flag enabling divergent-flow terms when calculating antidiffusive velocity
flux_corrected_transport: bool = False: flag enabling flux-corrected transport (FCT) logic (a.k.a. non-oscillatory or monotone variant)
third_order_terms: bool = False: flag enabling third-order terms
epsilon: float = 1e-15: value added to potentially zero-valued denominators
non_zero_mu_coeff: bool = False: flag indicating if code for handling the Fickian term is to be optimised out
DPDC: bool = False: flag enabling double-pass donor cell option (recursive pseudovelocities)
dtype: np.floating = np.float64: floating point precision

For a discussion of the above options, see e.g., Smolarkiewicz & Margolin 1998, Jaruga, Arabas et al. 2015 and Olesik, Arabas et al. 2020 (the last with examples using PyMPDATA).

In most use cases of PyMPDATA, the first thing to do is to instantiate the Options class with arguments suiting the problem at hand, e.g.:

Julia (click to expand)

using Pkg
Pkg.add("PyCall")
using PyCall
Options = pyimport("PyMPDATA").Options
options = Options(n_iters=3, infinite_gauge=true, flux_corrected_transport=true)

Matlab (click to expand)

Options = py.importlib.import_module('PyMPDATA').Options;
options = Options(pyargs(...
  'n_iters', 3, ...
  'infinite_gauge', true, ...
  'flux_corrected_transport', true ...
));

Python

from PyMPDATA import Options
options = Options(n_iters=3, infinite_gauge=True, flux_corrected_transport=True)

Arakawa-C grid layer and boundary conditions

The arakawa_c subpackage contains modules implementing the Arakawa-C staggered grid in which:

scalar fields are discretised onto cell-center points,
vector fields are discretised onto cell-boundary points.

In PyMPDATA, the solution domain is assumed to extend from the first cell's boundary to the last cell's boundary (thus first scalar field value is at $[\Delta x/2, \Delta y/2]$ ).

From the user perspective, the two key classes with their init methods are:

The data parameters are expected to be Numpy arrays or tuples of Numpy arrays, respectively. The halo parameter is the extent of ghost-cell region that will surround the data and will be used to implement boundary conditions. Its value (in practice 1 or 2) is dependent on maximal stencil extent for the MPDATA variant used and can be easily obtained using the Options.n_halo property.

As an example, the code below shows how to instantiate a scalar and a vector field given a 2D constant-velocity problem, using a grid of 100x100 points and cyclic boundary conditions (with all values set to zero):

Julia (click to expand)

ScalarField = pyimport("PyMPDATA").ScalarField
VectorField = pyimport("PyMPDATA").VectorField
PeriodicBoundaryCondition = pyimport("PyMPDATA").PeriodicBoundaryCondition

nx, ny = 100, 100
halo = options.n_halo
advectee = ScalarField(
    data=zeros((nx, ny)), 
    halo=halo, 
    boundary_conditions=(PeriodicBoundaryCondition(), PeriodicBoundaryCondition())
)
advector = VectorField(
    data=(zeros((nx+1, ny)), zeros((nx, ny+1))),
    halo=halo,
    boundary_conditions=(PeriodicBoundaryCondition(), PeriodicBoundaryCondition())    
)

Matlab (click to expand)

ScalarField = py.importlib.import_module('PyMPDATA').ScalarField;
VectorField = py.importlib.import_module('PyMPDATA').VectorField;
PeriodicBoundaryCondition = py.importlib.import_module('PyMPDATA').PeriodicBoundaryCondition;

nx = int32(100);
ny = int32(100);
halo = options.n_halo;
advectee = ScalarField(pyargs(...
    'data', py.numpy.zeros(int32([nx ny])), ... 
    'halo', halo, ...
    'boundary_conditions', py.tuple({PeriodicBoundaryCondition(), PeriodicBoundaryCondition()}) ...
));
advector = VectorField(pyargs(...
    'data', py.tuple({py.numpy.zeros(int32([nx+1 ny])), py.numpy.zeros(int32([nx ny+1]))}), ...
    'halo', halo, ...
    'boundary_conditions', py.tuple({PeriodicBoundaryCondition(), PeriodicBoundaryCondition()}) ...
));

Python

from PyMPDATA import ScalarField
from PyMPDATA import VectorField
from PyMPDATA import PeriodicBoundaryCondition
import numpy as np

nx, ny = 100, 100
halo = options.n_halo
advectee = ScalarField(
    data=np.zeros((nx, ny)), 
    halo=halo, 
    boundary_conditions=(PeriodicBoundaryCondition(), PeriodicBoundaryCondition())
)
advector = VectorField(
    data=(np.zeros((nx+1, ny)), np.zeros((nx, ny+1))),
    halo=halo,
    boundary_conditions=(PeriodicBoundaryCondition(), PeriodicBoundaryCondition())    
)

Note that the shapes of arrays representing components of the velocity field are different than the shape of the scalar field array due to employment of the staggered grid.

Besides the exemplified PeriodicBoundaryCondition class representing periodic boundary conditions, PyMPDATA supports ExtrapolatedBoundaryCondition, ConstantBoundaryCondition and PolarBoundaryCondition.

Stepper

The logic of the MPDATA iterative solver is represented in PyMPDATA by the Stepper class.

When instantiating the Stepper, the user has a choice of either supplying just the number of dimensions or specialising the stepper for a given grid:

Julia (click to expand)

Stepper = pyimport("PyMPDATA").Stepper

stepper = Stepper(options=options, n_dims=2)

Matlab (click to expand)

Stepper = py.importlib.import_module('PyMPDATA').Stepper;

steper = Stepper(pyargs(...
  'options', options, ...
  'n_dims', int32(2) ...
));

Python

from PyMPDATA import Stepper

stepper = Stepper(options=options, n_dims=2)

Julia (click to expand)

stepper = Stepper(options=options, grid=(nx, ny))

Matlab (click to expand)

stepper = Stepper(pyargs(...
  'options', options, ...
  'grid', py.tuple({nx, ny}) ...
))

Python

stepper = Stepper(options=options, grid=(nx, ny))

In the latter case, noticeably faster execution can be expected, however the resultant stepper is less versatile as bound to the given grid size. If number of dimensions is supplied only, the integration will take longer, yet same instance of the stepper can be used for different grids.

Since creating an instance of the Stepper class involves time consuming compilation of the algorithm code, the class is equipped with a cache logic - subsequent calls with same arguments return references to previously instantiated objects. Instances of Stepper contain no mutable data and are (thread-)safe to be reused.

The init method of Stepper has an optional non_unit_g_factor argument which is a Boolean flag enabling handling of the G factor term which can be used to represent coordinate transformations and/or variable fluid density.

Optionally, the number of threads to use for domain decomposition in first (non-contiguous) dimension during 2D and 3D calculations may be specified using the optional n_threads argument with a default value of numba.get_num_threads(). The multi-threaded logic of PyMPDATA depends thus on settings of numba, namely on the selected threading layer (either via NUMBA_THREADING_LAYER env var or via numba.config.THREADING_LAYER) and the selected size of the thread pool (NUMBA_NUM_THREADS env var or numba.config.NUMBA_NUM_THREADS).

Solver

Instances of the Solver class are used to control the integration and access solution data. During instantiation, additional memory required by the solver is allocated according to the options provided.

The only method of the Solver class besides the init is advance(self, nt: int, mu_coeff: float = 0) which advances the solution by nt timesteps, optionally taking into account a given value of diffusion coefficient.

Solution state is accessible through the Solver.advectee property.

Continuing with the above code snippets, instantiating a solver and making one integration step looks as follows:

Julia (click to expand)

Solver = pyimport("PyMPDATA").Solver
solver = Solver(stepper=stepper, advectee=advectee, advector=advector)
solver.advance(nt=1)
state = solver.advectee.get()

Matlab (click to expand)

Solver = py.importlib.import_module('PyMPDATA').Solver;
solver = Solver(pyargs('stepper', stepper, 'advectee', advectee, 'advector', advector));
solver.advance(pyargs('nt', 1));
state = solver.advectee.get();

Python

from PyMPDATA import Solver
solver = Solver(stepper=stepper, advectee=advectee, advector=advector)
solver.advance(nt=1)
state = solver.advectee.get()

Debugging

PyMPDATA relies heavily on Numba to provide high-performance number crunching operations. Arguably, one of the key advantage of embracing Numba is that it can be easily switched off. This brings multiple-order-of-magnitude drop in performance, yet it also make the entire code of the library susceptible to interactive debugging, one way of enabling it is by setting the following environment variable before importing PyMPDATA:

Julia (click to expand)

ENV["NUMBA_DISABLE_JIT"] = "1"

Matlab (click to expand)

setenv('NUMBA_DISABLE_JIT', '1');

Python

import os
os.environ["NUMBA_DISABLE_JIT"] = "1"

Credits:

Development of PyMPDATA is supported by the EU through a grant of the Foundation for Polish Science (POIR.04.04.00-00-5E1C/18).

Other open-source MPDATA implementations:

mpdat_2d in babyEULAG (FORTRAN) https://github.com/igfuw/bE_SDs/blob/master/babyEULAG.SDs.for#L741
mpdata-oop (C++, Fortran, Python) https://github.com/igfuw/mpdata-oop
apc-llc/mpdata (C++) https://github.com/apc-llc/mpdata
libmpdata++ (C++): https://github.com/igfuw/libmpdataxx
AtmosFOAM: https://github.com/AtmosFOAM/AtmosFOAM/tree/947b192f69d973ea4a7cfab077eb5c6c6fa8b0cf/applications/solvers/advection/MPDATAadvectionFoam

Project details

These details have not been verified by PyPI

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.0.16

Jan 9, 2024

1.0.15

Dec 4, 2023

1.0.14

Aug 26, 2023

1.0.13

Jul 21, 2023

1.0.12

Jun 24, 2023

1.0.11

Apr 25, 2023

1.0.10

Mar 23, 2023

1.0.9

Feb 23, 2023

1.0.8

Feb 12, 2023

1.0.7

Jan 17, 2023

1.0.6

Nov 26, 2022

1.0.5

Sep 15, 2022

1.0.4

Aug 16, 2022

1.0.3

Jul 29, 2022

1.0.2

Jun 18, 2022

1.0.1

Apr 20, 2022

1.0

Mar 4, 2022

0.12

Jan 17, 2022

0.11

Jan 14, 2022

0.10

Oct 31, 2021

0.9

Oct 11, 2021

0.8

Oct 10, 2021

0.7

Oct 8, 2021

0.6

Oct 5, 2021

0.5

Sep 13, 2021

0.4

Sep 11, 2021

0.3

Aug 4, 2021

This version

0.2

Jul 13, 2021

0.1

Jul 8, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyMPDATA-0.2.tar.gz (69.9 kB view hashes)

Uploaded Jul 13, 2021 Source

Built Distribution

PyMPDATA-0.2-py3-none-any.whl (46.1 kB view hashes)

Uploaded Jul 13, 2021 Python 3

Hashes for PyMPDATA-0.2.tar.gz

Hashes for PyMPDATA-0.2.tar.gz
Algorithm	Hash digest
SHA256	`d3ea92a68fba5d2c3c4fb8eb27cc04350f068a512ef811adc87010130d457cf9`
MD5	`01aba7e5512996681b3808cfd442d811`
BLAKE2b-256	`fd5600592591f94a52808001346e06f55fa3876e046d4b343fe3ad6c417dbac9`

Hashes for PyMPDATA-0.2-py3-none-any.whl

Hashes for PyMPDATA-0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`edd07ad9e71b38b4e1a2088c2dd9be7fd793546ef38a0bdd58ef81072d867de2`
MD5	`41cc1ed1c392261d56d50a3a93d97083`
BLAKE2b-256	`923a7dc3a3986fceafd835a8b774d3555eb907136c8dd537578e801cfc653d2e`