Skip to content

Commit

Permalink
Add normalization with statistics to many statistics preprocessors (#…
Browse files Browse the repository at this point in the history
…2189)

Co-authored-by: Valeriu Predoi <[email protected]>
  • Loading branch information
schlunma and valeriupredoi authored Feb 5, 2024
1 parent 3700cc9 commit 22e31c6
Show file tree
Hide file tree
Showing 6 changed files with 339 additions and 22 deletions.
62 changes: 53 additions & 9 deletions doc/recipe/preprocessor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1962,11 +1962,15 @@ along the longitude coordinate.
Parameters:
* `operator`: Operation to apply.
See :ref:`stat_preprocs` for more details on supported statistics.
* `normalize`: If given, do not return the statistics cube itself, but
rather, the input cube, normalized with the statistics cube. Can either
be `subtract` (statistics cube is subtracted from the input cube) or
`divide` (input cube is divided by the statistics cube).
* Other parameters are directly passed to the `operator` as keyword
arguments.
See :ref:`stat_preprocs` for more details.

See also :func:`esmvalcore.preprocessor.zonal_means`.
See also :func:`esmvalcore.preprocessor.zonal_statistics`.


``meridional_statistics``
Expand All @@ -1979,22 +1983,22 @@ argument:
Parameters:
* `operator`: Operation to apply.
See :ref:`stat_preprocs` for more details on supported statistics.
* `normalize`: If given, do not return the statistics cube itself, but
rather, the input cube, normalized with the statistics cube. Can either
be `subtract` (statistics cube is subtracted from the input cube) or
`divide` (input cube is divided by the statistics cube).
* Other parameters are directly passed to the `operator` as keyword
arguments.
See :ref:`stat_preprocs` for more details.

See also :func:`esmvalcore.preprocessor.meridional_means`.
See also :func:`esmvalcore.preprocessor.meridional_statistics`.


.. _area_statistics:

``area_statistics``
-------------------

This function calculates statistics over a region.
It takes one argument, ``operator``, which is the name of the operation to
apply.

This function can be used to apply several different operations in the
horizontal plane: for example, mean, sum, standard deviation, median, variance,
minimum, maximum and root mean square.
Expand All @@ -2013,6 +2017,33 @@ The required supplementary variable, either ``areacella`` for atmospheric
variables or ``areacello`` for ocean variables, can be attached to the main
dataset as described in :ref:`supplementary_variables`.

Parameters:
* `operator`: Operation to apply.
See :ref:`stat_preprocs` for more details on supported statistics.
* `normalize`: If given, do not return the statistics cube itself, but
rather, the input cube, normalized with the statistics cube. Can either
be `subtract` (statistics cube is subtracted from the input cube) or
`divide` (input cube is divided by the statistics cube).
* Other parameters are directly passed to the `operator` as keyword
arguments.
See :ref:`stat_preprocs` for more details.

Examples:
* Calculate global mean:

.. code-block:: yaml
area_statistics:
operator: mean
* Subtract global mean from dataset:

.. code-block:: yaml
area_statistics:
operator: mean
normalize: subtract
See also :func:`esmvalcore.preprocessor.area_statistics`.


Expand Down Expand Up @@ -2100,9 +2131,6 @@ See also :func:`esmvalcore.preprocessor.extract_volume`.
This function calculates the volume-weighted average across three dimensions,
but maintains the time dimension.

This function takes the argument: `operator`, which defines the operation to
apply over the volume.
At the moment, only `mean` is supported.
By default, the `mean` operation is weighted by the grid cell volumes.

For weighted statistics, this function requires a cell volume `cell measure`_,
Expand All @@ -2113,6 +2141,18 @@ dataset as described in :ref:`supplementary_variables`.

No depth coordinate is required as this is determined by Iris.

Parameters:
* `operator`: Operation to apply.
At the moment, only `mean` is supported.
See :ref:`stat_preprocs` for more details on supported statistics.
* `normalize`: If given, do not return the statistics cube itself, but
rather, the input cube, normalized with the statistics cube. Can either
be `subtract` (statistics cube is subtracted from the input cube) or
`divide` (input cube is divided by the statistics cube).
* Other parameters are directly passed to the `operator` as keyword
arguments.
See :ref:`stat_preprocs` for more details.

See also :func:`esmvalcore.preprocessor.volume_statistics`.

.. _axis_statistics:
Expand All @@ -2128,6 +2168,10 @@ Takes arguments:
Possible values for the axis are `x`, `y`, `z`, `t`.
* `operator`: Operation to apply.
See :ref:`stat_preprocs` for more details on supported statistics.
* `normalize`: If given, do not return the statistics cube itself, but
rather, the input cube, normalized with the statistics cube. Can either
be `subtract` (statistics cube is subtracted from the input cube) or
`divide` (input cube is divided by the statistics cube).
* Other parameters are directly passed to the `operator` as keyword
arguments.
See :ref:`stat_preprocs` for more details.
Expand Down
50 changes: 40 additions & 10 deletions esmvalcore/preprocessor/_area.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
import logging
import warnings
from pathlib import Path
from typing import TYPE_CHECKING, Iterable, Optional
from typing import TYPE_CHECKING, Iterable, Literal, Optional

import fiona
import iris
Expand All @@ -20,8 +20,13 @@
from iris.cube import Cube, CubeList
from iris.exceptions import CoordinateMultiDimError, CoordinateNotFoundError

from ._shared import get_iris_aggregator, guess_bounds, update_weights_kwargs
from ._supplementary_vars import (
from esmvalcore.preprocessor._shared import (
get_iris_aggregator,
get_normalized_cube,
guess_bounds,
update_weights_kwargs,
)
from esmvalcore.preprocessor._supplementary_vars import (
add_ancillary_variable,
add_cell_measure,
register_supplementaries,
Expand Down Expand Up @@ -187,6 +192,7 @@ def _extract_irregular_region(cube, start_longitude, end_longitude,
def zonal_statistics(
cube: Cube,
operator: str,
normalize: Optional[Literal['subtract', 'divide']] = None,
**operator_kwargs
) -> Cube:
"""Compute zonal statistics.
Expand All @@ -199,14 +205,20 @@ def zonal_statistics(
The operation. Used to determine the :class:`iris.analysis.Aggregator`
object used to calculate the statistics. Allowed options are given in
:ref:`this table <supported_stat_operator>`.
normalize:
If given, do not return the statistics cube itself, but rather, the
input cube, normalized with the statistics cube. Can either be
`subtract` (statistics cube is subtracted from the input cube) or
`divide` (input cube is divided by the statistics cube).
**operator_kwargs:
Optional keyword arguments for the :class:`iris.analysis.Aggregator`
object defined by `operator`.
Returns
-------
iris.cube.Cube
Zonal statistics cube.
Zonal statistics cube or input cube normalized by statistics cube (see
`normalize`).
Raises
------
Expand All @@ -220,14 +232,17 @@ def zonal_statistics(
"Zonal statistics on irregular grids not yet implemented"
)
(agg, agg_kwargs) = get_iris_aggregator(operator, **operator_kwargs)
cube = cube.collapsed('longitude', agg, **agg_kwargs)
cube.data = cube.core_data().astype(np.float32, casting='same_kind')
return cube
result = cube.collapsed('longitude', agg, **agg_kwargs)
if normalize is not None:
result = get_normalized_cube(cube, result, normalize)
result.data = result.core_data().astype(np.float32, casting='same_kind')
return result


def meridional_statistics(
cube: Cube,
operator: str,
normalize: Optional[Literal['subtract', 'divide']] = None,
**operator_kwargs,
) -> Cube:
"""Compute meridional statistics.
Expand All @@ -240,6 +255,11 @@ def meridional_statistics(
The operation. Used to determine the :class:`iris.analysis.Aggregator`
object used to calculate the statistics. Allowed options are given in
:ref:`this table <supported_stat_operator>`.
normalize:
If given, do not return the statistics cube itself, but rather, the
input cube, normalized with the statistics cube. Can either be
`subtract` (statistics cube is subtracted from the input cube) or
`divide` (input cube is divided by the statistics cube).
**operator_kwargs:
Optional keyword arguments for the :class:`iris.analysis.Aggregator`
object defined by `operator`.
Expand All @@ -261,9 +281,11 @@ def meridional_statistics(
"Meridional statistics on irregular grids not yet implemented"
)
(agg, agg_kwargs) = get_iris_aggregator(operator, **operator_kwargs)
cube = cube.collapsed('latitude', agg, **agg_kwargs)
cube.data = cube.core_data().astype(np.float32, casting='same_kind')
return cube
result = cube.collapsed('latitude', agg, **agg_kwargs)
if normalize is not None:
result = get_normalized_cube(cube, result, normalize)
result.data = result.core_data().astype(np.float32, casting='same_kind')
return result


def compute_area_weights(cube):
Expand Down Expand Up @@ -348,6 +370,7 @@ def _try_adding_calculated_cell_area(cube: Cube) -> None:
def area_statistics(
cube: Cube,
operator: str,
normalize: Optional[Literal['subtract', 'divide']] = None,
**operator_kwargs,
) -> Cube:
"""Apply a statistical operator in the horizontal plane.
Expand All @@ -370,6 +393,11 @@ def area_statistics(
The operation. Used to determine the :class:`iris.analysis.Aggregator`
object used to calculate the statistics. Allowed options are given in
:ref:`this table <supported_stat_operator>`.
normalize:
If given, do not return the statistics cube itself, but rather, the
input cube, normalized with the statistics cube. Can either be
`subtract` (statistics cube is subtracted from the input cube) or
`divide` (input cube is divided by the statistics cube).
**operator_kwargs:
Optional keyword arguments for the :class:`iris.analysis.Aggregator`
object defined by `operator`.
Expand All @@ -396,6 +424,8 @@ def area_statistics(
)

result = cube.collapsed(['latitude', 'longitude'], agg, **agg_kwargs)
if normalize is not None:
result = get_normalized_cube(cube, result, normalize)

# Make sure to preserve dtype
new_dtype = result.dtype
Expand Down
59 changes: 58 additions & 1 deletion esmvalcore/preprocessor/_shared.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,9 @@
import re
import warnings
from collections.abc import Callable
from typing import Any, Optional
from typing import Any, Literal, Optional

import dask.array as da
import iris.analysis
import numpy as np
from iris.coords import DimCoord
Expand Down Expand Up @@ -181,3 +182,59 @@ def update_weights_kwargs(
else:
kwargs.pop('weights', None)
return kwargs


def get_normalized_cube(
cube: Cube,
statistics_cube: Cube,
normalize: Literal['subtract', 'divide'],
) -> Cube:
"""Get cube normalized with statistics cube.
Parameters
----------
cube:
Input cube that will be normalized.
statistics_cube:
Cube that is used to normalize the input cube. Needs to be
broadcastable to the input cube's shape according to iris' rich
broadcasting rules enabled by the use of named dimensions (see also
https://scitools-iris.readthedocs.io/en/latest/userguide/cube_maths.
html#calculating-a-cube-anomaly). This is usually ensure by using
:meth:`iris.cube.Cube.collapsed` to calculate the statistics cube.
normalize:
Normalization operation. Can either be `subtract` (statistics cube is
subtracted from the input cube) or `divide` (input cube is divided by
the statistics cube).
Returns
-------
Cube
Input cube normalized with statistics cube.
"""
if normalize == 'subtract':
normalized_cube = cube - statistics_cube

elif normalize == 'divide':
normalized_cube = cube / statistics_cube

# Iris sometimes masks zero-divisions, sometimes not
# (https://github.com/SciTools/iris/issues/5523). Make sure to
# consistently mask them here.
normalized_cube.data = da.ma.masked_invalid(
normalized_cube.core_data()
)

else:
raise ValueError(
f"Expected 'subtract' or 'divide' for `normalize`, got "
f"'{normalize}'"
)

# Keep old metadata except for units
new_units = normalized_cube.units
normalized_cube.metadata = cube.metadata
normalized_cube.units = new_units

return normalized_cube
Loading

0 comments on commit 22e31c6

Please sign in to comment.