improver.ensemble_copula_coupling.utilities module

This module defines the utilities required for Ensemble Copula Coupling plugins.

choose_set_of_percentiles(no_of_percentiles, sampling='quantile')[source]

Function to create percentiles.

Parameters:

no_of_percentiles (int) – Number of percentiles.
sampling (str) –
Type of sampling of the distribution to produce a set of percentiles e.g. quantile or random.

Accepted options for sampling are:
- Quantile: A regular set of equally-spaced percentiles aimed
  at dividing a Cumulative Distribution Function into blocks of equal probability.
- Random: A random set of ordered percentiles.

Return type:

List[float]

Returns:

Percentiles calculated using the sampling technique specified.

Raises:

ValueError – if the sampling option is not one of the accepted options.

References

For further details, Flowerdew, J., 2014. Calibrating ensemble reliability whilst preserving spatial structure. Tellus, Series A: Dynamic Meteorology and Oceanography, 66(1), pp.1-20. Schefzik, R., Thorarinsdottir, T.L. & Gneiting, T., 2013. Uncertainty Quantification in Complex Simulation Models Using Ensemble Copula Coupling. Statistical Science, 28(4), pp.616-640.

concatenate_2d_array_with_2d_array_endpoints(array_2d, low_endpoint, high_endpoint)[source]

For a 2d array, add a 2d array as the lower and upper endpoints. The concatenation to add the lower and upper endpoints to the 2d array are performed along the second (index 1) dimension.

Parameters:

array_2d (ndarray) – 2d array of values
low_endpoint (float) – Number used to create a 2d array of a constant value as the lower endpoint.
high_endpoint (float) – Number of used to create a 2d array of a constant value as the upper endpoint.

Return type:

ndarray

Returns:

2d array of values after padding with the low_endpoint and high_endpoint.

create_cube_with_percentiles(percentiles, template_cube, cube_data, cube_unit=None)[source]

Create a cube with a percentile coordinate based on a template cube. The resulting cube will have an extra percentile coordinate compared with the template cube. The shape of the cube_data should be the shape of the desired output cube.

Parameters:

percentiles (Union[List[float], ndarray]) – Ensemble percentiles. There should be the same number of percentiles as the first dimension of cube_data.
template_cube (Cube) – Cube to copy metadata from.
cube_data (ndarray) – Data to insert into the template cube. The shape of the cube_data, excluding the dimension associated with the percentile coordinate, should be the same as the shape of template_cube. For example, template_cube shape is (3, 3, 3), whilst the cube_data is (10, 3, 3, 3), where there are 10 percentiles.
cube_unit (Union[Unit, str, None]) – The units of the data within the cube, if different from those of the template_cube.

Return type:

Cube

Returns:

Cube containing a percentile coordinate as the leading dimension (or scalar percentile coordinate if single-valued)

get_bounds_of_distribution(bounds_pairing_key, desired_units)[source]

Gets the bounds of the distribution and converts the units of the bounds_pairing to the desired_units.

This method gets the bounds values and units from the imported dictionaries: BOUNDS_FOR_ECDF and units_of_BOUNDS_FOR_ECDF. The units of the bounds are converted to be the desired units.

Parameters:

bounds_pairing_key (str) – Name of key to be used for the BOUNDS_FOR_ECDF dictionary, in order to get the desired bounds_pairing.
desired_units (Unit) – Units to which the bounds_pairing will be converted.

Return type:

ndarray

Returns:

Lower and upper bound to be used as the ends of the empirical cumulative distribution function, converted to have the desired units.

Raises:

KeyError – If the bounds_pairing_key is not within the BOUNDS_FOR_ECDF dictionary.

insert_lower_and_upper_endpoint_to_1d_array(array_1d, low_endpoint, high_endpoint)[source]

For a 1d array, add a lower and upper endpoint.

Parameters:

array_1d (ndarray) – 1d array of values
low_endpoint (float) – Number of use as the lower endpoint.
high_endpoint (float) – Number of use as the upper endpoint.

Return type:

ndarray

Returns:

1d array of values padded with the low_endpoint and high_endpoint.

interpolate_multiple_rows_same_x(*args)[source]

For each row i of fp, do the equivalent of np.interp(x, xp, fp[i, :]).

Calls a fast numba implementation where numba is available (see improver.ensemble_copula_coupling.numba_utilities.fast_interp_same_y) and calls a the native python implementation otherwise (see slow_interp_same_y()).

Parameters:

x – 1-D array
xp – 1-D array, sorted in non-decreasing order
fp – 2-D array with len(xp) columns

Returns:

2-D array with shape (len(fp), len(x)), with each row i equal to: np.interp(x, xp, fp[i, :])

interpolate_multiple_rows_same_y(*args)[source]

For each row i of xp, do the equivalent of np.interp(x, xp[i], fp).

Calls a fast numba implementation where numba is available (see improver.ensemble_copula_coupling.numba_utilities.fast_interp_same_y) and calls a the native python implementation otherwise (see slow_interp_same_y()).

Parameters:

x – 1-d array
xp – n * m array, each row must be in non-decreasing order
fp – 1-d array with length m

Returns:

n * len(x) array where each row i is equal to np.interp(x, xp[i], fp)

restore_non_percentile_dimensions(array_to_reshape, original_cube, n_percentiles)[source]

Reshape a 2d array, so that it has the dimensions of the original cube, whilst ensuring that the probabilistic dimension is the first dimension.

Parameters:

array_to_reshape (ndarray) – The array that requires reshaping. This has dimensions “percentiles” by “points”, where “points” is a flattened array of all the other original dimensions that needs reshaping.
original_cube (Cube) – Cube slice containing the desired shape to be reshaped to, apart from the probabilistic dimension. This would typically be expected to be either [time, y, x] or [y, x].
n_percentiles (int) – Length of the required probabilistic dimension (“percentiles”).

Return type:

ndarray

Returns:

The array after reshaping.

Raises:

ValueError – If the probabilistic dimension is not the first on the original_cube.
CoordinateNotFoundError – If the input_probabilistic_dimension_name is not a coordinate on the original_cube.

slow_interp_same_x(x, xp, fp)[source]

For each row i of fp, calculate np.interp(x, xp, fp[i, :]). :type x: ndarray :param x: 1-D array :type xp: ndarray :param xp: 1-D array, sorted in non-decreasing order :type fp: ndarray :param fp: 2-D array with len(xp) columns

Return type:

ndarray

Returns:

2-D array with shape (len(fp), len(x)), with each row i equal to: np.interp(x, xp, fp[i, :])

slow_interp_same_y(x, xp, fp)[source]

For each row i of xp, do the equivalent of np.interp(x, xp[i], fp).

Parameters:

x (ndarray) – 1-d array
xp (ndarray) – n * m array, each row must be in non-decreasing order
fp (ndarray) – 1-d array with length m

Return type:

ndarray

Returns:

n * len(x) array where each row i is equal to np.interp(x, xp[i], fp)