improver.nbhood.nbhood module

Module containing neighbourhood processing utilities.

class BaseNeighbourhoodProcessing(radii, lead_times=None)[source]

Bases: PostProcessingPlugin

A base class used to set up neighbourhood radii for a given cube based on the forecast period of that cube if required.

__init__(radii, lead_times=None)[source]

Create a base neighbourhood processing plugin that processes radii related arguments.

radii:
The radii in metres of the neighbourhood to apply. Rounded up to convert into integer number of grid points east and north, based on the characteristic spacing at the zero indices of the cube projection-x and y coords.

lead_times:
List of lead times or forecast periods, at which the radii within ‘radii’ are defined. The lead times are expected in hours.

_abc_impl = <_abc_data object>

_find_radii(cube_lead_times=None)[source]

Revise radius or radii for found lead times. If cube_lead_times is None, no automatic adjustment of the radii will take place. Otherwise it will interpolate to find the radius at each cube lead time as required.

Parameters:: cube_lead_times (Optional[ndarray]) – Array of forecast times found in cube.
Return type:: Union[float, ndarray]
Returns:: Required neighbourhood sizes.

process(cube)[source]

Supply a cube with a forecast period coordinate in order to set the correct radius for use in neighbourhood processing.

Also checks there are no unmasked NaNs in the input cube.

Parameters:: cube (Cube) – Cube to apply a neighbourhood processing method.
Returns:: The unaltered input cube.
Return type:: cube

class GeneratePercentilesFromANeighbourhood(radii, lead_times=None, percentiles=(0, 5, 10, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 95, 100))[source]

Bases: BaseNeighbourhoodProcessing

Class for generating percentiles from a circular neighbourhood.

__init__(radii, lead_times=None, percentiles=(0, 5, 10, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 95, 100))[source]

Create a neighbourhood processing subclass that generates percentiles from a 2D circular neighbourhood. A maximum kernel radius of 500 grid cells is imposed in order to avoid computational inefficiency and possible memory errors.

Parameters:

radii (Union[float, List[float]]) – The radii in metres of the neighbourhood to apply. Rounded up to convert into integer number of grid points east and north, based on the characteristic spacing at the zero indices of the cube projection-x and y coords.
lead_times (Optional[List]) – List of lead times or forecast periods, at which the radii within ‘radii’ are defined. The lead times are expected in hours.
percentiles (List) – Percentile values at which to calculate; if not provided uses DEFAULT_PERCENTILES.

_abc_impl = <_abc_data object>

make_percentile_cube(cube)[source]

Returns a cube with the same metadata as the sample cube but with an added percentile dimension.

Parameters:: cube (Cube) – Cube to copy meta data from.
Return type:: Cube
Returns:: Cube like input but with added percentiles coordinate. Each slice along this coordinate is identical.

pad_and_unpad_cube(slice_2d, kernel)[source]

Method to pad and unpad a two dimensional cube. The input array is padded and percentiles are calculated using a neighbourhood around each point. The resulting percentile data are unpadded and put into a cube.

Parameters:

slice_2d (Cube) – 2d cube to be padded with a halo.
kernel (ndarray) – Kernel used to specify the neighbourhood to consider when calculating the percentiles within a neighbourhood.

Return type:

Cube

Returns:

A cube containing percentiles generated from a neighbourhood.

Examples

Take the input slice_2d cube with the data, where 1 is an occurrence and 0 is an non-occurrence:
```
[[1., 1., 1.,],
 [1., 0., 1.],
 [1., 1., 1.]]
```
Define a kernel. This kernel is effectively placed over each point within the input data. Note that the input data is padded prior to placing the kernel over each point, so that the kernel does not exceed the bounds of the padded data:
```
[[ 0.,  0.,  1.,  0.,  0.],
 [ 0.,  1.,  1.,  1.,  0.],
 [ 1.,  1.,  1.,  1.,  1.],
 [ 0.,  1.,  1.,  1.,  0.],
 [ 0.,  0.,  1.,  0.,  0.]]
```

Pad the input data. The extent of the padding is given by the shape of the kernel. The number of values included within the calculation of the mean is determined by the size of the kernel:

[[ 0.75,  0.75,  1.  ,  0.5 ,  1.  ,  0.75,  0.75],
 [ 0.75,  0.75,  1.  ,  0.5 ,  1.  ,  0.75,  0.75],
 [ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ,  1.  ,  1.  ],
 [ 0.5 ,  0.5 ,  1.  ,  0.  ,  1.  ,  0.5 ,  0.5 ],
 [ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ,  1.  ,  1.  ],
 [ 0.75,  0.75,  1.  ,  0.5 ,  1.  ,  0.75,  0.75],
 [ 0.75,  0.75,  1.  ,  0.5 ,  1.  ,  0.75,  0.75]]

Calculate the values at the percentiles: [10]. For the point in the upper right corner within the original input data e.g.

[[->1.<-, 1., 1.,],
 [  1.,   0., 1.],
 [  1.,   1., 1.]]

When the kernel is placed over this point within the padded data, then the following points are included:

[[   0.75,    0.75,  ->1.<-,  0.5 ,  1.  ,  0.75,  0.75],
 [   0.75,  ->0.75,    1.  ,  0.5<-, 1.  ,  0.75,  0.75],
 [ ->1.  ,    1.  ,    1.  ,  1.  ,  1.<-,  1.  ,  1.  ],
 [   0.5 ,  ->0.5 ,    1.  ,  0.<-,  1.  ,  0.5 ,  0.5 ],
 [   1.  ,    1.  ,  ->1.<-,  1.  ,  1.  ,  1.  ,  1.  ],
 [   0.75,    0.75,    1.  ,  0.5 ,  1.  ,  0.75,  0.75],
 [   0.75,    0.75,    1.  ,  0.5 ,  1.  ,  0.75,  0.75]]

This gives:

[0, 0.5, 0.5, 0.75, 1., 1., 1., 1., 1., 1., 1., 1., 1.]

As there are 13 points within the kernel, this gives the following relationship between percentiles and values.

Values

Percentile

0

0.5

8.33

0.5

16.67

0.75

25.0

33.33

41.67

50.0

58.33

66.67

75.0

83.33

91.66

Therefore, for the 10th percentile at the value returned for the point in the upper right corner of the original input data is 0.5. When this process is applied to every point within the original input data, the result is:

[[[ 0.75,  0.75,  0.5 ,  0.5 ,  0.5 ,  0.75,  0.75],
  [ 0.75,  0.55,  0.55,  0.5 ,  0.55,  0.55,  0.55],
  [ 0.55,  0.55,  0.5 ,  0.5 ,  0.5 ,  0.5 ,  0.5 ],
  [ 0.5 ,  0.5 ,  0.5 ,  0.5 ,  0.5 ,  0.5 ,  0.5 ],
  [ 0.5 ,  0.5 ,  0.5 ,  0.5 ,  0.5 ,  0.55,  0.55],
  [ 0.55,  0.55,  0.55,  0.5 ,  0.55,  0.55,  0.75],
  [ 0.75,  0.75,  0.5 ,  0.5 ,  0.5 ,  0.75,  0.75]]],

The padding is then removed to give:

[[[ 0.5,  0.5,  0.5],
  [ 0.5,  0.5,  0.5],
  [ 0.5,  0.5,  0.5]]]

process(cube)[source]

Method to apply a circular kernel to the data within the input cube in order to derive percentiles over the kernel.

Parameters:: cube (Cube) – Cube containing array to apply processing to. Usually ensemble realizations.
Return type:: Cube
Returns:: Cube containing the percentile fields. Has percentile as an added dimension.

class NeighbourhoodProcessing(neighbourhood_method, radii, lead_times=None, weighted_mode=False, sum_only=False, re_mask=True)[source]

Bases: BaseNeighbourhoodProcessing

Class for applying neighbourhood processing to produce a smoothed field within the chosen neighbourhood.

__init__(neighbourhood_method, radii, lead_times=None, weighted_mode=False, sum_only=False, re_mask=True)[source]

Initialise class.

Parameters:

neighbourhood_method (str) – Name of the neighbourhood method to use. Options: ‘circular’, ‘square’.
radii (Union[float, List[float]]) – The radii in metres of the neighbourhood to apply. Rounded up to convert into integer number of grid points east and north, based on the characteristic spacing at the zero indices of the cube projection-x and y coords.
lead_times (Optional[List]) – List of lead times or forecast periods, at which the radii within ‘radii’ are defined. The lead times are expected in hours.
weighted_mode (bool) – If True, use a circle for neighbourhood kernel with weighting decreasing with radius. If False, use a circle with constant weighting.
sum_only (bool) – If true, return neighbourhood sum instead of mean.
re_mask (bool) – If re_mask is True, the original un-neighbourhood processed mask is applied to mask out the neighbourhood processed cube. If re_mask is False, the original un-neighbourhood processed mask is not applied. Therefore, the neighbourhood processing may result in values being present in areas that were originally masked.

Raises:

ValueError – If the neighbourhood_method is not either “square” or “circular”.
ValueError – If the weighted_mode is used with a neighbourhood_method that is not “circular”.

_abc_impl = <_abc_data object>

_calculate_neighbourhood(data, mask=None)[source]

Apply neighbourhood processing. Ensures that masked data does not contribute to the neighbourhood result. Masked data is either data that is masked in the input data array or that corresponds to zeros in the input mask.

Parameters:

data (ndarray) – Input data array.
mask (Optional[ndarray]) – Mask of valid input data elements.

Return type:

Union[ndarray, MaskedArray]

Returns:

Array containing the smoothed field after the neighbourhood method has been applied.

_do_nbhood_sum(data, max_extreme=None)[source]

Calculate the sum-in-area from an array. As this can be expensive, the method first checks for the extreme cases where the data are: All zeros (result will be all zeros too) All ones (result will be max_extreme, if supplied) Contains outer rows / columns that are completely zero or completely one, these rows and columns are trimmed before calculating the area sum and their contents will be as for the appropriate all case above.

Parameters:

data (ndarray) – Input data array where any masking has already been replaced with zeroes.
max_extreme (Optional[ndarray]) – Used as the result for any large areas of data that are all ones, allowing an optimisation to be used. If not supplied, the optimisation will only be used for large areas of zeroes, where a return of zero can be safely predicted.

Return type:

ndarray

Returns:

Array containing the sum of data within the usable neighbourhood of each point.

process(cube, mask_cube=None)[source]

Call the methods required to apply a neighbourhood processing to a cube.

Applies neighbourhood processing to each 2D x-y-slice of the input cube.

If the input cube is masked the neighbourhood sum is calculated from the total of the unmasked data in the neighbourhood around each grid point. The neighbourhood mean is then calculated by dividing the neighbourhood sum at each grid point by the total number of valid grid points that contributed to that sum. If a mask_cube is provided then this is used to mask each x-y-slice prior to the neighbourhood sum or mean being calculated.

Parameters:

cube (Cube) – Cube containing the array to which the neighbourhood processing will be applied. Usually thresholded data.
mask_cube (Optional[Cube]) – Cube containing the array to be used as a mask. Zero values in this array are taken as points to be masked.

Return type:

Cube

Returns:

Cube containing the smoothed field after the neighbourhood method has been applied.

check_radius_against_distance(cube, radius)[source]

Check required distance isn’t greater than the size of the domain.

Parameters:

cube (Cube) – The cube to check.
radius (float) – The radius, which cannot be more than half of the size of the domain.

Return type:

None

circular_kernel(ranges, weighted_mode)[source]

Method to create a circular kernel.

Parameters:

ranges (int) – Number of grid cells in the x and y direction used to create the kernel.
weighted_mode (bool) – If True, use a circle for neighbourhood kernel with weighting decreasing with radius. If False, use a circle with constant weighting.

Return type:

ndarray

Returns:

Array containing the circular smoothing kernel. This will have the same number of dimensions as fullranges.