improver.threshold module

Module containing thresholding classes.

class LatitudeDependentThreshold(threshold_function, threshold_units=None, comparison_operator='>')[source]

Bases: Threshold

Apply a latitude-dependent threshold truth criterion to a cube.

Calculates the threshold truth values based on the threshold function provided. A cube will be returned with a new threshold dimension auxillary coordinate on the latitude axis.

Can operate on multiple time sequences within a cube.

__init__(threshold_function, threshold_units=None, comparison_operator='>')[source]

Sets up latitude-dependent threshold class

Parameters:

threshold_function (callable) – A function which takes a latitude value (in degrees) and returns the desired threshold.
threshold_units (Optional[str]) – Units of the threshold values. If not provided the units are assumed to be the same as those of the input cube.
comparison_operator (str) – Indicates the comparison_operator to use with the threshold. e.g. ‘ge’ or ‘>=’ to evaluate ‘data >= threshold’ or ‘<’ to evaluate ‘data < threshold’. When using fuzzy thresholds, there is no difference between < and <= or > and >=. Valid choices: > >= < <= gt ge lt le.

_abc_impl = <_abc_data object>

_add_latitude_threshold_coord(cube, threshold)[source]

Add a 1D threshold-type coordinate with correct name and units to a 2D slice containing thresholded data. Assumes latitude coordinate is always the penultimate one (which standardise will have enforced)

Parameters:

cube (Cube) – Cube containing thresholded data (1s and 0s)
threshold (ndarray) – Values at which the data has been thresholded (matches cube’s y-axis)

Return type:

None

process(input_cube)[source]

Convert each point to a truth value based on provided threshold, fuzzy bound, and vicinity values. If the plugin has a “threshold_units” member, this is used to convert a copy of the input_cube into the units specified.

Parameters:

input_cube (Cube) – Cube to threshold. Must have a latitude coordinate.

Return type:

Cube

Returns:

Cube after a threshold has been applied. The data within this cube will contain values between 0 and 1 to indicate whether a given threshold has been exceeded or not.

The cube meta-data will contain: * Input_cube name prepended with probability_of_X_above(or below)_threshold (where X is the diagnostic under consideration) * Threshold dimension coordinate with same units as input_cube * Threshold attribute (“greater_than”, “greater_than_or_equal_to”, “less_than”, or less_than_or_equal_to” depending on the operator) * Cube units set to (1).

Raises:

ValueError – if a np.nan value is detected within the input cube.

class Threshold(threshold_values=None, threshold_config=None, fuzzy_factor=None, threshold_units=None, comparison_operator='>', collapse_coord=None, vicinity=None, fill_masked=None)[source]

Bases: PostProcessingPlugin

Apply a threshold truth criterion to a cube.

Calculate the threshold truth values based on a linear membership function around the threshold values provided. A cube will be returned with a new threshold dimension coordinate.

Can operate on multiple time sequences within a cube.

__init__(threshold_values=None, threshold_config=None, fuzzy_factor=None, threshold_units=None, comparison_operator='>', collapse_coord=None, vicinity=None, fill_masked=None)[source]

Set up for processing an in-or-out of threshold field, including the generation of fuzzy_bounds which are required to threshold an input cube (through self.process(cube)). If fuzzy_factor is not None, fuzzy bounds are calculated using the threshold value in the units in which it is provided.

The usage of fuzzy_factor is exemplified as follows:

For a 6 mm/hr threshold with a 0.75 fuzzy factor, a range of 25% around this threshold (between (6*0.75=) 4.5 and (6*(2-0.75)=) 7.5) would be generated. The probabilities of exceeding values within this range are scaled linearly, so that 4.5 mm/hr yields a thresholded value of 0 and 7.5 mm/hr yields a thresholded value of 1. Therefore, in this case, the thresholded exceedance probabilities between 4.5 mm/hr and 7.5 mm/hr would follow the pattern:

Data value  | Probability
------------|-------------
5     |   0
0     |   0.167
5     |   0.333
0     |   0.5
5     |   0.667
0     |   0.833
5     |   1.0

Parameters:

threshold_values (Union[float, List[float], None]) – Threshold value or values (e.g. 270K, 300K) to use when calculating the probability of the input relative to the threshold value(s). The units of these values, e.g. K in the example can be defined using the threshold_units argument or are otherwise assumed to match the units of the diagnostic being thresholded. threshold_values and and threshold_config are mutually exclusive arguments, defining both will lead to an exception.
threshold_config (dict) – Threshold configuration containing threshold values and (optionally) fuzzy bounds. Best used in combination with ‘threshold_units’. It should contain a dictionary of strings that can be interpreted as floats with the structure: “THRESHOLD_VALUE”: [LOWER_BOUND, UPPER_BOUND] e.g: {“280.0”: [278.0, 282.0], “290.0”: [288.0, 292.0]}, or with structure “THRESHOLD_VALUE”: “None” (no fuzzy bounds). Repeated thresholds with different bounds are ignored; only the last duplicate will be used. threshold_values and and threshold_config are mutually exclusive arguments, defining both will lead to an exception.
fuzzy_factor (Optional[float]) – Optional: specifies lower bound for fuzzy membership value when multiplied by each threshold. Upper bound is equivalent linear distance above threshold.
threshold_units (Optional[str]) – Units of the threshold values. If not provided the units are assumed to be the same as those of the input cube.
comparison_operator (str) – Indicates the comparison_operator to use with the threshold. e.g. ‘ge’ or ‘>=’ to evaluate ‘data >= threshold’ or ‘<’ to evaluate ‘data < threshold’. When using fuzzy thresholds, there is no difference between < and <= or > and >=. Valid choices: > >= < <= gt ge lt le.
collapse_coord (Optional[str]) – A coordinate over which an average is calculated, collapsing this coordinate. The only supported options are “realization” or “percentile”. If “percentile” is requested, the percentile coordinate will be rebadged as a realization coordinate prior to collapse. The percentile coordinate needs to be evenly spaced around the 50th percentile to allow successful conversion from percentiles to realizations and subsequent collapsing over the realization coordinate.
fill_masked (Optional[float]) – If provided all masked points in cube will be replaced with the provided value.
vicinity (Union[float, List[float], None]) – A list of vicinity radii to use to calculate maximum in vicinity thresholded values. This must be done prior to realization collapse.

Raises:

ValueError – If threshold_config and threshold_values are both set
ValueError – If neither threshold_config or threshold_values are set
ValueError – If both fuzzy_factor and bounds within the threshold_config are set.
ValueError – If the fuzzy_factor is not strictly between 0 and 1.
ValueError – If using a fuzzy factor with a threshold of 0.0.
ValueError – Can only collapse over a realization coordinate or a percentile coordinate that has been rebadged as a realization coordinate.

_abc_impl = <_abc_data object>

_add_threshold_coord(cube, threshold)[source]

Add a scalar threshold-type coordinate with correct name and units to a 2D slice containing thresholded data.

The ‘threshold’ coordinate will be float64 to avoid rounding errors during possible unit conversion.

Parameters:

cube (Cube) – Cube containing thresholded data (1s and 0s)
threshold (float) – Value at which the data has been thresholded

Return type:

None

_calculate_truth_value(cube, threshold, bounds)[source]

Compares the diagnostic values to the threshold value, converting units and applying fuzzy bounds as required. Returns the truth value.

Parameters:

cube (Cube) – A cube containing the diagnostic values. The cube rather than array is passed in to allow for unit conversion.
threshold (float) – A single threshold value against which to compare the diagnostic values.
bounds (Tuple[float, float]) – The fuzzy bounds used for applying fuzzy thresholding.

Returns:

An array of truth values given by the comparison of the diagnostic data to the threshold value. This is returned at the default float precision.

Return type:

truth_value

_check_fuzzy_bounds()[source]

If fuzzy bounds have been set from the command line, check they are consistent with the required thresholds

Return type:: None

_create_threshold_cube(cube)[source]

Create a cube with suitable metadata and zeroed data array for storing the thresholded diagnostic data.

Parameters:: cube (Cube) – A template cube from which to take the data shape.
Returns:: A cube of a suitable form for storing and describing the thresholded data.
Return type:: thresholded_cube

_decode_comparison_operator_string()[source]

Sets self.comparison_operator based on self.comparison_operator_string. This is a dict containing the keys ‘function’ and ‘spp_string’. Raises errors if invalid options are found.

Raises:: ValueError – If self.comparison_operator_string does not match a defined method.
Return type:: None

_generate_fuzzy_bounds(fuzzy_factor_loc)[source]

Construct fuzzy bounds from a fuzzy factor. If the fuzzy factor is 1, the fuzzy bounds match the threshold values for basic thresholding.

Return type:: List[Tuple[float, float]]

static _set_thresholds(threshold_values, threshold_config)[source]

Interprets a threshold_config dictionary if provided, or ensures that a list of thresholds has suitable precision.

Parameters:

threshold_values (Union[float, List[float], None]) – A list of threshold values or a single threshold value.
threshold_config (Optional[dict]) – A dictionary defining threshold values and optionally upper and lower bounds for those values to apply fuzzy thresholding.

Returns:

thresholds:: A list of threshold values as float64 type.
fuzzy_bounds:: A list of tuples that define the upper and lower bounds associated with each threshold value, these also as float64 type. If these are not set, None is returned instead.

Return type:

A tuple containing

_update_metadata(cube)[source]

Rename the cube and add attributes to the threshold coordinate after merging

Parameters:: cube (Cube) – Cube containing thresholded data
Return type:: None

_vicinity_processing(thresholded_cube, truth_value, unmasked, landmask, grid_point_radii, index)[source]

Apply max in vicinity processing to the thresholded values. The resulting modified threshold values are changed in place in thresholded_cube.

Parameters:

thresholded_cube (Cube) – The cube into which the resulting values are added.
truth_value (ndarray) – An array of thresholded values prior to the application of vicinity processing.
unmasked (ndarray) – Array identifying unmasked data points that should be updated.
landmask (ndarray) – A binary grid of the same size as truth_value that differentiates between land and sea points to allow the different surface types to be processed independently.
grid_point_radii (List[int]) – The vicinity radius to apply expressed as a number of grid cells.
index (int) – Index corresponding to the threshold coordinate to identify which array we are summing the contribution into.

process(input_cube, landmask=None)[source]

Convert each point to a truth value based on provided threshold values. The truth value may or may not be fuzzy depending upon if fuzzy_bounds are supplied. If the plugin has a “threshold_units” member, this is used to convert both thresholds and fuzzy bounds into the units of the input cube.

Parameters:

input_cube (Cube) – Cube to threshold. The code is dimension-agnostic.
landmask (Optional[Cube]) – Cube containing a landmask. Used with vicinity processing only.

Return type:

Cube

Returns:

Cube after a threshold has been applied, possibly using fuzzy bounds, and / or with vicinity processing applied to return a maximum in vicinity value. The data within this cube will contain values between 0 and 1 to indicate whether a given threshold has been exceeded or not.

The cube meta-data will contain: * Input_cube name prepended with probability_of_X_above(or below)_threshold (where X is the diagnostic under consideration) * The cube name will be suffixed with _in_vicinity if vicinity processing has been applied. * Threshold dimension coordinate with same units as input_cube * Threshold attribute (“greater_than”, “greater_than_or_equal_to”, “less_than”, or less_than_or_equal_to” depending on the operator) * Cube units set to (1).

Raises:

ValueError – Cannot apply land-mask cube without in-vicinity processing.
ValueError – if a np.nan value is detected within the input cube.