improver.categorical.utilities module

This module defines the utilities required for decision tree plugin

_check_diagnostic_lists_consistency(query)[source]

Checks if specific input lists have same nested list structure. e.g. [‘item’] != [[‘item’]]

Parameters:

query (Dict[str, Any]) – of categorical decision-making information

Return type:

bool

_check_nested_list_consistency(query)[source]

Return True if all input lists have same nested list structure. e.g. [‘item’] != [[‘item’]]

Parameters:

query (List[List[Any]]) – Nested lists to check for consistency.

Return type:

bool

Returns:

True if diagnostic query lists have same nested list structure, else returns False.

categorical_attributes(decision_tree, name)[source]

Extracts leaf items from decision_tree and creates cube attributes from them.

Parameters:
  • decision_tree (Dict) – Decision tree definition, provided as a dictionary.

  • name (str) – Name of the categorical variable

Return type:

Dict[str, Any]

Returns:

Attributes defining category meanings.

check_tree(decision_tree, target_period=None)[source]

Perform some checks to ensure the provided decision tree is valid.

Parameters:
  • decision_tree (Dict[str, Dict[str, Any]]) – Decision tree definition, provided as a dictionary.

  • target_period (Optional[int]) – The period in seconds that the categorical data being produced should represent. This should correspond with any period diagnostics, e.g. precipitation accumulation, being used as input. This is used to scale any threshold values that are defined with an associated period in the decision tree.

Return type:

str

Returns:

A list of problems found in the decision tree, or if none are found, the required input diagnostics.

Raises:

ValueError – If decision_tree is not a dictionary.

day_night_map(decision_tree)[source]

Returns a dict showing which night values are linked to which day values

Parameters:

decision_tree (Dict[str, Dict[str, Union[str, List]]]) – Decision tree definition, provided as a dictionary.

Return type:

Dict

Returns:

dict showing which night categories (values) are linked to which day categories (keys)

expand_nested_lists(query, key)[source]

Produce flat lists from list and nested lists.

Parameters:
  • query (Dict[str, Any]) – A single query from the decision tree.

  • key (str) – A string denoting the field to be taken from the dict.

Return type:

List[Any]

Returns:

A 1D list containing all the values for a given key.

get_parameter_names(diagnostic_fields)[source]

For diagnostic fields that can contain operators and values, strips out just the parameter names.

Parameters:

diagnostic_fields (List[List[str]]) –

Return type:

List[List[str]]

Returns:

The parameter names

interrogate_decision_tree(decision_tree)[source]

Obtain a list of necessary inputs from the decision tree as it is currently defined. Return a formatted string that contains the diagnostic names, the thresholds needed, and whether they are thresholded above or below these values. If the required diagnostic is deterministic then just the diagnostic names are outputted. This output is used with the –check-tree option in the CLI, informing the user of the necessary inputs for a provided decision tree.

Parameters:

decision_tree (Dict[str, Dict[str, Any]]) – The decision tree that is to be interrogated.

Return type:

str

Returns:

Returns a formatted string describing the diagnostics required, including threshold details.

is_decision_node(key, query)[source]

Determine whether a given node is a decision node. The meta node has a key of “meta”, leaf nodes have a query key of “leaf”, everything else is a decision node.

Parameters:
  • key (str) – Decision name (“meta” indicates a non-decision node)

  • query (Dict[str, Any]) – Dict where key “leaf” indicates a non-decision node

Return type:

bool

Returns:

True if query represents a decision node

is_variable(thing)[source]

Identify whether given string is likely to be a variable name by identifying the exceptions.

Parameters:

thing (str) – The string to operate on

Return type:

bool

Returns:

False if thing is one of [“+”, “-”, “*”, “/”] or if float( thing) does not raise a ValueError, else True.

update_daynight(cube, day_night)[source]

Update category depending on whether it is day or night

Parameters:
  • cube (Cube) – Cube containing only daytime categories.

  • day_night (Dict) – Dictionary of day codes (keys) and matching night codes (values)

Return type:

Cube

Returns:

Cube containing day and night categories

Raises:

CoordinateNotFoundError – cube must have time coordinate.

update_tree_thresholds(tree, target_period=None)[source]

Replaces value / unit pairs from tree definition with an Iris AuxCoord that encodes the same information. Also scales any threshold values that have an associated period (e.g. accumulation in 3600 seconds) by a factor to reflect the target period (e.g. a 3-hour, 10800 second, weather symbol).

Parameters:
  • tree (Dict[str, Dict[str, Any]]) – Decision tree.

  • target_period (Optional[int]) – The period in seconds that the categories being produced should represent. This should correspond with any period diagnostics, e.g. precipitation accumulation, being used as input. This is used to scale any threshold values that are defined with an associated period in the decision tree.

Return type:

Dict[str, Dict[str, Any]]

Returns:

The tree now containing AuxCoords instead of value / unit pairs, with period diagnostic threshold values scaled appropriately.

Raises:

ValueError – If thresholds are defined with an associated period and no target_period is provided.