improver.spotdata.neighbour_finding module

Neighbour finding for the Improver site specific process chain.

class NeighbourSelection(land_constraint=False, minimum_dz=False, search_radius=10000.0, site_coordinate_system=<Projected CRS: +proj=eqc +ellps=WGS84 +a=6378137.0 +lon_0=0.0 +to ...> Name: unknown Axis Info [cartesian]: - E[east]: Easting (unknown) - N[north]: Northing (unknown) - h[up]: Ellipsoidal height (metre) Area of Use: - undefined Coordinate Operation: - name: unknown - method: Equidistant Cylindrical Datum: unknown - Ellipsoid: WGS 84 - Prime Meridian: Greenwich, site_x_coordinate='longitude', site_y_coordinate='latitude', node_limit=36, unique_site_id_key=None)[source]

Bases: BasePlugin

For the selection of a grid point near an arbitrary coordinate, where the selection may be the nearest point, or a point that fulfils other imposed constraints.

Constraints available for determining the neighbours are:

  1. land_constraint which requires the selected point to be on land.

  2. minimum_dz which minimises the vertical displacement between the given coordinate (when an altitude is provided) and the grid point where its altitude is provided by the relevant model or high resolution orography. Note that spot coordinates provided without an altitude are given the altitude of the nearest grid point taken from the orography cube.

  3. A combination of the above, where the land constraint is primary and out of available land points, the one with the minimal vertical displacement is chosen.

__init__(land_constraint=False, minimum_dz=False, search_radius=10000.0, site_coordinate_system=<Projected CRS: +proj=eqc +ellps=WGS84 +a=6378137.0 +lon_0=0.0 +to ...> Name: unknown Axis Info [cartesian]: - E[east]: Easting (unknown) - N[north]: Northing (unknown) - h[up]: Ellipsoidal height (metre) Area of Use: - undefined Coordinate Operation: - name: unknown - method: Equidistant Cylindrical Datum: unknown - Ellipsoid: WGS 84 - Prime Meridian: Greenwich, site_x_coordinate='longitude', site_y_coordinate='latitude', node_limit=36, unique_site_id_key=None)[source]
Parameters:
  • land_constraint (bool) – If True the selected neighbouring grid point must be on land, where this is determined using a land_mask.

  • minimum_dz (bool) – If True the selected neighbouring grid point must be chosen to minimise the vertical displacement compared to the site altitude.

  • search_radius (float) – The radius in metres from a spot site within which to search for a grid point neighbour.

  • site_coordinate_system (CRS) – The coordinate system of the sitelist coordinates that will be provided. This defaults to be a latitude/longitude grid, a PlateCarree projection.

  • site_x_coordinate (str) – The key that identifies site x coordinates in the provided site dictionary. Defaults to longitude.

  • site_y_coordinate (str) – The key that identifies site y coordinates in the provided site dictionary. Defaults to latitude.

  • node_limit (int) – The upper limit for the number of nearest neighbours to return when querying the tree for a selection of neighbours from which one matching the minimum_dz constraint will be picked.

  • unique_site_id_key (Optional[str]) – Key in the provided site list that corresponds to a unique numerical ID for every site (up to 8 digits). If this optional key is provided such an identifier must exist for every site. This key will also be used to name the resulting unique ID coordinate on the constructed cube. Values in this coordinate will be recorded as strings, with all numbers padded to 8-digits, e.g. “00012345”.

_abc_impl = <_abc_data object>
_transform_sites_coordinate_system(x_points, y_points, target_crs)[source]

Function to convert coordinate pairs that specify spot sites into the coordinate system of the model from which data will be extracted. Note that the cartopy functionality returns a z-coordinate which we do not want in this case, as such only the first two columns are returned.

Parameters:
  • x_points (ndarray) – An array of x coordinates to be transformed in conjunction with the corresponding y coordinates.

  • y_points (ndarray) – An array of y coordinates to be transformed in conjunction with the corresponding x coordinates.

  • target_crs (CRS) – Coordinate system to which the site coordinates should be transformed. This should be the coordinate system of the model from which data will be spot extracted.

Return type:

ndarray

Returns:

An array containing the x and y coordinates of the spot sites in the target coordinate system, shaped as (n_sites, 2). The z coordinate column is excluded from the return.

build_KDTree(land_mask)[source]

Build a KDTree for extracting the nearest point or points to a site. The tree can be built with a constrained set of grid points, e.g. only land points, if required.

Parameters:

land_mask (Cube) – A land mask cube for the model/grid from which grid point neighbours are being selected.

Return type:

Tuple[cKDTree, ndarray]

Returns:

  • A KDTree containing the required nodes, built using the scipy cKDTree method.

  • An array of shape (n_nodes, 2) that contains the x and y indices that correspond to the selected node, e.g. node=100 –> x_coord_index=10, y_coord_index=300, index_nodes[100] = [10, 300]

check_sites_are_within_domain(sites, site_coords, site_x_coords, site_y_coords, cube)[source]

A function to remove sites from consideration if they fall outside the domain of the provided model cube. A warning is raised and the details of each rejected site are printed.

Parameters:
  • sites (List[Dict[str, Any]]) –

    A list of dictionaries defining the spot sites for which neighbours are to be found. e.g.:

    [{‘altitude’: 11.0, ‘latitude’: 57.867000579833984,

    ’longitude’: -5.632999897003174, ‘wmo_id’: 3034}]

  • site_coords (ndarray) – An array of shape (n_sites, 2) that contains the spot site coordinates in the coordinate system of the model cube.

  • site_x_coords (ndarray) – The x coordinates of the spot sites in their original coordinate system, from which invalid sites must be removed.

  • site_y_coords (ndarray) – The y coordinates of the spot sites in their original coordinate system, from which invalid sites must be removed.

  • cube (Cube) – A cube that is representative of the model/grid from which spot data will be extracted.

Return type:

Tuple[ndarray, ndarray, ndarray, ndarray]

Returns:

  • The sites modified to filter out the sites falling outside the grid domain of the cube.

  • The site_coords modified to filter out the sites falling outside the grid domain of the cube.

  • The x_coords modified to filter out the sites falling outside the grid domain of the cube.

  • The y_coords modified to filter out the sites falling outside the grid domain of the cube.

static geocentric_cartesian(cube, x_coords, y_coords)[source]

A function to convert a global (lat/lon) coordinate system into a geocentric (3D trignonometric) system. This function ignores orographic height differences between coordinates, giving a 2D projected neighbourhood akin to selecting a neighbourhood of grid points about a point without considering their vertical displacement.

Parameters:
  • cube (Cube) – A cube from which is taken the globe for which the geocentric coordinates are being calculated.

  • x_coords (ndarray) – An array of x coordinates that will represent one axis of the mesh of coordinates to be transformed.

  • y_coords (ndarray) – An array of y coordinates that will represent one axis of the mesh of coordinates to be transformed.

Return type:

ndarray

Returns:

An array of all the xyz combinations that describe the nodes of the grid, now in 3D geocentric cartesian coordinates. The shape of the array is (n_nodes, 3), order x[:, 0], y[:, 1], z[:, 2].

static get_nearest_indices(site_coords, cube)[source]

Uses the iris cube method nearest_neighbour_index to find the nearest grid points to a site.

Parameters:
  • site_coords (ndarray) – An array of shape (n_sites, 2) that contains the x and y coordinates of the sites.

  • cube (Cube) – Cube containing a representative grid.

Return type:

ndarray

Returns:

A list of shape (n_sites, 2) that contains the x and y indices of the nearest grid points to the sites.

neighbour_finding_method_name()[source]

Create a name to describe the neighbour method based on the constraints provided.

Return type:

str

Returns:

A string that describes the neighbour finding method employed. This is essentially a concatenation of the options.

process(sites, orography, land_mask)[source]

Using the constraints provided, find the nearest grid point neighbours to the given spot sites for the model/grid given by the input cubes. Returned is a cube that contains the defining characteristics of the spot sites (e.g. x coordinate, y coordinate, altitude) and the indices of the selected grid point neighbour.

Parameters:
  • sites (List[Dict[str, Any]]) –

    A list of dictionaries defining the spot sites for which neighbours are to be found. e.g.:

    [{‘altitude’: 11.0, ‘latitude’: 57.867000579833984,

    ’longitude’: -5.632999897003174, ‘wmo_id’: 3034}]

  • orography (Cube) – A cube of orography, used to obtain the grid point altitudes.

  • land_mask (Cube) – A land mask cube for the model/grid from which grid point neighbours are being selected, with land points set to one and sea points set to zero.

Return type:

Cube

Returns:

A cube containing both the spot site information and for each the grid point indices of its nearest neighbour as per the imposed constraints.

Raises:
  • KeyError – If a unique_site_id is in use but unique_site_id is not available for every site in sites.

  • ValueError – If a unique_site_id is in use but the unique_site_id is not unique for every site.

  • ValueError – If any unique IDs are longer than 8 digits.

select_minimum_dz(orography, site_altitude, index_nodes, distance, indices)[source]

Given a selection of nearest neighbours to a given site, this function calculates the absolute vertical displacement between the site and the neighbours. It then returns grid indices of the neighbour with the minimum vertical displacement (i.e. at the most similar altitude). The number of neighbours to consider is a maximum of node_limit, but these may be limited by the imposed search_radius, or this limit may be insufficient to reach the search radius, in which case a warning is raised.

Parameters:
  • orography (Cube) – A cube of orography, used to obtain the grid point altitudes.

  • site_altitude (float) – The altitude of the spot site being considered.

  • index_nodes (ndarray) – An array of shape (n_nodes, 2) that contains the x and y indices that correspond to the selected node,

  • distance (ndarray) – An array that contains the distances from the spot site to each grid point neighbour being considered. The number maybe np.inf if the site is beyond the search_radius.

  • indices (ndarray) – An array of tree node indices identifying the neigbouring grid points, the list corresponding to the array of distances.

Return type:

Optional[ndarray]

Returns:

A 2-element array giving the x and y indices of the chosen grid point neighbour. Returns None if no valid neighbours were found in the tree query.