improver.utilities.temporal_interpolation module

Contents

improver.utilities.temporal_interpolation module#

Class for Temporal Interpolation calculations.

class DurationSubdivision(target_period, fidelity=None, night_mask=True, day_mask=False)[source]#

Bases: object

Subdivide a duration diagnostic, e.g. sunshine duration, into shorter periods, optionally applying a night mask to ensure that quantities defined only in the day or night are not spread into night or day periods respectively.

This is a very simple approach. In the case of sunshine duration the duration is divided up evenly across the short periods defined by the fidelity argument. These are then optionally masked to zero for chosen periods (day or night). Values in the non-zeroed periods are then renormalised relative to the original period total, such that the total across the whole period ought to equal the original. This is not always possible as the night mask applied is simpler than e.g. the radiation scheme impact on a 3D orography. As such the renormalisation could yield durations longer than the fidelity period in each non-zeroed period as it tries to allocate e.g. 5 hours of sunlight across 4 non-zeroed hours. This is not physical, so the renormalisation is partnered with a clip that limits the duration allocated to the renormalised periods to not exceed their length. The result of this is that the original sunshine durations cannot be recovered for points that are affected. Instead the calculated night mask is limiting the accuracy to allow the subdivision to occur. This is the cost of this method.

Note that this method cannot account for any weather impacts e.g. cloud that is affecting the sunshine duration in a period. If a 6-hour period is split into three 2-hour periods the split will be even regardless of when thick cloud might occur.

__init__(target_period, fidelity=None, night_mask=True, day_mask=False)[source]#

Define the length of the target periods to be constructed and the intermediate fidelity. This fidelity is the length of the shorter periods into which the data is split and from which the target periods are constructed. A shorter fidelity period allows the time dependent day or night masks to be applied more accurately.

Parameters:
  • target_period (int) – The time period described by the output cubes in seconds. The data will be reconstructed into non-overlapping periods. The target_period must be a factor of the original period.

  • fidelity (Optional[int]) – If provided, the shortest increment in seconds into which the input periods are divided and to which the night mask is applied. The target periods are reconstructed from these shorter periods. Shorter fidelity periods better capture where the day / night discriminator falls. Setting fidelity either to None or equal to target_period will result in a simple subdivision of the original period into the specified target periods with no intermediate fidelity period processing.

  • night_mask (bool) – If true, points that fall at night are zeroed and duration reallocated to day time periods as much as possible.

  • day_mask (bool) – If true, points that fall in the day time are zeroed and duration reallocated to night time periods as much as possible.

Raises:
  • ValueError – If target_period and / or fidelity are not positive integers.

  • ValueError – If day and night mask options are both set True.

_compute_renormalisation_factor(cube, period)[source]#

Compute the renormalisation factor by streaming through all fidelity periods with masking applied, without storing all fidelity cubes simultaneously.

This is used to compute the factor needed to renormalise the fidelity period data so that the total across all fidelity periods matches the original period total after masking.

Parameters:
  • cube (Cube) – The original period cube of duration data.

  • period (int) – The period of the input cube in seconds.

Returns:

An array of renormalisation factors.

Return type:

factor

_make_fidelity_cube(cube, interval_data, interval_start, interval_end)[source]#

Create a single fidelity period cube with masking applied.

Parameters:
  • cube (Cube) – The original period cube, used as a template for metadata.

  • interval_data (ndarray) – The data array already divided by the total number of fidelity intervals.

  • interval_start (int) – The start time of the fidelity interval in seconds since epoch.

  • interval_end (int) – The end time of the fidelity interval in seconds since epoch.

Return type:

Cube

Returns:

A single fidelity period cube with the time coordinate set to the interval bounds and any day or night masking applied.

_process_target_period(cube, period, n_target_periods, target_start, target_end, factor)[source]#

Process a single target period, constructing, masking, renormalising, and collapsing the fidelity cubes into a single target period cube.

Parameters:
  • cube (Cube) – The original duration diagnostic cube.

  • period (int) – The period of the input cube in seconds.

  • n_target_periods (int) – The total number of target periods.

  • target_start (int) – The start time of the target period in seconds since epoch.

  • target_end (int) – The end time of the target period in seconds since epoch.

  • factor (ndarray) – An array of renormalisation factors.

Return type:

Cube

Returns:

A single cube representing the target period.

allocate_data_for_target_period(cube, period, target_start)[source]#

Allocate fractions of the original cube duration diagnostic to the fidelity periods within a single target period, optionally applying a day or night mask to zero out the appropriate periods.

By processing one target period at a time, only the fidelity cubes for that target period are held in memory simultaneously, reducing peak memory usage.

Parameters:
  • cube (Cube) – The original period cube from which duration data will be taken and divided up.

  • period (int) – The period of the input cube in seconds.

  • target_start (int) – The start time of the target period in seconds since epoch.

Return type:

CubeList

Returns:

A CubeList of fidelity period cubes for this target period, with the duration data evenly allocated across fidelity periods and any day or night masking applied.

static cube_period(cube)[source]#

Return the time period of the cube in seconds.

Parameters:

cube (Cube) – The cube for which the period is to be returned.

Returns:

Period of cube time coordinate in seconds.

Return type:

period

process(cube)[source]#

Create target period duration diagnostics from the original duration diagnostic data.

Rather than constructing all fidelity period cubes upfront and storing them in memory, this method pipelines the fidelity construction and collapse steps. For each target period, the fidelity cubes are constructed, masked, renormalised, clipped, and immediately collapsed into a single target period cube before moving on to the next target period. This significantly reduces peak memory usage.

Parameters:

cube (Cube) – The original duration diagnostic cube.

Return type:

Cube

Returns:

A cube containing the target period data with a time dimension with an entry for each period. These periods combined span the original cube’s period.

Raises:
  • ValueError – The target period is not a factor of the input period.

  • ValueError – The fidelity period is supplied but is not less than or equal to

  • the target period.

class ForecastTrajectoryGapFiller(interval_in_minutes=None, interpolation_method='linear', cluster_sources_attribute=None, interpolation_window_in_minutes=None, model_path=None, scaling='minmax', clipping_bounds=None, clip_in_scaled_space=True, clip_to_physical_bounds=False, max_batch=1, parallel_backend=None, n_workers=1, model_loader=None, **kwargs)[source]#

Bases: BasePlugin

Fill gaps in the forecast trajectory using temporal interpolation.

This plugin identifies gaps in a sequence of validity times (i.e. the forecast trajectory from a fixed forecast reference time) and fills them using temporal interpolation. When cluster_sources are configured, it can also identify forecast periods from a fixed forecast reference time that should be regenerated (e.g. when transitioning between forecast sources) even if they exist in the input forecast.

The plugin will: 1. Sort input cubes by validity time 2. Identify missing validity times (gaps) 3. Optionally identify times to regenerate based on cluster sources 4. Use TemporalInterpolation to fill gaps 5. Return a Cube with all validity times

__init__(interval_in_minutes=None, interpolation_method='linear', cluster_sources_attribute=None, interpolation_window_in_minutes=None, model_path=None, scaling='minmax', clipping_bounds=None, clip_in_scaled_space=True, clip_to_physical_bounds=False, max_batch=1, parallel_backend=None, n_workers=1, model_loader=None, **kwargs)[source]#

Initialise the plugin.

Parameters:
  • interval_in_minutes (Optional[int]) – The expected interval between validity times in minutes. Used to identify gaps in the sequence.

  • interpolation_method (str) – Method of interpolation to use. Options: linear, solar, daynight, google_film.

  • cluster_sources_attribute (Optional[str]) – Name of cube attribute containing cluster sources dictionary. The cluster_sources dictionary has a format like: {realization_index: {source_name: [periods]}}. When provided with interpolation_window_in_minutes, enables identification of validity times to regenerate at source transitions.

  • interpolation_window_in_minutes (Optional[int]) – Time window (in minutes) as +/- range around forecast source transitions.

  • model_path (Optional[str]) – Path to TensorFlow Hub module for Google FILM model (if using google_film).

  • scaling (str) – Scaling method for google_film interpolation (log10 or minmax).

  • clipping_bounds (Union[List[float], Tuple[float, float], None]) – Bounds for clipping google_film interpolated data. Can be a tuple or list of two floats.

  • clip_in_scaled_space (bool) – If True, clipping_bounds are applied in scaled space for google_film interpolation.

  • clip_to_physical_bounds (bool) – If True, interpolated data is clipped to physical bounds after inverse scaling for google_film interpolation.

  • max_batch (Optional[int]) – Maximum number of samples to process in a single batch when using the “google_film” interpolation method. This allows memory-efficient chunked inference. If None, all samples are processed at once.

  • parallel_backend (Optional[str]) – If specified, the parallelisation backend to use when performing google_film interpolation. Options are currently the “loky” backend provided by the joblib package. Default is None, which results in no parallelisation.

  • n_workers (Optional[int]) – If using parallel_backend, the number of workers to use for parallel processing. Default is None, which results in the use of 1 core.

  • model_loader (Any) – Optional callable to load the TensorFlow model. This is mainly intended for use in testing where a mock model loader can be supplied. If None, the default model loader will be used.

  • **kwargs – Additional arguments passed to TemporalInterpolation.

_abc_impl = <_abc._abc_data object>#
_assemble_final_cubelist(sorted_cubelist, result_cubes, periods_to_exclude)[source]#

Assemble the final cubelist by combining interpolated and original cubes.

Parameters:
  • sorted_cubelist (CubeList) – Original sorted list of cubes.

  • result_cubes (CubeList) – CubeList of interpolated cubes.

  • periods_to_exclude (set) – Set of forecast periods to exclude from originals.

Return type:

CubeList

Returns:

Final sorted CubeList with all forecast periods.

_calculate_target_time(cube_t0, target_period, t0_period)[source]#

Calculate the target time for interpolation.

Parameters:
  • cube_t0 (Cube) – The cube at the earlier forecast period.

  • target_period (int) – The target forecast period in minutes.

  • t0_period (int) – The earlier forecast period in minutes.

Return type:

datetime

Returns:

The target time as a datetime object.

_create_gap_filling_tasks(missing_periods, sorted_cubelist)[source]#

Create interpolation tasks for missing forecast periods.

Parameters:
  • missing_periods (List[int]) – List of forecast periods (in minutes) that are missing.

  • sorted_cubelist (CubeList) – Sorted list of cubes by forecast period.

Return type:

List[Tuple[str, int, int, int]]

Returns:

List of tuples (task_type, target_period, t0_period, t1_period) for gap filling tasks.

_create_regeneration_tasks(periods_to_regenerate, sorted_cubelist)[source]#

Create interpolation tasks for periods to regenerate.

Parameters:
  • periods_to_regenerate (List[Tuple[int, int, int]]) – List of tuples (transition_period, expected_t0, expected_t1).

  • sorted_cubelist (CubeList) – Sorted list of cubes by forecast period.

Return type:

List[Tuple[str, int, int, int]]

Returns:

List of tuples (task_type, target_period, t0_period, t1_period) for regeneration tasks.

_extract_cube_for_period(cubelist, period)[source]#

Extract a cube for a specific forecast period (in minutes).

Parameters:
  • cubelist (CubeList) – List of cubes to extract from.

  • period (int) – Forecast period in minutes.

Return type:

Cube

Returns:

Cube corresponding to the specified forecast period.

_get_forecast_periods(cubelist)[source]#

Extract forecast periods from cubes in minutes since the reference time.

Parameters:

cubelist (CubeList) – List of cubes to extract forecast periods from.

Return type:

List[int]

Returns:

Sorted list of unique forecast periods in minutes.

_identify_gaps(cubelist)[source]#

Identify missing forecast periods that need filling.

Parameters:

cubelist (CubeList) – List of input cubes.

Return type:

List[int]

Returns:

List of forecast_periods (in minutes) that are missing.

Raises:

ValueError – If interval_in_minutes is not set.

_identify_periods_to_regenerate(cubelist)[source]#

Identify periods to regenerate based on cluster source transitions.

Parameters:

cubelist (CubeList) – List of input cubes.

Return type:

List[Tuple[int, int, int]]

Returns:

List of tuples (transition_period, expected_t0, expected_t1) where transition_period is the forecast period at the source transition, expected_t0 is (transition - window), and expected_t1 is (transition + window).

_identify_source_transitions(cluster_sources, realization_index)[source]#

Identify forecast source transitions for a given realization.

Parameters:
  • cluster_sources (dict) – Dictionary mapping realization indices to their forecast sources and periods.

  • realization_index (int) – The realization index to check for transitions.

Return type:

List[int]

Returns:

List of forecast periods immediately before a source transition. Only includes transitions where the source actually changes.

_interpolate_batch_periods(interpolator, sorted_cubelist, target_periods, t0_period, t1_period)[source]#

Interpolate multiple forecast periods between t0_period and t1_period in one batch.

Parameters:
  • interpolator (TemporalInterpolation) – The TemporalInterpolation plugin to use.

  • sorted_cubelist (CubeList) – Sorted list of cubes by forecast period.

  • target_periods (list) – List of target forecast periods (in minutes).

  • t0_period (int) – The earlier forecast period in minutes.

  • t1_period (int) – The later forecast period in minutes.

Return type:

CubeList

Returns:

CubeList of interpolated cubes for the target periods.

_parse_cluster_sources(cube)[source]#

Parse the cluster sources dictionary from a cube attribute.

Parameters:

cube (Cube) – A cube containing the cluster sources attribute.

Return type:

dict

Returns:

Dictionary mapping realization indices to their forecast sources and periods. Format: {realization_index: {source_name: [periods]}}

Raises:
  • ValueError – If the cluster sources attribute is not a dictionary.

  • ValueError – If the cluster sources JSON string cannot be parsed.

  • ValueError – If the sources for a realization are not a dictionary.

  • ValueError – If the periods for a source are not a list.

_validate_input(cubelist)[source]#

Validate that the input cubelist meets requirements.

Parameters:

cubelist (CubeList) – List of cubes to validate.

Raises:
  • ValueError – If cubelist is empty or has fewer than 2 cubes.

  • ValueError – If any cube is missing required coordinates (forecast_period, time).

  • ValueError – If cubes do not have multiple, different forecast_periods and times.

  • ValueError – If cubes do not all have the same forecast_reference_time.

Return type:

None

process(*cubes)[source]#

Fill gaps in the forecast trajectory, i.e. gaps in the validity time sequence, or equivalently forecast period sequence for a fixed forecast reference time.

Parameters:

cubes (Union[Cube, CubeList]) –

One or more cubes with potentially missing validity times. Can be:

  • A single Cube with a forecast_period or time dimension

    (will be sliced)

  • Multiple Cube arguments representing different validity times

  • A single CubeList containing multiple validity times

All cubes should have the same validity time coordinate structure and dimensions (except for forecast_period and time), and are expected to all have the same forecast_reference_time.

Return type:

Cube

Returns:

A single merged Cube with gaps filled using temporal interpolation. The cube will have time as a dimension coordinate.

Raises:

TypeError – If input is not Cube or CubeList.

class GoogleFilmInterpolation(model_path, scaling='minmax', clipping_bounds=None, clip_in_scaled_space=False, clip_to_physical_bounds=False, cluster_sources_attribute=None, interpolation_window_in_minutes=None, max_batch=1, parallel_backend=None, n_workers=1, model_loader=None)[source]#

Bases: BasePlugin

Class to perform temporal interpolation using the Google FILM model.

The model is expected to be a TensorFlow Hub module that takes as input two images and a time point given as a fraction between 0 at t0 and 1 at t1, and outputs an interpolated image.

The input cubes are expected to have the same spatial dimensions and coordinate system. The output cube will have the same metadata as cube1.

__init__(model_path, scaling='minmax', clipping_bounds=None, clip_in_scaled_space=False, clip_to_physical_bounds=False, cluster_sources_attribute=None, interpolation_window_in_minutes=None, max_batch=1, parallel_backend=None, n_workers=1, model_loader=None)[source]#

Initialise the plugin.

Parameters:
  • model_path (str) – Path to the TensorFlow Hub module for the Google FILM model.

  • scaling (str) – Scaling method to apply to the data before interpolation. Supported methods are “log10” and “minmax”.

  • clipping_bounds (Optional[Tuple[float, float]]) – A tuple specifying the (min, max) bounds to which to clip the interpolated data. Default is None.

  • clip_in_scaled_space (bool) – Whether to apply clipping in the scaled data space. Default is True.

  • clip_to_physical_bounds (bool) – Whether to apply clipping to physical bounds after interpolation. Default is False.

  • cluster_sources_attribute (Optional[str]) – Name of cube attribute containing cluster sources dictionary. The cluster_sources dictionary has a format like: {realization_index: {source_name: [periods]}}. When provided with interpolation_window_in_minutes, enables identification of validity times to regenerate at source transitions.

  • interpolation_window_in_minutes (Optional[int]) – Time window (in minutes) as +/- range around forecast source transitions.

  • max_batch (Optional[int]) – If using google_film interpolation, the maximum batch size for model inference. This limits memory usage by processing the data in smaller chunks. Default is 1 (no batching).

  • parallel_backend (Optional[str]) – If specified, the parallelisation backend to use when performing google_film interpolation. Options are currently the “loky” backend provided by the joblib package. Default is None, which results in no parallelisation.

  • n_workers (Optional[int]) – If using parallel_backend, the number of workers to use for parallel processing. Default is None, which results in the use of 1 core.

  • model_loader (Any) – Optional callable to load the TensorFlow model. This is mainly intended for use in testing where a mock model loader can be supplied. If None, the default model loader will be used.

Raises:

ValueError – If an unsupported scaling method is provided.

_abc_impl = <_abc._abc_data object>#
_apply_clipping(interpolated, cube1, cube2)[source]#

Clip the interpolated cube data to within the provided clipping bounds, if provided. Otherwise, clip within the bounds of the input cubes if either clip_to_physical_bounds or clip_in_scaled_space is True. If neither is set, no clipping is applied.

Parameters:

interpolated (Cube) – The interpolated cube.

Return type:

None

_apply_scaling(cube1, cube2, scaling)[source]#

Apply scaling to the input cubes before interpolation.

Parameters:
  • cube1 (Cube) – The first input cube.

  • cube2 (Cube) – The second input cube.

  • scaling (str) – Scaling method to apply. Supported methods are “log10” and “minmax”.

Return type:

None

_finalise_interpolated_cube(cube, cube1, cube2, cube1_orig, cube2_orig)[source]#

Apply clipping and reverse scaling to an interpolated cube.

Parameters:
  • cube (Cube) – The interpolated cube to finalise (in scaled space).

  • cube1 (Cube) – The first input cube (scaled, for clipping in scaled space).

  • cube2 (Cube) – The second input cube (scaled, for clipping in scaled space).

  • cube1_orig (Cube) – The first input cube before scaling (for reverse scaling and

  • clipping). (physical)

  • cube2_orig (Cube) – The second input cube before scaling (for reverse scaling and

  • clipping).

Return type:

Cube

Returns:

The finalised interpolated cube, with scaling reversed and clipping applied as configured.

_interpolate_no_extra_dim(cube1, cube2, template_slices, time_fractions, model, cube1_orig, cube2_orig)[source]#

Helper method to handle interpolation when there is no extra dimension.

Parameters:
  • cube1 (Cube) – The first input cube (scaled).

  • cube2 (Cube) – The second input cube (scaled).

  • template_slices (list) – List of template slices over time.

  • time_fractions (list) – List of time fractions for interpolation.

  • model (Any) – The loaded TensorFlow Hub model.

  • cube1_orig (Cube) – The first input cube before scaling.

  • cube2_orig (Cube) – The second input cube before scaling.

Return type:

CubeList

Returns:

CubeList of interpolated cubes for each time point.

_interpolate_with_extra_dim(cube1, cube2, template_slices, time_fractions, model, extra_dim, cube1_orig, cube2_orig)[source]#

Helper method to handle interpolation when an extra dimension (e.g. realization, percentile) is present.

Parameters:
  • cube1 (Cube) – The first input cube (scaled).

  • cube2 (Cube) – The second input cube (scaled).

  • template_slices (list) – List of template slices over time.

  • time_fractions (list) – List of time fractions for interpolation.

  • model (Any) – The loaded TensorFlow Hub model.

  • extra_dim (str) – The name of the extra dimension.

  • cube1_orig (Cube) – The first input cube before scaling.

  • cube2_orig (Cube) – The second input cube before scaling.

Return type:

CubeList

Returns:

CubeList of interpolated cubes for each time and extra_dim value.

_reverse_scaling(cube, cube1, cube2, scaling)[source]#

Reverse scaling on the interpolated cube after interpolation.

Parameters:
  • cube (Cube) – The interpolated cube.

  • cube1 (Cube) – The first input cube.

  • cube2 (Cube) – The second input cube.

  • scaling (str) – Scaling method to reverse. Supported methods are “log10” and “minmax”.

Return type:

None

_run_google_film(arr1, arr2, model, time_points)[source]#

Run the Google FILM model to interpolate between two arrays at multiple time points. The input arrays can be 2D (H, W) or 3D (N, H, W), where N is the number of pairs to process. The output will be a 3D array (N, H, W) of interpolated data. Each input array is treated as a grayscale image, expanded to 3 channels for the model. The number of pairs N should match the length of time_points. The dimension N could represent e.g. different realizations or multiple time points, or these items stacked together.

Parameters:
  • arr1 (ndarray) – The first input array.

  • arr2 (ndarray) – The second input array.

  • model (Any) – The loaded TensorFlow Hub model.

  • time_points (List[float]) – A list of floats between 0 and 1 indicating the interpolation

  • points.

Return type:

ndarray

Returns:

Numpy array of interpolated data for each time point, shape (N, H, W)

process(cube1, cube2, template_interpolated_cube)[source]#

Perform temporal interpolation between two cubes using the Google FILM model.

Parameters:
  • cube1 (Cube) – The first input cube (at time t=0).

  • cube2 (Cube) – The second input cube (at time t=1).

  • template_interpolated_cube (Cube) – A cube containing the interpolated data with the correct metadata for the output times.

Return type:

CubeList

Returns:

A CubeList containing the interpolated cubes at the specified times.

Raises:
  • ValueError – If cube1 or cube2 do not have realization coordinates.

  • ValueError – If cube1 and cube2 have different numbers of realizations.

class TemporalInterpolation(interval_in_minutes=None, times=None, interpolation_method='linear', accumulation=False, max=False, min=False, model_path=None, scaling='minmax', clipping_bounds=None, clip_in_scaled_space=False, clip_to_physical_bounds=False, max_batch=1, parallel_backend=None, n_workers=1, model_loader=None)[source]#

Bases: BasePlugin

Interpolate data to intermediate times between the validity times of two cubes. This can be used to fill in missing data (e.g. for radar fields) or to ensure data is available at the required intervals when model data is not available at these times.

The plugin will return the interpolated times and the later of the two input times. This allows us to modify the input diagnostics if they represent accumulations.

The IMPROVER convention is that period diagnostics have their time coordinate point at the end of the period. The later of the two inputs therefore covers the period that has been broken down into shorter periods by the interpolation and, if working with accumulations, must itself be modified. The result of this approach is that in a long run of lead-times, e.g. T+0 to T+120 all the lead-times will be available except T+0.

If working with period maximums and minimums we cannot return values in the new periods that do not adhere to the inputs. For example, we might have a 3-hour maximum of 5 ms-1 between 03-06Z. The period before it might have a maximum of 11 ms-1. Upon splitting the 3-hour period into 1-hour periods the gradient might give us the following results:

Inputs: 00-03Z: 11 ms-1, 03-06Z: 5 ms-1 Outputs: 03-04Z: 9 ms-1, 04-05Z: 7 ms-1, 05-06Z: 5ms-1

However these outputs are not in agreement with the original 3-hour period maximum of 5 ms-1 over the period 03-06Z. We enforce the maximum from the original period which results in:

Inputs: 00-03Z: 10 ms-1, 03-06Z: 5 ms-1 Outputs: 03-04Z: 5 ms-1, 04-05Z: 5 ms-1, 05-06Z: 5ms-1

If instead the preceding period maximum was 2 ms-1 we would use the trend to produce lower maximums in the interpolated 1-hour periods, becoming:

Inputs: 00-03Z: 2 ms-1, 03-06Z: 5 ms-1 Outputs: 03-04Z: 3 ms-1, 04-05Z: 4 ms-1, 05-06Z: 5ms-1

This interpretation of the gradient information is retained in the output as it is consistent with the original period maximum of 5 ms-1 between 03-06Z. As such we can impart increasing trends into maximums over periods but not decreasing trends. The counter argument can be made when interpolating minimums in periods, allowing us only to introduce decreasing trends for these.

We could use the cell methods to determine whether we are working with accumulations, maximums, or minimums. This should be denoted as a cell method associated with the time coordinate, e.g. for an accumulation it would be time: sum, whilst a maximum would have time: max. However we cannot guarantee these cell methods are present. As such the interpolation of periods here relies on the user supplying a suitable keyword argument that denotes the type of period being processed.

__init__(interval_in_minutes=None, times=None, interpolation_method='linear', accumulation=False, max=False, min=False, model_path=None, scaling='minmax', clipping_bounds=None, clip_in_scaled_space=False, clip_to_physical_bounds=False, max_batch=1, parallel_backend=None, n_workers=1, model_loader=None)[source]#

Initialise class.

Parameters:
  • interval_in_minutes (Optional[int]) –

    Specifies the interval in minutes at which to interpolate between the two input cubes. A number of minutes which does not divide up the interval equally will raise an exception.

    e.g. cube_t0 valid at 03Z, cube_t1 valid at 06Z,
    interval_in_minutes = 60 –> interpolate to 04Z and 05Z.

  • times (Optional[List[datetime]]) – A list of datetime objects specifying the times to which to interpolate.

  • interpolation_method (str) – Method of interpolation to use. Default is linear. Only methods in known_interpolation_methods can be used.

  • accumulation (bool) – Set True if the diagnostic being temporally interpolated is a period accumulation. The output will be renormalised to ensure that the total across the period constructed from the shorter intervals matches the total across the period from the coarser intervals.

  • max (bool) – Set True if the diagnostic being temporally interpolated is a period maximum. Trends between adjacent input periods will be used to provide variation across the interpolated periods where these are consistent with the inputs.

  • min (bool) – Set True if the diagnostic being temporally interpolated is a period minimum. Trends between adjacent input periods will be used to provide variation across the interpolated periods where these are consistent with the inputs.

  • model_path (Optional[str]) – Path to the TensorFlow Hub module for the Google FILM model. Required if interpolation_method is “google_film”.

  • scaling (str) – Scaling method to apply to the data before interpolation when using “google_film” method. Supported methods are “log10” and “minmax”. Default is “minmax”.

  • clipping_bounds (Optional[Tuple[float, float]]) – A tuple specifying the (min, max) bounds to which to clip the interpolated data when using “google_film” method. Default is None.

  • clip_in_scaled_space (bool) – Whether to apply clipping in the scaled data space when using “google_film” method. Default is True.

  • clip_to_physical_bounds (bool) – Whether to apply clipping to physical bounds after interpolation when using “google_film” method. Default is False.

  • max_batch (Optional[int]) – If using google_film interpolation, the maximum batch size for model inference. This limits memory usage by processing the data in smaller chunks. Default is 1 (no batching).

  • parallel_backend (Optional[str]) – If specified, the parallelisation backend to use when performing google_film interpolation. Options are currently the “loky” backend provided by the joblib package. Default is None, which results in no parallelisation.

  • n_workers (Optional[int]) – If using parallel_backend, the number of workers to use for parallel processing. Default is None, which results in the use of 1 core.

  • model_loader (Any) – Optional callable to load the TensorFlow model. This is mainly intended for use in testing where a mock model loader can be supplied. If None, the default model loader will be used.

Raises:
  • ValueError – If neither interval_in_minutes nor times are set.

  • ValueError – If both interval_in_minutes and times are not set.

  • ValueError – If interpolation method not in known list.

  • ValueError – If interpolation_method is “google_film” but model_path is not provided.

  • ValueError – If multiple period diagnostic kwargs are set True.

  • ValueError – A period diagnostic is being interpolated with a method not found in the period_interpolation_methods list.

_abc_impl = <_abc._abc_data object>#
static _calculate_accumulation(cube_t0, period_reference, interpolated_cube)[source]#

If the input is an accumulation we use the trapezium rule to calculate a new accumulation for each output period from the rates we converted the accumulations to prior to interpolating. We then renormalise to ensure the total accumulation across the period is unchanged by expressing it as a series of shorter periods.

The interpolated cube is modified in place.

Parameters:
  • cube_t0 (Cube) – The input cube corresponding to the earlier time.

  • period_reference (Cube) – The input cube corresponding to the later time, with the values prior to conversion to rates.

  • interpolated_cube (Cube) – The cube containing the interpolated times, which includes the data corresponding to the time of the later of the two input cubes.

static add_bounds(cube_t0, interpolated_cube)[source]#

Calcualte bounds using the interpolated times and the time taken from cube_t0. This function is used rather than iris’s guess bounds method as we want to use the earlier time cube to inform the lowest bound. The interpolated_cube crd is modified in place.

Parameters:
  • cube_t0 (Cube) – The input cube corresponding to the earlier time.

  • interpolated_cube (Cube) – The cube containing the interpolated times, which includes the data corresponding to the time of the later of the two input cubes.

Raises:

CoordinateNotFoundError – if time or forecast_period coordinates are not present on the input cubes.

static calc_lats_lons(cube)[source]#

Calculate the lats and lons of each point from a non-latlon cube, or output a 2d array of lats and lons, if the input cube has latitude and longitude coordinates.

Parameters:

cube (Cube) – cube containing x and y axis

Return type:

Tuple[ndarray, ndarray]

Returns:

  • 2d Array of latitudes for each point.

  • 2d Array of longitudes for each point.

static calc_sin_phi(dtval, lats, lons)[source]#

Calculate sin of solar elevation

Parameters:
  • dtval (datetime) – Date and time.

  • lats (ndarray) – Array 2d of latitudes for each point

  • lons (ndarray) – Array 2d of longitudes for each point

Return type:

ndarray

Returns:

Array of sine of solar elevation at each point

construct_time_list(initial_time, final_time)[source]#

A function to construct a list of datetime objects formatted appropriately for use by iris’ interpolation method.

Parameters:
  • initial_time (datetime) – The start of the period over which a time list is to be constructed.

  • final_time (datetime) – The end of the period over which a time list is to be constructed.

Return type:

List[Tuple[str, List[datetime]]]

Returns:

A list containing a tuple that specifies the coordinate and a list of points along that coordinate to which to interpolate, as required by the iris interpolation method, e.g.:

[('time', [<datetime object 0>,
           <datetime object 1>])]

Raises:
  • ValueError – If list of times provided falls outside the range specified by the initial and final times.

  • ValueError – If the interval_in_minutes does not divide the time range up equally.

static daynight_interpolate(interpolated_cube)[source]#

Set linearly interpolated data to zero for parameters (e.g. solar radiation parameters) which are zero if the sun is below the horizon.

Parameters:

interpolated_cube (Cube) – cube containing Linear interpolation of cube at interpolation times in time_list.

Return type:

CubeList

Returns:

A list of cubes interpolated to the desired times.

static enforce_time_coords_dtype(cube)[source]#

Enforce the data type of the time, forecast_reference_time and forecast_period within the cube, so that time coordinates do not become mis-represented. The units of the time and forecast_reference_time are enforced to be “seconds since 1970-01-01 00:00:00” with a datatype of int64. The units of forecast_period are enforced to be seconds with a datatype of int32. This functions modifies the cube in-place.

Parameters:

cube (Cube) – The cube that will have the datatype and units for the time, forecast_reference_time and forecast_period coordinates enforced.

Return type:

Cube

Returns:

Cube where the datatype and units for the time, forecast_reference_time and forecast_period coordinates have been enforced.

process(cube_t0, cube_t1)[source]#

Interpolate data to intermediate times between validity times of cube_t0 and cube_t1.

Parameters:
  • cube_t0 (Cube) – A diagnostic cube valid at the beginning of the period within which interpolation is to be permitted.

  • cube_t1 (Cube) – A diagnostic cube valid at the end of the period within which interpolation is to be permitted.

Return type:

CubeList

Returns:

A list of cubes interpolated to the desired times.

Raises:
  • TypeError – If cube_t0 and cube_t1 are not of type iris.cube.Cube.

  • ValueError – A mix of instantaneous and period diagnostics have been used as inputs.

  • ValueError – A period type has been declared but inputs are not period diagnostics.

  • ValueError – Period diagnostics with overlapping periods.

  • CoordinateNotFoundError – The input cubes contain no time coordinate.

  • ValueError – Cubes contain multiple validity times.

  • ValueError – The input cubes are ordered such that the initial time cube has a later validity time than the final cube.

solar_interpolate(diag_cube, interpolated_cube)[source]#

Temporal Interpolation code using solar elevation for parameters (e.g. solar radiation parameters like Downward Shortwave (SW) radiation or UV index) which are zero if the sun is below the horizon and scaled by the sine of the solar elevation angle if the sun is above the horizon.

Parameters:
  • diag_cube (Cube) – cube containing diagnostic data valid at the beginning of the period and at the end of the period.

  • interpolated_cube (Cube) – cube containing Linear interpolation of diag_cube at interpolation times in time_list.

Return type:

CubeList

Returns:

A list of cubes interpolated to the desired times.

_as_tuple_if_list(bounds)[source]#

Convert a list to a tuple, or return as is if already a tuple or None.

Parameters:

bounds (Union[List[float], Tuple[float, float], None]) – The bounds to convert. Can be a list or tuple of two floats, or None.

Return type:

Optional[Tuple[float, float]]

Returns:

A tuple of two floats if bounds is a list or tuple, or None if bounds is None.

Raises:

TypeError – If bounds is not a list, tuple, or None.

_run_film_chunk(arr1, arr2, times, model, start, end)[source]#

Run the Google FILM model for a chunk of data from start to end indices. Defined outside of the GoogleFilmInterpolation class to allow multiprocessing workers to call it.

Parameters:
  • arr1 (ndarray) – The first input array.

  • arr2 (ndarray) – The second input array.

  • times (ndarray) – Array of time points for interpolation.

  • model (Any) – The loaded TensorFlow Hub model.

  • start (int) – Start index for the chunk.

  • end (int) – End index for the chunk.

Return type:

ndarray

Returns:

Numpy array of interpolated data for the chunk.

_run_film_chunk_mp(args)[source]#

Run a chunk of data through the Google FILM model in a multiprocessing worker.

Parameters:

args – Tuple containing (arr1, arr2, times, model_path, start, end).

Returns:

Numpy array of interpolated data for the chunk.

load_model(model_path)[source]#

Load the TensorFlow Hub model. This is a standalone function to allow multiprocessing workers to load the model independently from the GoogleFilmInterpolation class.

Parameters:

model_path (str) – Path to the TensorFlow Hub module for the Google FILM model.

Return type:

Any

Returns: The loaded TensorFlow Hub model.