improver.calibration.rainforest_training module#
RainForests model training plugin.
- class TrainRainForestsModel(model_config_dict, training_data, observation_column, training_columns, lightgbm_params=None)[source]#
Bases:
BasePlugin- __init__(model_config_dict, training_data, observation_column, training_columns, lightgbm_params=None)[source]#
Initialise the options used when training models.
- Parameters:
model_config_dict (
dict[int,dict[str,dict[str,str]]]) – Dictionary describing the high-level RainForests model structure; - top level key describes the lead-hour, - next level key describes the threshold, - corresponding values locate the associated model file.training_data (
DataFrame) – Combined data set used to train models.observation_column (
str) – The column in the data set to be trained for.training_columns (
list[str]) – Set of columns from the data set to be trained from.lightgbm_params (
dict|None) – Optional. Parameters passed into training library. Any parameters here will override the default parameters.
Dictionary is of format:
{- “24”: {
- “0.000010”: {
“lightgbm_model”: “<path_to_lightgbm_model_object>”, “treelite_model”: “<path_to_treelite_model_object>”
}, “0.000050”: {
“lightgbm_model”: “<path_to_lightgbm_model_object>”, “treelite_model”: “<path_to_treelite_model_object>”
}, “0.000100”: {
“lightgbm_model”: “<path_to_lightgbm_model_object>”, “treelite_model”: “<path_to_treelite_model_object>”
},
}
The keys specify the lead times and model threshold values, while the associated values are the path to the corresponding tree-model objects for that lead time and threshold.
- _abc_impl = <_abc._abc_data object>#
- _train_model(threshold, model_path)[source]#
Train a model for a particular threshold and saves it to disk.
- params = {'num_leaves': 5, 'objective': 'binary', 'seed': 0}#