YAHPO Gym Module Handbook

Subpackages

Submodules

yahpo_gym.benchmark_set module

class yahpo_gym.benchmark_set.BenchmarkSet(scenario: str | None = None, instance: str | None = None, active_session: bool = True, session: InferenceSession | None = None, multithread: bool = True, check: bool = True, noisy: bool = False)

Bases: object

Interface for a benchmark scenario. Initialized with a valid key for a valid scenario and optinally an onnxruntime.InferenceSession.

Parameters

scenario: str

(Required) A key for ConfigDict pertaining to a valid benchmark scenario (e.g. lcbench).

instance: str

(Optional) A key for ConfigDict pertaining to a valid instance (e.g. 3945). See BenchmarkSet(<key>).instances for a list of available instances.

active_session: bool

Should the benchmark run in an active onnxruntime.InferenceSession? Initialized to Trtue.

session: onnx.Session

A ONNX session to use for inference. Overwrite active_session and sets the provided onnxruntime.InferenceSession as the active session. Initialized to None.

multithread: bool

Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no session is given.

check: bool

Should input to objective_function be checked for validity? Initialized to True, but can be disabled for speedups.

noisy: bool

Use stochastic surrogate models? Initialized to False.

get_fidelity_space(seed: int | None = None)

Get the fidelity space to be optimized for.

Parameters

seedint

Seed for the ConfigSpace. Optional, initialized to None.

get_opt_space(drop_fidelity_params: bool = False, seed: int | None = None)

Get the search space to be optimized. Sets ‘instance’ as a constant instance and removes all fidelity parameters if ‘drop_fidelity_params = True’.

Parameters

drop_fidelity_params: bool

Should fidelity params be dropped from the opt_space? Defaults to False.

seedint

Seed for the ConfigSpace. Optional, initialized to None.

property instance

The selected instance. Returns none if no instance was selected.

property instances

A list of valid instances for the scenario. This usually refers to the dataset the selected ML algorithm (=`scenario`) the model should be fit on. Note: The rbv2_*, lcbench, and iaml_* scenarios contain instances based on OpenML datasets. Parameters of the ConfigSpace correspond to the dataset identifier, not the task identifier even though the instance_name is OpenMLTaskId.

objective_function(configuration: Dict | List[Dict], seed: int | None = None, logging: bool = False, multithread: bool = True)

Evaluate the surrogate for (a) given configuration(s).

Parameters

configuration: Dict

A valid dict or list of dicts containing hyperparameters to be evaluated. Attention: configuration is not checked for internal validity for speed purposes.

logging: bool

Should the evaluation be logged in the archive? Initialized to False.

multithread: bool

Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no active session has been set.

objective_function_timed(configuration: Dict | List[Dict], seed: int | None = None, logging: bool = False, multithread: bool = True)

Evaluate the surrogate for (a) given configuration(s) and sleep for ‘self.quant’ * predicted runtime(s). The quantity ‘self.quant’ is automatically inferred if it is not set manually. If configuration is a list of dicts, sleep is done after all evaluations. Note, that this assumes that the predicted runtime is in seconds.

Parameters

configuration: Dict

A valid dict or list of dicts containing hyperparameters to be evaluated. Attention: configuration is not checked for internal validity for speed purposes.

logging: bool

Should the evaluation be logged in the archive? Initialized to False.

multithread: bool

Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no active session has been set.

property properties

List of properties of the benchmark scenario: Describes the type of the search space: (‘continuous’, ‘mixed’, ‘categorical’, ‘hierarchical’) and availability of metadata (e.g. ‘memory’: memory measurements are available). Note that the instance name or id variable is not counted as a categorical hyperparameter.

set_constant(param: str, value=None)

Set a given hyperparameter to a constant.

Parameters

param: str

A valid parameter name.

value: int | str | any

A valid value for the parameter param.

set_instance(value)

Set an instance.

Parameters

value: int | str | any

A valid value for the parameter pertaining to the configuration. See instances.

set_session(session: InferenceSession | None = None, multithread: bool = True)

Set the session for inference on the surrogate model.

Parameters

session: onnxruntime.InferenceSession

A ONNX session to use for inference. Overwrite active_session and sets the provided onnxruntime.InferenceSession as the active session. Initialized to None.

multithread: bool

Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no session is given.

property target_stats

Empirical minimum and maximum of the target. Obtained through random search with 0.5M points.

Returns a data.frame with the following columns:
  • metric :: target

  • statistic :: min or max

  • value :: value of minimum/maximum

  • scenario :: the scenario

“ instance :: the instance

If no instance is set, all instances for a given scenario are returned.

property targets

A list of available targets for the scenario.

yahpo_gym.configuration module

class yahpo_gym.configuration.ConfigDict

Bases: object

Dictionary of available benchmark scenarios (configurations). This provides a thin wrapper allowing for easy updating and retrieving of configurations pertaining to a specific benchmark scenario.

get_item(key: str, **kwargs)

Instantiate a given Configuration.

Parameters

key: str

The key of the configuration to retrieve

update(config_dict: Dict)

Add new or update existing benchmark scenario configuration.

Parameters

config_dict: dict

A dictionary of settings required for a given configuration.

class yahpo_gym.configuration.Configuration(config_dict: Dict)

Bases: object

Interface for benchmark scenario meta information. Abstract base class used to instantiate configurations that contain all relevant meta-information about a specific benchmark scenario.

Parameters

config_dict: dict

A dictionary of settings required for a given configuration.

property config_path
property data
get_path(key: str)
property hp_names
yahpo_gym.configuration.cfg(key: str | None = None, **kwargs)

Shorthand acces to ‘ConfigDict’.

Parameters

key: str

The key of the configuration to retrieve. If none, prints available keys.

yahpo_gym.configuration.list_scenarios()

List available scenarios.

Returns:

_type_: List

yahpo_gym.local_config module

class yahpo_gym.local_config.LocalConfiguration(settings_path: str | None = None)

Bases: object

Interface for setting up a local configuration. This reads from and writes to a configuration file in the YAML format, allowing to store paths to the underlying data and models required for inference on the fitted surrogates.

Parameters

settings_path: str

Path to the local configuration file. The default is “~/.config/yahpo_gym”.

property config

The stored settings dictionary (cached).

property data_path

Path where metadata and surrogate models for inference are stored.

init_config(data_path: str = '')

Initialize a new local configuration.

This writes a local configuration file to the specified ‘settings_path’. The It is currently used to globally store the following information ‘data_path’: A path to the metadata required for inference.

Parameters

settings_path: str

Path to the directory where surrogate models and metadata are saved.

set_data_path(data_path: str)

Set path to directory where required models and metadata are stored.

Parameters

data_path: str

Path to the directory where surrogate models and metadata are saved.

Module contents