YAHPO Gym Module Handbook

Subpackages

yahpo_gym.benchmarks package

Submodules

yahpo_gym.benchmark_set module

class yahpo_gym.benchmark_set.BenchmarkSet(scenario: str | None = None, instance: str | None = None, active_session: bool = True, session: InferenceSession | None = None, multithread: bool = True, check: bool = True, noisy: bool = False)

Bases: object

Interface for a benchmark scenario. Initialized with a valid key for a valid scenario and optinally an onnxruntime.InferenceSession.

Parameters

scenario: str: (Required) A key for ConfigDict pertaining to a valid benchmark scenario (e.g. lcbench).
instance: str: (Optional) A key for ConfigDict pertaining to a valid instance (e.g. 3945). See BenchmarkSet(<key>).instances for a list of available instances.
active_session: bool: Should the benchmark run in an active onnxruntime.InferenceSession? Initialized to Trtue.
session: onnx.Session: A ONNX session to use for inference. Overwrite active_session and sets the provided onnxruntime.InferenceSession as the active session. Initialized to None.
multithread: bool: Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no session is given.
check: bool: Should input to objective_function be checked for validity? Initialized to True, but can be disabled for speedups.
noisy: bool: Use stochastic surrogate models? Initialized to False.

get_fidelity_space(seed: int | None = None)

Get the fidelity space to be optimized for.

Parameters

seedint: Seed for the ConfigSpace. Optional, initialized to None.

get_opt_space(drop_fidelity_params: bool = False, seed: int | None = None)

Get the search space to be optimized. Sets ‘instance’ as a constant instance and removes all fidelity parameters if ‘drop_fidelity_params = True’.

Parameters

drop_fidelity_params: bool: Should fidelity params be dropped from the opt_space? Defaults to False.
seedint: Seed for the ConfigSpace. Optional, initialized to None.

property instance: The selected instance. Returns none if no instance was selected.

property instances: A list of valid instances for the scenario. This usually refers to the dataset the selected ML algorithm (=`scenario`) the model should be fit on. Note: The rbv2_*, lcbench, and iaml_* scenarios contain instances based on OpenML datasets. Parameters of the ConfigSpace correspond to the dataset identifier, not the task identifier even though the instance_name is OpenMLTaskId.

objective_function(configuration: Dict | List[Dict], seed: int | None = None, logging: bool = False, multithread: bool = True)

Evaluate the surrogate for (a) given configuration(s).

Parameters

configuration: Dict: A valid dict or list of dicts containing hyperparameters to be evaluated. Attention: configuration is not checked for internal validity for speed purposes.
logging: bool: Should the evaluation be logged in the archive? Initialized to False.
multithread: bool: Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no active session has been set.

objective_function_timed(configuration: Dict | List[Dict], seed: int | None = None, logging: bool = False, multithread: bool = True)

Evaluate the surrogate for (a) given configuration(s) and sleep for ‘self.quant’ * predicted runtime(s). The quantity ‘self.quant’ is automatically inferred if it is not set manually. If configuration is a list of dicts, sleep is done after all evaluations. Note, that this assumes that the predicted runtime is in seconds.

Parameters

configuration: Dict: A valid dict or list of dicts containing hyperparameters to be evaluated. Attention: configuration is not checked for internal validity for speed purposes.
logging: bool: Should the evaluation be logged in the archive? Initialized to False.
multithread: bool: Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no active session has been set.

property properties: List of properties of the benchmark scenario: Describes the type of the search space: (‘continuous’, ‘mixed’, ‘categorical’, ‘hierarchical’) and availability of metadata (e.g. ‘memory’: memory measurements are available). Note that the instance name or id variable is not counted as a categorical hyperparameter.

set_constant(param: str, value=None)

Set a given hyperparameter to a constant.

Parameters

param: str: A valid parameter name.
value: int | str | any: A valid value for the parameter param.

set_instance(value)

Set an instance.

Parameters

value: int | str | any: A valid value for the parameter pertaining to the configuration. See instances.

set_session(session: InferenceSession | None = None, multithread: bool = True)

Set the session for inference on the surrogate model.

Parameters

session: onnxruntime.InferenceSession: A ONNX session to use for inference. Overwrite active_session and sets the provided onnxruntime.InferenceSession as the active session. Initialized to None.
multithread: bool: Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no session is given.

property target_stats

Empirical minimum and maximum of the target. Obtained through random search with 0.5M points.

Returns a data.frame with the following columns:

metric :: target
statistic :: min or max
value :: value of minimum/maximum
scenario :: the scenario

“ instance :: the instance

If no instance is set, all instances for a given scenario are returned.

property targets: A list of available targets for the scenario.

yahpo_gym.configuration module

class yahpo_gym.configuration.ConfigDict

Bases: object

Dictionary of available benchmark scenarios (configurations). This provides a thin wrapper allowing for easy updating and retrieving of configurations pertaining to a specific benchmark scenario.

get_item(key: str, **kwargs)

Instantiate a given Configuration.

Parameters

key: str: The key of the configuration to retrieve

update(config_dict: Dict)

Add new or update existing benchmark scenario configuration.

Parameters

config_dict: dict: A dictionary of settings required for a given configuration.

class yahpo_gym.configuration.Configuration(config_dict: Dict)

Bases: object

Interface for benchmark scenario meta information. Abstract base class used to instantiate configurations that contain all relevant meta-information about a specific benchmark scenario.

Parameters

config_dict: dict: A dictionary of settings required for a given configuration.

property config_path

property data

get_path(key: str)

property hp_names

yahpo_gym.configuration.cfg(key: str | None = None, **kwargs)

Shorthand acces to ‘ConfigDict’.

Parameters

key: str: The key of the configuration to retrieve. If none, prints available keys.

yahpo_gym.configuration.list_scenarios()

List available scenarios.

Returns:: _type_: List

yahpo_gym.local_config module

class yahpo_gym.local_config.LocalConfiguration(settings_path: str | None = None)

Bases: object

Interface for setting up a local configuration. This reads from and writes to a configuration file in the YAML format, allowing to store paths to the underlying data and models required for inference on the fitted surrogates.

Parameters

settings_path: str: Path to the local configuration file. The default is “~/.config/yahpo_gym”.

property config: The stored settings dictionary (cached).

property data_path: Path where metadata and surrogate models for inference are stored.

init_config(data_path: str = '')

Initialize a new local configuration.

This writes a local configuration file to the specified ‘settings_path’. The It is currently used to globally store the following information ‘data_path’: A path to the metadata required for inference.

Parameters

settings_path: str: Path to the directory where surrogate models and metadata are saved.

set_data_path(data_path: str)

Set path to directory where required models and metadata are stored.

Parameters

data_path: str: Path to the directory where surrogate models and metadata are saved.

YAHPO Gym Module Handbook

Subpackages

Submodules

yahpo_gym.benchmark_set module

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

yahpo_gym.configuration module

Parameters

Parameters

Parameters

Parameters

yahpo_gym.local_config module

Parameters

Parameters

Parameters

Module contents