YAHPO Gym Module Handbook
Subpackages
Submodules
yahpo_gym.benchmark_set module
- class yahpo_gym.benchmark_set.BenchmarkSet(scenario: str | None = None, instance: str | None = None, active_session: bool = True, session: InferenceSession | None = None, multithread: bool = True, check: bool = True, noisy: bool = False)
Bases:
object
Interface for a benchmark scenario. Initialized with a valid key for a valid scenario and optinally an onnxruntime.InferenceSession.
Parameters
- scenario: str
(Required) A key for ConfigDict pertaining to a valid benchmark scenario (e.g. lcbench).
- instance: str
(Optional) A key for ConfigDict pertaining to a valid instance (e.g. 3945). See BenchmarkSet(<key>).instances for a list of available instances.
- active_session: bool
Should the benchmark run in an active onnxruntime.InferenceSession? Initialized to Trtue.
- session: onnx.Session
A ONNX session to use for inference. Overwrite active_session and sets the provided onnxruntime.InferenceSession as the active session. Initialized to None.
- multithread: bool
Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no session is given.
- check: bool
Should input to objective_function be checked for validity? Initialized to True, but can be disabled for speedups.
- noisy: bool
Use stochastic surrogate models? Initialized to False.
- get_fidelity_space(seed: int | None = None)
Get the fidelity space to be optimized for.
Parameters
- seedint
Seed for the ConfigSpace. Optional, initialized to None.
- get_opt_space(drop_fidelity_params: bool = False, seed: int | None = None)
Get the search space to be optimized. Sets ‘instance’ as a constant instance and removes all fidelity parameters if ‘drop_fidelity_params = True’.
Parameters
- drop_fidelity_params: bool
Should fidelity params be dropped from the opt_space? Defaults to False.
- seedint
Seed for the ConfigSpace. Optional, initialized to None.
- property instance
The selected instance. Returns none if no instance was selected.
- property instances
A list of valid instances for the scenario. This usually refers to the dataset the selected ML algorithm (=`scenario`) the model should be fit on. Note: The rbv2_*, lcbench, and iaml_* scenarios contain instances based on OpenML datasets. Parameters of the ConfigSpace correspond to the dataset identifier, not the task identifier even though the instance_name is OpenMLTaskId.
- objective_function(configuration: Dict | List[Dict], seed: int | None = None, logging: bool = False, multithread: bool = True)
Evaluate the surrogate for (a) given configuration(s).
Parameters
- configuration: Dict
A valid dict or list of dicts containing hyperparameters to be evaluated. Attention: configuration is not checked for internal validity for speed purposes.
- logging: bool
Should the evaluation be logged in the archive? Initialized to False.
- multithread: bool
Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no active session has been set.
- objective_function_timed(configuration: Dict | List[Dict], seed: int | None = None, logging: bool = False, multithread: bool = True)
Evaluate the surrogate for (a) given configuration(s) and sleep for ‘self.quant’ * predicted runtime(s). The quantity ‘self.quant’ is automatically inferred if it is not set manually. If configuration is a list of dicts, sleep is done after all evaluations. Note, that this assumes that the predicted runtime is in seconds.
Parameters
- configuration: Dict
A valid dict or list of dicts containing hyperparameters to be evaluated. Attention: configuration is not checked for internal validity for speed purposes.
- logging: bool
Should the evaluation be logged in the archive? Initialized to False.
- multithread: bool
Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no active session has been set.
- property properties
List of properties of the benchmark scenario: Describes the type of the search space: (‘continuous’, ‘mixed’, ‘categorical’, ‘hierarchical’) and availability of metadata (e.g. ‘memory’: memory measurements are available). Note that the instance name or id variable is not counted as a categorical hyperparameter.
- set_constant(param: str, value=None)
Set a given hyperparameter to a constant.
Parameters
- param: str
A valid parameter name.
- value: int | str | any
A valid value for the parameter param.
- set_instance(value)
Set an instance.
Parameters
- value: int | str | any
A valid value for the parameter pertaining to the configuration. See instances.
- set_session(session: InferenceSession | None = None, multithread: bool = True)
Set the session for inference on the surrogate model.
Parameters
- session: onnxruntime.InferenceSession
A ONNX session to use for inference. Overwrite active_session and sets the provided onnxruntime.InferenceSession as the active session. Initialized to None.
- multithread: bool
Should the ONNX session be allowed to leverage multithreading capabilities? Initialized to True but on some HPC clusters it may be needed to set this to False, depending on your setup. Only relevant if no session is given.
- property target_stats
Empirical minimum and maximum of the target. Obtained through random search with 0.5M points.
- Returns a data.frame with the following columns:
metric :: target
statistic :: min or max
value :: value of minimum/maximum
scenario :: the scenario
“ instance :: the instance
If no instance is set, all instances for a given scenario are returned.
- property targets
A list of available targets for the scenario.
yahpo_gym.configuration module
- class yahpo_gym.configuration.ConfigDict
Bases:
object
Dictionary of available benchmark scenarios (configurations). This provides a thin wrapper allowing for easy updating and retrieving of configurations pertaining to a specific benchmark scenario.
- class yahpo_gym.configuration.Configuration(config_dict: Dict)
Bases:
object
Interface for benchmark scenario meta information. Abstract base class used to instantiate configurations that contain all relevant meta-information about a specific benchmark scenario.
Parameters
- config_dict: dict
A dictionary of settings required for a given configuration.
- property config_path
- property data
- property hp_names
- yahpo_gym.configuration.cfg(key: str | None = None, **kwargs)
Shorthand acces to ‘ConfigDict’.
Parameters
- key: str
The key of the configuration to retrieve. If none, prints available keys.
- yahpo_gym.configuration.list_scenarios()
List available scenarios.
- Returns:
_type_: List
yahpo_gym.local_config module
- class yahpo_gym.local_config.LocalConfiguration(settings_path: str | None = None)
Bases:
object
Interface for setting up a local configuration. This reads from and writes to a configuration file in the YAML format, allowing to store paths to the underlying data and models required for inference on the fitted surrogates.
Parameters
- settings_path: str
Path to the local configuration file. The default is “~/.config/yahpo_gym”.
- property config
The stored settings dictionary (cached).
- property data_path
Path where metadata and surrogate models for inference are stored.
- init_config(data_path: str = '')
Initialize a new local configuration.
This writes a local configuration file to the specified ‘settings_path’. The It is currently used to globally store the following information ‘data_path’: A path to the metadata required for inference.
Parameters
- settings_path: str
Path to the directory where surrogate models and metadata are saved.