This plugin enables Hydra applications to utilize Optuna for the optimization of the parameters of experiments. In contrast to the original Optuna Sweeper plugin this plugin has the following advantages:
- Pruning with
in the objective function.
from hydra_plugins.hydra_optuna_pruning_sweeper import trial_provider ... trial = trial_provider.trial ... trial.report(score, step) if trial.should_prune(): ... ...
- Manual specification of hyperparameters which are used at the beginning of the hyperparameter search (see https://optuna.readthedocs.io/en/stable/tutorial/20_recipes/008_specify_params.html).
- Simple parallelization across processes with the
optuna.integration.DaskStorage
simply by specifyingn_workers>1
. - Parallelization across nodes with the
optuna.integration.DaskStorage
by specifying a daskClient
. - More powerful custom search spaces. For example, a variable length list of values can be suggested.
- It internally uses
optuna.study.Study.optimize
which can be freely configured. - It uses
optuna>=3.1.0
. This is only an advantage until keisuke-umezawa's Pull Request is merged.
Before using consider the following disadvantages:
- This plugin is supposed to be used with the
BasicLauncher
. Different launchers like the Submitit Launcher plugin are not supported. - The deprecated parameter
search_space
from the original has been removed. - This plugin internally uses hydra_zen for configuration.
It requires importing
optuna
anddask.distributed
. Both might be imported even if this plugin is not used and thus, once installed, it can slow down starting hydra. - This plugin does not seek to improve the architecture of hydra nor does it allow the usage of hyperparameter search libraries other than optuna.
- Fewer people use this plugin. Therefore, it might not be as robust as the original.
This plugin requires hydra-core>=1.2.0
. Please install it with the following command:
pip install hydra-core --upgrade
This plugin has not yet been added to PyPI. Install it by cloning this repository and executing
pip install PATH-TO-CLONED-REPOSITORY
preferably in an activated virtual environment (pip install -e
does not work).
The examples
package includes the adapted examples provided with the original plugin.
A more complicated deep-learning example can be found here
lightning_hydra_optuna_pruning_example
This plugin can mostly be used like the original. However, it has more options.
For more information please take a look at the doc-strings of optuna_pruning_sweeper.OptunaPruningSweeper
and custom_search_space.CustomSearchSpace
.
Please set hydra/sweeper
to OptunaPruningSweeper
in your config file.
defaults:
- override hydra/sweeper: OptunaPruningSweeper
Alternatively, add the hydra/sweeper=OptunaPruningSweeper
option to your command line.
The default configuration simply consists of the default values of the OptunaPruningSweeper
class in the optuna_pruning_sweeper.py
module.
Like the original, this plugin uses hydra's OverrideGrammer.
As a rule of thumb use choice
override instead of suggest_categorical
(i.e. x: choice(false, true)
),
range
override instead of suggest_int
(i.e. x: range(1, 4)
) and interval
override instead of
suggest_float
(i.e. x: interval(1, 4)
). In case of range
and interval
add the tag "log" for a
logarithmic search space (i.e. x: tag(log, range(1, 4))
and x: tag(log, interval(1, 4))
).
Manual values can be specified as tags, i.e. x: tag("1", log, interval(1, 4))
.
Unfortunately, hydra throws an error if the tags can be converted to something over than a string.
Therefore, numbers, booleans and null
have to be surrounded by quotes.
If multiple manual values are specified they have to be prefaced by their index, as hydra does not respect the order,
i.e. x: tag(0:1, 1:3, log, interval(1, 4))
(first x=0 is tried then x=3 and later the values are sampled from [1, 4]).
As hydra can't convert 0:1
to a number, no quotes are required.
Override the class
class CustomSearchSpace(ABC):
def manual_values(self) -> Dict[str, List[Any]]:
return dict()
@abstractmethod
def suggest(self, cfg: DictConfig, trial: Trial) -> Dict[str, Any]:
pass
to get access to the trial
object which can be used to dynamically create the search space.
In contrast to the original Optuna sweeper, the suggested values should be returned with their name in a dictionary.
This allows the suggestion of values which in turn consist of suggested values.
For example, a variable length list of values can be suggested.
This example is already implemented as the ListSearchSpace
.
A full example can be found at examples/custom_search_space
.
I want the objective function to be able to access the Trial
object so that the content of
optuna.integration
can be reused. Since the hydra launcher interface does not allow this, I simply created
the module trial_provider.py
which only contains the single variable through which the Trial
object can be passed (similar to the singleton design pattern).
Unlike the original Optuna sweeper, I use
optuna.study.Study.optimize
method to carry out the sweep.
This should improve the robustness of the code and allow for the usage of further features from optuna
like garbage collection after each trial and more customization options concerning how many trials are run.
Two methods can be used to parallelize the hyperparameter search.
First, if n_jobs>1
the storage is wrapped in
optuna.integration.DaskStorage
which is used to start num_jobs
parallel executions of
optuna.study.Study.optimize
on the specified dask cluster. Libraries like Dask-Jobqueue or
Dask-MPI can be used to set up a dask cluster over multiple nodes by passing a
callable to the argument dask_client
(see examples/sphere_custom_dask_client for an example
).
Another way to parallelize the hyperparameter search is to set up an RDB Backend or
optuna.storages.JournalStorage
which each process can access. And start multiple processes from the command line which each carry out the
hyperparameter search, as described here.
To improve robustness I mainly rely on
optuna.study.Study.optimize
and
optuna.integration.DaskStorage
. Further, some code has
been adapted from the original Optuna Sweeper plugin and keisuke-umezawa's Pull Request.
- When the
DaskStorage
in combination with an RDB Backend is used some internal exceptions get thrown. The hyperparameter search however works as expected. - When the
DaskStorage
is used, some logging information is not propagated correctly.