Other than manual experimentation and error, here are a variety of primary approaches for determining the ideal set of hyperparameter values:
- Grid Search involves identifying a set of parameters for each hyperparameter, executing the model with every conceivable combination of these values, and selecting the set of values that yields the best results. Grid search is based on conjecture since the practitioner manually sets the variables to be tried.
- The improved form of Grid Search predictive algorithms for hyperparameter optimization is Halving Grid Search. The Halving Grid Search searches for a given set of hyperparameters via a method of consecutive halving. The search approach begins by assessing all possibilities based on a small sample of the data, and then iteratively picks the top candidates based on progressively larger samples.
- Random Search is the process of selecting random variations of hyperparameter variables from specified statistical distributions to identify the best set of values. Random search has the benefit of grid search in that it can explore a broader variety of values without expanding the number of repetitions.
- Halving Randomized Search uses the same consecutive halving strategy, This method is further optimized than Halving Grid Search. It randomly selects a subset of hyperparameter combinations, contrasting Halving Grid Search, which trains on every possible combination of hyperparameters.
- Hyperopt is an open-source Python toolkit for Bayesian Optimization developed for large-scale improvements of models with dozens of variables. It enables scaling of hyperparameter optimization over several CPU cores.
Hyperopt-Sklearn is a library extension that enables the autoML hyperparameter optimization and automated search of Machine Learning algorithms for regression and classification problems.
Bayesian Search is a sequential strategy that utilizes the outcomes of the original set of hyperparameters to enhance the subsequent search procedure. This method decreases the time required for optimization, particularly for models trained with a significant quantity of data.