1. Grid Search¶
A module for grid search of hyperparameters of NEORL algorithms.
Original paper: Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of machine learning research, 13(2).
Grid Search is an exhaustive search for selecting an optimal set of algorithm hyperparameters. In Grid Search, the analyst sets up a grid of hyperparameter values. A multi-dimensional full grid of all hyperparameters is constructed, which contains all possible combinations of hyperparameters. Afterwards, every combination of hyperparameter values is tested in serial/parallel, where the optimization score (e.g. fitness) is estimated. Grid search can be very expensive for fine grids as well as large number of hyperparameters to tune.
1.1. What can you use?¶
Multi processing: ✔️
Discrete/Continuous/Mixed spaces: ✔️
Reinforcement Learning Algorithms: ✔️
Evolutionary Algorithms: ✔️
Hybrid Neuroevolution Algorithms: ✔️
1.2. Parameters¶
-
class
neorl.tune.gridtune.
GRIDTUNE
(param_grid, fit)[source]¶ A module for grid search for hyperparameter tuning
- Parameters
param_grid – (dict) the grid (list of possible values) for each hyperparameter provided in a dictionary form. Example: {‘x1’: [40, 50, 60, 80, 100], ‘x2’: [0.2, 0.4, 0.8], ‘x3’: [‘blend’, ‘cx2point’]}
fit – (function) the self-defined fitness function that includes the hyperparameters as input and algorithm score as output
-
tune
(ncores=1, csvname=None, verbose=True)[source]¶ This function starts the tuning process with specified number of processors
- Parameters
ncores – (int) number of parallel processors (see the Notes section below for an important note about parallel execution)
csvname – (str) the name of the csv file name to save the tuning results (useful for expensive cases as the csv file is updated directly after the case is done)
verbose – (bool) whether to print updates to the screen or not
1.3. Example¶
Example of using grid search to tune three ES hyperparameters for solving the 5-d Sphere function
from neorl.tune import GRIDTUNE
from neorl import ES
#**********************************************************
# Part I: Original Problem Settings
#**********************************************************
#Define the fitness function (for original optimisation)
def sphere(individual):
y=sum(x**2 for x in individual)
return y
#*************************************************************
# Part II: Define fitness function for hyperparameter tuning
#*************************************************************
def tune_fit(cxpb, mutpb, alpha):
#--setup the parameter space
nx=5
BOUNDS={}
for i in range(1,nx+1):
BOUNDS['x'+str(i)]=['float', -100, 100]
#--setup the ES algorithm
es=ES(mode='min', bounds=BOUNDS, fit=sphere, lambda_=80, mu=40, mutpb=mutpb, alpha=alpha,
cxmode='blend', cxpb=cxpb, ncores=1, seed=1)
#--Evolute the ES object and obtains y_best
#--turn off verbose for less algorithm print-out when tuning
x_best, y_best, es_hist=es.evolute(ngen=100, verbose=0)
return y_best #returns the best score
#*************************************************************
# Part III: Tuning
#*************************************************************
param_grid={
#def tune_fit(cxpb, mutpb, alpha):
'cxpb': [0.2, 0.4], #cxpb is first
'mutpb': [0.05, 0.1], #mutpb is second
'alpha': [0.1, 0.2, 0.3, 0.4]} #alpha is third
#setup a grid tune object
gtune=GRIDTUNE(param_grid=param_grid, fit=tune_fit)
#view the generated cases before running them
print(gtune.hyperparameter_cases)
#tune the parameters with method .tune
gridres=gtune.tune(ncores=1, csvname='tune.csv')
print(gridres)
1.4. Notes¶
For
ncores > 1
, the parallel tuning engine starts. Make sure to run your python script from the terminal NOT from an IDE (e.g. Spyder, Jupyter Notebook). IDEs are not robust when running parallel problems with packages likejoblib
ormultiprocessing
. Forncores = 1
, IDEs seem to work fine.If there are large number of hyperparameters to tune (large \(d\)), try nested grid search. First, run a grid search on few parameters first, then fix them to their best, and start another grid search for the next group of hyperparameters, and so on.
Always start with coarse grid for all hyperparameters (small \(k_i\)) to obtain an impression about their sensitivity. Then, refine the grids for those hyperparameters with more impact, and execute a more detailed grid search.
Grid search is ideal to use when the analyst has prior experience on the feasible range of each hyperparameter and the most important hyperparameters to tune.