Neural Genetic Algorithms (NGA)

A module for the surrogate-based genetic algorithms trained by offline data-driven tri-training approach. The surrogate model used is radial basis function networks (RBFN).

Original paper: Huang, P., Wang, H., & Jin, Y. (2021). Offline data-driven evolutionary optimization based on tri-training. Swarm and Evolutionary Computation, 60, 100800.

What can you use?

  • Multi processing: ❌

  • Discrete spaces: ❌

  • Continuous spaces: ✔️

  • Mixed Discrete/Continuous spaces: ❌

Parameters

class neorl.hybrid.nga.NGA(mode, bounds, fit, npop, num_warmups=None, hidden_shape=None, kernel='gaussian', ncores=1, seed=None)[source]

Neural Genetic Algorithm

Parameters
  • mode – (str) problem type, either “min” for minimization problem or “max” for maximization

  • bounds – (dict) input parameter type and lower/upper bounds in dictionary form. Example: bounds={'x1': ['int', 1, 4], 'x2': ['float', 0.1, 0.8], 'x3': ['float', 2.2, 6.2]}

  • fit – (function) the fitness function

  • npop – (int) population size of genetic algorithms

  • num_warmups – (int) number of warmup samples to train the surrogate which will be evaluated by the real fitness fit (if None, num_warmups=20*len(bounds))

  • hidden_shape – (int) number of hidden layers in the RBFN network (if None, hidden_shape=int(sqrt(int(num_warmups/3))))

  • kernel – (str) kernel type for the RBFN network (choose from gaussian, reflect, mul, inmul)

  • ncores – (int) number of parallel processors (currently only ncores=1 is supported)

  • seed – (int) random seed for sampling

evolute(ngen, verbose=False)[source]

This function evolutes the NGA algorithm for number of generations.

Parameters
  • ngen – (int) number of generations to evolute

  • verbose – (bool) print statistics to screen

Returns

(tuple) (list of best individuals, list of best fitnesses)

Example

from neorl import NGA

#Define the fitness function
def FIT(individual):
    """Sphere test objective function.
                    F(x) = sum_{i=1}^d xi^2
                    d=1,2,3,...
                    Range: [-100,100]
                    Minima: 0
    """
    y=sum(x**2 for x in individual)
    return y

#Setup the parameter space (d=5)
nx=5
BOUNDS={}
for i in range(1,nx+1):
    BOUNDS['x'+str(i)]=['float', -100, 100]

nga = NGA(mode='min', bounds=BOUNDS, fit=FIT, npop=40, num_warmups=200, 
          hidden_shape=10, seed=1)
individuals, surrogate_fit = nga.evolute(ngen=100, verbose=False)

#make evaluation of the best individuals using the real fitness function
real_fit=[FIT(item) for item in individuals]

#print the best individuals/fitness found
min_index=real_fit.index(min(real_fit))
print('------------------------ Final Summary --------------------------')
print('Best real individual:', individuals[min_index])
print('Best real fitness:', real_fit[min_index])
print('-----------------------------------------------------------------')

Notes

  • Tri-training concept uses semi-supervised learning to leverage surrogate models that approximate the real fitness function to accelerate the optimization process for expensive fitness functions. Three RBFN models are trained, which are used to determine the best individual from one generation to the next, which is added to retrain the three surrogate models. The real fitness function fit is ONLY used to evaluate num_warmups. Afterwards, the three RBFN models are used to guide the genetic algorithm optimizer.

  • For num_warmups, choose a reasonable value to accommodate the number of design variables x in your problem. If None, the default value of warmup samples is 20 times the size of x.

  • For hidden_shape, large number of hidden layers can slow down surrogate training, small number can lead to underfitting. If None, the default value of hidden_shape is \(int(\sqrt{int(num_{warmups}/3)})\).

  • The kernel can play a significant role in surrogate training. Four options are available: Gaussian function (gaussian), Reflected function (reflect), Multiquadric function (mul), and Inverse multiquadric function (inmul).

  • Total number of cost evaluations via the real fitness function fit for NGA is num_warmups.

  • Total number of cost evaluations via the surrogate model for NGA is npop * ngen.