Modern PESA (PESA2)

A module for the parallel hybrid PESA2 algorithm with prioritized experience replay from reinforcement learning. Modern PESA2 combines GWO, WOA, and DE modules in NEORL.

Original paper: Radaideh, M. I., & Shirvan, K. (2022). PESA: Prioritized experience replay for parallel hybrid evolutionary and swarm algorithms-Application to nuclear fuel. Nuclear Engineering and Technology.

https://doi.org/10.1016/j.net.2022.05.001

What can you use?

  • Multi processing: ✔️

  • Discrete spaces: ✔️

  • Continuous spaces: ✔️

  • Mixed Discrete/Continuous spaces: ✔️

Parameters

class neorl.hybrid.pesa2.PESA2(mode, bounds, fit, R_frac=0.5, memory_size=None, alpha_init=0.1, alpha_end=1, nwolves=5, npop=50, CR=0.7, F=0.5, nwhales=10, int_transform='nearest_int', ncores=1, seed=None)[source]

Prioritized replay for Evolutionary Swarm Algorithms: PESA 2 (Modern Version) A hybrid algorithm of GWO, DE, and WOA

PESA2 Major Parameters

Parameters
  • mode – (str) problem type, either “min” for minimization problem or “max” for maximization

  • bounds – (dict) input parameter type and lower/upper bounds in dictionary form. Example: bounds={'x1': ['int', 1, 4], 'x2': ['float', 0.1, 0.8], 'x3': ['float', 2.2, 6.2]}

  • fit – (function) the fitness function

  • R_frac – (int) fraction of npop, nwolves, nwhales to survive to the next generation. Also, R_frac equals to the number of individuals to replay from the memory

  • memory_size – (int) max size of the replay memory (if None, memory_size is built to accommodate all samples during search)

  • alpha_init – (float) initial value of the prioritized replay coefficient (See Notes below)

  • alpha_end – (float) final value of the prioritized replay coefficient (See Notes below)

PESA2 Auxiliary Parameters (for the internal algorithms)

Parameters
  • npop – (int) for DE, total number of individuals in DE population

  • CR – (float) for DE, crossover probability between [0,1]

  • F – (float) for DE, differential/mutation weight between [0,2]

  • nwolves – (float) for GWO, number of wolves for GWO

  • nwhales – (float) for WOA, number of whales in the population of WOA

PESA2 Misc. Parameters

Parameters
  • int_transform – (str): method of handling int/discrete variables, choose from: nearest_int, sigmoid, minmax.

  • ncores – (int) number of parallel processors

  • seed – (int) random seed for sampling

evolute(ngen, x0=None, replay_every=1, warmup=100, verbose=0)[source]

This function evolutes the PESA2 algorithm for number of generations.

Parameters
  • ngen – (int) number of generations to evolute

  • x0 – (list of lists) initial samples to start the replay memory (len(x0) must be equal or more than npop)

  • replay_every – (int) perform memory replay every number of generations, default: replay after every generation

  • warmup – (int) number of random warmup samples to initialize the replay memory and must be equal or more than npop (only used if x0=None)

  • verbose – (int) print statistics to screen, 0: no print, 1: PESA print, 2: detailed print

Returns

(tuple) (best individual, best fitness, and a list of fitness history)

Example

from neorl import PESA2

#Define the fitness function
def FIT(individual):
        """Sphere test objective function.
                F(x) = sum_{i=1}^d xi^2
                d=1,2,3,...
                Range: [-100,100]
                Minima: 0
        """
        y=sum(x**2 for x in individual)
        return y

#Setup the parameter space (d=5)
nx=5
BOUNDS={}
for i in range(1,nx+1):
        BOUNDS['x'+str(i)]=['float', -100, 100]

pesa2=PESA2(mode='min', bounds=BOUNDS, fit=FIT, npop=50, nwolves=5, nwhales=5, ncores=1)
x_best, y_best, pesa2_hist=pesa2.evolute(ngen=50, replay_every=2, verbose=2)

Notes

  • PESA2 is symmetric, meaning population size is equal between DE, WOA, and GWO, which is helpful to ensure that all algorithms can update the memory with similar computing time. Since GWO/WOA have typically smaller population than DE, i.e. nwolves < npop, nwhales < npop, PESA2 adjusts number of internal generations for GWO/WOA to ensure similar fitness calculations per individual algorithm.

  • For example, if the user sets npop=60 for DE, nwovles=6 for GWO, and nwhales=10 for WOA, then GWO and WOA are executed internally for 10 and 6 generations, respectively, to have a total of 60 evaluations per individual algorithm.

  • R_frac defines the fraction of individuals from npop, nwovles, and nwhales to survive to the next generation, and also the number of samples to replay from the memory. For example, if the user sets R_frac=0.5, npop=60 for DE, nwovles=6 for GWO, and nwhales=10 for WOA, then after every generation, the top 30 individuals in DE, the best 3 wolves, and the best 5 whales survive to the next generation. Then the replay memory feeds 30 individuals to DE, three new wolves to GWO, and 5 new whales to WOA.

  • For complex problems and limited memory, we recommend to set memory_size ~ 5000. When the memory gets full, old samples are overwritten by new ones. Allowing a large memory for complex problems may slow down PESA2 as handling large memories is more computationally exhaustive. If memory_size = None, the memory size will be set to maximum value of ngen*npop*3.

  • For parallel computing of PESA2, pick ncores divisible by 3 (e.g. 6, 18, 30) to ensure equal computing power across the internal algorithms.

  • If ncores=1, serial calculation of PESA2 is executed.

  • Check the sections of GWO, WOA, and DE for notes on the internal algorithms and the auxiliary parameters of PESA2.

  • Start the prioritized replay with a small fraction for alpha_init < 0.1 to increase randomness earlier to improve PESA exploration. Choose a high fraction for alpha_end > 0.9 to increase exploitation by the end of evolution.

  • Look for an optimal balance between npop and ngen, it is recommended to minimize population size to allow for more generations.

  • Total number of cost evaluations for PESA2 is ngen*npop*3 + warmup.