Modern PESA (PESA2)¶
A module for the parallel hybrid PESA2 algorithm with prioritized experience replay from reinforcement learning. Modern PESA2 combines GWO, WOA, and DE modules in NEORL.
Original paper: Radaideh, M. I., & Shirvan, K. (2022). PESA: Prioritized experience replay for parallel hybrid evolutionary and swarm algorithms-Application to nuclear fuel. Nuclear Engineering and Technology.
https://doi.org/10.1016/j.net.2022.05.001
What can you use?¶
Multi processing: ✔️
Discrete spaces: ✔️
Continuous spaces: ✔️
Mixed Discrete/Continuous spaces: ✔️
Parameters¶
-
class
neorl.hybrid.pesa2.
PESA2
(mode, bounds, fit, R_frac=0.5, memory_size=None, alpha_init=0.1, alpha_end=1, nwolves=5, npop=50, CR=0.7, F=0.5, nwhales=10, int_transform='nearest_int', ncores=1, seed=None)[source]¶ Prioritized replay for Evolutionary Swarm Algorithms: PESA 2 (Modern Version) A hybrid algorithm of GWO, DE, and WOA
PESA2 Major Parameters
- Parameters
mode – (str) problem type, either “min” for minimization problem or “max” for maximization
bounds – (dict) input parameter type and lower/upper bounds in dictionary form. Example:
bounds={'x1': ['int', 1, 4], 'x2': ['float', 0.1, 0.8], 'x3': ['float', 2.2, 6.2]}
fit – (function) the fitness function
R_frac – (int) fraction of
npop
,nwolves
,nwhales
to survive to the next generation. Also,R_frac
equals to the number of individuals to replay from the memorymemory_size – (int) max size of the replay memory (if None,
memory_size
is built to accommodate all samples during search)alpha_init – (float) initial value of the prioritized replay coefficient (See Notes below)
alpha_end – (float) final value of the prioritized replay coefficient (See Notes below)
PESA2 Auxiliary Parameters (for the internal algorithms)
- Parameters
npop – (int) for DE, total number of individuals in DE population
CR – (float) for DE, crossover probability between [0,1]
F – (float) for DE, differential/mutation weight between [0,2]
nwolves – (float) for GWO, number of wolves for GWO
nwhales – (float) for WOA, number of whales in the population of WOA
PESA2 Misc. Parameters
- Parameters
int_transform – (str): method of handling int/discrete variables, choose from:
nearest_int
,sigmoid
,minmax
.ncores – (int) number of parallel processors
seed – (int) random seed for sampling
-
evolute
(ngen, x0=None, replay_every=1, warmup=100, verbose=0)[source]¶ This function evolutes the PESA2 algorithm for number of generations.
- Parameters
ngen – (int) number of generations to evolute
x0 – (list of lists) initial samples to start the replay memory (
len(x0)
must be equal or more thannpop
)replay_every – (int) perform memory replay every number of generations, default: replay after every generation
warmup – (int) number of random warmup samples to initialize the replay memory and must be equal or more than
npop
(only used ifx0=None
)verbose – (int) print statistics to screen, 0: no print, 1: PESA print, 2: detailed print
- Returns
(tuple) (best individual, best fitness, and a list of fitness history)
Example¶
from neorl import PESA2
#Define the fitness function
def FIT(individual):
"""Sphere test objective function.
F(x) = sum_{i=1}^d xi^2
d=1,2,3,...
Range: [-100,100]
Minima: 0
"""
y=sum(x**2 for x in individual)
return y
#Setup the parameter space (d=5)
nx=5
BOUNDS={}
for i in range(1,nx+1):
BOUNDS['x'+str(i)]=['float', -100, 100]
pesa2=PESA2(mode='min', bounds=BOUNDS, fit=FIT, npop=50, nwolves=5, nwhales=5, ncores=1)
x_best, y_best, pesa2_hist=pesa2.evolute(ngen=50, replay_every=2, verbose=2)
Notes¶
PESA2 is symmetric, meaning population size is equal between DE, WOA, and GWO, which is helpful to ensure that all algorithms can update the memory with similar computing time. Since GWO/WOA have typically smaller population than DE, i.e.
nwolves < npop
,nwhales < npop
, PESA2 adjusts number of internal generations for GWO/WOA to ensure similar fitness calculations per individual algorithm.For example, if the user sets
npop=60
for DE,nwovles=6
for GWO, andnwhales=10
for WOA, then GWO and WOA are executed internally for 10 and 6 generations, respectively, to have a total of 60 evaluations per individual algorithm.R_frac
defines the fraction of individuals fromnpop
,nwovles
, andnwhales
to survive to the next generation, and also the number of samples to replay from the memory. For example, if the user setsR_frac=0.5
,npop=60
for DE,nwovles=6
for GWO, andnwhales=10
for WOA, then after every generation, the top 30 individuals in DE, the best 3 wolves, and the best 5 whales survive to the next generation. Then the replay memory feeds 30 individuals to DE, three new wolves to GWO, and 5 new whales to WOA.For complex problems and limited memory, we recommend to set
memory_size ~ 5000
. When the memory gets full, old samples are overwritten by new ones. Allowing a large memory for complex problems may slow down PESA2 as handling large memories is more computationally exhaustive. Ifmemory_size = None
, the memory size will be set to maximum value ofngen*npop*3
.For parallel computing of PESA2, pick
ncores
divisible by 3 (e.g. 6, 18, 30) to ensure equal computing power across the internal algorithms.If
ncores=1
, serial calculation of PESA2 is executed.Check the sections of GWO, WOA, and DE for notes on the internal algorithms and the auxiliary parameters of PESA2.
Start the prioritized replay with a small fraction for
alpha_init < 0.1
to increase randomness earlier to improve PESA exploration. Choose a high fraction foralpha_end > 0.9
to increase exploitation by the end of evolution.Look for an optimal balance between
npop
andngen
, it is recommended to minimize population size to allow for more generations.Total number of cost evaluations for PESA2 is
ngen*npop*3 + warmup
.