To understand how the Adaptive Simulated Annealing algorithm works, it helps to visualize the problems presented by such complex systems as a geographical terrain. For example, consider a mountain range, with two “parameters,” such as along the North-South and East-West directions. We want to find the lowest valley in this terrain. The ASA algorithm approaches this problem similar to using a ball that can bounce over mountains from valley to valley. We start at a high “temperature,” where the temperature is an ASA parameter that mimics the effect of a fast moving particle in a hot object like a hot molten metal; the ball is permitted to make very high bounces and is able to bounce over any mountain to access any valley, given enough bounces. As the temperature is made relatively colder, the ball cannot bounce so high, and it can settle, becoming trapped in relatively smaller ranges of valleys. We imagine that our mountain range is aptly described by a “cost function” (the ObjectiveandPenalty parameter in Isight). We define probability distributions of the two directional parameters, called generating distributions, because they generate possible valleys or states we are to explore. We define another distribution, called the acceptance distribution, which depends on the difference of cost functions of the present generated valley we are to explore and the last saved lowest valley. The acceptance distribution decides probabilistically whether to stay in a new lower valley or to bounce out of it. All the generating and acceptance distributions depend on temperatures. In a D-dimensional parameter space with parameters ${p}^{i}$ having ranges $[{A}_{i},{B}_{i}]$, about the $k$’th last saved point (e.g., a local optima),${p}_{k}^{i}$, a new point is generated using a distribution defined by the product of distributions for each parameter,${g}^{i}({y}^{i},{T}_{i})$, in terms of random variables ${y}^{i}$ in $[-1,1]$, where ${p}_{k}+{1}^{i}={p}_{k}^{i}+{y}^{i}({B}_{i}-{A}_{i})$, and “temperatures” ${T}_{i},{g}^{i}({y}^{i};{T}_{i})=1/\left[2\left(\left|{y}^{i}\right|+{T}_{i}\right)(1+1/{T}_{i})\right]$. The cost functions, $C({p}_{k}+1)-C\left({p}_{k}\right)$, are compared using a uniform random generator, $U$ in $(0,1)$, in a “Boltzmann” test: If $\mathrm{exp}[-\left(C({p}_{k}+1)-C\left({p}_{k}\right)\right)/{T}_{\mathrm{cos}t}]>U$, where ${T}_{\mathrm{cos}t}$ is the “temperature” used for this test, the new point is accepted as the new saved point for the next iteration. Otherwise, the last saved point is retained. The annealing schedule for each parameter temperature, ${T}_{i}$, from a starting temperature ${T}_{i}0$, is as follows: ${T}_{i}\left({k}_{i}\right)={T}_{0i}\mathrm{exp}(-{c}_{i}{k}_{i}^{1/D})$. The annealing schedule for the cost temperature is developed similarly to the parameter temperatures. However, the index for reannealing the cost function, ${k}_{\mathrm{cos}t}$, is determined by the number of accepted points, instead of the number of generated points as used for the parameters. ${T}_{\mathrm{cos}t}\left({k}_{\mathrm{cos}t}\right)={T}_{0\mathrm{cos}t}\mathrm{exp}(-{c}_{\mathrm{cos}t}{k}_{\mathrm{cos}t}^{1/D})$. As determined by the technique options selected, the parameter “temperatures” may be periodically adaptively reannealed or increased relative to their previous values, using their relative first derivatives with respect to the cost function, to guide the search “fairly” among the parameters. As determined by the technique options selected, the reannealing of
the cost temperature resets the scale of the annealing of the cost acceptance
criteria using the new ${T}_{0\mathrm{cos}t}$ value. The new
${T}_{0\mathrm{cos}t}$ value is taken to be the minimum of
the current initial cost temperature and the maximum of the absolute
values of the best and last cost functions and their difference. The
new ${k}_{\mathrm{cos}t}$ is calculated taking ${T}_{\mathrm{cos}t}$T |