Simulated Annealing with Polynomial Regression

Simulated annealing

Simulated annealing is an optimization method to find the global optimum of the objective function. It is inspired by the process of metal annealing which heats the metal to a very high level and cools down in a controlled manner. In the SA algorithm, a random point is selected to start with. A new point is proposed at each iteration by making small changes to the current point. The point is then evaluated by the objective function to get a score (energy, cost) so that it can be compared to the previous point. If the new point has a better score, it will be accepted; Otherwise, it is accepted with a certain probability determined by the probability distribution. This probability distribution is determined by a parameter called the “temperature”. It decreases gradually at each iteration.

The idea behind the temperature parameter is that at higher temperatures, the algorithm is more likely to accept solutions that are worse than the current solution. This allows the algorithm to explore a wider range of solutions and avoid getting stuck in local minima. As the temperature decreases, the algorithm becomes more conservative and is less likely to accept worse solutions, which helps it converge towards the global optimum.

Polynomial Regression

SA algorithm works nicely with an objective function. When a dataset is provided, we could build a model first and use the model as an objective function.

Polynomial regression fits a polynomial curve of certain degree to the given data points. When the degree is one, it becomes linear regression. Polynomial regression can be useful in cases where the relationship between the independent variable and the dependent variable is not linear, which is usually the case. By using a higher-degree polynomial, we can capture more complex relationships between the variables.

Adaptive model-fitting

For every 100th iteration, we stop and “zoom in” to the region centered at the current best point we obtained and run the SA algorithm there locally. We compare the optimum value from this local region and compare it to the best point and update the global best point when necessary. This procedure is activated when the “shrink_factor” is between 0 and 1 (exclusive). It shrinks the range for each variable from the original domain to a certain percentage (value of the parameter shrink_factor) centered at the current global best point. When the new reduced domain is determined, we run the SA algorithm on this region to find the optimum. This will generate a new polynomial regression model with the same parameters on a smaller domain. It will depict a more accurate local surface that can be used as the objective function.

SA optimization on d3VIEW

In d3VIEW Workflow, there is a worker “dataset_simulated_annealing_optimizer” to perform the SA optimization. First, we upload the dataset and select the input and target variables (X and Y columns).

The parameters “iterations”, “step_size” and “initial_temp” controls the process of the SA algorithm. Their values may need to be carefully tuned as performance of the SA algorithm on different datasets vary.

The default normalization method is “minmax” normalization, which normalizes each variable to a range of 0 and 1. Standardization normalization is also available (“znorm”). If we want to skip normalization, we can choose “false”.

By default, it returns the optimum point from the optimization process. We can also let it return the domain information (reduced domain if the shrink_factor is between 0 and 1) or the tracking history of the best point at each iteration, by selecting the “return_type”. When normalization is specified, the output will be shown in the original scale without normalization.

Sample output for different “return_type” using HIC dataset

The “target_value” parameter controls to which value the optimization converges to. In many applications, the feature we are interested in optimizing is bounded (e.g, greater or equal to zero). When we are comparing the performance of the “best point” we found during the optimization, we compare the absolute value of the difference between the value we found and the “target_value”. In this way, We can use the “target_value’ parameter to limit our search so that the optimized target Y value will be closed to the “target_value”. Therefore, when we set “target_value” to be 0, we can ignore the negative signs in the output.

When the “target_value” is set to be 0, we can ignore the negative sign in the output Y value.

Finally, we can choose the order of the polynomial regression.

Example

The following dataset provide parameters for generating the Force-deflection curves. The DTW values is a measure how far each simulation curve from the test curve.