Multi-objective optimization with Pareto front

Optimization

Decisions are made based on preferences. Either it is to minimize the monthly expense for a person living with a tight budget or to maximize the revenue of a restaurant for a business owner, we need to consider our preference or criteria (minimize or maximize) to form a solution. This process is called optimization and optimizing based on a preference is an objective. When we only focus on single objectives, we can easily sort out the best solution. However, in reality, we frequently need to take multiple objectives into consideration. And more than often, the objectives we want to optimize are conflicting each other. For example, when a car has the most comfortability and the best performance, we know it is going to cost a fortune on the fuel. In order to solve these multi-objectives optimization problems, we can consider the Pareto front.

Pareto Dominance and Pareto Front

Assume that there is a set of solutions for a scenario where our objective is to maximize X and minimize Y. These solutions are illustrated by the graph below where each point represents one of the available solutions. We can see that in general, solutions with a higher X value also have a higher Y value. In another word, our objectives (maximizing X and minimizing Y) are conflicting each other.

Illustration of available solutions of a scenario with objectives to maximize X and minimize Y.

If we look at solution A and C, we can see that solution C has a lower Y value and a higher X value. And thus, solution C is more preferable to A. The same can be said to solution F compared to solution E. Although both solutions have the same X value, but solution F has a lower Y value and thus we prefer solution F to E. When a solution is no worse than the other one in any perspective and it is strictly better from at least one perspective, then we say this solution dominates the other, or the other solution is dominated by this solution. In our example, solution C dominates A and solution F dominates E. Solution A and E are dominated solutions. Meanwhile, we can’t make the same conclusion about solution F. By observation, no other solution dominates solution F. Solution G has a higher X value, but it also has a higher Y value and solution D has a lower Y value, but it also has a lower X value. Since we can’t improve either X or Y without hurting the others, we say solution F is non-dominated. All non-dominated solutions form the set of Pareto front.

Non-dominated solutions that form the Pareto Front are highlighted in red. Green curve is the illustration of the Pareto front.

From our example, we notice that the Pareto front consists of solution C, F, and G. Any other solution is dominated by at least one of the solutions in Pareto front and solutions in the Pareto front are non-dominated solutions. If we connect the Pareto front solutions, we can get a visual representation of the Pareto front formed by the given solution. Any solution that lies on this line will be non-dominate solution. For aesthetic reason, we extend the Pareto front to both ends so that it covers the full range.

Utopian point and optimal solution

Pareto front helps us to narrow down the solutions to a few that we must make a trade-off in choose one over the other. Is there a mechanism to identify one single best solution among the Pareto front solutions? Assume that we can take the X and Y values from each individual solution apart and combine with X and Y values from the other solutions, the best solution we could get under the assumption is called the utopian point. As the name suggest, it is unrealistic. But it provides a good reference on how far away each Pareto front solution is from this ideal solution.

Utopian point is highlighted in green and the Pareto front solution F, which is closest to the utopian point, is the optimal solution.

Since we view each preference (either X or Y) equally important, we can measure the distance from each of the Pareto front solution to the utopian point. The one with the shortest distance is the optimal solution. It makes sense because the solution closest to the ideal solution is either balanced across both X and Y, or one of them is so close to the best possible value we can get, and it compensates the other value that is worse off.

Get the optimal solution on d3VIEW

As demonstrated above, in order to find all Pareto front solutions, we need to compare all solutions to make sure that no one dominates the candidate for the Pareto front solutions. This can be computationally expensive. In d3VIEW, we adopt the Kung’s method^[1]. First, we sort the first objective from the most optimal to least optimal. Then, we divide the dataset into two (top half and bottom half), find the non-dominated solutions in each sub dataset, and combine them. The set of non-dominated solutions from the top half sub dataset will still be non-dominated for the whole dataset. However, this is not the case for those from the bottom half sub dataset. Then, we check if any of the non-dominated solutions from the bottom half is dominated by any non-dominated solutions from the top half. After identifying all non-dominated solutions, we combine all the non-dominated solutions from the two sub datasets. Recursively, we can find all non-dominated solutions of the given dataset with fewer efforts.

Illustration of Kung’s method to find non-dominated solutions recursively.

Here is an example of Pareto optimization using a sample HIC dataset. Each row of the dataset represents a solution to the problem. And we are interested in a solution that has the min value for all three variables “tbumper”, “thood”, and “HIC”.

By providing this dataset to the “Dataset Compute Pareto Front Optimal” worker on d3VIEW, in return, we get the optimal solution in the output.

3D scatter plot of the HIC dataset. Green dot is the utopian point (ideal solution), and the circled point is the optimal solution.

Optimal solution from d3VIEW Workflow as a dataset with the distance to the ideal solution as a reference.

Or we can provide this dataset to the worker “Dataset Compute Pareto Front Sorter” and it will return all the points with the distance information sorted in ascending order, which means, the first point we see on the top is the optimal solution.

Output from the worker “Dataset Compute Pareto Front Sorter”. It includes all the points with distance information sorted in ascending order.

Reference

[1] H. T. Kung, F. Luccio, and F. P. Preparata. 1975. On Finding the Maxima of a Set of Vectors. J. ACM 22, 4 (Oct. 1975), 469–476. https://doi.org/10.1145/321906.321910

Multi-objective optimization with Pareto front

September 23, 2022 | by Bing Li

Optimization

Pareto Dominance and Pareto Front

Utopian point and optimal solution

Get the optimal solution on d3VIEW

Reference

Categories

Leave a Reply Cancel reply

Multi-objective optimization with Pareto front

September 23, 2022 | by Bing Li

Optimization

Pareto Dominance and Pareto Front

Utopian point and optimal solution

Get the optimal solution on d3VIEW

Reference

Categories

Related Posts

Free LSTC USSID

Interpolation methods for time series data

Data Drilling in Simlytiks

Leave a Reply Cancel reply