#### 1.4 Optimization Methods

##### 1.4.1 General Information About the Extremal Problems

To solve problems with the single criterion of optimality rigorous mathematical methods are developed.

Direct methods of the calculus of variations – one of the branches of the theory of extreme problems for functional – reduce the problem of finding the functional extremum to the optimization of functions.

There are analytical and numerical methods for finding optimal solutions. As a rule, the real problems are solved numerically, and only in some cases it is possible to obtain an analytical solution.

Functions optimization using differentiation

Finding the extremum of the function of one or more variables possible by means of differential calculus methods. It’s said that the   point gives to function f (x) local maximum, if there is a number Ɛ>0 at which from the inequality | x-x̂| < Ɛ the inequality f (x) ≤ f (x̂) comes after.

The function is called one-extremal (unimodal) if it has a single extremum and multi-extremal (multimodal), if it has more than one extremum. The point at which the function has a maximum or minimum value of all local extrema, called a point of the global extremum.

A necessary condition for an extremum of a differentiable function of one variable gives the famous Fermat’s theorem: let f  (x) – function of one variable, differentiable at the point x̂. If x̂ – local extreme  point, then f’ (x̂) = 0.

The points at which this relationship is satisfied, called stationary. The stationary points are not necessarily the point of extreme. Sufficient conditions for the maximum and minimum functions of one variable – respectively f” (x̂) <0,  f” (x̂) > 0.

Before proceeding to the necessary and sufficient conditions for extrema of functions of several variables, we introduce some definitions. The gradient of function f  (x) is a vector

The real symmetric matrix H is called positive (negative) defined if XT = Hx>0(<0) for every set of real numbers x1 , x2, …. xn, not all of which are zero.

The necessary conditions for that x̂ – the point of local extremum of n variables function f(x), x ƐE are as follows:

If the Hessian is positive (negative) defined for all xƐE n, it is a sufficient condition of unimodality of the function. To test matrix A definiteness, Sylvester criterion is applied, according to which the necessary and sufficient condition for positive certainty are the inequalities:

This case involves determining the extremum in an infinite change range of variables x1 , x2, …. xn.  If optimized function imposed additional conditions (restrictions), talk about the problem of conditional extremum. In general, you want to find extremum f(x), xƐE n under the constraints

To solve the problem (1.14) only with restrictions in the form of equations a method of Lagrange multipliers is used, which is based on the conduct of the Lagrange’s function:

where λj  – undetermined Lagrange multipliers. We write the necessary conditions for optimality in the problem of conditional extremum with equality constraints:

It is a system of n + m equations from which can be determined x i , i=1,…,n, λj, j=1,…,m. A rigorous proof of the Lagrange conditions set out in the specific manuals. Explain the meaning of the method as follows. On the one hand, for all of x which satisfy the constraints hj (x)=0, j=1,…,m, obviously L=(x, λ) = f(x).  On the other hand, the extreme point of the Lagrange function also satisfies these conditions (the second equation (1.14), and therefore, finding an  extremum L(x, λ), we simultaneously obtain a conditional f (x) extremum. To  address the issue of the presence of a stationary point to be a local extremum in the problem of conditional extremum, let us expand Lagrange function in a Taylor series with a subject to the satisfaction of relations hj(x)=0.

Optimization with constraints in the form of inequalities

Classical methods of finding the conditional and unconditional extrema of functions discussed above, in some cases, allow to solve problems with inequality constraints.

Let the task of finding the maximum of a function of one variable f(x) on the interval a≤x≤b. Using the necessary optimality conditions, we find the roots of f'(x) = 0 which lie in the interval [a, b]; We check the sufficient conditions for maximum  f” (x̂) <0 and choose the points corresponding to the maximum. Also, we compute the function values at the borders of a segment, where it can take higher values than the interval (Fig. 1.4). We turn now to the case of several variables and consider the optimization problem: find a maximum f(x), xƐE, subject to the constraints:

In the first stage of the solution by the method of Lagrange multipliers, we find all stationary points lying in the positive octant of n-dimensional space and isolate the maximum points on the basis of sufficient conditions for an extremum. Then we explore the positive octant boundary, in turn equating to zero in all sorts of combinations of 1, 2,…,n-m+1 variables, and each time solving the optimization problem with equality constraints. As a result of the computing process, the complexity of which is obvious, the largest of all the extrema should be selected.

A more general problem, find the maximum

For the problem can be written necessary optimality conditions (generalized Lagrange multiplier rule), however, it is rarely used because of the complexity of solving the resulting system of equations.

##### 1.4.2 Nonlinear Programming

Subject of nonlinear programming

Nonlinear programming – branch of applied mathematics dealing with finding the extremum of function of many variables in the presence of nonlinear constraints in the form of equalities and inequalities, i.e. solution of the problem (1.14), discussed in the previous section [3].

Classical methods of optimization are part of it, along with disciplines such as linear, quadratic, separable programming. However, of the greatest practical interest to us are the numerical or direct methods of nonlinear programming, especially intensively developed in recent years.

None of the proposed algorithms is absolutely the best, so the choice of a numerical method is dictated by the content of a specific problem, which must be solved. Computational methods are classified according to some peculiarity of the problem (no restrictions, with equality constraints, inequalities and so on), the nature of methods of solutions (e.g., with or without the use of derivatives), the type of computers, programming language, and so on.

Search of one variable function extremum

A number of methods of finding an extremum of function of many variables use as a part the procedure for the one-dimensional optimization. In the case then function of one variable is multi-extremal, the only correct method of finding the global extremum is a direct enumeration of a number of values with some step in its change.

Obviously, the function can vary sharply, the smaller should be chosen the grid. After a rough determination of the neighborhood of extremum, begin to search its exact value. For this purpose, one-dimensional algorithms for searching the extremes of unimodal functions in a given interval are used.

One of the most effective methods is the so-called golden section. Recall that if a segment divided into two parts, so that the ratio of the lengths at a greater relative length equal to the length of most of all segment, obtain the so-called golden ratio (is approximately 0.38: 0.62). Golden section method just based on the multiple division of uncertainty interval, i.e. the interval in which the
extremum enclosed in an appropriate ratio.

To reduce the range of uncertainty in the 100 times 11 calculations is required, in the 10000 times – 21 calculations. For comparison, the bisection method (dichotomy) leads to a corresponding narrowing of the range of 14 and 28 function evaluations.

The advantage of the golden section is that it works equally well for smooth and non-smooth functions. It was found that, in the case of smooth functions by a polynomial approximation possible to quickly determine the number of extreme at the same accuracy as that by the golden section.

If the optimized function is defined and unimodal on the entire real axis, there is no need to worry about selecting the initial uncertainty interval. For example, in the method of Davis, Sven and Campy (abbreviated as DSC), from a certain point, it becomes increasing steps until extremum is passed, and then made quadratic interpolation on the basis of information about the functions in the past three points is determined extremum of the interpolation polynomial.

The Powell’s algorithm of quadratic polynomial interpolation is carried out in three arbitrary points, approximate extremum is found, dropped one of the four points and the procedure is repeated until convergence. The most effective is a combination of the described algorithms, or the so-called method of DSC-Powell. In accordance with this first algorithm DSC sought interval in which the extremum, three points are selected and carried there through parabola. Approximate value at an extremum is calculated as in the method of Powell:

If the value of the function at the point x̂ of optimum values f(x1), f(x2), f(x3) differ by less than a predetermined accuracy, complete calculations,
otherwise discard the worst of the points x1, x2, x3,x̂ and carry out a new parabola. For functions that are sufficiently close to quadratic efficiency DSC-Powell is very high: as a rule, the decision to an accuracy 10-5…10-6 is achieved 6–8 calculations of the objective function.

Methods for unconstrained optimization

Consider the problem of finding the maximum of a function of several variables without restrictions. Find maximum f(x), xƐE n One of the most famous is the gradient methods to solve this problem. They are based on the fact that the promotion of the objective function to the extreme in the space E n made by the rule:

Vector sk sets another search direction and λj – the length of a step in this direction. Obviously,λj should be chosen so as to move as close as possible to the extreme. Various methods of selecting the direction of the search are used. The simplest of these is that the movement of the point xk is made in the direction of the greatest magnification of f(xk), i.e. in the direction of a gradient function at a given point.

According to this method, called the method of steepest descent,

The theory shows, and the practice of calculation confirms that the steepest descent method is not very effective in cases where the level curves of the objective function is strongly stretched, i.e. there are deep ravines while searching a minimum or ranges when searching maximum. The steepest descent direction is almost orthogonal to the best direction of the search, as a consequence, the optimal step reduced all the time, and the algorithm “get stuck” without reaching the extreme. The way out of this situation can be a scaling of variables, at which the level lines would get kind of close to the circle.

In order to reduce the amount of computations of the objective function, associated with a numerical definition of partial derivatives, sometimes used method of coordinate descent, which is also called a relaxation or Gauss-Seidel method. Let ei – axis xi unit vector, and x={x1…,xn – the starting point of the search. One international of coordinate descent is to take steps: xk-1=xkkek, k = 1,…,n.

Step as in the method of steepest descent is determined by the condition max f(xkksk The Gauss-Seidel method suffered from the same flaw as the steepest descent method, – a bad convergence in the presence of ravines.

One way to overcome the computational difficulties associated with the gully structure of the objective function involves the use of information not only on its first derivative, but also higher order, contained in the second partial derivatives. An arbitrary function can be represented by its quadratic expansion in a Taylor series in the neighborhood of point x:

Equations (1.26) or (1.27) are applied iteratively until the end calculation process criterion is reached, called Newton’s method. Difficulties of using Newton algorithm associated, firstly, with Hessian matrix inversion, and secondly, with the computation of the second partial derivatives, which restricts its practical use.

The methods of conjugate directions are without drawbacks of gradient methods and have the convergence rate close to Newton’s method. At the same time, they are the methods of the first order, as the gradient. Positive defined quadratic form of n variables is minimized conjugate gradient method for no more than n steps. The conjugate gradient method is suitable for minimization of non-quadratic functions, only when they are iterative.

There are different ways of constructing conjugate directions. In particular, Fletcher and Reeves proposed a method, called the conjugate gradients method, according to which the subsequent direction of the search is a linear combination of the direction of steepest descent and the previous direction, i.e.:

Some methods do not use the derivatives of functions, and the optimization direction in which tis determined only on the basis of successive calculations of the objective function. In cases where the determination of the objective function derivatives is difficult, search algorithms may be preferable. In the case of one-dimensional analogue of the search method is the method of golden section, and the method of using derivatives – DSC-Powell method.

Methods of optimization with constraints

In addition to the previously described method of Lagrange multipliers for finding the extremum of functions with restrictions a number of numerical  methods developed. The first approach to the construction of algorithms for constrained optimization is monotonous motion to the optimum of the objective function and at the same time striving to meet the exact or approximate limits.
Methods of this type are numerous, but the complexity, lack of flexibility and a large amount of computational work limit their use in practical calculations.

More elegant, easy to implement and effective the methods based on the reduction of problems with constraints to the solution of a sequence of
unconstrained optimization – the so-called penalty function methods. There are several variations of these methods.

Let’s begin their consideration with the interior point method for problems with inequality constraints:

The organization of numerical maximum search of (1.29) must be such that the point does not leave the feasible region. This shortage is deprived the external penalty function method, which for the problem of the form (1.33) involves the construction of the associated objective function.

It can be shown that the sequence of points x̂k converges to the solution of the problem (1.33), but in contrast to the interior point method to the extreme movement takes place outside the feasible set, and is taken from the name of the method of exterior penalty functions. This method is also applicable to the general problem of nonlinear programming (1.14), for which used attached objective function:

Algorithm of solution is the same as for the problem (1.34).

The solution of nonlinear programming problems with constraints using penalty function method is complicated by the fact that as the penalty function coefficient is increasing, (1.35) expressed gully structure. As previously shown, not all the methods of unconditional optimization solution can cope with such problems, and therefore the choice of the method of finding the extremum of the attached objective function is of fundamental importance.

An important role is also played the strategy of the penalty factor change, because if you choose it immediately large, constraints of the problem satisfied well, but the objective function does not improve. In contrast, if too small values of Λ k, motion occurs in the direction of improvement of the objective function, but practically does not take into account the constraints that can lead to failure in the E n  areas where the objective function and constraints are not defined.

For example, if in the objective function or in limitations members of the form xa are present, it is unacceptable entering the zone x≤0 Get rid of the
zone uncertainty, resulting in the computer calculations for emergencies can sometimes be the introduction of a suitable change of variables. In particular, to meet conditions x>0 the replacement x=e is suitable, which already zƐE1. If such a reception is impossible, it should be carefully selected constants of unconditional search methods as the length of the step in the direction of descent, change of this step in the process to find a one-dimensional vector of variables did not leave the area where the objective function and constraints of
the problem identified.

In conclusion, we consider the possibility of nonlinear optimization methods usage in order to solve systems of nonlinear equations. Suppose that in the problem (1.14) there are no restrictions in the form of inequalities, and the number of variables equal to the number of restrictions in the form of equations, i.e., in fact, the task of solving the system of m equations with m unknowns. We form the function

and find its maximum. If the system of equations  hj(x)=0, j=1,…m , has a solution, then, obviously, at the same time with the maximum of I* is the root of the system of equations. In particular, if the functions hj(x) are linear, function (1.36) is obtained quadratic and can be effectively solved by Newton’s and conjugate gradient method.

Replacement of the problem of systems of linear equations solution to extremum problems is justified in cases where the matrix of the system is ill-conditioned (e.g., in the problem of approximation by least squares) and can not be solved by conventional methods, in particular, by process of elimination.

The values hj(x) in (1.35) are called residuals, and the solution of nonlinear equations is replaced by minimizing the sum of squared residuals.

##### 1.4.3 Methods for Optimization of Hardly Computable Functions

In some problems, when the calculation of the value of the objective function may take minutes, hours or even days of the computer, the range of acceptable methods of optimization significantly narrowed

These problems, in particular, include aerodynamic optimization of turbine blades using CFD.

The Nelder-Mead method (Nelder A.-Mead R.), also known as the flexible polyhedron method or the simplex method is a method of unconditional optimization of functions of several variables. Without requiring computation of the gradient function, it is applicable to non-smooth, noisy functions, and is particularly effective in small (up to 6) number of variable parameters. Its essence lies in the follow-successive movement and deformation of the simplex around the point of extreme. The method is a local extremum and can “get stuck” in one of them. If you still need to find a global extremum, one can try to select other initial simplex.

A more developed approach to the exclusion of local extrema offered algorithm based on the Monte-Carlo method, as well as evolutionary algorithms.

The genetic algorithm (GA) – is a global search heuristic method, used to solve optimization problems and modeling, by random selection, combination and variation of the required parameters with the use of mechanisms that resemble biological evolution. GA usage assumes its careful adjustment on special test functions, which, however, does not guarantee the effectiveness of the algorithm and the accuracy of decisions of the function.

This algorithm is well suited to the study of noisy functions, but requires a large number of CFD – calculations and therefore more time on optimization. The last forcing researchers to use coarse meshes and not quite accurate, but easily calculated turbulence models, which will inevitably leads to loss of the numerical calculations precision.

Monte-Carlo (random search) methods allows you to find the extremes of multimodal and noisy functions; use various constraints during optimization; is particularly effective when a large number of variable parameters; requires careful adjustment for test functions; it is one of the most common methods of optimization and solution of various problems in mathematics, physics,
economics, etc. However, the method requires tens of thousands of the objective function computing and practically not applicable for direct optimization based on CFD – calculations. To improve the efficiency of random search used quasi-random sequence of numbers (LPτ [4] Sobol), Faure, Halton et al.). Increased efficiency is achieved by eliminating clustering that occurs in a random search that is by more even distribution of points in the search study area of the function extremum.

Recently, in the optimization algorithms the methods of experiment planning are widely used. Using the methods of the theory of experimental design
(Design of the Experiment – DOE), the original mathematical model can be approximated by a quadratic polynomial. One of the relevant planning schemes of the experiment described in Section 1.3. These quadratic polynomials can be used to further optimization with the use of a universal and reliable global search method using a quasi-random sequences.