Definition 2.14: Unconstrained Programs
An unconstrained program is a problem of the form to minimize (or maximize) without any constraints on the variables :
Remark 2.15
Note that, being the feasible domain of unbounded, theorem 2.1 does not apply. Thus, one does not know with certainty whether a minimum actually exists for that problem a a For unconstrained optimization problems, the existence of a minimum can actually be guaranteed if the objective function is such that (O-coercive function). . Moreover, even if the objective function is convex, one such minimum may not exist (think of !). Hence, we shall proceed with the theoretically unattractive task of seeking minima and maxima of functions which need not have them!
Given a point , necessary conditions help determine whether or not a point is a local or a global minimum of a function . For this purpose, we are mostly interested in obtaining conditions that can be checked algebraically.
Definition 2.15: Descent Direction
Suppose that is continuous at . A vector is said to be a descent direction, or an improving direction, for at if
Moreover, the cone of descent directions at , denoted by , is given by
The definition 2.15 provides a geometrical characterization for a descent direction. Yet, an algebraic characterization for a descent direction would be more useful from a practical point of view. In response to this, let us assume that is differentiable and define the following set at :
This is illustrated in figure ?? where the half-space and the gradient are translated from the origin to for convenience.
Lemma 2.1 proves that every element is a descent direction at .
Lemma 2.1: Algebraic Characterization of a Descent Direction
Suppose that is differentiable at . If there exists a vector such that , then is a descent direction for at . That is,
Proof. being differentiable at ,
where . Rearranging the terms and dividing by , we get
Since and , there exists a such that for all . □
We are now ready to derive a number of necessary conditions for a point to be a local minimum of an unconstrained optimization problem.
Theorem 2.3: First-Order Necessary Condition for a Local Minimum
Suppose that is differentiable at . If is a local minimum, then .
Proof. The proof proceeds by contraposition. Suppose that . Then, letting , we get . By lemma 2.1,
hence contradicting the assumption that is a local minimum for . □
Remark 2.16: Obtaining Candidate Solution Points
The above condition is called a first-order necessary condition because it uses the first-order derivatives of . This condition indicates that the candidate solutions to an unconstrained optimization problem can be found by solving a system of algebraic (nonlinear) equations. Points such that are known as stationary points. Yet, a stationary point need not be a local minimum; it could very well be a local maximum or even a saddle point.
Example 2.8. Consider the problem
The gradient vector of the objective function is given by
which has three distinct roots , and . Out of these values, gives the smallest cost value. Yet, we cannot declare to be the global minimum because we do not know whether a (global) minimum exists for this problem. Indeed, as shown in figure 2.4 , none of the stationary points is a global minimum because decreases to as .
More restrictive necessary conditions can also be derived in terms of the Hessian matrix whose elements are the second-order derivatives of . One such second-order condition is given below.
Theorem 2.4: Second-Order Necessary Conditions for a Local Minimum
Suppose that is twice differentiable at . If is a local minimum, then and is positive semidefinite.
Proof. Consider an arbitrary direction . Then, from the differentiability of at , we have
where . Since is a local minimum, from Theorem 2.3, . Rearranging the terms and dividing by , we get
Since is a local minimum, for sufficiently small. By taking the limit as , it follows that . Since is arbitrary, is therefore positive semidefinite. □
Example 2.9. Consider the problem
The gradient vector of the objective function is given by
so that the only stationary point in is . Now, consider the Hessian matrix of the objective function at :
It is easily checked that is indefinite, . Therefore, by Theorem 2.4 , the stationary point is not a (local) minimum (nor is it a local maximum). Such stationary points are called saddle points (see Figure 2.5 ).
The conditions presented in theorems 2.3 and 2.4 are necessary conditions. That is, they must hold true at every local optimal solution. Yet, a point satisfying these conditions need not be a local minimum. Theorem 2.5 gives sufficient conditions for a stationary point to be a global minimum point, provided the objective function is convex on .
Theorem 2.5: First-Order Sufficient Conditions for a Global Minimum for Convex Functions
Suppose that is differentiable at and convex on . If , then is a global minimum of on .
Proof. being convex on and differentiable at , by Theorem 2.6, we have
But since is a stationary point,
Theorem 2.6: First-Order Condition of Convexity
Let be a convex set in with a nonempty interior, and let be a function. Suppose is continuous on and differentiable on . Then is convex on if and only if
holds for any two points .
Proof. ... □
Theorem 2.7: Second-Order Condition of Convexity
Let be a convex set in with a nonempty interior, and let be a function. Suppose is continuous on and twice differentiable on . Then is convex on if and only if
Proof. ... □
Remark 2.17
The first-order condition of strict convexity is:
While the second-order condition is:
The convexity condition required by theorem 2.5 is actually very restrictive, in the sense that many practical problems are nonconvex. In theorem 2.8, we give sufficient conditions for characterizing a local minimum point, provided the objective function is strictly convex in a neighborhood of that point.
Theorem 2.8: Second-Order Sufficient Conditions for a Strict Local Minimum
Suppose that is twice differentiable at . If and is positive definite, then is a local minimum of .
Proof. being twice differentiable at , we have
for each , where . Let denote the smallest eigenvalue of . Then, being positive definite we have and . Moreover, from we get
Since ,
and finally,
i.e., is a strict local minimum of . □
Example 2.10. Consider the problem
The gradient vector and Hessian matrix at are given by
Note that since and also that is the second-order characterization of strict convexity. Hence, by Theorem 2.8 , is a local minimum of ( is also a global minimum of on since is convex). The objective function is pictured in Figure 2.6 below.
We close this subsection by re-emphasizing the fact that every local minimum of an unconstrained optimization problem is a global minimum if is a convex function on . Yet, convexity of is not a necessary condition for each local minimum to be a global minimum.