In practice, few problems can be formulated as unconstrained programs. This is because the feasible region is generally restricted by imposing constraints on the optimization variables. In this section, we first present theoretical results for the problem that is to minimize , subject to for a general set (geometric optimality conditions). Then, we let be more specifically defined as :
Definition 2.16: Inequality Constrained Program
The feasible region of the NLP is defined by a set of nonlinear inequalities , for the Inequality Constrained Program we derive the Karush-Kuhn-Tucker (KKT) conditions of optimality in the following.
Definition 2.17: Feasible Direction
Let be a nonempty set in . A vector , , is said to be a feasible direction at if
Moreover, the cone of feasible directions at , denoted by , is given by
From the Definition 2.17 and from Lemma 2.1, it is clear that a small movement from along a direction leads to feasible points, whereas a similar movement along a direction leads to solutions of improving objective value (see the definition 2.15). As shown in theorem ??, a (geometric) necessary condition for local optimality is that: ”Every improving direction is not a feasible direction”. This fact is illustrated in figure ?? where both the half-space and the cone (see Definition 2.3) are translated from the origin to for clarity.
Theorem 2.9: Geometric Necessary Condition for a Local Minimum
Let be a nonempty set in and let be a differentiable function. Suppose that is a local minimizer of the problem to minimize subject to . Then, .
We now specify the feasible region as
where , are continuous functions. In the geometric optimality condition given by Theorem 2.12, is the cone of feasible directions. From a practical viewpoint, it is desirable to convert this geometric condition into a more usable condition involving algebraic equations. As Lemma 2.2 indicates, we can define a cone in terms of the gradients of the active constraints at , such that . For this, we need the following:
Definition 2.18: Active Constraint, Active Set
Let , and consider the set
. Let be a feasible point. For each , the constraint is said to be active or binding at if ; it is said to be inactive at if . Moreover,
denotes the set of active constraints at .
Lemma 2.2: Algebraic Characterization of a Feasible Direction
Let be differentiable functions, and consider the set . For any feasible point , we have
Remark 2.18
This Lemma together with Theorem 2.12 directly leads to the result that for any local solution point , i.e.,
The foregoing geometric characterization of local solution points applies equally well to either interior points
or boundary points being at the boundary of the feasible domain. At an interior point, in particular, any direction is feasible, and the necessary condition reduces to , which gives the same condition as in unconstrained optimization (see Theorem 2.3).
Note also that there are several cases where the condition is satisfied by non-optimal points. In other words, this condition is necessary but not sufficient for a point to be a local minimum of on . For instance, any point with for some trivially satisfies the condition . Another example is given below.
Example 2.11. Consider the problem
Clearly, this problem is convex and is the unique global minimum.
Now, let be any point on the line . Both constraints and are active at , and we have . Therefore, no direction can be found such that and simultaneously, i.e., . In turn, this implies that is trivially satisfied for any point on .
On the other hand, observe that the condition in Theorem 2.9excludes all the points on , but the origin, since a feasible direction at is given, e.g., by .
We now reduce the geometric necessary optimality condition to a statement in terms of the gradients of the objective function and of the active constraints. The resulting first-order optimality conditions are known as the Karush-Kuhn-Tucker (KKT) necessary conditions. Beforehand, we introduce the important concepts of a regular point and of a KKT point.
Definition 2.19: Regular Point (for a Set of Inequality Constraints)
Let be differentiable functions on and consider the set . A point is said to be a regular point if the gradient vectors are linearly independent:
Definition 2.20: KKT Point
Let and be differentiable functions. Consider the problem to minimize subject to . If a point satisfies the conditions:
Remark 2.19
The scalars are called the Lagrange multipliers. The condition 2.4a is called stationarity condition. The condition 2.4b is referred as dual feasibility (DF) condition; the condition 2.4c, i.e., the requirement that be feasible, is called the primal feasibility (PF) condition; finally, the condition 2.4d is called the complementarity slackness (CS) condition a
Theorem 2.10: KKT Necessary Conditions
Let and be differentiable functions. Consider the problem to minimize subject to . If is a local minimum and a regular point of the constraints, then there exists a unique vector such that is a KKT point.
Proof. Since solves the problem, then there exists no direction such that and , simultaneously (see Remark 2.18). Let be the matrix whose rows are and , . Clearly, the statement is false and, by Corollary 2.1, there exists a nonzero vector in such that . Denoting the components of by and for , we get:
where and for and (here is the vector whose components are the ’s for ). Letting for , we then get the conditions:
where is the vector whose components are for . Note that , since otherwise the assumption of linear independence of the active constraints at would be violated a a Note that if then and by Corollary 2.1we have:
with at least one thus violating the linear independence assumption. . Then, letting , we obtain that is a KKT point. □
Remark 2.20
One of the major difficulties in applying the foregoing result is that we do not know a priori which constraints are active and which constraints are inactive, i.e., the active set is unknown. Therefore, it is necessary to investigate all possible active sets for finding candidate points satisfying the KKT conditions. This is illustrated in the example below.
Example 2.12 (Regular Case). Consider the problem
Note that every feasible point is regular since and , so must satisfy the stationarity conditions:
Four cases can be distinguished:
Overall, there is a unique candidate for a local minimum. Yet, it cannot be concluded as to whether this point is actually a global minimum or even a local minimum.
Remark 2.21: Constraint Qualification
It is very important to note that for a local minimum to be a KKT point, an additional condition must be placed on the behaviour of the constraint, i.e., not every local minimum is a KKT point, such a condition is known as a constraint qualification. In Theorem 2.10 it is shown that one possible constraint qualification is that be a regular point, which is the well known linear independence constraint qualification (LICQ). A weaker constraint qualification (i.e., implied by LICQ) known as the Mangasarian-Fromovitz constraint qualification (MFCQ) requires that there exits (at least) one direction , i.e. such that , for each . Note, however, that the Lagrange multipliers are guaranteed to be unique if LICQ holds (as stated in Theorem 2.10),while this uniqueness property may be lost under MFCQ.
The following example illustrates the necessity of having a constraint qualification for a KKT point to be a local minimum point of an NLP.
Example 2.13 (Non-Regular Case). Consider the problem
The feasible region is shown in fig. 2.7below. Note that a minimum point is . The stationary condition relative to variable computed at reads:
It is readily seen that this condition cannot be met at the local minimum point . In other words, the KKT conditions are not necessary in this example. This is because no constraint qualification can hold at . In particular, not being a regular point, LICQ does not hold. Moreover, the set being empty (the direction gives while any other direction induces a violation of either one of the constraints), MFCQ does not hold at either.
The following theorem provides a sufficient condition under which any KKT point of an inequality constrained NLP problem is guaranteed to be a global minimum of that problem.
Theorem 2.11: KKT Sufficient Conditions
Let and , be convex and differentiable functions. Consider the problem to minimize subject to . If is a KKT point, then is a global minimum of that problem.
Proof. Consider the function . Since and , are convex functions and , is also convex a a The sum of convex functions is a convex function, a positive scalar times a convex function is again a convex function . Moreover, the stationarity conditions impose that we have . Hence, by Theorem 2.5, is a global minimizer for on , i.e.
In particular, for each such that , , we have
Noting that contains the feasible domain , we therefore showed that is a global minimizer for the problem. □
Example 2.14. Consider the problem
The point with , and , being a KKT point, and both the objective function and the feasible set being convex, by Theorem 2.11, is a global minimum.
Both second-order necessary and sufficient conditions for inequality constrained NLP problems will be presented later on in 2.8.