Home Laboratory Computer Lab Theses

2.6 Problems with inequality constraints

In practice, few problems can be formulated as unconstrained programs. This is because the feasible region is generally restricted by imposing constraints on the optimization variables. In this section, we first present theoretical results for the problem that is to minimize f(x) , subject to xS for a general set S (geometric optimality conditions). Then, we let S be more specifically defined as :

Definition 2.16: Inequality Constrained Program

minimize f(x) subject to g(x)0

The feasible region of the NLP is defined by a set of nonlinear inequalities gi(x)0 , for the Inequality Constrained Program we derive the Karush-Kuhn-Tucker (KKT) conditions of optimality in the following.

2.6.1 Geometric Optimality Conditions

Definition 2.17: Feasible Direction

Let S be a nonempty set in nx . A vector dnx , d0 , is said to be a feasible direction at x¯cl(S)aacl(S) indicates the closure of the set S, that is the union of S with all of its limit points if

δ>0 such that x¯+ηdSη(0,δ)

Moreover, the cone of feasible directions at x¯ , denoted by 𝒟(x¯) , is given by

𝒟(x¯):={d0:δ>0 such that x¯+ηdSη(0,δ)}

From the Definition 2.17 and from Lemma 2.1, it is clear that a small movement from x¯ along a direction d𝒟(x¯) leads to feasible points, whereas a similar movement along a direction d0(x¯) leads to solutions of improving objective value (see the definition 2.15). As shown in theorem ??, a (geometric) necessary condition for local optimality is that: ”Every improving direction is not a feasible direction”. This fact is illustrated in figure ?? where both the half-space 0(x¯) and the cone 𝒟(x¯) (see Definition 2.3) are translated from the origin to x¯ for clarity.

Theorem 2.9: Geometric Necessary Condition for a Local Minimum

Let S be a nonempty set in nx and let f:nx be a differentiable function. Suppose that x¯ is a local minimizer of the problem to minimize f(x) subject to xS . Then, 0(x¯)𝒟(x¯)= .

Proof. By contradiction, suppose that there exists a vector d0(x¯)𝒟(x¯) , d0 . Then, by lemma 2.1,

δ1>0 such that f(x¯+ηd)<f(x¯)η(0,δ1)

Moreover, by the definition 2.17,

δ2>0 such that x¯+ηdSη(0,δ2)

Hence,

xη(x¯)S such that f(x¯+ηd)<f(x¯)

for every η(0,min{δ1,δ2}) , which contradicts the assumption that x¯ is a local minimum of f on S (see the definition 2.11). □

2.6.2 KKT Conditions

We now specify the feasible region as

S:={x:gi(x)0i=1,,ng}

where gi:nx,i=1,,ng , are continuous functions. In the geometric optimality condition given by Theorem 2.12, 𝒟(x¯) is the cone of feasible directions. From a practical viewpoint, it is desirable to convert this geometric condition into a more usable condition involving algebraic equations. As Lemma 2.2 indicates, we can define a cone 𝒟0(x¯) in terms of the gradients of the active constraints at x¯ , such that 𝒟0(x¯)𝒟(x¯) . For this, we need the following:

Definition 2.18: Active Constraint, Active Set

Let gi:nx,i=1,,ng , and consider the set
S:={x:gi(x)0,i=1,,ng} . Let x¯S be a feasible point. For each i=1,,ng , the constraint gi is said to be active or binding at x¯ if gi(x¯)=0 ; it is said to be inactive at x¯ if gi(x¯)<0 . Moreover,

𝒜(x¯):={i:gi(x¯)=0}

denotes the set of active constraints at x¯ .

Lemma 2.2: Algebraic Characterization of a Feasible Direction

Let gi:nx,i=1,,ng be differentiable functions, and consider the set S:={x:gi(x)0,i=1,,ng} . For any feasible point x¯S , we have

𝒟0(x¯):={d:gi(x¯)d<0i𝒜(x¯)}𝒟(x¯)

Proof. Suppose 𝒟0(x¯) is nonempty, and let d𝒟0(x¯) . Since gi(x¯)d<0 for each i𝒜(x¯) , then by Lemma 2.3, d is a descent direction for gi at x¯ , i.e.,

δ2>0 such that gi(x¯+ηd)<gi(x¯)=0η(0,δ2),i𝒜(x¯)

Furthermore, since gi(x¯)<0 and gi is continuous at x¯ (since it is differentiable) for each i𝒜(x¯) ,

δ1>0 such that gi(x¯+ηd)<0η(0,δ1),i𝒜(x¯)

Furthermore, overall, it is clear that the points x¯+ηd are in S for all η(0,min{δ1,δ2}) . Hence, by Definition 2.17, d𝒟(x¯) . □

Remark 2.18

This Lemma together with Theorem 2.12 directly leads to the result that 0(x¯)𝒟0(x¯)= for any local solution point x¯ , i.e.,

argmin{f(x):xS}{xnx:0(x)𝒟0(x)=}

The foregoing geometric characterization of local solution points applies equally well to either interior points
int(S):={xnx:gi(x)<0,i=1,,ng} or boundary points being at the boundary of the feasible domain. At an interior point, in particular, any direction is feasible, and the necessary condition 0(x¯)𝒟0(x¯)= reduces to f(x¯)=0 , which gives the same condition as in unconstrained optimization (see Theorem 2.3).

Note also that there are several cases where the condition 0(x¯)𝒟0(x¯)= is satisfied by non-optimal points. In other words, this condition is necessary but not sufficient for a point x¯ to be a local minimum of f on S . For instance, any point x¯ with gi(x¯)=0 for some i𝒜(x¯) trivially satisfies the condition 0(x¯)𝒟0(x¯)= . Another example is given below.

Example 2.11. Consider the problem

minx2f(x):=x12+x22 s.t. g1(x):=x10g2(x):=x10

Clearly, this problem is convex andx=(0,0) is the unique global minimum.

Now, letx¯ be any point on the line𝒞:={x:x1=0} . Both constraintsg1 andg2 are active atx¯ , and we haveg1(x¯)=g2(x¯)=(1,0) . Therefore, no directiond0 can be found such thatg1(x¯)d<0 andg2(x¯)d<0 simultaneously, i.e.,𝒟0(x¯)= . In turn, this implies that0(x¯)𝒟0(x¯)= is trivially satisfied for any point on𝒞 .

On the other hand, observe that the condition0(x¯)𝒟(x¯)= in Theorem 2.9excludes all the points on𝒞 , but the origin, since a feasible direction atx¯ is given, e.g., byd=(0,1) .

We now reduce the geometric necessary optimality condition 0(x¯)𝒟0(x¯)= to a statement in terms of the gradients of the objective function and of the active constraints. The resulting first-order optimality conditions are known as the Karush-Kuhn-Tucker (KKT) necessary conditions. Beforehand, we introduce the important concepts of a regular point and of a KKT point.

Definition 2.19: Regular Point (for a Set of Inequality Constraints)

Let gi:nx,i=1,,ng be differentiable functions on nx and consider the set S:={xnx:gi(x)0,i=1,,ng} . A point x¯S is said to be a regular point if the gradient vectors gi(x¯),i𝒜(x¯) are linearly independent:

rank(gi(x¯),i𝒜(x¯))=|𝒜(x¯)|

Definition 2.20: KKT Point

Let f:nx and gi:nx,i=1,,ng be differentiable functions. Consider the problem to minimize f(x) subject to gi(x)0,i=1,,ng . If a point (x¯,ν¯)nx×ng satisfies the conditions:

f(x¯)+g(x¯)ν¯=f(x¯)+i=1nggi(x¯)ν¯i=0(2.4a)ν¯0(2.4b)g(x¯)0(2.4c)ν¯g(x¯)=0(2.4d)

then (x¯,ν¯) is said to be a KKT point a a Note thatg=gx=[g1gng] .

Remark 2.19

The scalars νi,i=1,,ng are called the Lagrange multipliers. The condition 2.4a is called stationarity condition. The condition 2.4b is referred as dual feasibility (DF) condition; the condition 2.4c, i.e., the requirement that x¯ be feasible, is called the primal feasibility (PF) condition; finally, the condition 2.4d is called the complementarity slackness (CS) condition a

ν¯igi(x¯)=0 for i=1,,ng
.

Theorem 2.10: KKT Necessary Conditions

Let f:nx and gi:nx,i=1,,ng be differentiable functions. Consider the problem to minimize f(x) subject to gi(x)0,i=1,,ng . If x is a local minimum and a regular point of the constraints, then there exists a unique vector ν such that (x,ν) is a KKT point.

Proof. Since x solves the problem, then there exists no direction dnx such that f(x)d<0 and gi(x)d<0 , i𝒜(x) simultaneously (see Remark 2.18). Let A(|𝒜(x)|+1)×nx be the matrix whose rows are f(x¯) and gi(x¯) , i𝒜(x) . Clearly, the statement {dnx:Ad<0} is false and, by Corollary 2.1, there exists a nonzero vector p0 in |𝒜(x)|+1 such that Ap=0 . Denoting the components of p by u0 and ui for i𝒜(x) , we get:

u0f(x)+i𝒜(x)uigi(x)=0

where u00 and ui0 for i𝒜(x) and (u0,u𝒜(x))(0,0) (here u𝒜(x) is the vector whose components are the ui ’s for i𝒜(x) ). Letting ui=0 for i𝒜(x) , we then get the conditions:

u0f(x)+g(x)u=u0f(x)+i=1nguigi(x)=0ug(x)=0(u0,u)(0,0)(u0,u)(0,0)

where u is the vector whose components are ui for i=1,,ng . Note that u00 , since otherwise the assumption of linear independence of the active constraints at x would be violated a a Note that ifu0=0 thenu0 and by Corollary 2.1we have:

i𝒜(x)uigi(x)=0

with at least oneui0 thus violating the linear independence assumption. . Then, letting ν=1u0u , we obtain that (x,ν) is a KKT point. □

Remark 2.20

One of the major difficulties in applying the foregoing result is that we do not know a priori which constraints are active and which constraints are inactive, i.e., the active set is unknown. Therefore, it is necessary to investigate all possible active sets for finding candidate points satisfying the KKT conditions. This is illustrated in the example below.

Example 2.12 (Regular Case). Consider the problem

minx3f(x):=12(x12+x22+x32)s.t.g1(x):=x1+x2+x3+30g2(x):=x10

Note that every feasible point is regular sinceg1=[111] andg2=[100] , sox must satisfy the stationarity conditions:

x1+ν1+ν2=0x2+ν1=0x3+ν1=0

Four cases can be distinguished:

  • The constraintsg1 andg2 are both inactive, i.e.x1+x2+x3<3 ,x1<0 , andν1=ν2=0 . From the latter together with the dual feasibility conditions, we getx1=x2=x3=0 , hence contradicting the former primal feasibility condition.
  • The constraintg1 is inactive, whileg2 is active, i.e.x1+x2+x3<3 ,x1=0 ,ν1=0 andν20 . From the latter, together with the stationarity condition, we getx2=x3=0 , hence contradicting the former once again.
  • The constraintg1 is active, whileg2 is inactive, i.e.x1+x2+x3=3 ,x1<0 ,ν10 andν2=0 . Then, the point(x,ν) such thatx1=x2=x3=1 ,ν1=1 andν2=0 is a KKT point.
  • The constraintsg1 andg2 are both active, i.e.x1+x2+x3=3 ,x1=0 , andν1,ν2=0 . Then, we obtainx2=x3=32 ,ν1=32 , andν2=32 , hence contradicting the dual feasibility conditionν20 .

Overall, there is a unique candidate for a local minimum. Yet, it cannot be concluded as to whether this point is actually a global minimum or even a local minimum.

Remark 2.21: Constraint Qualification

It is very important to note that for a local minimum x to be a KKT point, an additional condition must be placed on the behaviour of the constraint, i.e., not every local minimum is a KKT point, such a condition is known as a constraint qualification. In Theorem 2.10 it is shown that one possible constraint qualification is that x be a regular point, which is the well known linear independence constraint qualification (LICQ). A weaker constraint qualification (i.e., implied by LICQ) known as the Mangasarian-Fromovitz constraint qualification (MFCQ) requires that there exits (at least) one direction d𝒟0(x) , i.e. such that gi(x)d<0 , for each i𝒜(x) . Note, however, that the Lagrange multipliers are guaranteed to be unique if LICQ holds (as stated in Theorem 2.10),while this uniqueness property may be lost under MFCQ.

The following example illustrates the necessity of having a constraint qualification for a KKT point to be a local minimum point of an NLP.

Example 2.13 (Non-Regular Case). Consider the problem

minx2f(x):=x12+(x21)2 s.t. g1(x):=x232x10g2(x):=x23+2x10

The feasible region is shown in fig. 2.7below. Note that a minimum point isx=(0,0) . The stationary condition relative to variablex2 computed atx reads:

2=0

It is readily seen that this condition cannot be met at the local minimum pointx . In other words, the KKT conditions are not necessary in this example. This is because no constraint qualification can hold atx . In particular,x not being a regular point, LICQ does not hold. Moreover, the set𝒟0(x) being empty (the directiond=(0,1) givesg1(x)d=g2(x)d=0 while any other direction induces a violation of either one of the constraints), MFCQ does not hold atx either.

pict

Figure 2.7:: Solution of Example 2.13

The following theorem provides a sufficient condition under which any KKT point of an inequality constrained NLP problem is guaranteed to be a global minimum of that problem.

Theorem 2.11: KKT Sufficient Conditions

Let f:nx and gi:nx,i=1,,ng , be convex and differentiable functions. Consider the problem to minimize f(x) subject to g(x)0 . If (x,ν) is a KKT point, then x is a global minimum of that problem.

Proof. Consider the function (x):=f(x)+i=1ngνigi(x) . Since f and gi , i=1,,ng are convex functions and νi0 , is also convex a a The sum of convex functions is a convex function, a positive scalar times a convex function is again a convex function . Moreover, the stationarity conditions impose that we have (x)=0 . Hence, by Theorem 2.5, x is a global minimizer for on nx , i.e.

(x)(x)xnx

In particular, for each x such that gi(x)gi(x)=0 , i𝒜(x) , we have

f(x)f(x)i𝒜(x)νi[gi(x)gi(x)]0

Noting that {xnx:gi(x)0,i𝒜(x)} contains the feasible domain {xnx:gi(x)0,i=1,,ng} , we therefore showed that x is a global minimizer for the problem. □

Example 2.14. Consider the problem

minx3f(x):=12(x12+x22+x32)s.t.g1(x):=x1+x2+x3+30g2(x):=x10

The point(x,ν) withx1=x2=x3=1 ,ν1=1 andν2=0 , being a KKT point, and both the objective function and the feasible set being convex, by Theorem 2.11,x is a global minimum.

Both second-order necessary and sufficient conditions for inequality constrained NLP problems will be presented later on in 2.8.