We now turn our attention to problems of the calculus of variations in the presence of constraints. Similar to finite-dimensional optimization problems, it is more convenient to use Lagrange multipliers in order to derive the necessary conditions of optimality associated with such problems; these considerations are discussed in the following for equality constraints. The method of Lagrange multipliers is then used in to obtain necessary conditions of optimality for problems subject to (equality) terminal constraints and isoperimetric constraints, respectively. The Lagrange multiplier methodology is then extended to inequality constrained problems.
We start by considering a general class of equality constraints where the function space is defined by level-set curves of one ore more functionals .
Definition 3.11: Equality constrained calculus of variations problem
Consider the problem to minimize:
where .
The functionals are defined in as well.
Remark 3.19
Note that also the basic problem of Calculus of Variations we have addressed so far falls into this broader category. Consider the functionals and and recall that for the basic problem we had . We can define the same as .
Remark 3.20: Common types of constraint functionals
This kind of problems allow to treat more general problems such as endpoint constrained problems and isoperimetric problems. In the endpoint constrained case the constraint functionals take the form for . The constraint reduces to , in this way the final state is forced to satisfy . Isoperimetric problems are a class of problems where the constraint functionals are defined as and the constraint is expressed as the level-set where . These kind of constraints are found in minimum volume [area] problems with fixed area [perimeter] as for example the catenary or Dido’s problem that we will solve in the following sections.
The next theorem that we give without proof is instrumental to prove a fundamental lemma on the existence of minimizers of such equality constrained problems.
Theorem 3.11: Inverse Function Theorem
Let and If a function has continuous first partial derivatives in each component with nonvanishing Jacobian determinant at then provides a continuously invertible mapping between and a region containing a full neighborhood of .
The following Lemma gives conditions under which a point in a normed linear space cannot be a (local) extremal of a functional , when constrained to the level set of another functional .
Lemma 3.5
Let and be functionals defined in a neighborhood of in a normed linear space , and let be the equality constraint defining . Suppose that there exist fixed directions such that the Gâteaux derivatives of and satisfy the Jacobian condition:
and are continuous in a neighborhood of (in each direction , ). Then, cannot be a local extremal for when constrained to .
Proof. Consider the auxiliary functions and , we have and and by definition of Gâteaux derivative , , and . Let us also define the function as:
the jacobian determinant of at is:
by hypothesis we have and thence Theorem 3.11 is applicable. Therefore the pre-image points of the neighborhood of maps a full neighbourhood of the origin in the plane as shown in Figure 3.12. That means that we can find and such that:
by definition of the functions and we have:
that means that is admissible (i.e. satisfy the constraint ) and reduces the cost, that is hence cannot be a local extremal trajectory. □
With this preparation, it is easy to derive necessary conditions for a local extremal in the presence of equality or inequality constraints, and in particular the existence of the Lagrange multipliers.
Theorem 3.12: Existence of the Lagrange Multipliers
Let and be functionals defined in a neighborhood of in a normed linear space and having continuous Gâteaux derivatives in this neighborhood. Let also and suppose that is a (local) extremal for constrained to . Suppose further that for some direction . Then, there exists a scalar such that:
Proof. Since is a (local) extremal for constrained to by Lemma 3.5 we must have that the determinant:
for any Hence, by defining:
it follows that for each . □
Remark 3.21
As in the finite-dimensional case, the parameter in Theorem 3.12 is called a Lagrange multiplier. Using the terminology of directional derivatives appropriate to , the Lagrange condition says simply that the directional derivatives of are proportional to those of at . Thus, in general, Lagrange’s condition means that the level sets of both and at share the same tangent hyperplane at i.e., they meet tangentially. Note also that the Lagrange’s condition can also be written in the form
This is due to the linearity of the Gâteaux derivative that is, given two scalars and two functionals from to we have that as shown in Remark 3.7 which suggests, as we will see in the following, consideration of the Lagrangian functional .
It is possible, ableit technical, to extend the method of Lagrange multipliers to problems involving any finite number of constraint functionals:
Theorem 3.13: (Existence of the Lagrange Multipliers (Multiple Constraints)
Let and be functionals defined in a neighborhood of in a normed linear space and having continuous Gâteaux derivatives in this neighborhood. Let also and suppose that is a (local) extremal for constrained to . Suppose further that:
for (independent) directions . Then, there exists a constant vector such that:
Remark 3.22: Link to Nonlinear Optimization
The previous theorem is the generalization of Theorem 2.13 of Chapter 2 to optimization problems in normed linear spaces. Note, in particular, that the requirement that to be a regular point for the Lagrange multipliers to exist is generalized by a non-singularity condition in terms of the Gâteaux derivatives of the constraint functionals . Yet, this condition is in general not sufficient to guarantee uniqueness of the Lagrange multipliers.
Remark 3.23: Hybrid Method of Admissible Directions and Lagrange Multipliers
If with a subset of a normed linear space and the -admissible directions form a linear subspace of (i.e., for every scalars ), then the conclusions of Theorem 3.12 remain valid when further restricting the continuity requirement for to and considering -admissible directions only. Said differently, Theorem 3.12 can be applied to determine (local) extremals to the functional constrained to . This extension leads to a more efficient but admittedly hybrid approach to certain problems involving multiple constraints. Those constraints on which determine a domain having a linear subspace of -admissible directions, may be taken into account by simply restricting the set of admissible directions when applying the method of Lagrangian multipliers to the remaining constraints.
Necessary conditions of optimality for problems with end-point constraints and isoperimetric constraints shall be obtained with this hybrid approach in the next subsections.
So far, we have only considered those problems with either free or fixed end-time , and end-points , . In this subsection, we shall consider problems having end-point constraints of the form with being specified or not. In the case is free, shall be considered as an additional variable in the optimization problem. Like in 3.3.1 we shall then define the functions by extension on a ”sufficiently” large interval , and consider the linear space , supplied with the weak norm . In particular, Theorem 3.13 applies readily by specializing the normed linear space to and considering the Gâteaux derivative at in the direction . These considerations yield necessary conditions of optimality for problems with end-point constraints as given in the following:
Theorem 3.14: Transversal Conditions
Consider the problem to minimize the functional
on with and . Suppose that gives a (local) minimum for on and at . Then, is a solution to the Euler equation of Theorem 3.3 satisfying both the end-point constraints and the transversal condition:
in the particular one-dimensional case (i.e. ) the transversal condition reduces to:
Proof. Observe first that by fixing and varying in the -admissible direction such that we show as in the proof of Theorem 3.3 that must be a solution to the Euler equation on . Observe also that the right end-point constraints may be expressed as the zero-level set of the functionals:
then simplifying the terms using the Euler equation as in the proof of Theorem 3.7 we obtain the first variations of the cost and constraint functionals as:
where the usual compressed notation is used. Based on the differentiability assumptions on and it is clear that these Gâteaux derivatives exist and are continuous. Further, since the rank condition holds at one can always find (independent) directions such that the regularity condition:
is satisfied. Now, consider the linear subspace
. Since gives a (local) minimum for on by Theorem 3.13 and Remark 3.23 there exist a vector such that:
The above equation can be rewritten using the vectorial form as:
that must hold for every . In particular, we consider variations such that so that the second term is simplified and we get:
that must hold for each admissible and thus we get:
Now considering variations such that while is such that we get for each :
and thus we have the equation:
for each . In vectorial notation we have the following dimensional system of equations:
that following our definitions of and becomes:
that is a linear system of equations expressed as where . Note that is the jacobian matrix whose dimensions are and by hypothesis its rank is . Therefore for the rank-nullity theorem , we can therefore pick a dimensional vector and post-multiply by it the previous equation, noting that since , we get:
note that if belongs to the kernel of the Jacobian at also does, then it also holds:
Substituting the definition of and we get:
Remark 3.24: one-dimensional case
For the one-dimensional case we have:
note that reduces to a row vector in then the kernel is simply made of the subspace of vectors in such that:
by picking and the transversality condition reduces to:
Remark 3.25
Note that therefore its Jacobian is a matrix of the form :
and it is a matrix-valued function of that is . Its rank can be at most when all of its rows are independent.
Example 3.15 (Minimum Path Problem with Variable End-Point and End-Time). Consider the problem to minimize the distance between a fixed point and the point in the -plane. We want to find the curve such that the functional is minimized, subject to the constraints and . Note that neither nor are explicitly given. Instead they are implicitly given by the scalar end-point constraint where is obviously a function. We stress the fact that (i.e. the zero level-set of the function ) is a straight line in . Therefore, we are looking for the minimum path problem from a fixed point to a point on a given straight line. For the sake of simplicity let’s consider , from Theorem 3.14we know that an extremal must satisfy the Euler equation in the interval . Note that is yet unknown. As we have already seen the extremal for minimum path problems are straight lines. Since we have assigned a boundary condition in , we get:
that are a family of straight lines passing through the origin. In order to find and we apply the necessary condition of Theorem 3.14adpated to the one dimensional case, that is:
Substituting , , , we get:
and upon simplification:
that precisely means that the extremal curve is perpendicular to the straight line defined by the constraint. In this simple case, this is a global condition but the transversality condition in general is a local condition at the end-point between the ”gradient” of the constraint and the extremal trajectory. Note that we have also implicitly found as the intersection between the lines and that is . A graphical representation of this problem is shown in Figure 3.13
An isoperimetric problem of the calculus of variations is a problem wherein one or more constraints involves a functional in the form of an integral over part or all of the integration horizon . Typically:
where is a given number. Note that this problem already fits the framework that we have treated so far since the constraint is in the form of a level-set of the functional . We will use the hybrid approach in Remark 3.23 to deal both with simpler constraints imposed by the function space (e.g. fixed end-points) and with the isoperimetric constraint .
Theorem 3.15: First-Order Necessary Conditions for Isoperimetric Problems
Consider the problem to minimize the functional:
on subject to the isoperimetric constraints:
with and Suppose that gives a (local) minimum for this problem, and
for (independent) directions . Then, there exists a constant vector such that is a solution to the so called Euler-Lagrange’s equation:
where:
Proof. Remark first that, from the differentiability assumptions on and the Gâteaux derivatives and , , exist and are continuous for every .
Since gives a (local) minimum for on constrained to , and is a regular point for the constraints, by Theorem 3.13 (and Remark 3.23 ), there exists a constant vector such that:
for each -admissible direction . Observe that this latter condition is equivalent to that of finding a minimizer to the functional:
on . The conclusion then directly follows upon applying Theorem 3.3. □
Remark 3.26: First Integrals
Similar to free problems of the calculus of variations (see Remark 3.11) it is easy to show that the Hamiltonian function defined as
is constant along an extremal trajectory provided that does not depend on the independent variable . Recall the definition and that along an extremal trajectory the necessary condition of Theorem 3.15 holds, that is:
The total time derivative of the Hamiltonian is:
Example 3.16 (Solution to Dido’s Problem). We are finally ready to face problem shown in Example 3.3. Dido’s Problem is to minimize the functional:
subject to the isoperimetric 9 constraint:
The subspace is . Now we build the function as:
Since is independent of , the hamiltonian function is constant along an extremal trajectory, that is:
in this case we have , hence:
Manipulating the previous relation we get:
that is a separable nonlinear differential equation, upon squaring both terms and after some algebraic manipulation we get:
and by formally expressing and integrating both sides we have:
In order to solve the integral in we make the substitution , and hence:
in a similar way as for the brachistochrone problem we get the extremal trajectory in parametric form as:
where we have defined . Note that define a circular arc of radius and centered at . This is easy to see considering that:
and thus:
however we still need to compute the actual value of the radius and the center coordinates . Assume for the sake of simplicity that , and . Substituing the boundary condtions in the previous equation gives:
by subtracting these two conditions we have:
therefore the center of this family of circles is in and passes through and . We still need to compute the radius and the ordinate of the center . Indeed these two parameters depend on the isoperimetric constraint (i.e. the length of the circle) that is given. The length of an arc of circle of radius between and (yet unknown) is , this comes from the definition of radiants, obviously the same relation holds if we compute it analytically as:
and can expressed in terms of by using the boundary condition at and :
Since the arctangent function is an odd function we have while from the cartesian expression of the circle with = 0 at point we get:
finally the expression for the length presents the only unknown :
that, given , can be solved numerically for that is the ordinate of the center of the circle, then using we can retrieve the value of the Lagrange multiplier that, interestingly enough, in this case is also the radius of the circle changed of sign. as a function of is plotted in Figure 3.14. The extremal for the simple case that gives and is plotted in Figure 3.15.
Example 3.17 (Problem of the Surface of Revolution of Minimum Area). Consider the problem to find the smooth curve having a fixed length joining two given points and and generating a surface of revolution around the -axis of minimum area. In mathematical terms, the problem consists of finding a minimizer of the functional:
on subject to the isoperimetric: constraint
Let us drop the coefficient in and introduce the Lagrangian as
since does not depend on the independent variable we use the fact that must be constant along an extremal trajectory ,
for some constant That is:
that can be rearranged as:
that with some simplification gives:
again we solve the integral by substituting and and thus:
and by using the relation we obtain:
and assuming we have:
finally substituing back in we obtain:
The constants , and are to be found from the boundary conditions and the isoperimetric constraints. Note that the extremal curve are a family of catenaries. For the sake of simplicity let’s do the the inverse reasoning that is we assign and we compute the boundary condititions. Note that is a translation along the -axis while a translation along the -axis.With , and , we get the results of Figure 3.16and 3.17where the length of the curve is:
note that the length does not depend directly on the lagrange multiplier . While the surface area is:
Example 3.18 (Solution to hanging rope with fixed length). We give the solution to the problem shown in Example 3.2. Given a rope of length attached at two poles and find the function that describes the shape assumed by the rope under the action of gravity. We want to minimize the functional expressing the potential energy under the isoperimetric constraint expressing the fixed length. That is:
on subject to the isoperimetric constraint
This is exactly the same problem as the surface of minimum area. The rope assumes the shape of a catenary:
where again are to be found from the boundary condition and the length .
The method of Lagrange multipliers can also be used to address problems of the calculus of variations having inequality constraints (or mixed equality and inequality constraints), as shown by the following:
Theorem 3.16: Existence and Uniqueness of the Lagrange Multipliers (Inequality Constraints)
Let and be functionals defined in a neighborhood of in a normed linear space and having continuous and linear Gâteaux derivatives in this neighborhood. Suppose that is a (local) minimum point for constrained to for some constant vector . Suppose further that constraints, say for simplicity, are active at and satisfy the regularity condition
for (independent) directions (the remaining constraints being inactive). Then, there exists a vector such that:
and
for
Proof. Since the inequality constraints are inactive, the nonnegativity conditions and complementary slackness are trivially satisfied by taking . On the other hand, since the inequality constraints are active and satisfy a regularity condition at , the conclusion that there exists a unique vector such that the ”stationarity condition” holds follows from Theorem 3.12 moreover, the complementarity slackness condition is trivially satisfied for since if is active. Hence, it suffices to prove that the Lagrange multipliers cannot assume negative values when is a (local) minimum.
We show the result by contradiction. Without loss of generality, suppose that , and consider the matrix A defined by
By hypothesis, , hence the null space of has dimension lower than or equal to . But from the stationarity condition the nonzero vector . That is has dimension equal to and only if such that . Because there does not exist a in such that for every . Thus, by Gordan’s Theorem there exists a nonzero vector in such that , or equivalently:
The Gâteaux derivatives of being linear (by assumption), we get:
In particular,
that is, being a local minimum of on ,
and we get
which contradicts the inequality obtained earlier. □