Notes | Mechatronic Systems and Laboratory

3.5 Problems with equality and inequality constraints

We now turn our attention to problems of the calculus of variations in the presence of constraints. Similar to finite-dimensional optimization problems, it is more convenient to use Lagrange multipliers in order to derive the necessary conditions of optimality associated with such problems; these considerations are discussed in the following for equality constraints. The method of Lagrange multipliers is then used in to obtain necessary conditions of optimality for problems subject to (equality) terminal constraints and isoperimetric constraints, respectively. The Lagrange multiplier methodology is then extended to inequality constrained problems.

We start by considering a general class of equality constraints where the function space $𝒟$ is defined by level-set curves of one ore more functionals $G_{1}, \dots, G_{n_{g}}$ .

Definition 3.11: Equality constrained calculus of variations problem

Consider the problem to minimize:

\begin{matrix} \min_{x (t)} & F [x] = \int_{t_{1}}^{t_{2}} f (t, x (t), \dot{x} (t)) d t \\ s . t . & x \in 𝒟 \subset 𝒳 \end{matrix}

where $𝒟 : = {x \in 𝒳 such that G_{i} [x] = K_{i} i = 1, \dots, n_{g}}$ .

The functionals $G_{i}$ are defined in $𝒳$ as well.

Remark 3.19

Note that also the basic problem of Calculus of Variations we have addressed so far falls into this broader category. Consider the functionals $G_{1} [x] = x (t_{1})$ and $G_{2} [x] = x (t_{2})$ and recall that for the basic problem we had $𝒟 : = {x \in 𝒞^{1} {[t_{1}, t_{2}]}^{n_{x}} such that x (t_{1}) = x 1, x (t_{2}) = x 2}$ . We can define the same $𝒟$ as $𝒟 : = {x \in 𝒞^{1} {[t_{1}, t_{2}]}^{n_{x}} such that G_{1} [x] = x 1 and G_{2} [x] = x 2}$ .

Remark 3.20: Common types of constraint functionals

This kind of problems allow to treat more general problems such as endpoint constrained problems and isoperimetric problems. In the endpoint constrained case the constraint functionals $G_{i}$ take the form $G_{i} [x] = ψ_{i} (t_{2}, x (t_{2}))$ for $i = 1, \dots, n_{x}$ . The constraint reduces to $G_{i} [x] = 0 i = 1, \dots, n_{x}$ , in this way the final state is forced to satisfy $ψ (t_{2}, x (t_{2})) = 0$ . Isoperimetric problems are a class of problems where the constraint functionals are defined as $G [x] = \int_{t_{1}}^{t_{2}} g (t, x, \dot{x}) d t$ and the constraint is expressed as the level-set $G [x] = L$ where $L \in ℝ$ . These kind of constraints are found in minimum volume [area] problems with fixed area [perimeter] as for example the catenary or Dido’s problem that we will solve in the following sections.

The next theorem that we give without proof is instrumental to prove a fundamental lemma on the existence of minimizers of such equality constrained problems.

Theorem 3.11: Inverse Function Theorem

Let $x 0 \in ℝ^{n_{x}}$ and $η > 0 .$ If a function $Φ : ℬ_{η} (x 0) \to ℝ^{n_{x}}$ has continuous first partial derivatives in each component with nonvanishing Jacobian determinant at $x 0,$ then $Φ$ provides a continuously invertible mapping between $ℬ_{η} (x 0)$ and a region containing a full neighborhood of $Φ (x 0)$ .

The following Lemma gives conditions under which a point $\bar{x}$ in a normed linear space $(𝒳; ‖ \cdot ‖)$ cannot be a (local) extremal of a functional $F$ , when constrained to the level set of another functional $G$ .

Lemma 3.5

Let $F$ and $G$ be functionals defined in a neighborhood of $\bar{x}$ in a normed linear space $(𝒳; ‖ \cdot ‖)$ , and let $G [x] = K$ be the equality constraint defining $𝒟$ . Suppose that there exist fixed directions $ξ 1, ξ 2$ such that the Gâteaux derivatives of $F$ and $G$ satisfy the Jacobian condition:

| \begin{matrix} δ F_{\bar{x}} [ξ \end{matrix} 1] δ F_{\bar{x}} [ξ 2] δ G_{\bar{x}} [ξ 1] & δ G_{\bar{x}} [ξ 2] | \neq 0

and are continuous in a neighborhood of $\bar{x}$ (in each direction $ξ 1$ , $ξ 2$ ). Then, $\bar{x}$ cannot be a local extremal for $F$ when constrained to $𝒟 : = {x \in 𝒳 such that G [x] = K}$ .

Proof. Consider the auxiliary functions $j (η_{1}, η_{2}) : = F [\bar{x} + η_{1} ξ 1 + η_{2} ξ 2]$ and $g (η_{1}, η_{2}) : = G [\bar{x} + η_{1} ξ 1 + η_{2} ξ 2]$ , we have $j (0, 0) = F [\bar{x}]$ and $g (0, 0) = G [\bar{x}] = K$ and by definition of Gâteaux derivative $δ F_{\bar{x}} [ξ 1] = \frac{\partial j}{\partial η_{1}} | 0, 0 = j_{η_{1}} (0, 0)$ , $δ F_{\bar{x}} [ξ 2] = \frac{\partial j}{\partial η_{2}} | 0, 0 = j_{η_{2}} (0, 0)$ , $δ G_{\bar{x}} [ξ 1] = \frac{\partial g}{\partial η_{1}} | 0, 0 = g_{η_{1}} (0, 0)$ and $δ G_{\bar{x}} [ξ 2] = \frac{\partial g}{\partial η_{2}} | 0, 0 = g_{η_{2}} (0, 0)$ . Let us also define the function $Φ : ℝ^{2} \to ℝ^{2}$ as:

Φ = [\begin{matrix} j (η_{1}, η_{2}) \\ g (η_{1}, η_{2}) \end{matrix}],

the jacobian determinant of $Φ$ at $(0, 0)$ is:

Det (J Φ (0, 0)) = | \begin{matrix} j_{η_{1}} (0, 0) & j_{η_{2}} (0, 0) \\ g_{η_{1}} (0, 0) & g_{η_{2}} (0, 0) \end{matrix} | = | \begin{matrix} δ F_{\bar{x}} [ξ \end{matrix} 1] δ F_{\bar{x}} [ξ 2] δ G_{\bar{x}} [ξ 1] & δ G_{\bar{x}} [ξ 2] |,

by hypothesis we have $Det (J (0, 0)) \neq 0$ and thence Theorem 3.11 is applicable. Therefore the pre-image points of the neighborhood of $Φ (0, 0) = {[j (0, 0), g (0, 0)]}^{⊤} = {[F [\bar{x}], K]}^{⊤}$ maps a full neighbourhood of the origin $(0, 0)$ in the $η_{1} - η_{2}$ plane as shown in Figure 3.12. That means that we can find $(η_{1}^{\times}, η_{2}^{\times})$ and $(η_{1}^{o}, η_{2}^{o})$ such that:

\begin{aligned} j (η_{1}^{o}, η_{2}^{o}) < j (0, 0) < j (η_{1}^{\times}, η_{2}^{\times}) \\ g (η_{1}^{o}, η_{2}^{o}) = g (0, 0) = g (η_{1}^{\times}, η_{2}^{\times}) = K, \end{aligned}

by definition of the functions $g$ and $j$ we have:

\begin{aligned} F [\bar{x} + η_{1}^{o} ξ 1 + η_{2}^{o} ξ 2] < F [\bar{x}] < F [\bar{x} + η_{1}^{x} ξ 1 + η_{2}^{x} ξ 2] \\ G [\bar{x} + η_{1}^{o} ξ 1 + η_{2}^{o} ξ 2] < G [\bar{x}] < G [\bar{x} + η_{1}^{x} ξ 1 + η_{2}^{x} ξ 2] = K, \end{aligned}

that means that $\tilde{x} = \bar{x} + η_{1}^{o} ξ 1 + η_{2}^{o} ξ 2$ is admissible (i.e. satisfy the constraint $G [\tilde{x}] = K$ ) and reduces the cost, that is $F [\tilde{x}] < F [\bar{x}]$ hence $\bar{x}$ cannot be a local extremal trajectory. □

pict — Figure 3.12:: Due to the inverse function theorem there exist a full neihborhood of (j(0,0),g(0,0)) hence it is possible to find trajectories with lower cost

With this preparation, it is easy to derive necessary conditions for a local extremal in the presence of equality or inequality constraints, and in particular the existence of the Lagrange multipliers.

Theorem 3.12: Existence of the Lagrange Multipliers

Let $F$ and $G$ be functionals defined in a neighborhood of $x ⋆$ in a normed linear space $(𝒳, ‖ \cdot ‖),$ and having continuous Gâteaux derivatives in this neighborhood. Let also $K = G (x ⋆)$ and suppose that $x ⋆$ is a (local) extremal for $F$ constrained to $Γ (K) : = {x \in 𝒳 : G (x) = K}$ . Suppose further that $δ G_{x ⋆} [\bar{ξ}] \neq 0$ for some direction $\bar{ξ} \in 𝒳$ . Then, there exists a scalar $λ \in ℝ$ such that:

δ F_{x ⋆} [ξ] + λ δ G_{x ⋆} [ξ] = 0 \forall ξ \in 𝒳

^a^aNote that the variations need not to be admissible for the space Γ(K)..

Proof. Since $x ⋆$ is a (local) extremal for $F$ constrained to $Γ (K)$ by Lemma 3.5 we must have that the determinant:

| \begin{matrix} δ F_{x ⋆} \end{matrix} [\bar{ξ}] δ F_{x ⋆} [ξ] δ G_{x ⋆} [\bar{ξ}] & δ G_{x ⋆} [ξ] | = 0

for any $ξ \in 𝒳$ Hence, by defining:

λ : = - \frac{δ F_{x ⋆}}{[}

it follows that $δ F_{x ⋆} [ξ] + λ δ G_{x ⋆} [ξ] = 0$ for each $ξ \in 𝒳$ . □

Remark 3.21

As in the finite-dimensional case, the parameter $λ$ in Theorem 3.12 is called a Lagrange multiplier. Using the terminology of directional derivatives appropriate to $ℝ^{n_{x}}$ , the Lagrange condition $δ F_{x ⋆} [ξ] = - λ δ G_{x ⋆} [ξ]$ says simply that the directional derivatives of $F$ are proportional to those of $G$ at $x ⋆$ . Thus, in general, Lagrange’s condition means that the level sets of both $F$ and $G$ at $x ⋆$ share the same tangent hyperplane at $x ⋆$ i.e., they meet tangentially. Note also that the Lagrange’s condition can also be written in the form

δ {(F + λ G)}_{x ⋆} [ξ] = 0 \forall ξ \in 𝒳

This is due to the linearity of the Gâteaux derivative that is, given two scalars $a_{1}, a_{2} \in ℝ$ and two functionals $A, B$ from $𝒳$ to $ℝ$ we have that $δ {(a_{1} A + a_{2} B)}_{x} [ξ] = a_{1} δ A_{x} [ξ] + a_{2} δ A_{x} [ξ]$ as shown in Remark 3.7 which suggests, as we will see in the following, consideration of the Lagrangian functional $L : = F + λ G$ .

It is possible, ableit technical, to extend the method of Lagrange multipliers to problems involving any finite number of constraint functionals:

Theorem 3.13: (Existence of the Lagrange Multipliers (Multiple Constraints)

Let $F$ and $G_{i}, i = 1, \dots n_{g}$ be functionals defined in a neighborhood of $x ⋆$ in a normed linear space $(𝒳, ‖ \cdot ‖)$ and having continuous Gâteaux derivatives in this neighborhood. Let also $K_{i} = G_{i} [x ⋆], i = 1, \dots, n_{g}$ and suppose that $x ⋆$ is a (local) extremal for $F$ constrained to $Γ (K) : = {x \in 𝒳 : G_{i} [x] = K_{i}, i = 1, \dots, n_{g}}$ . Suppose further that:

| \begin{matrix} δ G_{1, x ⋆} \end{matrix} [{\bar{ξ}}_{1}] \dots δ G_{1, x ⋆} [{\bar{ξ}}_{n_{g}}] ⋮ & ⋱ & ⋮ δ G_{n_{g}, x ⋆} [{\bar{ξ}}_{1}] & \dots & δ G_{n_{g}, x ⋆} [{\bar{ξ}}_{n_{g}}] | \neq 0

for $n_{g}$ (independent) directions ${\bar{ξ}}_{1}, \dots, {\bar{ξ}}_{n_{g}} \in 𝒳$ . Then, there exists a constant vector $λ \in ℝ^{n_{g}}$ such that:

δ F_{x ⋆} [ξ] + \sum_{i = 1}^{n_{g}} λ_{i} δ G_{i, x ⋆} [ξ] = 0 \forall ξ \in 𝒳

Remark 3.22: Link to Nonlinear Optimization

The previous theorem is the generalization of Theorem 2.13 of Chapter 2 to optimization problems in normed linear spaces. Note, in particular, that the requirement that $x ⋆$ to be a regular point for the Lagrange multipliers to exist is generalized by a non-singularity condition in terms of the Gâteaux derivatives of the constraint functionals $G_{1}, \dots, G_{n_{g}}$ . Yet, this condition is in general not sufficient to guarantee uniqueness of the Lagrange multipliers.

Remark 3.23: Hybrid Method of Admissible Directions and Lagrange Multipliers

If $x ⋆ \in 𝒟$ with $𝒟$ a subset of a normed linear space $(𝒳, ∥ \cdot ∥),$ and the $𝒟$ -admissible directions form a linear subspace of $𝒳$ (i.e., $ξ 1, ξ 2 \in 𝒟 \Rightarrow η_{1} ξ 1 + η_{2} ξ 2 \in 𝒟$ for every scalars $η_{1}, η_{2} \in ℝ$ ), then the conclusions of Theorem 3.12 remain valid when further restricting the continuity requirement for $F$ to $𝒟$ and considering $𝒟$ -admissible directions only. Said differently, Theorem 3.12 can be applied to determine (local) extremals to the functional $F |_{𝒟}$ constrained to $Γ (K)$ . This extension leads to a more efficient but admittedly hybrid approach to certain problems involving multiple constraints. Those constraints on which determine a domain $𝒟$ having a linear subspace of $𝒟$ -admissible directions, may be taken into account by simply restricting the set of admissible directions when applying the method of Lagrangian multipliers to the remaining constraints.

Necessary conditions of optimality for problems with end-point constraints and isoperimetric constraints shall be obtained with this hybrid approach in the next subsections.

3.5.1 Problems with end-point constraints

So far, we have only considered those problems with either free or fixed end-time $t_{1}$ , $t_{2}$ and end-points $x (t_{1})$ , $x (t_{2})$ . In this subsection, we shall consider problems having end-point constraints of the form $ϕ (t_{2}, x (t_{2})) = 0$ with $t_{2}$ being specified or not. In the case $t_{2}$ is free, $t_{2}$ shall be considered as an additional variable in the optimization problem. Like in 3.3.1 we shall then define the functions $x (t)$ by extension on a ”sufficiently” large interval $[t_{1}, T]$ , and consider the linear space $𝒞^{1} {[t_{1}, T]}^{n_{x}} \times ℝ$ , supplied with the weak norm $∥ (x, t) ∥_{1, \infty} : = ∥ x ∥_{1, \infty} + | t |$ . In particular, Theorem 3.13 applies readily by specializing the normed linear space $(X, ∥ \cdot ∥)$ to $(𝒞^{1} {[t_{1}, T]}^{n_{x}} \times ℝ, ∥ (\cdot, \cdot) ∥_{1, \infty})$ and considering the Gâteaux derivative $δ F_{x, t_{2}} [ξ, τ]$ at $(x, t_{2})$ in the direction $(ξ, τ)$ . These considerations yield necessary conditions of optimality for problems with end-point constraints as given in the following:

Theorem 3.14: Transversal Conditions

Consider the problem to minimize the functional

F [x, t_{2}] : = \int_{t_{1}}^{t_{2}} f (t, x (t), \dot{x} (t)) d t

on $𝒟 : = {(x, t_{2}) \in 𝒞^{1} {[t_{1}, T]}^{n_{x}} \times [t_{1}, T] : x (t_{1}) = x 1, ϕ (t_{2}, x (t_{2})) = 0}$ with $f \in 𝒞^{1} ([t_{1}, T] \times ℝ^{2 \times n_{x}})$ and $ϕ \in 𝒞^{1} {([t_{1}, T] \times ℝ^{n_{x}})}^{n_{x}}$ . Suppose that $(x ⋆, t_{2}^{⋆})$ gives a (local) minimum for $F$ on $𝒟$ and $Rank [\begin{matrix} ϕ \end{matrix} t ϕ x] = n_{x}$ at $(x ⋆ (t_{2}^{⋆}), t_{2}^{⋆})$ . Then, $x ⋆$ is a solution to the Euler equation of Theorem 3.3 satisfying both the end-point constraints $x ⋆ (t_{1}) = x 1$ and the transversal condition:

[(f - f_{\dot{x}} \dot{x}) d_{t} + f_{\dot{x}} d x] x ⋆, t_{2}^{⋆} = 0 \forall [d_{t} d x ⊤] \in Ker [\begin{matrix} ϕ \end{matrix} t ϕ x] x ⋆, t_{2}^{⋆}

in the particular one-dimensional case (i.e. $n_{x} = 1$ ) the transversal condition reduces to:

{[ϕ_{x} (f - ẋ f_{ẋ}) - ϕ_{t} f_{ẋ}]}_{x ⋆, t_{2}^{⋆}} = 0 .

Proof. Observe first that by fixing $t_{2} : = t_{2}^{⋆}$ and varying $x ⋆$ in the $𝒟$ -admissible direction $ξ \in 𝒞^{1} {[t_{1}, t_{2}^{⋆}]}^{n_{x}}$ such that $ξ (t_{1}) = ξ (t_{2}^{⋆}) = 0$ we show as in the proof of Theorem 3.3 that $x ⋆$ must be a solution to the Euler equation on $[t_{1}, t_{2}^{⋆}]$ . Observe also that the right end-point constraints may be expressed as the zero-level set of the functionals:

G_{k} [x] : = ϕ_{k} (t_{2}, x (t_{2})) = 0 k = 1, \dots, n_{x},

then simplifying the terms using the Euler equation as in the proof of Theorem 3.7 we obtain the first variations of the cost and constraint functionals as:

\begin{aligned} δ F_{(x ⋆, t_{2}^{⋆})} [ξ, τ] & = \frac{\partial f [x ⋆ (t_{2}^{⋆})]}{\partial \dot{x}} ξ (t_{2}^{⋆}) + f [x ⋆ (t_{2}^{⋆})] τ \\ δ G_{k, (x ⋆, t_{2}^{⋆})} [ξ, τ] & = \frac{\partial ϕ_{k} [x ⋆ (t_{2}^{⋆})]}{\partial t} τ + \frac{\partial ϕ_{k} [x ⋆ (t_{2}^{⋆})]}{\partial x} (ξ (t_{2}^{⋆}) + \dot{x} (t_{2}^{⋆}) τ) \end{aligned}

where the usual compressed notation is used. Based on the differentiability assumptions on $f$ and $ϕ$ it is clear that these Gâteaux derivatives exist and are continuous. Further, since the rank condition $Rank [\begin{matrix} ϕ \end{matrix} t ϕ x] = n_{x}$ holds at $(x ⋆ (t_{2}^{⋆}), t_{2}^{⋆})$ one can always find $n_{x}$ (independent) directions $({\bar{ξ}}_{k}, {\bar{τ}}_{k}) \in 𝒞^{1} {[t_{1}, t_{2}]}^{n_{x}} \times ℝ$ such that the regularity condition:

| \begin{matrix} δ G_{1, (x ⋆, t_{2}^{⋆})} \end{matrix} [{\bar{ξ}}_{1}, {\bar{τ}}_{1}] \dots δ G_{1, (x ⋆, t_{2}^{⋆})} [{\bar{ξ}}_{n_{x}}, {\bar{τ}}_{n_{x}}] ⋮ & ⋱ & ⋮ δ G_{n_{x}, (x ⋆, t_{2}^{⋆})} [{\bar{ξ}}_{1}, {\bar{τ}}_{1}] & \dots & δ G_{n_{x}, (x ⋆, t_{2}^{⋆})} [{\bar{ξ}}_{n_{x}}, {\bar{τ}}_{n_{x}}] | \neq 0

is satisfied. Now, consider the linear subspace
$Ξ : = {(ξ, τ) \in 𝒞^{1} {[t_{1}, T]}^{n_{x}} \times ℝ : ξ (t_{1}) = 0}$ . Since $(x ⋆, t_{2}^{⋆})$ gives a (local) minimum for $F$ on $𝒟$ by Theorem 3.13 and Remark 3.23 there exist a vector $λ \in ℝ^{n_{x}}$ such that:

\begin{aligned} 0 & = δ F_{(x ⋆, t_{2}^{⋆})} [ξ, τ] + \sum_{i = 1}^{n_{g}} λ_{i} δ G_{i, (x ⋆, t_{2}^{⋆})} [ξ, τ] = \\ = \frac{\partial f [x ⋆ (t_{2}^{⋆})]}{\partial \dot{x}} ξ (t_{2}^{⋆}) + f [x ⋆ (t_{2}^{⋆})] τ + \\ + \sum_{k = 1}^{n_{x}} λ_{k} (\frac{\partial ϕ_{k} [x ⋆ (t_{2}^{⋆})]}{\partial t} τ + \frac{\partial ϕ_{k} [x ⋆ (t_{2}^{⋆})]}{\partial x} (ξ (t_{2}^{⋆}) + \dot{x} (t_{2}^{⋆}) τ)) . \end{aligned}

The above equation can be rewritten using the vectorial form as:

\begin{aligned} 0 = \frac{\partial f [x ⋆ (t_{2}^{⋆})]}{\partial \dot{x}} ξ (t_{2}^{⋆}) + \\ + (f [x ⋆ (t_{2}^{⋆})] + λ ⊤ \frac{\partial ϕ [x ⋆ (t_{2}^{⋆})]}{\partial t}) τ + λ ⊤ \frac{\partial ϕ [x ⋆ (t_{2}^{⋆})]}{\partial x} (ξ (t_{2}^{⋆}) + x 2 ⋆ (t_{2}^{⋆}) τ) \end{aligned}

that must hold for every $(ξ, τ) \in Ξ$ . In particular, we consider variations $ξ$ such that $ξ (t_{2}^{⋆}) = - x ⋆ (t_{2}^{⋆}) τ$ so that the second term is simplified and we get:

(f [x ⋆ (t_{2}^{⋆})] - \frac{\partial f [x ⋆ (t_{2}^{⋆})]}{\partial \dot{x}} {\dot{x}}^{⋆} (t_{2}^{⋆}) + λ ⊤ \frac{\partial ϕ [x ⋆ (t_{2}^{⋆})]}{\partial t}) τ = 0

that must hold for each admissible $τ$ and thus we get:

λ ⊤ \frac{\partial ϕ [x ⋆ (t_{2}^{⋆})]}{\partial t} = - f [x ⋆ (t_{2}^{⋆})] + \frac{\partial f [x ⋆ (t_{2}^{⋆})]}{\partial \dot{x}} {\dot{x}}^{⋆} (t_{2}^{⋆}) = b_{t} .

Now considering variations such that $τ = 0$ while $ξ i (t_{2}^{⋆})$ is such that $ξ i (t_{2}^{⋆}) = {(0, . ., ξ_{i} (t_{2}^{⋆}), . ., 0)}^{⊤}$ we get for each $i$ :

(\frac{\partial f [x ⋆ (t_{2}^{⋆})]}{\partial ẋ_{i}} + λ ⊤ \frac{\partial ϕ}{\partial x_{i}}) ξ_{i} (t_{2}^{⋆}) = 0

and thus we have the equation:

λ ⊤ \frac{\partial ϕ}{\partial x_{i}} = - \frac{\partial f [x ⋆ (t_{2}^{⋆})]}{\partial ẋ_{i}} = b_{i, x}

for each $i = 1, \dots, n_{x}$ . In vectorial notation we have the following $n_{x}$ dimensional system of equations:

λ ⊤ [ϕ t ϕ x] = [- f [x ⋆ (t_{2}^{⋆})] + \frac{\partial f [x ⋆ (t_{2}^{⋆})]}{\partial \dot{x}} {\dot{x}}^{⋆} (t_{2}^{⋆}) - \frac{\partial f [x ⋆ (t_{2}^{⋆})]}{\partial \dot{x}}] ⊤

that following our definitions of $b_{t}$ and $b x$ becomes:

λ ⊤ [ϕ t ϕ x] = [b_{t} b x ⊤]

that is a linear system of equations expressed as $λ ⊤ A = [b_{t} b x ⊤]$ where $A = [ϕ t ϕ x]$ . Note that $[ϕ t ϕ x]$ is the jacobian $J ϕ (t_{2}^{⋆}, x (t_{2}^{⋆}))$ matrix whose dimensions are $n_{x} \times n_{x} + 1$ and by hypothesis its rank is $n_{x}$ . Therefore for the rank-nullity theorem $dim (Ker (J ϕ (t_{2}^{⋆}, x (t_{2}^{⋆})))) = 1$ , we can therefore pick a $n_{x} + 1$ dimensional vector $d = {[d_{t} d}^{x ⊤] ⊤} \in Ker (J ϕ (t_{2}^{⋆}, x (t_{2}^{⋆})))$ and post-multiply by it the previous equation, noting that $[ϕ t ϕ x] d = 0$ since $d \in Ker (J ϕ (t_{2}^{⋆}, x (t_{2}^{⋆})))$ , we get:

[b_{t} b x ⊤] d = 0

note that if $d$ belongs to the kernel of the Jacobian at $(x ⋆ (t_{2}^{⋆}), t_{2}^{⋆})$ also $- d$ does, then it also holds:

[- b_{t} - b x ⊤] d = 0

Substituting the definition of $b_{t}$ and $b x$ we get:

[(f - f_{\dot{x}} \dot{x}) d_{t} + f_{\dot{x}} d x] x ⋆, t_{2}^{⋆} = 0 \forall {[d_{t} d}^{x ⊤] ⊤} \in Ker [\begin{matrix} ϕ \end{matrix} t ϕ x] x ⋆, t_{2}^{⋆}

□

Remark 3.24: one-dimensional case

For the one-dimensional case we have:

{[(f - f_{ẋ} ẋ) d_{t} + f_{ẋ} d_{x}]}_{x^{⋆}, t_{2}^{⋆}} = 0 \forall (d_{t} d_{x}) \in Ker {[\begin{matrix} ϕ_{t} & ϕ_{x} \end{matrix}]}_{x^{⋆}, t_{2}^{⋆}}

note that ${[\begin{matrix} ϕ_{t} & ϕ_{x} \end{matrix}]}_{x^{⋆}, t_{2}^{⋆}} = v ⊤$ reduces to a row vector in $ℝ^{2}$ then the kernel is simply made of the subspace of vectors in $d \in ℝ^{2}$ such that:

v ⊤ d = 0 \to d_{t} ϕ_{t} + d_{x} ϕ_{x} = 0,

by picking $d_{x} = - ϕ_{t}$ and $d_{t} = ϕ_{x}$ the transversality condition reduces to:

{[(f - f_{ẋ} ẋ) ϕ_{x} - f_{ẋ} ϕ_{t}]}_{x^{⋆}, t_{2}^{⋆}} = 0

Remark 3.25

Note that $ϕ \in 𝒞^{1} {([t_{1}, T] \times ℝ^{n_{x}})}^{n_{x}}$ therefore its Jacobian is a matrix $n_{x} \times (n_{x} + 1)$ of the form :

J ϕ = \frac{\partial ϕ}{\partial [t x]} = [\begin{matrix} \frac{\partial ϕ_{1}}{\partial t} & \frac{\partial ϕ_{1}}{\partial x_{1}} & \dots & \frac{\partial ϕ_{1}}{\partial x_{n_{x}}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ \frac{\partial ϕ_{n_{x}}}{\partial t} & \frac{\partial ϕ_{n_{x}}}{\partial x_{1}} & \dots & \frac{\partial ϕ_{n_{x}}}{\partial x_{n_{x}}} \end{matrix}] = [\begin{matrix} ϕ \end{matrix} t ϕ x]

and it is a matrix-valued function of $(t, x)$ that is $J ϕ (t, x)$ . Its rank can be at most $n_{x}$ when all of its rows are independent.

Example 3.15 (Minimum Path Problem with Variable End-Point and End-Time). Consider the problem to minimize the distance between a fixed point $A = (x_{A}, y_{A})$ and the point $B = (x_{B}, y_{B}) \in$ ${(x, y) \in ℝ^{2} : y = a x + b}$ in the $(x, y)$ -plane. We want to find the curve $y (x)$ $x_{A} \leq x \leq x_{B}$ such that the functional $F [y, x_{B}] : = \int_{x_{A}}^{x_{B}} \sqrt{1 + ẏ {(x)}^{2}} d x$ is minimized, subject to the constraints $y (x_{A}) = y_{A}$ and $y (x_{B}) = a x_{B} + b$ . Note that neither $x_{B}$ nor $y (x_{B})$ are explicitly given. Instead they are implicitly given by the scalar end-point constraint $ϕ (x_{B}, y (x_{B})) = y (x_{B}) - a x_{B} - b = 0$ where $ϕ : ℝ^{2} \to ℝ$ is obviously a $𝒞^{1}$ function. We stress the fact that $ϕ (x, y) = 0$ (i.e. the zero level-set of the function $ϕ$ ) is a straight line in $ℝ^{2}$ . Therefore, we are looking for the minimum path problem from a fixed point $A$ to a point $B$ on a given straight line. For the sake of simplicity let’s consider $A = (0, 0)$ , from Theorem 3.14we know that an extremal $y^{⋆}$ must satisfy the Euler equation in the interval $x \in [x_{A} x_{B}^{⋆}]$ . Note that $x_{B}^{⋆}$ is yet unknown. As we have already seen the extremal for minimum path problems are straight lines. Since we have assigned a boundary condition in $A$ , we get:

y^{⋆} (x) = C_{1} x

that are a family of straight lines passing through the origin. In order to find $C_{1}$ and $x_{B}^{⋆}$ we apply the necessary condition of Theorem 3.14adpated to the one dimensional case, that is:

{[(f - f_{ẏ} ẏ) ϕ_{y} - f_{ẏ} ϕ_{x}]}_{y^{⋆}, x_{B}^{⋆}} = 0

Substituting $f = \sqrt{1 + C_{1}^{2}}$ , $f_{ẏ} = \frac{C_{1}}{\sqrt{1 + C_{1}^{2}}}$ , $ϕ_{y} = 1$ , $ϕ_{x} = - a$ we get:

\sqrt{1 + C_{1}^{2}} - \frac{C_{1}^{2}}{\sqrt{1 + C_{1}^{2}}} + \frac{a C_{1}}{\sqrt{1 + C_{1}^{2}}} = 0

and upon simplification:

\frac{1 + a C_{1}}{\sqrt{1 + C_{1}^{2}}} = 0 \to C_{1} = - \frac{1}{a}

that precisely means that the extremal curve $y^{⋆} = - \frac{1}{a} x$ is perpendicular to the straight line defined by the constraint. In this simple case, this is a global condition but the transversality condition in general is a local condition at the end-point between the ”gradient” of the constraint and the extremal trajectory. Note that we have also implicitly found $x_{B}^{⋆}$ as the intersection between the lines $y^{⋆} = - \frac{1}{a} x$ and $y = a x + b$ that is $x_{B}^{⋆} = \frac{b a}{a^{2} + 1}$ . A graphical representation of this problem is shown in Figure 3.13

3.5.2 Problems with isoperimetric constraints

An isoperimetric problem of the calculus of variations is a problem wherein one or more constraints involves a functional in the form of an integral over part or all of the integration horizon $[t_{1}, t_{2}]$ . Typically:

\begin{aligned} \min_{x (t) \in 𝒟} F [x] = \int_{t_{1}}^{t_{2}} f (t, x (t), \dot{x} (t)) d t \\ subject to: G [x] = \int_{t_{1}}^{t_{2}} ψ (t, x (t), \dot{x} (t)) d t = K \end{aligned}

where $K \in ℝ$ is a given number. Note that this problem already fits the framework that we have treated so far since the constraint is in the form of a level-set of the functional $G [x]$ . We will use the hybrid approach in Remark 3.23 to deal both with simpler constraints imposed by the function space $𝒟$ (e.g. fixed end-points) and with the isoperimetric constraint $G [x] = K$ .

Theorem 3.15: First-Order Necessary Conditions for Isoperimetric Problems

Consider the problem to minimize the functional:

F [x] : = \int_{t_{1}}^{t_{2}} f (t, x (t), \dot{x} (t)) d t

on $𝒟 : = {x \in 𝒞^{1} {[t_{1}, t_{2}]}^{n_{x}} : x (t_{1}) = x 1, x (t_{2}) = x 2},$ subject to the isoperimetric constraints:

G_{i} [x] : = \int_{t_{1}}^{t_{2}} ψ_{i} (t, x (t), \dot{x} (t)) d t = K_{i}, i = 1, \dots, n_{g}

with $f \in 𝒞^{1} ([t_{1}, t_{2}] \times ℝ^{2 \times n_{x}})$ and $ψ_{i} \in 𝒞^{1} ([t_{1}, t_{2}] \times ℝ^{2 \times n_{x}}), i = 1, \dots, n_{g} .$ Suppose that $x ⋆ \in 𝒟$ gives a (local) minimum for this problem, and

| \begin{matrix} δ G_{1, x ⋆} \end{matrix} [{\bar{ξ}}_{1}] \dots δ G_{1, x ⋆} [{\bar{ξ}}_{n_{g}}] ⋮ & ⋱ & ⋮ δ G_{n_{g}, x ⋆} [{\bar{ξ}}_{1}] & \dots & δ G_{n_{g}, x ⋆} [{\bar{ξ}}_{n_{g}}] | \neq 0

for $n_{g}$ (independent) directions ${\bar{ξ}}_{1}, \dots, {\bar{ξ}}_{n_{g}} \in 𝒞^{1} {[t_{1}, t_{2}]}^{n_{x}}$ . Then, there exists a constant vector $λ \in ℝ^{n_{g}}$ such that $x ⋆$ is a solution to the so called Euler-Lagrange’s equation:

\frac{d}{d t} ℒ_{ẋ_{i}} (t, x (t), \dot{x} (t), λ) = ℒ_{x_{i}} (t, x (t), \dot{x} (t), λ), i = 1, \dots, n_{x}

where:

\begin{aligned} ℒ (t, x (t), \dot{x} (t), λ) : = & f (t, x (t), \dot{x} (t)) + λ ⊤ ψ (t, x (t), \dot{x} (t)) = \\ = & f (t, x (t), \dot{x} (t)) + \sum_{i = 1}^{n_{g}} λ_{i} ψ_{i} (t, x (t), \dot{x} (t)) . \end{aligned}

Proof. Remark first that, from the differentiability assumptions on $f$ and $ψ_{i}, i = 1, \dots, n_{g}$ the Gâteaux derivatives $δ F_{x ⋆} [ξ]$ and $δ G_{i, x ⋆} [ξ]$ , $i = 1, \dots, n_{g}$ , exist and are continuous for every $ξ \in 𝒞^{1} {[t_{1}, t_{2}]}^{n_{x}}$ .

Since $x ⋆ \in 𝒟$ gives a (local) minimum for $F$ on $𝒟$ constrained to $Γ (K) : = {x \in 𝒞^{1} {[t_{1}, t_{2}]}^{n_{x}} : 𝒢_{i} (x) = K_{i}, i = 1, \dots, n_{g}}$ , and $x ⋆$ is a regular point for the constraints, by Theorem 3.13 (and Remark 3.23 ), there exists a constant vector $λ \in ℝ^{n_{g}}$ such that:

δ F_{x ⋆} [ξ] + \sum_{i = 1}^{n_{g}} λ_{i} δ G_{i, x ⋆} [ξ] = 0

for each $𝒟$ -admissible direction $ξ$ . Observe that this latter condition is equivalent to that of finding a minimizer to the functional:

F [x] : = \int_{t_{1}}^{t_{2}} f (t, x (t), \dot{x} (t)) + λ^{⊤} ψ (t, x (t), \dot{x} (t)) d t : = \int_{t_{1}}^{t_{2}} ℒ (t, x (t), \dot{x} (t), λ) d t

on $𝒟$ . The conclusion then directly follows upon applying Theorem 3.3. □

Remark 3.26: First Integrals

Similar to free problems of the calculus of variations (see Remark 3.11) it is easy to show that the Hamiltonian function $H (x (t), \dot{x} (t), λ)$ defined as

H : = ℒ - \frac{\partial ℒ}{\partial \dot{x}} \dot{x}

is constant along an extremal trajectory provided that $ℒ$ does not depend on the independent variable $t$ . Recall the definition $ℒ = f (x (t), \dot{x} (t)) + λ ⊤ ψ (x (t), \dot{x} (t))$ and that along an extremal trajectory the necessary condition of Theorem 3.15 holds, that is:

\frac{d}{d t} {\frac{\partial ℒ}{\partial \dot{x}}}^{⊤} - {\frac{\partial ℒ}{\partial x}}^{⊤} = 0 .

The total time derivative of the Hamiltonian is:

\frac{d H}{d t} = \frac{\partial ℒ}{\partial x} \dot{x} + \frac{\partial ℒ}{\partial \dot{x}} \ddot{x} - \frac{d}{d t} \frac{\partial ℒ}{\partial \dot{x}} \dot{x} - \frac{\partial ℒ}{\partial \dot{x}} \ddot{x} = 0

Example 3.16 (Solution to Dido’s Problem). We are finally ready to face problem shown in Example 3.3. Dido’s Problem is to minimize the functional:

\max_{y \in 𝒟} F [y] = \int_{x_{A}}^{x_{B}} y d x

⁸

subject to the isoperimetric ⁹ constraint:

G [y] = \int_{x_{A}}^{x_{B}} d s = \int_{x_{A}}^{x_{B}} \sqrt{1 + ẏ^{2}} d x = L

The subspace $𝒟$ is $𝒟 = {y \in 𝒞^{1} [x_{A}, x_{B}] such that y (x_{A}) = y_{A} y (x_{B}) = y_{B}}$ . Now we build the function $ℒ$ as:

ℒ = y + λ \sqrt{1 + ẏ^{2}}

Since $ℒ$ is independent of $x$ , the hamiltonian function is constant along an extremal trajectory, that is:

H = ℒ - ℒ_{ẏ} ẏ = C_{1}

in this case we have $ℒ_{ẏ} = \frac{λ ẏ}{\sqrt{1 + ẏ^{2}}}$ , hence:

y + λ \sqrt{1 + ẏ^{2}} - \frac{λ ẏ^{2}}{\sqrt{1 + ẏ^{2}}} = C_{1} .

Manipulating the previous relation we get:

\begin{aligned} (y + λ \sqrt{1 + ẏ^{2}}) \sqrt{1 + ẏ^{2}} - λ ẏ^{2} = C_{1} \sqrt{1 + ẏ^{2}} \\ (C_{1} - y) \sqrt{1 + ẏ^{2}} = λ \end{aligned}

that is a separable nonlinear differential equation, upon squaring both terms and after some algebraic manipulation we get:

ẏ = \sqrt{\frac{λ^{2} - {(C_{1} - y)}^{2}}{{(C_{1} - y)}^{2}}}

and by formally expressing $ẏ = \frac{d y}{d x}$ and integrating both sides we have:

\int_{y_{A}}^{y} \frac{C_{1} - y}{\sqrt{λ^{2} - {(C_{1} - y)}^{2}}} d y = \int_{x_{B}}^{x} d x .

In order to solve the integral in $d y$ we make the substitution $y = C_{1} - λ c o s (𝜃)$ , $d y = λ s i n (𝜃) d 𝜃$ and hence:

\int_{𝜃_{A}}^{𝜃} \frac{λ c o s (𝜃)}{\sqrt{λ^{2} - λ^{2} c o s^{2} (𝜃)}} λ s i n (𝜃) d 𝜃 = \int_{𝜃_{A}}^{𝜃} λ c o s (𝜃) d 𝜃

in a similar way as for the brachistochrone problem we get the extremal trajectory in parametric form as:

\begin{aligned} y^{⋆} (𝜃) = C_{1} + \hat{λ} c o s (𝜃) \\ x^{⋆} (𝜃) = C_{2} + \hat{l a m b d a} s i n (𝜃) \end{aligned}

where we have defined $\hat{λ} = - λ$ . Note that $(x^{⋆}, y^{⋆})$ define a circular arc of radius $\hat{λ}$ and centered at $(C_{2}, C_{1})$ . This is easy to see considering that:

\begin{aligned} y^{⋆} - C_{1} = \hat{λ} c o s (𝜃) \\ x^{⋆} - C_{2} = \hat{λ} s i n (𝜃) \end{aligned}

and thus:

{(y^{⋆} - C_{1})}^{2} + {(x^{⋆} - C_{2})}^{2} = {\hat{λ}}^{2}

however we still need to compute the actual value of the radius $\hat{λ}$ and the center coordinates $(C_{2}, C_{1})$ . Assume for the sake of simplicity that $x_{A} = - 1$ , $x_{B} = 1$ and $y_{A} = y_{B} = 0$ . Substituing the boundary condtions in the previous equation gives:

\begin{aligned} {(- C_{1})}^{2} + {(- 1 - C_{2})}^{2} = {\hat{λ}}^{2} \\ {(- C_{1})}^{2} + {(1 - C_{2})}^{2} = {\hat{λ}}^{2} \end{aligned}

by subtracting these two conditions we have:

{(- 1 - C_{2})}^{2} - {(1 - C_{2})}^{2} \to 2 C_{2} = 0 \to C_{2} = 0

therefore the center of this family of circles is in $(0, C_{1})$ and passes through $A = (- 1, 0)$ and $B = (1, 0)$ . We still need to compute the radius $\hat{λ}$ and the ordinate of the center $C_{1}$ . Indeed these two parameters depend on the isoperimetric constraint (i.e. the length of the circle) $L$ that is given. The length of an arc of circle of radius $\hat{λ}$ between $𝜃_{A}$ and $𝜃_{B}$ (yet unknown) is $L = \hat{λ} (𝜃_{B} - 𝜃_{A})$ , this comes from the definition of radiants, obviously the same relation holds if we compute it analytically as:

G [y^{⋆} (𝜃), x^{⋆} (𝜃)] = \int_{𝜃_{A}}^{𝜃_{B}} \sqrt{x_{𝜃}^{⋆^{2}} + y_{𝜃}^{⋆^{2}}} d 𝜃 = λ (𝜃_{B} - 𝜃_{A}) = L

$𝜃_{A}$ and $𝜃_{B}$ can expressed in terms of $C_{1}$ by using the boundary condition at $B$ and $A$ :

\begin{aligned} \frac{x^{⋆} (𝜃_{B})}{y^{⋆} (𝜃_{B}) - C_{1}} = \frac{1}{- C_{1}} = \frac{\hat{λ} s i n (𝜃_{B})}{\hat{λ} c o s (𝜃_{B})} = t a n (𝜃_{B}) \\ \frac{x^{⋆} (𝜃_{A})}{y^{⋆} (𝜃_{A}) - C_{1}} = \frac{- 1}{- C_{1}} = \frac{\hat{λ} s i n (𝜃_{A})}{\hat{λ} c o s (𝜃_{A})} = t a n (𝜃_{A}) . \end{aligned}

Since the arctangent function is an odd function we have $𝜃_{A} = - 𝜃_{B} = - a r c t a n (\frac{- 1}{C_{1}})$ while from the cartesian expression of the circle with $C_{2}$ = 0 at point $B$ we get:

C_{1}^{2} + 1 = {\hat{λ}}^{2} \to \hat{λ} = \sqrt{C_{1}^{2} + 1}

finally the expression for the length $L$ presents the only unknown $C_{1}$ :

L = 2 \sqrt{C_{1}^{2} + 1} a r c t a n (\frac{- 1}{C_{1}})

that, given $L$ , can be solved numerically for $C_{1}$ that is the ordinate of the center of the circle, then using $λ = \sqrt{C_{1}^{2} + 1}$ we can retrieve the value of the Lagrange multiplier $λ$ that, interestingly enough, in this case is also the radius of the circle changed of sign. $L$ as a function of $C_{1}$ is plotted in Figure 3.14. The extremal for the simple case $C_{1} = - 1$ that gives $λ = \sqrt{2}$ and $𝜃_{B} = \frac{π}{4}$ is plotted in Figure 3.15.

Example 3.17 (Problem of the Surface of Revolution of Minimum Area). Consider the problem to find the smooth curve $y (x) \geq 0$ having a fixed length $L > 0,$ joining two given points $A = (x_{A}, y_{A})$ and $B = (x_{B}, y_{B}),$ and generating a surface of revolution around the $x$ -axis of minimum area. In mathematical terms, the problem consists of finding a minimizer of the functional:

F [y] : = 2 π \int_{x_{A}}^{x_{B}} y (x) \sqrt{1 + ẏ {(x)}^{2}} d x

on $𝒟 : = {x \in 𝒞^{1} [x_{A}, x_{B}] : y (x_{A}) = y_{A}, y (x_{B}) = y_{B}},$ subject to the isoperimetric: constraint

G [y] : = \int_{x_{A}}^{x_{B}} \sqrt{1 + ẏ {(x)}^{2}} d x = μ .

Let us drop the coefficient $2 π$ in $F$ and introduce the Lagrangian $ℒ$ as

ℒ (y (x), ẏ (x), λ) : = y (x) \sqrt{1 + ẏ {(x)}^{2}} + λ \sqrt{1 + ẏ {(x)}^{2}}

since $ℒ$ does not depend on the independent variable $x$ we use the fact that $H$ must be constant along an extremal trajectory $ȳ (x)$ ,

H : = ℒ - ẏ ℒ_{ẏ} = C_{1}

for some constant $C_{1} \in ℝ .$ That is:

(ȳ (x) + λ) \sqrt{1 + \dot{ȳ} {(x)}^{2}} - \frac{(ȳ (x) + λ) {\dot{ȳ}}^{2}}{\sqrt{1 + \dot{ȳ} {(x)}^{2}}} = C_{1}

that can be rearranged as:

\frac{(ȳ (x) + λ) (1 + \dot{ȳ} {(x)}^{2}) - (ȳ (x) + λ) {\dot{ȳ}}^{2}}{\sqrt{1 + \dot{ȳ} {(x)}^{2}}} = \frac{(ȳ (x) + λ)}{\sqrt{1 + \dot{ȳ} {(x)}^{2}}} = C_{1}

that with some simplification gives:

ẏ (x) \sqrt{\frac{C_{1}^{2}}{{(λ + y)}^{2} - C_{1}^{2}}} = d x

again we solve the integral by substituting $y = C_{1} c o s h (𝜃) - λ$ and $d y = C_{1} s i n h (𝜃) d 𝜃$ and thus:

\int_{y_{A}}^{y} \sqrt{\frac{C_{1}^{2}}{{(λ + y)}^{2} - C_{1}^{2}}} d y = \int_{𝜃_{A}}^{𝜃} \sqrt{\frac{C_{1}^{2}}{{(C_{1} c o s h {(𝜃)}^{2})}^{2} - C_{1}^{2}}} C_{1} s i n h (𝜃) d 𝜃

and by using the relation $c o s h {(𝜃)}^{2} - s i n h {(𝜃)}^{2} = 1$ we obtain:

\int_{𝜃_{A}}^{𝜃} \sqrt{\frac{C_{1}^{2}}{{(C_{1} c o s h {(𝜃)}^{2})}^{2} - C_{1}^{2}}} C_{1} s i n h (𝜃) d 𝜃 = C_{1} 𝜃 + C_{2}

and assuming $x_{A} = 0$ we have:

x = C_{1} 𝜃 + C_{2} \to 𝜃 = \frac{x - C_{2}}{C_{1}}

finally substituing back in $y = C_{1} c o s h (𝜃) - λ$ we obtain:

ȳ (x) = C_{1} c o s h (\frac{x - C_{2}}{C_{1}}) - λ .

The constants $C_{1}$ , $C_{2}$ and $λ$ are to be found from the boundary conditions and the isoperimetric constraints. Note that the extremal curve are a family of catenaries. For the sake of simplicity let’s do the the inverse reasoning that is we assign $C_{1}, C_{2}, λ$ and we compute the boundary condititions. Note that $C_{2}$ is a translation along the $x$ -axis while $λ$ a translation along the $y$ -axis.With $C_{1} = C_{2} = 10$ , $λ = 5.4$ and $x_{A} = 0$ , $x_{B} = 20$ we get the results of Figure 3.16and 3.17where the length $L$ of the curve is:

L = G [ȳ] = \int_{x_{A}}^{x_{B}} \sqrt{1 + s i n h {(\frac{x - C_{2}}{C_{1}})}^{2}} d x = 23.5

note that the length $L$ does not depend directly on the lagrange multiplier $λ$ . While the surface area is:

A = F [ȳ] = 2 π \int_{x_{A}}^{x_{B}} ȳ (x) \sqrt{1 + \dot{ȳ} {(x)}^{2}} d x = 970.26

Example 3.18 (Solution to hanging rope with fixed length). We give the solution to the problem shown in Example 3.2. Given a rope of length $L$ attached at two poles $A = (x_{A}, y_{A})$ and $B = (x_{B}, y_{B})$ find the function $y (x)$ that describes the shape assumed by the rope under the action of gravity. We want to minimize the functional expressing the potential energy under the isoperimetric constraint expressing the fixed length. That is:

F [y] : = \int_{x_{A}}^{x_{B}} y (x) \sqrt{1 + ẏ {(x)}^{2}} d x

on $𝒟 : = {x \in 𝒞^{1} [x_{A}, x_{B}] : y (x_{A}) = y_{A}, y (x_{B}) = y_{B}},$ subject to the isoperimetric constraint

G [y] : = \int_{x_{A}}^{x_{B}} \sqrt{1 + ẏ {(x)}^{2}} d x = μ

This is exactly the same problem as the surface of minimum area. The rope assumes the shape of a catenary:

y^{⋆} (x) = C_{1} c o s h (\frac{x - C_{2}}{C_{1}}) - λ

where again $C_{1}, C_{2}, λ$ are to be found from the boundary condition and the length $L$ .

3.5.3 Problems with inequality constraints

The method of Lagrange multipliers can also be used to address problems of the calculus of variations having inequality constraints (or mixed equality and inequality constraints), as shown by the following:

Theorem 3.16: Existence and Uniqueness of the Lagrange Multipliers (Inequality Constraints)

Let $F$ and $G_{1}, \dots, G_{n_{g}}$ be functionals defined in a neighborhood of $x ⋆$ in a normed linear space $(𝒳, ∥ \cdot ∥),$ and having continuous and linear Gâteaux derivatives in this neighborhood. Suppose that $x ⋆$ is a (local) minimum point for $F$ constrained to $Γ (K) : = {x \in 𝒳 : 𝒢_{i} (x) \leq K_{i}, i = 1, \dots, n_{g}}$ for some constant vector $K$ . Suppose further that $n_{a} \leq n_{g}$ constraints, say $G_{1}, \dots, G_{n_{a}}$ for simplicity, are active at $x ⋆$ and satisfy the regularity condition

| \begin{matrix} δ G_{1, x ⋆} \end{matrix} [{\bar{ξ}}_{1}] \dots δ G_{1, x ⋆} [{\bar{ξ}}_{n_{a}}] ⋮ & ⋱ & ⋮ δ G_{n_{a}, x ⋆} [{\bar{ξ}}_{1}] & \dots & δ G_{n_{a}, x ⋆} [{\bar{ξ}}_{n_{a}}] | \neq 0

for $n_{a}$ (independent) directions ${\bar{ξ}}_{1}, \dots, {\bar{ξ}}_{n_{a}} \in 𝒳$ (the remaining constraints being inactive). Then, there exists a vector $ν \in ℝ^{n_{g}}$ such that:

δ F_{x ⋆} [\bar{ξ}] + \sum_{i = 1}^{n_{g}} δ G_{i, x ⋆} [\bar{ξ}] ν_{i} = 0 \forall ξ \in 𝒳

and

\begin{aligned} ν_{i} & \geq 0 \\ (G_{i} [x ⋆] - K_{i}) ν_{i} & = 0 \end{aligned}

for $i = 1, \dots, n_{g}$

Proof. Since the inequality constraints $G_{n_{a} + 1}, \dots, G_{n_{g}}$ are inactive, the nonnegativity conditions and complementary slackness are trivially satisfied by taking $ν_{n_{a} + 1} = \dots = ν_{n_{g}} = 0$ . On the other hand, since the inequality constraints $G_{1}, \dots, G_{n_{a}}$ are active and satisfy a regularity condition at $x ⋆$ , the conclusion that there exists a unique vector $λ \in ℝ^{n_{a}}$ such that the ”stationarity condition” holds follows from Theorem 3.12 moreover, the complementarity slackness condition is trivially satisfied for $i = 1 \dots, n_{a}$ since $G_{i} [x ⋆] = K_{i}$ if $i$ is active. Hence, it suffices to prove that the Lagrange multipliers $ν_{i} \dots ν_{n_{a}}$ cannot assume negative values when $x ⋆$ is a (local) minimum.

We show the result by contradiction. Without loss of generality, suppose that $ν_{1} < 0$ , and consider the $(n_{a} + 1) \times n_{a}$ matrix A defined by

A = [\begin{matrix} δ F_{x ⋆} \end{matrix} [{\bar{ξ}}_{1}] \dots δ F_{x ⋆} [{\bar{ξ}}_{n_{a}}] δ G_{1, x ⋆} [{\bar{ξ}}_{1}] & \dots & δ G_{1, x ⋆} [{\bar{ξ}}_{n_{a}}] ⋮ & ⋱ & ⋮ δ G_{n_{a}, x ⋆} [{\bar{ξ}}_{1}] & \dots & δ G_{n_{a}, x ⋆} [{\bar{ξ}}_{n_{a}}]] .

By hypothesis, $Rank (A^{⊤}) \geq n_{a} - 1$ , hence the null space of $A$ has dimension lower than or equal to $1$ . But from the stationarity condition the nonzero vector ${(1, ν_{1}, \dots, ν_{n_{a}})}^{⊤} \in Ker (A^{⊤})$ . That is $Ker (A^{⊤})$ has dimension equal to $1$ and $A^{⊤} y < 0$ only if $\exists η \in ℝ$ such that $y = η {(1, ν_{1}, \dots, ν_{n_{a}})}^{⊤}$ . Because $ν_{1} < 0$ there does not exist a $y \neq 0$ in $Ker (A^{⊤})$ such that $y_{i} \geq 0$ for every $i = 1, \dots, n_{a} + 1$ . Thus, by Gordan’s Theorem there exists a nonzero vector $p \geq 0$ in $ℝ^{n_{a}}$ such that $A p < 0$ , or equivalently:

\begin{matrix} \sum_{k = 1}^{n_{a}} p_{k} δ F_{x ⋆} [{\bar{ξ}}_{k}] < 0 \\ \sum_{k = 1}^{n_{a}} p_{k} δ G_{i, x ⋆} [{\bar{ξ}}_{k}] < 0 i = 1, \dots, n_{a} . \end{matrix}

The Gâteaux derivatives of $F, G_{1}, \dots, G_{n_{a}}$ being linear (by assumption), we get:

\begin{aligned} δ F_{x ⋆} [\sum_{k = 1}^{n_{a}} p_{k} {\bar{ξ}}_{k}] < 0 \\ δ G_{i, x ⋆} [\sum_{k = 1}^{n_{a}} p_{k} {\bar{ξ}}_{k}] < 0, i = 1, \dots, n_{a} . \end{aligned}

In particular,

\exists δ > 0 such that x ⋆ + η \sum_{k = 1}^{n_{a}} p_{k} {\bar{ξ}}_{k} \in Γ (K), \forall 0 \leq η \leq δ

that is, $x ⋆$ being a local minimum of $F$ on $Γ (K)$ ,

\exists δ^{'} \in (0, δ] such that F [x ⋆ + η \sum_{k = 1}^{n_{a}} p_{k} {\bar{ξ}}_{k}] \geq F [x ⋆] \forall 0 \leq η \leq δ^{'}

and we get

δ F_{x ⋆} [\sum_{k = 1}^{n_{a}} p_{k} {\bar{ξ}}_{k}] = \lim_{η \to 0^{+}} \frac{F [x}{⋆} \geq 0

which contradicts the inequality obtained earlier. □