Notes | Mechatronic Systems and Laboratory

3.4 Piecewise $𝒞^{1}$ extremal functions

In all the problems examined so far, the functions defining the class for optimization were required to be continuously differentiable that is $x \in 𝒞^{1} {[t_{1}, t_{2}]}^{n_{x}}$ . Yet, it is natural to wonder whether cornered trajectories, i.e., trajectories represented by piecewise continuously differentiable functions, might not yield improved results. Besides improvement, it is also natural to wonder whether those problems of the calculus of variations which do not have extremals in the class of $𝒞^{1}$ functions actually have extremals in the larger class of piecewise $𝒞^{1}$ functions.

Definition 3.10: Piecewise $𝒞^{1}$ functions

A real-valued function $\hat{x} \in 𝒞 [a, b]$ is said to be piecewise $𝒞^{1}$ , denoted $\hat{x} \in {\hat{𝒞}}^{1} [a, b]$ , if there is a finite (irreducible) partition $a = c_{0} <$ $c_{1} < \dots < c_{N + 1} = b$ such that $\hat{x}$ may be regarded as a function in $𝒞^{1} [c_{k}, c_{k + 1}]$ for each $k = 0, 1, \dots, N .$ When present, the interior points $c_{1}, \dots, c_{N}$ are called corner points of $\hat{x}$ .

Figure 3.8:: Illustration of a piecewise continuously differentiable function $\hat{x} \in {\hat{𝒞}}^{1} [a, b]$ (thick red line), and its derivative $\dot{\hat{x}}$ (dash-dotted red line); without corners, $\hat{x}$ may resemble the continuously differentiable function $x \in 𝒞^{1} [a, b]$ (dashed blue curve).

Some remarks are in order. Observe first that, when there are no corners, then $\hat{x} \in 𝒞^{1} [a, b]$ . Further, for any $\hat{x} \in {\hat{𝒞}}^{1} [a, b], \dot{\hat{x}}$ is defined and continuous on $[a, b]$ except at its corner points $c_{1}, \dots, c_{N}$ where it has distinct limiting values $\dot{\hat{x}} (c_{k}^{\pm})$ ; such discontinuities are said to be simple, and $\dot{\hat{x}}$ is said to be piecewise continuous on $[a, b]$ denoted $\dot{\hat{x}} \in \hat{𝒞} [a, b]$ . Figure 3.8 illustrates the effect of the discontinuities of $\dot{\hat{x}}$ in producing corners on the graph of $\hat{x}$ . Without these corners, $\hat{x}$ might resemble the $𝒞^{1}$ function $x$ whose graph is presented for comparison. In particular, each piecewise $𝒞^{1}$ function is ”almost” $𝒞^{1}$ , in the sense that it is only necessary to round out the corners to produce the graph of a $𝒞^{1}$ function. These considerations are formalized by the following Lemma:

Lemma 3.4: Smoothing of Piecewise $𝒞^{1}$ Functions

Let $\hat{x} \in {\hat{𝒞}}^{1} [a, b]$ . Then, for each $δ > 0,$ there exists $x \in 𝒞^{1} [a, b]$ such that $x \equiv \hat{x}$ except in a neighborhood $ℬ_{δ} (c_{k})$ of each corner point of $\hat{x}$ . Moreover, $∥ x - \hat{x} ∥_{\infty} \leq Â δ$ where $Â$ is a constant determined by $\hat{x}$ .

Likewise, we shall consider the class ${\hat{𝒞}}^{1} {[a, b]}^{n_{x}}$ of $n_{x}$ -dimensional vector valued analogue of ${\hat{𝒞}}^{1} [a, b],$ consisting of those functions $\hat{x} \in {\hat{𝒞}}^{1} {[a, b]}^{n_{x}}$ with components ${\hat{x}}_{j} \in {\hat{𝒞}}^{1} [a, b], j = 1, \dots, n_{x} .$ The corners of such $\hat{x}$ are by definition those of any one of its components ${\hat{x}}_{j} .$ Note that the above lemma can be applied to each component of a given $\hat{x},$ and shows that $\hat{x}$ can be approximated by a $x \in 𝒞^{1} {[a, b]}^{n_{x}}$ which agrees with it except in prescribed neighborhoods of its corner points. Both real valued and real vector valued classes of piecewise $𝒞^{1}$ functions form linear spaces of which the subsets of $𝒞^{1}$ functions are subspaces. Indeed, it is obvious that the constant multiple of one of these functions is another of the same kind, and the sum of two such functions exhibits the piecewise continuous differentiability with respect to a suitable partition of the underlying interval $[a, b]$ . Since ${\hat{𝒞}}^{1} [a, b] \subset 𝒞 [a, b]$ , we have:

∥ x ∥_{\infty} : = \max_{a \leq t \leq b} | x (t) |

defines a norm on ${\hat{𝒞}}^{1} [a, b] .$ Moreover,

∥ x ∥_{1, \infty} : = \max_{a \leq t \leq b} | x (t) | + \sup_{t \in ⋃_{k = 0}^{N} (c_{k}, c_{k + 1})} | ẋ (t) |

can be shown to be another norm on ${\hat{𝒞}}^{1} [a, b],$ with $a = c_{0} < c_{1} < \dots < c_{N} < c_{N + 1} =$ $b$ being a suitable partition for $\hat{x}$ . (The space of vector valued piecewise $𝒞^{1}$ functions ${\hat{𝒞}}^{1} {[a, b]}^{n_{x}} can also be endowed with the norms ∥ \cdot ∥_{\infty} and ∥ \cdot ∥_{1, \infty}) .$ By analogy to the linear space of $𝒞^{1}$ functions, the maximum norms $∥ \cdot ∥_{\infty}$ and $∥ \cdot ∥_{1, \infty}$ are referred to as the strong norm and the weak norm, respectively; the functions which are locally extremal with respect to the former [latter] norm are said to be strong [weak] extremal functions.

3.4.1 The Weierstrass-Erdmann Corner Conditions

A natural question that arises when considering the class of ${\hat{𝒞}}^{1}$ functions is whether a (local) extremal point for a functional in the class of $𝒞^{1}$ functions also gives a (local) extremal point for this functional in the larger class of ${\hat{𝒞}}^{1}$ functions. We state the following Theorem:

Theorem 3.8: ${\hat{𝒞}}^{1}$ Extremals vs. $𝒞^{1}$ Extremals

If $x ⋆$ gives a [local] extremal point for the functional:

F [x] : = \int_{t_{1}}^{t_{2}} f (t, x (t), \dot{x} (t)) d t

on $𝒟 : = {x \in 𝒞^{1} {[t_{1}, t_{2}]}^{n_{x}} : x (t_{1}) = x 1, x (t_{2}) = x 2}$ with
$f \in 𝒞 ([t_{1}, t_{2}] \times ℝ^{2 n_{x}})$ then $x ⋆$ also gives a [local] extremal point for $F$
on $\hat{𝒟} : = {\hat{x} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}} : \hat{x} (t_{1}) = x 1, \hat{x} (t_{2}) = x 2}$ with respect to the same $∥ \cdot ∥_{\infty}$ or $∥ \cdot ∥_{1, \infty}$ norm.

On the other hand, a functional $F$ may not have $𝒞^{1}$ extremals, but be extremized by a ${\hat{𝒞}}^{1}$ function. We shall first seek for weak (local) extremals ${\hat{x}}^{⋆} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}},$ i.e., extremal trajectories with respect to some weak neighborhood $ℬ_{δ}^{1, \infty} ({\hat{x}}^{⋆})$ .

Observe that $\hat{x} \in ℬ_{δ}^{1, \infty} ({\hat{x}}^{⋆})$ if and only if $\hat{x} = {\hat{x}}^{⋆} + α \hat{ξ}$ for $\hat{ξ} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}}$ and a sufficiently small $α$ . In characterizing (weak) local extremals for the functional
$F [\hat{x}] : = \int_{t_{1}}^{t_{2}} f (t, \hat{x} (t), \dot{\hat{x}} (t)) d t$
on $\hat{𝒟} : = {\hat{x} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}} : \hat{x} (t_{1}) = x 1, \hat{x} (t_{2}) = x 2},$ where $f$ and its partials $f_{x}, f_{\dot{x}}$ are continuous on $[t_{1}, t_{2}] \times ℝ^{2 n_{x}},$ one can therefore duplicate the analysis of the previous section. This is done by splitting the integral into a finite sum of integrals with continuously differentiable integrands, then differentiating each under the integral sign. Overall, it can be shown that a (weak, local) extremal ${\hat{x}}^{⋆} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}}$ must be stationary in intervals excluding corner points, i.e. the Euler equation is satisfied on $[t_{1}, t_{2}]$ except at corner points $c_{1}, \dots, c_{N}$ of ${\hat{x}}^{⋆}$ .
Likewise, both Legendre second-order necessary conditions and convexity sufficient conditions can be shown to hold on intervals excluding corners points of a ${\hat{𝒞}}^{1}$ extremal.
Finally, transversality conditions corresponding to the various free end-point problems remain the same. To see this most easily, e.g., in the case where freedom is permitted only at the right end-point, suppose that ${\hat{x}}^{⋆} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}}$ gives a local extremal for $\hat{𝒟},$ and let $c_{N}$ be the right-most corner point of ${\hat{x}}^{⋆}$ . Then, restricting comparison to those competing $\hat{x}$ having their right-most corner point at $c_{N}$ and satisfying $\hat{x} (c_{N}) = {\hat{x}}^{⋆} (c_{N})$ , it is seen that the corresponding directions $(\hat{ξ}, τ)$ must utilize the end-point freedom exactly as for $𝒞^{1}$ functions. Thus, resulting in identical boundary conditions.

Besides necessary conditions of optimality on intervals excluding corner points $c_{1}, \dots, c_{N}$ of a local extremal ${\hat{x}}^{⋆} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}},$ the discontinuities of ${\dot{\hat{x}}}^{⋆}$ which are permitted at each $c_{k}$ are restricted. These are the so-called first Weierstrass-Erdmann corner conditions.

Theorem 3.9: First Weierstrass-Erdmann Corner Condition

Let ${\hat{x}}^{⋆} (t)$ be a (weak) local extremal of the problem to minimize the functional

F [\hat{x}] : = \int_{t_{1}}^{t_{2}} f (t, \hat{x} (t), \dot{\hat{x}} (t)) d t

on $\hat{𝒟} : = {\hat{x} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}} : \hat{x} (t_{1}) = x 1, \hat{x} (t_{2}) = x 2},$ where $f$ and its partials $f_{x}, f_{\dot{x}}$ are continuous on $[t_{1}, t_{2}] \times ℝ^{2 n_{x}}$ . Then, at every (possible) corner point $c \in [t_{1}, t_{2}]$ of ${\hat{x}}^{⋆},$ we have:

\frac{\partial f (c, x}{⋆} (c), {\dot{x}}^{⋆} (c^{-}) \partial \dot{x} ⊤ = \frac{\partial f (c, x}{⋆} (c), {\dot{x}}^{⋆} (c^{+}) \partial \dot{x} ⊤

where ${\dot{\hat{x}}}^{⋆} (c^{-})$ and ${\dot{\hat{x}}}^{⋆} (c^{+})$ denote the left and right time derivative of ${\hat{x}}^{⋆}$ at $c$ respectively.

Proof. Integrating both sides of Euler equation between $t$ and $t_{1}$ for each component $i = 1, \dots, n_{x}$ gives:

\int_{t_{1}}^{t} \frac{d}{d t} \frac{\partial f}{\partial ẋ_{i}} d t = \int_{t_{1}}^{t} \frac{\partial f}{\partial x} d t \to \frac{\partial f}{\partial ẋ_{i}} = \int_{t_{1}}^{t} \frac{\partial f}{\partial x_{i}} d t + C_{i},

therefore the function $g (t) : = \frac{\partial f (t, {\hat{x}}^{⋆} (t), {\dot{\hat{x}}}^{⋆} (t)}{\partial ẋ_{i}}$ is continuous at each $t \in (t_{1}, t_{2})$ even though $\dot{\hat{x}} (t)$ may be discontinuous at that point. That is $g (c^{-}) = g (c^{+})$ . Moreover, $\frac{\partial f}{\partial ẋ_{i}}$ being continuous in its $1 + 2 n_{x}$ arguments, $\hat{x} (t)$ being continuous at $c$ , and $\dot{\hat{x}} (t)$ having finite limits $\dot{\hat{x}} (c^{\pm})$ at $c$ we get:

\frac{\partial f (c, \hat{x} (c), \dot{\hat{x}} (c^{-}))}{\partial ẋ_{i}} = \frac{\partial f (c, \hat{x} (c), \dot{\hat{x}} (c^{+}))}{\partial ẋ_{i}}

for each $i = 1, \dots, n_{x}$ □

Remark 3.16

The first Weierstrass-Erdmann condition of Theorem 3.9 shows that the discontinuities of $\dot{\hat{x}}$ which are permitted at corner points of a local extremal trajectory ${\hat{x}}^{⋆} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}}$ are those which preserve the continuity of $f_{\dot{x}}$ . Likewise, it can be shown that the continuity of the Hamiltonian $H : = f - \frac{\partial f}{\partial \dot{x}} \dot{x}$ must be preserved at corner points of ${\hat{x}}^{⋆}$ that is:

H (c, \hat{x} (c), \dot{\hat{x}} (c^{-})) = H (c, \hat{x} (c), \dot{\hat{x}} (c^{+}))

which yields the so-called second Weierstrass-Erdmann corner condition.

Example 3.12. Consider the problem to minimize the functional:

\begin{aligned} F [x] = \int_{- 1}^{1} x^{2} (t) {(1 - ẋ (t))}^{2} d t \\ s.t. x \in 𝒟, \end{aligned}

where $𝒟 : = {x \in 𝒞^{1} [- 1, 1] : x (- 1) = 0, x (1) = 1}$ . The lagrangian $f = x^{2} (t) (1 - ẋ {(t)}^{2})$ is independent of the independent variable $t$ , we thus have the constancy of the Hamiltonian on a stationary trajectory. Thus:

H = f - \frac{\partial f}{\partial ẋ} ẋ = x^{2} (t) {(1 - ẋ (t))}^{2} - [2 x {(t)}^{2} (ẋ (t) - 1)] ẋ (t) = c \forall t \in [- 1, 1]

for some constant $c$ . Upon semplification, we get:

x {(t)}^{2} (1 - ẋ {(t)}^{2}) = c \forall t \in [- 1, 1],

in order to solve this differential equation we make the substitution $u (t) : = x {(t)}^{2}$ and thence $\dot{u} (t) : = 2 x (t) ẋ (t)$ , we get the somewhat simpler equation $\dot{u} {(t)}^{2} = 4 (u (t) - c)$ that can be solved by separation of variable and has general solution:

u (t) : = {(t + k)}^{2} + c

where $k$ is a constant of integration. In turn, substituing back $x^{2} (t) = u (t)$ we conclude that a stationary point $\bar{x}$ must be of the form:

\bar{x} {(t)}^{2} = {(t + k)}^{2} + c .

In particular, the boundary conditions $x (- 1) = 0$ and $x (1) = 1$ produce constants $c = - {(\frac{3}{4})}^{2}$ and $k = \frac{1}{4}$ . However, the resulting stationary function:

\bar{x} (t) = \sqrt{{(t + \frac{1}{4})}^{2} - {(\frac{3}{4})}^{2}} = \sqrt{(t + 1) (t - \frac{1}{2})}

is defined only for $t \geq \frac{1}{2}$ or $t \leq - 1$ . Thus, there is no stationary function for the Lagrangian $f$ in $𝒟$ . Next, we turn to the problem of minimizing $F$ in the larger set $\hat{𝒟} : = {\hat{x} \in {\hat{𝒞}}^{1} [- 1, 1] : \hat{x} (- 1) = 0, \hat{x} (1) = 1$ . Suppose that ${\hat{x}}^{⋆}$ is a local minimizer for $F$ on $\hat{𝒟}$ . Then, by the Weierstrass-Erdmann condition, we must have:

\frac{\partial f (c, x (c), ẋ (c^{-})}{\partial ẋ} = \frac{\partial f (c, x (c), ẋ (c^{+})}{\partial ẋ}

- 2 {\hat{x}}^{⋆} (c) [1 - {\dot{\hat{x}}}^{⋆} (c^{-})] = - 2 {\hat{x}}^{⋆} (c) [1 - {\dot{\hat{x}}}^{⋆} (c^{+})]

which gives:

{\hat{x}}^{⋆} (c) [{\dot{\hat{x}}}^{⋆} (c^{+}) - {\dot{\hat{x}}}^{⋆} (c^{-})] = 0,

by definition of corner points, we must have $\dot{\hat{x}} (c^{+}) \neq \dot{\hat{x}} (c^{-})$ , hence corner points are only allowed at those $c \in (- 1, 1)$ such that ${\hat{x}}^{⋆} (c) = 0$ . Observe that since $f = x^{2} (t) {(1 - ẋ (t))}^{2} > 0 \forall x \neq 0, ẋ \neq 1$ and thus the functional is bounded below by $0$ . This means that if we are able to find a $\bar{x} \in \hat{𝒟}$ function that satisfies the necessary conditions and achieve $F [\bar{x}] = 0$ then $\bar{x} = x^{⋆}$ is the global optimum of the problem. From the boundary condition we have that $\bar{x} (- 1) = 0$ and $\bar{x} (1) = 1$ and from the Weierstrass-Erdmann condition we have that we can only have corner points at time instants such that $\bar{x} (c) = 0$ . We can then construct a function $\bar{x}$ that is $0$ from $- 1$ to $0$ where it has a corner point and grows linearly with constant derivative equal to $1$ up to $1$ so that $\bar{x} (1) = 1$ . More precisely:

{\bar{x}}^{⋆} (t) = {\begin{matrix} 0, & - 1 \leq t \leq 0 \\ t, & 0 < t \leq 1 \end{matrix}

such a function achieves the global minimum $F [\bar{x}] = 0$ and satisfies the necessary conditions (i.e. it has a corner point at $0$ where $\bar{x} (0) = 0$ ) and thence it is the unique global minimum point for $F$ on $\hat{𝒟}$ . The global optimum is plotted in Figure 3.9.

pict — Figure 3.9:: Globally minimizing trajectory, note the corner point $c = 0$

Corollary 3.1: Absence of corner points

Consider the problem to minimize the functional:

F [\hat{x}] : = \int_{t_{1}}^{t_{2}} f (t, \hat{x} (t), \dot{\hat{x}} (t)) d t

on $\hat{𝒟} : = {\hat{x} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}} : \hat{x} (t_{1}) = x 1, \hat{x} (t_{2}) = x 2}$ . If $\frac{\partial f}{\partial \dot{x}} (t, y, z)$ is a strictly monotone function of $z \in ℝ^{n_{x}}$ (or, equivalently, $f (t, y, z)$ is a convex function in $z$ on $ℝ^{n_{x}}$ ), for each $(t, y) \in [t_{1}, t_{2}] \times R^{n_{x}}$ , then an extremal solution ${\hat{x}}^{⋆} (t)$ cannot have corner points.

Proof. By the first Weierstrass-Erdmann corner condition, at a corner point holds:

\frac{\partial f (c, x}{⋆} (c), {\dot{x}}^{⋆} (c^{-}) \partial \dot{x} ⊤ = \frac{\partial f (c, x}{⋆} (c), {\dot{x}}^{⋆} (c^{+}) \partial \dot{x} ⊤ .

Let’s define vector-valued function $k (z) : = \frac{\partial f (c, x}{⋆} (c), z) \partial \dot{x} ⊤$ . By hypothesis $k$ is strictly monotone in $z$ and thus it cannot assume twice the same value. The Weierstrass-Erdmann condition can be written in terms of $k$ as:

k (\dot{\hat{x}} (c^{-})) = k (\dot{\hat{x}} (c^{+})) \to k (z 1) = k (z 2)

but by definition of corner point $z 1 \neq z 2$ thus contradicting the strictly monotonicity of $k$ . □

Example 3.13 (Minimum Path Problems). Consider again the problem in Example 3.8that is to minimize the distance between two fixed points, namely $A = (x_{1}, y_{1})$ and $B = (x_{2}, y_{2})$ in the $(x, y)$ -plane. We have shown that extremal trajectories for this problem correspond to straight lines. But could we have extremal trajectories with corner points? The answer is no, the lagrangian is $f = \sqrt{1 + ẋ^{2}}$ and $\frac{\partial f}{\partial ẋ} = \frac{ẋ}{\sqrt{1 + ẋ^{2}}}$ is a strictly monotone function in $ẋ$ hence we can apply Corollary 3.1and conclude that extremal trajectories for this problem cannot have corner points.

3.4.2 Weierstrass’ Necessary Conditions: Strong Minima

The Gâteaux derivatives of a functional are obtained by comparing its value at a point $x$ with those at points $x + α ξ$ in a weak norm neighborhood. In contrast to these (weak) variations, we now consider a new type of (strong) variations whose smallness does not imply that of their derivatives. In the scalar case ⁶ , we consider variations $Ŵ \in {\hat{𝒞}}^{1} [t_{1}, t_{2}]$ defined as:

Ŵ (t) : = {\begin{matrix} v (t - τ + δ) & if τ - δ \leq t \leq τ \\ v (- \sqrt{δ} (t - τ) + τ) & if τ \leq t < τ + \sqrt{δ} \\ 0 & otherwise \end{matrix}

Where $τ \in (t_{1}, t_{2})$ and $v$ and $δ$ are positive real coefficients such that $τ - δ > t_{1}$ and $τ + \sqrt{δ} < t_{2}$ .

Figure 3.10:: Strong variation $Ŵ (t)$ and its time derivative $\dot{Ŵ} (t)$

Note that strong variations of this kind depend on three parameters $v, δ, τ$ . Informally speaking $τ$ is the point at which the perturbation is centered while $δ$ determines the extension in time of the perturbation. Note that the conditions $τ - δ > t_{1}$ and $τ + \sqrt{δ} < t_{2}$ constrain the variation to lie within the open interval $(t_{1}, t_{2})$ . The parameter $v$ modulates the magnitude of the variation and of its derivative. We are now ready to state a set of necessary conditions for a strong local minimum, whose proof is based on the foregoing class of variations.

Theorem 3.10: Weierstrass’ Necessary Condition

Consider the problem to minimize the functional:

F [\hat{x}] : = \int_{t_{1}}^{t_{2}} f (t, \hat{x} (t), \dot{\hat{x}} (t)) d t

on $\hat{𝒟} : = {\hat{x} \in {\hat{𝒞}}^{1} {[t_{1}, t_{2}]}^{n_{x}} : \hat{x} (t_{1}) = x 1, \hat{x} (t_{2}) = x 2} .$ Suppose ${\hat{x}}^{⋆} (t), t_{1} \leq t \leq t_{2},$ gives a strong (local) minimum for $F$ on $\hat{𝒟}$ . Then,

ℰ (t, {\hat{x}}^{⋆}, {\dot{\hat{x}}}^{⋆}, v) : = f (t, {\hat{x}}^{⋆}, {\dot{\hat{x}}}^{⋆} + v) - f (t, {\hat{x}}^{⋆}, {\dot{\hat{x}}}^{⋆}) - \frac{\partial f (t, {\hat{x}}^{⋆}, {\dot{\hat{x}}}^{⋆})}{\partial \dot{x}} v \geq 0

at every $t \in [t_{1}, t_{2}]$ and for each $v \in ℝ^{n_{x}}$ . ( $ℰ$ is referred to as the excess function of Weierstrass ).

Proof. For the sake of clarity, we shall present and prove this condition for scalar functions $\hat{x} \in \hat{𝒞} [t_{1}, t_{2}]$ only. Let ${\hat{x}}_{δ} (t) : = {\hat{x}}^{⋆} (t) + Ŵ (t)$ . Note that both $Ŵ$ and ${\hat{x}}^{⋆}$ being ${\hat{𝒞}}^{1}$ functions, so is ${\hat{x}}_{δ}$ . These smoothness conditions are sufficient to calculate $F [{\hat{x}}_{δ}]$ , as well as its derivative with respect to $δ$ at $δ = 0$ . Note that ${\hat{x}}_{δ}$ and ${\hat{x}}^{⋆}$ differ only in the interval $[τ - δ, τ + \sqrt{δ}]$ . Hence, by the definition of $Ŵ$ , we have:

\begin{aligned} F [{\hat{x}}_{δ}] - F [{\hat{x}}^{⋆}] = \int_{τ - δ}^{τ + \sqrt{δ}} (f (t, {\hat{x}}_{δ} (t), {\dot{\hat{x}}}_{δ} (t)) - f (t, {\hat{x}}^{⋆} (t), {\dot{\hat{x}}}^{⋆} (t))) d t \\ = \int_{τ}^{τ + \sqrt{δ}} f (t, {\hat{x}}_{δ} (t), {\dot{\hat{x}}}_{δ} (t)) d t + \int_{τ - δ}^{τ} f (t, {\hat{x}}_{δ} (t), {\dot{\hat{x}}}_{δ} (t)) d t + \\ - \int_{τ - δ}^{τ + \sqrt{δ}} f (t, {\hat{x}}^{⋆} (t), {\dot{\hat{x}}}^{⋆} (t)) d t, \end{aligned}

note that we have splitted the integral in the points of discontinuities of the derivative of $Ŵ (t)$ . The differential quotient is:

\begin{aligned} \frac{F [{\hat{x}}_{δ}] - F [{\hat{x}}^{⋆}]}{δ} = \\ \frac{1}{δ} \int_{τ - δ}^{τ} (f (t, {\hat{x}}^{⋆} (t) + v (t - τ + δ), {\dot{\hat{x}}}^{⋆} (t) + v) - f (t, {\hat{x}}^{⋆} (t), {\dot{\hat{x}}}^{⋆} (t))) d t + \\ \frac{1}{δ} {\int_{τ}^{τ + \sqrt{δ}} f (t, {\hat{x}}^{⋆} (t) + v (- \sqrt{δ} (t - τ) + τ), {\dot{\hat{x}}}^{⋆} (t) - v \sqrt{δ}) d t + \\ - \int_{τ}^{τ + \sqrt{δ}} f (t, {\hat{x}}^{⋆} (t), {\dot{\hat{x}}}^{⋆} (t)) d t} = I_{1}^{δ} + I_{2}^{δ} . \end{aligned}

\begin{aligned} I_{1}^{0} = \lim_{δ \to 0} I_{1}^{δ} = \lim_{δ \to 0} \frac{1}{δ} {- \int_{τ}^{τ - δ} (f (t, {\hat{x}}^{⋆} (t) + v (t - τ + δ), {\dot{\hat{x}}}^{⋆} (t) + v) d t + \\ - \int_{τ}^{τ - δ} f (t, {\hat{x}}^{⋆} (t), {\dot{\hat{x}}}^{⋆} (t))) d t} = \\ = f (τ, {\hat{x}}^{⋆} (τ), {\dot{\hat{x}}}^{⋆} (τ) + v) - f (τ, {\hat{x}}^{⋆} (τ), {\dot{\hat{x}}}^{⋆} (τ)), \end{aligned}

where we have applied Theorem 3.6. In order to analyse the second term we define $g (t) : = v (- (t - τ) + \sqrt{δ})$ , its time-derivative is $ġ (t) = - v$ and thus we have:

\begin{aligned} I_{2}^{0} = \lim_{δ \to 0} I_{2}^{δ} = \lim_{δ \to 0} \frac{1}{δ} {\int_{τ}^{τ + \sqrt{δ}} f (t, {\hat{x}}^{⋆} (t) + \sqrt{δ} g (t), {\dot{\hat{x}}}^{⋆} (t) + \sqrt{δ} ġ) d t + \\ - \int_{τ}^{τ + \sqrt{δ}} f (t, {\hat{x}}^{⋆} (t), {\dot{\hat{x}}}^{⋆} (t)) d t}, \end{aligned}

using the first-order Taylor series expansion in $\sqrt{δ}$ under the integral sign we have:

\begin{aligned} f (t, {\hat{x}}^{⋆} (t) + \sqrt{δ} g (t), {\dot{\hat{x}}}^{⋆} (t) + \sqrt{δ} ġ) = \\ = f [{\hat{x}}^{⋆}] + \frac{\partial f [{\hat{x}}^{⋆}]}{\partial x} g (t) \sqrt{δ} + \frac{\partial f [{\hat{x}}^{⋆}]}{\partial ẋ} ġ (t) \sqrt{δ} + o (\sqrt{δ}) \end{aligned}

where the arguments have been compressed for notational simplicity. Therefore we have:

\lim_{δ \to 0} I_{2}^{δ} = \lim_{δ \to 0} \frac{1}{\sqrt{δ}} \int_{τ}^{τ + \sqrt{δ}} (\frac{\partial f [{\hat{x}}^{⋆}]}{\partial x} g (t) + \frac{\partial f [{\hat{x}}^{⋆}]}{\partial ẋ} ġ (t) + o (\sqrt{δ})) d t,

upon integration by parts of the term involving $ġ (t)$ we obtain:

\begin{aligned} \lim_{δ \to 0} I_{2}^{δ} = \lim_{δ \to 0} {\frac{1}{\sqrt{δ}} \int_{τ}^{τ + \sqrt{δ}} (\frac{\partial f [{\hat{x}}^{⋆}]}{\partial x} - \frac{d}{d t} \frac{\partial f [{\hat{x}}^{⋆}]}{\partial ẋ}) g (t) d t + \\ + \frac{1}{\sqrt{δ}} (\frac{\partial f [{\hat{x}}^{⋆}]}{\partial ẋ} g) |_{τ}^{τ + \sqrt{δ}} + \frac{o (\sqrt{δ})}{\sqrt{δ}}}, \end{aligned}

note that the term $\frac{\partial f [{\hat{x}}^{⋆}]}{\partial x} - \frac{d}{d t} \frac{\partial f [{\hat{x}}^{⋆}]}{\partial ẋ} = 0$ since Euler equations are necessary condition for optimality. Then by definition of small $o$ we have $\lim_{δ \to 0} \frac{o (\sqrt{δ})}{\sqrt{δ}} = 0$ , finally the last term reduces to:

\frac{1}{\sqrt{δ}} (\frac{\partial f [{\hat{x}}^{⋆}]}{\partial ẋ} g) |_{τ}^{τ + \sqrt{δ}} = - \frac{\partial f [{\hat{x}}^{⋆} (τ)]}{\partial ẋ} v,

since $g (τ) = 0$ and $g (τ + \sqrt{δ}) = - v \sqrt{δ}$ . Since ${\hat{x}}^{⋆}$ is a local strong minimizer we have $F [{\hat{x}}_{δ}] \geq F {[\hat{x}]}^{⋆}$ , for all sufficiently small $δ$ , and thence also in the limit

0 \leq \lim_{δ \to 0} \frac{F [{\hat{x}}_{δ}] - F [{\hat{x}}^{⋆}]}{δ} = I_{1}^{0} + I_{2}^{0},

and thus:

f (τ, {\hat{x}}^{⋆} (τ), {\dot{\hat{x}}}^{⋆} (τ) + v) - f (τ, {\hat{x}}^{⋆} (τ), {\dot{\hat{x}}}^{⋆} (τ)) - \frac{\partial f (τ, {\hat{x}}^{⋆} (τ), {\dot{\hat{x}}}^{⋆} (τ)}{\partial ẋ} v \geq 0

for every $τ \in ℝ$ and $v \in ℝ$ . □

Example 3.14 (Minimum path problem II). Consider again the problem in Example 3.8that is to minimize the distance between two fixed points, namely $A = (x_{1}, y_{1})$ and $B = (x_{2}, y_{2})$ in the $(x, y)$ -plane. We have shown that extremal trajectories for this problem correspond to straight lines joining $A$ and $B$ , that is:

y^{⋆} (x) = C_{1} x + C_{2}

where $C_{1} = \frac{y_{2} - y_{1}}{x_{2} - x_{1}}$ and $C_{2} = y_{1}$ . We now ask the question whether $y^{⋆} (x)$ is a strong minimum for that problem? The Weierstress excess function is:

ℰ (x, y^{⋆}, ẏ^{⋆}, v) = \sqrt{1 + {(C_{1} + v)}^{2}} - \sqrt{1 + C_{1}^{2}} - \frac{C_{1} v}{\sqrt{1 + {(C_{1})}^{2}}},

note that in this simple case it can be regarded as a function of $C_{1}$ and $v$ only. The excess function is plotted as a function of $v$ for different values of $C_{1}$ in Figure 3.11. It is easily checked that it is always nonnegative therefore $y^{⋆}$ is also a strong local minimum for the minimum path problem. In fact, we note that the function $g (z) = \sqrt{1 + z^{2}}$ is convex ⁷ . The Weierstrass condition is equivalent to:

g (v + C_{1}) - g (C_{1}) - \frac{d g}{d z} | z = C_{1} v \geq 0

That holds true for every $C_{1}, v \in ℝ$ being precisely the first-order characterization of convexity of $g$ .

The following corollary indicates that the Weierstrass condition is also useful to detect strong (local) minima in the class of $𝒞^{1}$ functions.

Corollary 3.2: Weierstrass’ Necessary Condition

Consider the problem to minimize the functional:

\begin{matrix} \min_{x (t)} & F [x] = \int_{t_{1}}^{t_{2}} f (t, x (t), \dot{x} (t)) d t \\ s . t . & x \in 𝒟 \end{matrix}

where the functional space is defined as $𝒟 = {x \in 𝒞^{1} [t_{1}, t_{2}] such that x (t_{1}) = x 1, x (t_{2}) = x 2}$ and $f : ℝ \times ℝ^{n_{x}} \times ℝ^{n_{x}} \to ℝ$ a continuously differentiable function. Suppose that $x ⋆ \in 𝒟$ gives a (local) minimum for $F$ on $𝒟$ .
Then :

ℰ (t, {\hat{x}}^{⋆}, {\dot{\hat{x}}}^{⋆}, v) : = f (t, {\hat{x}}^{⋆}, {\dot{\hat{x}}}^{⋆} + v) - f (t, {\hat{x}}^{⋆}, {\dot{\hat{x}}}^{⋆}) - \frac{\partial f (t, {\hat{x}}^{⋆}, {\dot{\hat{x}}}^{⋆})}{\partial \dot{x}} v \geq 0

at every $t \in [t_{1}, t_{2}]$ and for each $v \in ℝ^{n_{x}}$ .

Proof. By Theorem 3.8 we have that $x ⋆$ is a local strong minimizer on $\hat{𝒟}$ as well, therefore Theorem 3.10 holds. □

Remark 3.17: Weierstrass’ Condition and Convexity

It is readily seen that the Weierstrass condition of Theorem 3.10 is satisfied automatically when the Lagrangian function $f (t, y, z)$ is partially convex (and continuously differentiable) in $z \in ℝ^{n_{x}}$ , for each $(t, y) \in [t_{1}, t_{2}] \times ℝ^{n_{x}}$ .

Remark 3.18: Weierstrass’ Condition and Pontryagin’s Maximum Principle

Interestingly enough, the Weierstrass’ condition can be rewritten as:

\begin{aligned} f (t, x ⋆ (t), {\dot{x}}^{⋆} (t) + v) - \frac{\partial f (t, x}{⋆} ({\dot{x}}^{⋆} + v) \\ \geq f (t, x ⋆ (t), {\dot{x}}^{⋆} (t)) - \frac{\partial f (t, x}{⋆} {\dot{x}}^{⋆} \end{aligned}

which given the definition of Hamiltonian gives:

H (t, x ⋆ (t), {\dot{x}}^{⋆} (t) + v) \leq H (t, x ⋆ (t), {\dot{x}}^{⋆} (t))

for each $t \in [t_{1}, t_{2}]$ and for each $v \in ℝ^{n_{x}}$ . This necessary condition prefigures Pontryagin’s Maximum Principle in optimal control theory.

3.4 Piecewise 𝒞1 extremal functions

3.4.1 The Weierstrass-Erdmann Corner Conditions

3.4.2 Weierstrass’ Necessary Conditions: Strong Minima

3.4 Piecewise $𝒞^{1}$ extremal functions