Notes | Mechatronic Systems and Laboratory

2.3 Definitions

A variety of different definitions of optimality are used in different contexts. It is important to understand fully each definition and the context within which it is appropriately used.

2.3.1 Infimum and Supremum

Let $S \subset ℝ$ be a nonempty set.

Definition 2.8: Infimum/Supremum

The infimum of $S$ , denoted as $\inf S$ , provided it exists, is the greatest lower bound for $S$ , i.e. a number $α$ satisfying:

$z \geq α \forall z \in S$ ,
$\forall \bar{α} > α, \exists z \in S$ such that $z < \bar{α}$ .

Similarly, the supremum of $S$ , denoted as $\sup S$ , provided it exists, is the least upper bound for $S$ , i.e. a number $α$ satisfying:

$z \leq α \forall z \in S$ ,
$\forall \bar{α} < α, \exists z \in S$ such that $z > \bar{α}$ .

The first question one may ask concerns the existence of infima and suprema in $ℝ$ . In particular, one cannot prove that in $ℝ$ every set bounded from above has a supremum and every set bounded from below has an infimum. This is an axiom, known as the axiom of completeness:

Axiom 2.1: of Completeness

If a nonempty subset of real numbers has an upper bound, then it has a least upper bound. If a nonempty subset of real numbers has a lower bound, it has a greatest lower bound.

Remark 2.6

It is important to note that the real number $\inf S$ (respectively $\sup S$ ), with $S$ a nonempty set in $ℝ$ bounded from below (respectively from above), although it exist, need not be an element of $S$ .

Example 2.2. Let $S = (0, + \infty) = {z \in ℝ : z > 0}$ . Clearly, $\inf S = 0$ and $0 \notin S$ .

Notation 2.1

Let $S : = f (x) : x \in D$ be the image of the feasible set $D \subset ℝ$ of an optimization problem under the objective function $f$ . Then, the notation

\begin{matrix} \inf_{x \in D} f (x) & or & \inf {f (x) : x \in D} \end{matrix}

refers to the number $\inf S$ . Likewise, the notation

\begin{matrix} \sup_{x \in D} f (x) & or & \sup {f (x) : x \in D} \end{matrix}

refers to $\sup S$ .

Clearly, the numbers $\inf S$ and $\sup S$ may not be attained by the value $f (x)$ at any $x \in D$ . This is illustrated in an example below.

Example 2.3. Clearly, $\inf {\exp (x) : x \in (0, + \infty)} = 1$ , but $\exp (x) > 1$ for all $x \in (0, + \infty)$ .

Remark 2.7

By convention, the infimum of an empty set is $+ \infty$ while the supremum of an empty set is $- \infty$ . That is, if the values $\pm \infty$ are allowed, then infima and suprema always exist.

2.3.2 Minimum and Maximum

Consider the standard problem formulation

\min_{x \in D} f (x)

where $D \subset ℝ^{n}$ denotes the feasible set. Any $x \in D$ is said to be a feasible point. Conversely, any $x \in ℝ^{n} ∖ D : = {x \in ℝ^{n} : x \notin D}$ is said to be an infeasible point.

Definition 2.9: (Global) Minimum, Strict (Global) Minimum

A point $x ⋆ \in D$ is said to be a (global) ^a minimum of $f$ on $D$ if

f (x) \geq f (x ⋆) \forall x \in D

(2.1)

i.e. a minimum is a feasible point whose objective function value is less than or equal to the objective function value of all other feasible points. It is said to be a strict (global) minimum of $f$ on $D$ if

f (x) > f (x ⋆) \forall x \in D, x \neq x ⋆

A (global) maximum is defined by reversing the inequality in the above definition:

Definition 2.10: (Global) Maximum, Strict (Global) Maximum

A point $x ⋆ \in D$ is said to be a (global) maximum of $f$ on $D$ if

f (x) \leq f (x ⋆) \forall x \in D

(2.2)

It is said to be a strict (global) maximum of $f$ on $D$ if

f (x) < f (x ⋆) \forall x \in D, x \neq x ⋆

Remark 2.8

The important distinction between minimum (respectively maximum) and infimum (respectively supremum) is that the value $\min {f (x) : x \in D}$ must be attained at one or more points $x \in D$ , whereas the value $\inf {f (x) : x \in D}$ does not necessarily have to be attained at any points $x \in D$ . Yet, if a minimum (respectively maximum) exists, then its optimal value will equal the infimum (respectively supremum).

Remark 2.9

Note also that if a minimum exists, it is not necessarily unique. That is, there may be multiple or even an infinite number of feasible points that satisfy the inequality 2.1 and are thus minima.

Notation 2.2

Since there is in general a set of points that are minima, the notation

\arg \min {f (x) : x \in D} : = {x \in D : f (x) = \inf {f (x) : x \in D}}

is introduced to denote the set of minima. This is a (possibly empty) set ^a ^a The notation $\bar{x} = \arg \min {f (x) : x \in D}$ is also used by some authors. In this case, $\arg \min {f (x) : x \in D}$ should be understood as a function returning a point $\bar{x}$ that minimizes $f$ on $D$ . in $ℝ^{n}$ .

Remark 2.10

A minimum $x ⋆$ is often referred to as an optimal solution, a global optimal solution, or simply a solution of the optimization problem. The real number $f (x ⋆)$ is known as the (global) optimal value or optimal solution value.

Notation 2.3

Regardless of the number of minima, there is always a unique real number that is the optimal value (if it exists). The notation $m i n {f (x) : x \in D}$ is used to refer to this real value.

Unless the objective function $f$ and the feasible set $D$ possess special properties (e.g. convexity), it is usually very hard to devise algorithms that are capable of locating or estimating a global minimum or a global maximum with certainty. This motivates the definition of local minima and maxima, which, by the nature of their definition in terms of local information, are much more convenient to locate with an algorithm.

Definition 2.11: Local Minimum, Strict Local Minimum

A point $x ⋆ \in D$ is said to be a local minimum of $f$ on $D$ if

\exists 𝜀 > 0 such that f (x) \geq f (x ⋆) \forall x \in ℬ_{𝜀} (x ⋆) \cap D

A point $x ⋆ \in D$ is said to be a strict local minimum of $f$ on $D$ if

\exists 𝜀 > 0 such that f (x) > f (x ⋆) \forall x \in ℬ_{𝜀} (x ⋆) \cap D

The qualifier ’local’ originates from the requirement that $x ⋆$ be a minimum only for those feasible points in a neighbourhood around the local minimum.

Remark 2.11

Trivially, the property of $x ⋆$ being a global minimum implies that $x ⋆$ is also a local minimum because a global minimum is local minimum with $𝜀$ set arbitrarily large.

Definition 2.12: Local Maximum, Strict Local Maximum

A point $x ⋆ \in D$ is said to be a local maximum of $f$ on $D$ if

\exists 𝜀 > 0 such that f (x) \leq f (x ⋆) \forall x \in ℬ_{𝜀} (x ⋆) \cap D

A point $x ⋆ \in D$ is said to be a strict local maximum of $f$ on $D$ if

\exists 𝜀 > 0 such that f (x) < f (x ⋆) \forall x \in ℬ_{𝜀} (x ⋆) \cap D

The qualifier ’local’ originates from the requirement that $x ⋆$ be a maximum only for those feasible points in a neighbourhood around the local maximum.

Remark 2.12

Trivially, the property of $x ⋆$ being a global maximum implies that $x ⋆$ is also a local maximum because a global maximum is local maximum with $𝜀$ set arbitrarily large.

Remark 2.13

It is important to note that the concept of a global minimum or a global maximum of a function on a set is defined without the notion of a distance (or a norm in the case of a vector space). In contrast, the definition of a local minimum or a local maximum requires that a distance be specified on the set of interest. In $ℝ_{x}^{n}$ , norms are equivalent, and it is readily shown that local minima (respectively maxima) in ( $ℝ_{x}^{n}$ , $∥ \cdot ∥_{α}$ ) match local minima (respectively maxima) in (( $ℝ_{x}^{n}$ , $∥ \cdot ∥_{β}$ ), for any two arbitrary norms $∥ \cdot ∥_{α}$ and $∥ \cdot ∥_{β}$ in $ℝ_{x}^{n}$ (e.g. the Euclidean norm $∥ \cdot ∥_{2}$ and the infinite norm $∥ \cdot ∥_{\infty}$ ). Yet, this nice property does not hold in linear functional spaces, as those encountered in problems of the calculus of variations and optimal control.

Figure ?? illustrates the various definitions of minima and maxima. Point $x_{1}$ is the unique global maximum; the objective value at this point is also the supremum. Points $a$ , $x_{2}$ , and $b$ are strict local minima because there exists a neighbourhood around each of these point for which $a$ , $x_{2}$ , or $b$ is the unique minimum (on the intersection of this neighbourhood with the feasible set $D$ ). Likewise, point $x_{3}$ is a strict local maximum. Point $x_{4}$ is the unique global minimum; the objective value at this point is also the infimum. Finally, point $x_{5}$ is simultaneously a local minimum and a local maximum because there are neighbourhoods for which the objective function remains constant over the entire neighbourhood; it is neither a strict local minimum, nor a strict local maximum.

Example 2.4. Consider the function

f (x) = {\begin{matrix} + 1 & if x < 0 \\ - 1 & otherwise \end{matrix}

The point $x^{⋆} = 0$ is a local minimum for

\min_{x \in [- 2, 2]} f (x)

with value $f (x^{⋆}) = - 1$ . The optimal value is $- 1$ and $\arg \min {f (x) : x \in [- 2, 2]} = [0, 2]$ .

2.3.3 Existence of Minima and Maxima

A crucial question when it comes to optimizing a function on a given set, is whether a minimizer or a maximizer exist for that function in that set. Strictly, a minimum or maximum should only be referred to when it is known to exist.

Figure ?? illustrates three instances where a minimum does not exist. In figure ??a, the infimum of $f$ over $S : = (a, b)$ is given by $f (b)$ , but since $S$ is not closed and, in particular, $b \notin S$ , a minimum does not exist. In figure ??b, the infimum of $f$ over $S : = [a, b]$ is given by the limit of $f (x)$ as $x$ approaches $c$ from the left, i.e. $\inf {f (x) : x \in S} = l i m_{x \to c^{-}} f (x)$ . However, because $f$ is discontinuous at $c$ , a minimizing solution does not exist. Finally, figure ??c illustrates a situation within which $f$ is unbounded over the unbounded set $S : = {x \in ℝ : x \geq a}$ .

We now formally state and prove the result that if $S$ is nonempty, closed, and bounded, and if $f$ is continuous on $S$ , then, unlike the various situations of figure ??, a minimum exists.

Theorem 2.1: Weierstrass’ Theorem

Let $S$ be a nonempty compact set and let $f : S \to ℝ$ be continuous on $S$ . Then, the problem $\min {f (x) : x \in S}$ attains its minimum, that is, there exists a minimizing solution to this problem.

Proof. Since $f$ is continuous on $S$ and $S$ is both closed and bounded, $f$ is bounded below on $S$ . Consequently, since $S \neq \emptyset$ , there exists a greatest lower bound $α : = \inf {f (x) : x \in S}$ (see previous axiom). Now, let $0 < 𝜀 < 1$ , and consider the set $S_{k} : = {x \in S : α \leq f (x) \leq α + 𝜀^{k}}$ for $k = 1, 2, . . .$ . By the definition of an infimum, $S_{k} \neq \emptyset$ for each $k$ , and so we may construct a sequence of points ${x_{k}} \subset S$ by selecting a point $x_{k}$ for each $k = 1, 2, . . .$ . Since $S$ is bounded, there exists a convergent subsequence ${x_{k}}_{𝒦} \subset S$ indexed by the set $𝒦 \subset ℕ$ . Let $\bar{x}$ denote its limit. By the closedness of $S$ , we have $\bar{x} \in S$ and by the continuity of $f$ on $S$ , since $α \leq f (x k) \leq α + 𝜀^{k}$ , we have $α = l i m_{k \to \infty, k \in 𝒦} f (x k) = f (\bar{x})$ . Hence, we have shown that there exist a solution $\bar{x} \in S$ such that $f (\bar{x}) = α = \inf {f (x) : x \in S}$ , i.e. $\bar{x}$ is a minimizing solution. □

Remark 2.14

The hypotheses of theorem 2.1 can be justified as follows:

the feasible set must be nonempty, otherwise there are no feasible points at which to attain the minimum;
the feasible set must contain its boundary points, which is ensured by assuming that the feasible set is closed;
the objective function must be continuous on the feasible set, otherwise the limit at a point may not exist or be different from the value of the function at that point;
the feasible set must be bounded because otherwise even a continuous function can be unbounded on the feasible set.

Example 2.5. Theorem 2.1 establishes that a minimum (and a maximum) of

\min_{x \in [- 1, 1]} x^{2}

exists, since $[- 1, 1]$ is a nonempty compact set and $x \mapsto x^{2}$ is a continuous function on $[- 1, 1]$ . On the other hand, minima can still exist even though the set is not compact or the function is not continuous, for theorem 2.1 only provides a sufficient condition. This is the case for the problem

\min_{x \in (- 1, 1)} x^{2}

which has a minimum at $x = 0$ .

Example 2.6. Consider the NLP problem

\begin{matrix} \min_{x} & {(x_{1} - 3)}^{2} + {(x_{2} - 2)}^{2} \\ s.t. & x_{1}^{2} - x_{2} - 3 \leq 0 \\ x_{2} - 1 \leq 0 \\ - x_{1} \leq 0 \end{matrix}

The objective function being continuous and the feasible region being nonempty, closed and bounded, the existence of a minimum to this problem directly follows from theorem 2.1 .