Convex Sets and Functions - Lecture Notes

Convex Sets

  • A set CRNC ∈ R^N is convex if, for any two points x,yCx, y ∈ C, the line segment joining xx and yy is contained in CC.

    • Mathematically, this means for any x,yCx, y ∈ C, the set  [x,y]:=tx+(1t)y:t[0,1]\ [x, y] := {tx + (1 − t)y : t ∈ [0, 1]} is a subset of CC.

Convex Sets in R

  • In the real number line R\mathbb{R}, convex sets are intervals. Examples include:

    • (a,b)(a, b)

    • [a,b)[a, b)

    • [a,b][a, b]

    • [a,+)[a, +∞)

    • (,a)(−∞, a)

Norm Balls

  • A norm ball with center x<em>cRnx<em>c ∈ R^n and radius r0r ≥ 0 is defined as the set  xRnxx</em>cr\ {x ∈ R^n | ∥x − x</em>c∥ ≤ r}.

    • Euclidean norm ball: B<em>2(x</em>c,r)=xxx<em>c</em>2r=x<em>i=1n(x</em>ixc,i)2rB<em>2(x</em>c ,r) = {x | ∥x−x<em>c∥</em>2 ≤ r} = { x | \sqrt{\sum<em>{i=1}^{n} (x</em>i − x_{c,i})^2} ≤ r }

    • <em>1\ell<em>1-norm ball: B</em>1(x<em>c,r)=xxx</em>c<em>1r=x</em>i=1nx<em>ix</em>c,irB</em>1(x<em>c ,r) = {x | ∥x−x</em>c∥<em>1 ≤ r} = {x | \sum</em>{i=1}^{n} |x<em>i − x</em>{c,i}| ≤ r}

    • <em>\ell<em>∞-norm ball: B</em>(x<em>c,r)=xxx</em>c<em>r=xmax</em>i=1,,nx<em>ix</em>c,irB</em>∞(x<em>c ,r) = {x | ∥x−x</em>c∥<em>∞ ≤ r} = {x | \max</em>{i=1,…,n} |x<em>i − x</em>{c,i}| ≤ r}

Norm Balls - Ellipsoids

  • An ellipsoid is a set of the form  x(xx<em>c)tA(xx</em>c)1\ {x|(x − x<em>c )^tA(x − x</em>c ) ≤ 1}, where AA is a symmetric positive definite matrix.

Affine Sets

  • Lines in R2R^2:  x=(x<em>1,x</em>2)R2a<em>1x</em>1+a<em>2x</em>2=b\ {x = (x<em>1, x</em>2) ∈ R^2 | a<em>1x</em>1 + a<em>2x</em>2 = b}, where a=(a<em>1,a</em>2)a = (a<em>1, a</em>2) is the normal to the line.

  • Planes in R3R^3:  x=(x<em>1,x</em>2,x<em>3)R3aTx=a</em>1x<em>1+a</em>2x<em>2+a</em>3x<em>3=b\ {x = (x<em>1, x</em>2, x<em>3) ∈ R^3 | a^T x = a</em>1x<em>1 + a</em>2x<em>2 + a</em>3x<em>3 = b}, where a=(a</em>1,a<em>2,a</em>3)a = (a</em>1, a<em>2, a</em>3) is the normal to the plane.

  • Lines in R3R^3 (as the intersection of two planes):  x=(x<em>1,x</em>2,x3)R3Ax=b\ {x = (x<em>1, x</em>2, x_3) ∈ R^3 | Ax = b}, where

    A=[a<em>11amp;a</em>12amp;a<em>13 a</em>21amp;a<em>22amp;a</em>23]andb=[b<em>1 b</em>1]A = \begin{bmatrix} a<em>{11} &amp; a</em>{12} &amp; a<em>{13} \ a</em>{21} &amp; a<em>{22} &amp; a</em>{23} \end{bmatrix} \text{and} b = \begin{bmatrix} b<em>1 \ b</em>1 \end{bmatrix}.

    • The rows a<em>1,a</em>2R3a<em>1, a</em>2 ∈ R^3 of AA are the normals to the planes.

  • The solution set of underdetermined linear equations:  xAx=b\ {x | Ax = b}.

  • The set of solutions of linear equations is of the form  x<em>p+x</em>h:x<em>hker(A)=x</em>p+ker(A)\ {x<em>p + x</em>h : x<em>h ∈ ker(A)} = x</em>p + ker(A), which are vector spaces shifted by adding a constant vector xpx_p.

Convex Cones

  • A convex cone is a set CC that is closed under positive linear combinations. If x,yCx, y ∈ C, then αx+βyC\alpha x + \beta y ∈ C for any α0,β0\alpha ≥ 0, \beta ≥ 0.

  • Given a set of vectors SS, the convex cone generated by SS is the set C<em>S=</em>i=1nα<em>ix</em>i:α<em>1,,α</em>n0,x<em>1,,x</em>nCC<em>S = { \sum</em>{i=1}^{n} \alpha<em>i x</em>i : \alpha<em>1, …, \alpha</em>n ≥ 0, x<em>1, …, x</em>n ∈ C}.

Convex Combination and Convex Hull

  • A convex combination of x<em>1,...,x</em>kx<em>1, . . . , x</em>k is any point xx of the form x=t<em>1x</em>1+t<em>2x</em>2++t<em>kx</em>kx = t<em>1x</em>1 + t<em>2x</em>2 + · · · + t<em>k x</em>k with t<em>1+t</em>2++t<em>k=1t<em>1 + t</em>2 + · · · + t<em>k = 1, and t</em>i0t</em>i ≥ 0.

  • The convex hull of a set SS is the set of all convex combinations of points in SS.

Hyperplanes and Half-Spaces

  • A hyperplane is a set of the form  xatx=b\ {x | a^tx = b}, where a0a ≠ 0.

  • A half-space is a set of the form  xatxb\ {x | a^tx ≤ b}, where a0a ≠ 0.

  • In both cases, aa is the normal vector.

Polytope

  • A polytope is a set of the form  xAxb\ {x | Ax ⪯ b}, where ARm×nA ∈ R^{m×n}.

  • The notation represents component-wise inequality: a<em>ixb</em>ia<em>ix ≤ b</em>i, where aia_i is the ii-th row of AA.

  • A polytope is an intersection of mm half-spaces.

Operations That Preserve Convexity

  • The intersection of any number of convex sets is convex.

  • If f:RnRnf : R^n → R^n is an affine function (i.e., it can be written as f(x)=Ax+bf (x) = Ax + b with AMn,m(R),bRmA ∈ M_{n,m}(R), b ∈ R^m), then:

    • If CRnC ⊆ R^n is convex, then f(C)=f(x)RmxCf (C) = {f (x) ∈ R^m | x ∈ C} is a convex set.

    • If CRmC ⊆ R^m is convex, then f1(C)=xRnf(x)Cf^{-1}(C) = {x ∈ R^n | f (x) ∈ C} is a convex set.

  • To verify if a set is convex, express it in terms of simpler known convex sets using operations that preserve convexity.

Convex Functions

  • A function f:CRf : C → R is convex if f(tx+(1t)y)tf(x)+(1t)f(y)f (tx + (1 − t)y) ≤ tf (x) + (1 − t)f (y), for any two points x,yCx, y ∈ C, and t[0,1]t ∈ [0, 1].

  • ff is concave if f-f is convex.

  • A function f:CRf : C → R is strictly convex if f (tx + (1 − t)y) < tf (x) + (1 − t)f (y), for any two points x,yCx, y ∈ C and any t[0,1]t ∈ [0, 1].

Examples of Convex and Concave Functions on R

  • Convex functions:

    • Quadratic functions: x2x^2 on R\mathbb{R}.

    • Exponential: eaxe^{ax} is convex on R\mathbb{R}, for any aRa ∈ R.

    • Powers: xαx^α for x > 0 and α1\alpha ≥ 1 or α0\alpha ≤ 0.

    • Powers of absolute value: xp|x|^p on R\mathbb{R}, for p1p ≥ 1.

    • Negative entropy: xlogxx \log x for x > 0.

    • Affine functions: ax+bax + b on R\mathbb{R}, for any a,bRa, b ∈ R.

  • Concave functions:

    • Affine functions: ax+bax + b on R\mathbb{R}, for any a,bRa, b ∈ R (Convex and concave!).

    • Powers: xαx^α for x > 0 and 0α10 ≤ \alpha ≤ 1.

    • Logarithm: logx\log x for x > 0.

Examples of Convex Functions on Rn

  • Affine functions: f(x)=x,b+cf (x) = ⟨x, b⟩ + c.

  • Norms: Every norm on RnR^n is convex. In particular, any <em>p\ell<em>p norm,  ∥x</em>p=(<em>i=1nx</em>ip)1/p\ ∥x∥</em>p = (\sum<em>{i=1}^{n} |x</em>i|^p)^{1/p} is convex.

  • Max function: f(x)=maxx<em>1,...,x</em>nf (x) = \max{x<em>1, . . . , x</em>n}.

  • Quadratic-over-linear function: f(x<em>1,x</em>2)=x<em>12x</em>2f (x<em>1, x</em>2) = \frac{x<em>1^2}{x</em>2}, with x_2 > 0.

Epigraph, Sublevel and Superlevel Sets

  • α\alpha-sublevel set of f:RnRf : R^n → R: Cα=xdom(f)f(x)αC_α = {x ∈ dom(f )| f (x) ≤ α}.

    • Sublevel sets of convex functions are convex.

    • Superlevel sets of concave functions are convex.

    • Functions with all their sub-level sets convex are not necessarily convex. These broader class of functions are called quasi-convex.

  • Epigraph of f:RnRf : R^n → R: epi(f)=(x,t)Rn+1xdom(f),f(x)tepi(f ) = {(x,t) ∈ R^{n+1} | x ∈ dom(f ), f (x) ≤ t}.

    • ff is a convex function if and only if its epigraph is a convex set.

Sufficient and Necessary Condition for Convexity

  • A function f:RnRf : R^n → R is convex if and only if the function g:RRg : R → R, g(t)=f(x+tv)g(t) = f (x + tv) is convex for any x,vRnx, v ∈ R^n.

  • Using this property the problem is reduced to check the convexity of function of one variable.

First Order Condition for Convexity

  • Let f:RnRmf : R^n → R^m be a differentiable function and let CC be a convex subset of RnR^n. Then:

    f is convex f(y)f(x)+f(x),yx,x,yC.f \text{ is convex } ⇐⇒ f (y) ≥ f (x) + ⟨∇f (x), y − x⟩, ∀x, y ∈ C.

    • The first order Taylor approximation of ff is a global underestimator.

    • Moreover, ff is strictly convex if and only if:

    f (y) > f (x) + ⟨∇f (x), y − x⟩, ∀x, y ∈ C.

Second Order Condition for Convexity

  • Let f:RnRf : R^n → R be a twice differentiable function and let CC be a convex subset of RnR^n.

    f is convex 2f(x) is positive semi-definite for all xC (all eigenvalues of 2f(x) are non-negative).f \text{ is convex } ⇐⇒ ∇^2f (x) \text{ is positive semi-definite for all } x ∈ C \text{ (all eigenvalues of } ∇^2f (x) \text{ are non-negative)}.

    f is strictly convex 2f(x) is positive definite for all xC (all eigenvalues are positive).f \text{ is strictly convex } ⇐⇒ ∇^2f (x) \text{ is positive definite for all } x ∈ C \text{ (all eigenvalues are positive)}.

Is a Quadratic Function a Convex Function?

  • Let f(x)=12x,Qxx,d+cf (x) = \frac{1}{2} ⟨x, Qx⟩ − ⟨x, d⟩ + c, with AA a symmetric matrix.

    • Then, f(x)=Qxd∇f (x) = Qx − d and 2f(x)=Q∇^2f (x) = Q.

    • f(x)f (x) is convex if the matrix AA is positive semi-definite.

    • A special case: f(x)=12Axb22=12x,AtAx2x,b+b,bf (x) = \frac{1}{2} ∥Ax − b∥_2^2 = \frac{1}{2} ⟨x, A^tAx⟩ − 2⟨x, b⟩ + ⟨b, b⟩.

      • Then, f(x)=At(Axb)∇f (x) = A^t(Ax − b) and 2f(x)=AtA∇^2f (x) = A^tA.

      • f(x)f (x) is convex for any matrix AA because AtAA^tA is symmetric and positive semidefinite.

Establishing the Convexity of a Function

  • In practice, for a general function ff, we either:

    1. Show that ff is obtained from simple convex functions and operations that preserve convexity.

    2. Verify the definition (restricted to a line).

    3. If the function is twice differentiable, show that 2f(x)∇^2f (x) is positive semidefinite.

    4. Verify the definition.

Operations That Preserve Convexity

  • Nonnegative multiple: αf\alpha f is convex if ff is convex and \alpha > 0.

  • Sum: f<em>1+f</em>2f<em>1 + f</em>2 is convex if f<em>1,f</em>2f<em>1, f</em>2 are convex. This is also true for infinite sums and for integrals.

  • Composition with affine function: Let f:RnRf : R^n → R be a convex function, AMn,m(R),bRnA ∈ M_{n,m}(R), b ∈ R^n. Then, f(Ax+b)f (Ax + b) is a convex function.

    • Example: any norm of an affine function f(x)=Ax+bf (x) = ∥Ax + b∥ is convex.

  • Pointwise maximum: If f<em>1,...,f</em>mf<em>1, . . . , f</em>m are convex, then f(x)=maxf<em>1(x),...,f</em>m(x)f (x) = \max{f<em>1(x), . . . , f</em>m(x)} is convex.

  • Pointwise supremum: If f(x,y)f (x, y) is convex in xx for all yy, and CC is an arbitrary set, then g(x)=supyCf(x,y)g(x) = \sup_{y∈C} f (x, y) is convex.

  • Pointwise infimum: If f(x,y)f (x, y) is convex in (x,y)(x, y) and CC is a convex set, then g(x)=infyCf(x,y)g(x) = \inf_{y∈C} f (x, y) is convex.

  • Extended-value extension: Let f:DRnRf : D ⊂ R^n → R, be a convex function defined on DD. We denote by f~\tilde{f} its extended-value extension such that

    f~(x)={f(x)amp;if xC +amp;if xC\tilde{f}(x) = \begin{cases} f (x) &amp; \text{if } x ∈ C \ +∞ &amp; \text{if } x ∉ C \end{cases}

    • The f~\tilde{f} is convex. (Concave functions are extended with -∞ to preserve concavity.)

  • Scalar composition: Let g:RnRg : R^n → R and h:RRh : R → R and their composition f=hg:RnRf = h ◦ g : R^n → R. (That is, f(x)=h(g(x))f (x) = h(g(x))).

    • ff is convex if h~\tilde{h} is convex and nondecreasing, and gg is convex.

    • ff is convex if h~\tilde{h} is convex and nonincreasing, and gg is concave.

    • ff is concave if h~\tilde{h} is concave and nondecreasing, and gg is concave.

    • ff is concave if h~\tilde{h} is concave and nonincreasing, and gg is convex.

Convex Problems You Have Already Seen

  • An image denoising energy: Given fRHWf ∈ R^{HW} a noisy image, recover uRHWu ∈ R^{HW} as the solution of

    min<em>uRHW</em>hu<em>22+λuf</em>22\min<em>{u∈R^{HW}} ∥∇</em>h u∥<em>2^2 + λ∥u − f∥</em>2^2

    • h\nabla_h is a 2HW×HW2HW × HW matrix that computes the discretized gradient with finite differences.

      1. The second term is: uTu2fTu+fTfu^T u − 2f^T u + f^T f. Quadratic function with Hessian II, which is positive definite. Thus it is strictly convex.

      2. The first term is: uT(<em>hT</em>h)uu^T (∇<em>h^T ∇</em>h)u. Quadratic function with Hessian <em>hT</em>h∇<em>h^T ∇</em>h, which is positive semi-definite. Thus first term is convex.

      3. Sum of convex and strictly convex functions is strictly convex.

Non-Differentiable Functions

  • Total variation for image denoising (Rudin-Osher-Fatemi): Given ff a noisy image, recover uu as the solution of

    min<em>u</em>hu<em>1+λuf</em>22\min<em>u ∥∇</em>h u∥<em>1 + λ∥u − f∥</em>2^2

    1. Second term is strictly positive.

    2. First term is: f(g(x))f (g(x)), where f(y)=y<em>1f (y) = ∥y∥<em>1 and g(u)=</em>hug(u) = ∇</em>hu. ff is convex (norms are convex functions) and gg is affine. Composition of convex with affine is convex.

    3. Sum of convex and strictly convex functions is strictly convex.