If a function has a Lipschitz gradient, i.e. for any and

then

If is differentiable, this follows from the representation of the gradient via the Hessian and the fact that all eigenvalues of the Hessian are upper-bounded by Otherwise express as the integral over put the gradient under the integral sign, and use the Cauchy-Schwarz inequality.

Note that if additionally is convex, then we have that the error of its first order approximation is bounded both from above and below:

As shown below, these bounds might be taken as an *equivalent definition* of -smooth convex function, i.e. if a smooth function satisfies these inequalities, then it is convex and -smooth. In fact, the trivial lower bound in may always be sharpened.

**Lemma (Sandwiching a smooth convex function)**

Function is convex and -smooth on its domain (i.e. differentiable with -Lipschitz gradient), if, and only if, holds. Moreover, is in fact equivalent to

for any

** Proof. **We must prove the following equivalence for a smooth function

We already have and (convexity follows from the lower bound of which is weaker than that of ). We are now to show Indeed, suppose holds, and pick so that

and hence ♦

The lower bound from the lemma which we have just proved has a useful consequence for the gradient of a smooth function.

**Lemma (Coercivity of the gradient)**

For a -smooth convex function, one has

**Proof. **Sum the lower bounds for the sandwiching lemma for the points and ♦

A [non necessarily smooth] function is called –*strongly convex *if the trivial convexity lower bound is improved by the quadratic term:

where is any subgradient of It is easy to see that is -strongly convex i.f.f. is convex. Another useful elementary property is that strongly convex functions admit unique minimizers.

As a result, if the function is -smooth and -strongly convex, the function is sandwiched between 2 quadratics:

**Sandwiching a smooth and strongly convex function:**

Another consequence is the improved coercivity of the gradient, proved by considering which is convex and -smooth:

**Lemma (****Coercivity of the gradient in the strongly convex case):**

### Like this:

Like Loading...

## About Dmitry Ostrovsky

PhD student with interests in learning theory, statistics, and optimization, working with Anatoli Juditsky and Zaid Harchaoui.

Pingback: Subgradient Projection method II — smooth objective | Look at the corners!

Pingback: Subgradient projection method III — Strong convexity | Look at the corners!