Strong convexity provides an improved lower bound for a convex function; therefore are in our right to expect the improvement of the rate of convergence of PGD. In this post we will find out how much we actually gain.
First let us analyze the non-smooth (Lipschitz) case.
Let be -strongly convex and -Lipschitz on PGD with attains
Proof. Beginning with the same analysis as in the non-strongly convex case, we get
Multiplying by gives
Summing over and using convexity, we obtain the result. ♦
We now turn ourselves to the smooth case, that is the function now is -smooth and -strongly convex. Remember that previously the “three point identity” proved to be useful. For the strongly convex case it has an improvement:
Lemma (Three point identity in the strongly convex case)
For -smooth and -strongly convex function,
The proof of this lemma in fact repeats that for the usual three-point identity, and we omit it. We now give the rate of PGD for strongly convex smooth functions in terms of condition number Surprisingly, the analysis is simpler than for the non-strongly convex case.
Let be -smooth and -strongly convex on PGD with attains the rate
Proof. First, by smoothness. Furthermore, it turns out that there is a linear rate of localization in the argument in the strongly convex case (note that the solution is unique):
and we obtain the claim. ♦