## Subexponential distributions II — A closer look at Bernstein’s inequality

In the previous post we considered subexponential distributions and concentration inequality, called Berstein inequality, for independent sums of subexponential random variables. In this post we will concentrate more thoroughly on the meaning of parameters  $(\sigma^2, b)$  and the concept of  deviation zones.

Recall that for subexponential distributions we have

Theorem 1

For $X$ $(\sigma^2, b)$-subexponential,

$\displaystyle P\left(X - \mu > t\right) \le\exp\left(-\frac{t^2}{2 \sigma^2}\right) \, \bigvee \, \exp\left(-\frac{t}{2 b}\right).$

The behaviour of the tail changes when $t = \sigma^2/b.$  The zones $[0, \sigma^2/b]$  and  $[\sigma^2/b, \infty]$ are called, correspondingly,  near  and  far zone.  In the near zone, the bound coincides with the (sub)gaussian tail bound with variance (proxy) parameter  $\sigma^2,$  whereas in the far zone the tails are those of exponential with ‘typical value’  $b.$

So, how these zones correspond to each other? A first step towards understanding what is going on in Theorem 1 is the following crude consequence:

Proposition

Let $X$ be $(\sigma^2, b)$-subexponential. Then, w.h.p.

$X - \mu = \displaystyle O\left(\sigma \vee b\right).$

Proof.  Depending on the comparison  $\sigma$  vs  $b,$  only two options are possible: either  $\displaystyle b \le \sigma \le \frac{\sigma^2}{b},$  and in this case the deviation is upper-bounded by  $O(1)\,\sigma,$  or  $\displaystyle \frac{\sigma^2}{b} \le \sigma \le b,$  and the deviation is bounded by  $O(1)\,b$. ♦

The observation that the ‘typical value’ of a subexponential distribution is determined by either  $\displaystyle \sigma$  or  $\displaystyle b,$  depending on whether  $\displaystyle \sigma$  is greater or less than  $\displaystyle b,$  suggests that we separately consider two extremes.

• Subgaussian regime:  the near zone is very big, and the far zone deviations are unlikely, i.e.

$\displaystyle b \ll \sigma \ll \frac{\sigma^2}{b}.$

A vivid example of this kind of behaviour is  $\displaystyle \chi_n^2$  for which $\displaystyle \sigma^2 = O(n)$  and  $\displaystyle b = O(1).$ Below is a typical plot for Subgaussian regime (the ratio  $\sigma/b$  is chosen not very big to see the whole picture).

• Exponential (or Poisson) regime:  the near zone is small, and the far zone deviations are likely, i.e.

$\displaystyle \frac{\sigma^2}{b} \ll \sigma \ll b.$

An important application of the Bernstein bound is sharpening Hoeffding inequality for a bounded random variable when it is additionally known that it has a small variance.

Proposition

A random variable  $X$  such that,  for  $\sigma \le b,$

$\displaystyle |X - \mu| \le b,$        $\displaystyle \text{Var}[X] = \sigma^2$

is  $(2\sigma^2, 2b)$-subexponential.

We won’t prove this fact. It is easy to see that resulting Bernstein bound is stronger than Hoeffding bound, which we would use instead if we didn’t know  $\sigma^2,$  for any  $t.$  For the ‘hardest’ confidence level, corresponding to the critical point $t_*=\sigma^2/b,$  Bernstein gives  $\displaystyle b/\sigma$  times smaller quantiles: