In the previous post we considered subexponential distributions and concentration inequality, called Berstein inequality, for independent sums of subexponential random variables. In this post we will concentrate more thoroughly on the meaning of parameters and the concept of deviation zones.
Recall that for subexponential distributions we have
The behaviour of the tail changes when The zones and are called, correspondingly, near and far zone. In the near zone, the bound coincides with the (sub)gaussian tail bound with variance (proxy) parameter whereas in the far zone the tails are those of exponential with ‘typical value’
So, how these zones correspond to each other? A first step towards understanding what is going on in Theorem 1 is the following crude consequence:
Let be -subexponential. Then, w.h.p.
Proof. Depending on the comparison vs only two options are possible: either and in this case the deviation is upper-bounded by or and the deviation is bounded by . ♦
The observation that the ‘typical value’ of a subexponential distribution is determined by either or depending on whether is greater or less than suggests that we separately consider two extremes.
- Subgaussian regime: the near zone is very big, and the far zone deviations are unlikely, i.e.
A vivid example of this kind of behaviour is for which and Below is a typical plot for Subgaussian regime (the ratio is chosen not very big to see the whole picture).
- Exponential (or Poisson) regime: the near zone is small, and the far zone deviations are likely, i.e.
An important application of the Bernstein bound is sharpening Hoeffding inequality for a bounded random variable when it is additionally known that it has a small variance.
A random variable such that, for
We won’t prove this fact. It is easy to see that resulting Bernstein bound is stronger than Hoeffding bound, which we would use instead if we didn’t know for any For the ‘hardest’ confidence level, corresponding to the critical point Bernstein gives times smaller quantiles: