In this post we will discuss subgaussian distributions. In a nutshell, these are the ones that have as light tails as the Gaussian distribution.

Let us start out with a simple (centered) Gaussian random variable The well-known fact about it is that for any

Probably, the easiest way to see is as follows. Consider a complex random variable where i.i.d. (note that ), and observe that multiplier is due to the symmetry of the normal density. The bound is also tight in a certain sense (one may check it by estimates involving the integration of gaussian moments by parts).

Another, more general (albeit maybe less beautiful) way to get this inequality is the so-called* Chernoff’s technique.* The main observation is that

** **

where the last line (due to Markov’s inequality) is true for any for which the expectation, called the *moment generating function (MGF), *exists. Since for for any we can minimize the right-hand side in obtaining As we see, such behaviour of MGF is the only thing needed to get a tail bound, which motivates the following

DefinitionA random variable with is –

subgaussian,if for any

Parameter is sometimes called the *variance proxy**.*** **We just proved a bound on the tails of subgaussian distributions (we won’t prove the second bound here, it follows from the isoperimetric inequality, see e.g. [Johnstone], p. 46):

TheoremIf is -subgaussian, then

In other words, is upper-bounded either by or with high probability (pick the one you like the most). Another characterization of a subgaussian distribution, which I will give without a proof, is that its central absolute moment behaves as giving, for some absolute constant,

It turns out that a bounded random variable can be characterized as subgaussian in a simple way.

Hoeffding’s lemmaAny is -subgaussian with

**Proof. **Without the loss of generality, let (otherwise consider instead). Consider the *cumulant* Since the Taylor expansion of at is

for some We have We check that where has density given by

where is the density of Since

Noting that this holds for any we obtain the claim. ♦

It is straightforward to see that independent subgaussian random variables admit a very simple algebra. Namely, if are independent subgaussians with parameters correspondingly, then is subgaussian with As a consequence, we have the following

Theorem (Subgaussian concentration)For -subgaussian and independent, it holds

Its version corresponding to the case of bounded random variables is formulated below for completeness

Theorem (Hoeffding’s inequality)For independent, it holds

Both of these bounds are typical concentration inequalities as they were described in the previous post. Indeed, let be i.i.d. with and The sum of deviates from its expectation, which is only by Putting it another way, we may normalize to and get the arithmetical mean has the average of but deviates from it only by

Finally, I give (the proof is quite technical so I omit it here) probably the most general result about subgaussian distributions:

Theorem (Lipschitz functions are subgaussian)Let be -Lipschitz with respect to i.e.

If then is -subgaussian.

This theorem readily provides a lot of subgaussian distributions. Let us give several examples:

- with is 1-subgaussian (and hence as well).
- with is -subgaussian, where is the operator norm of
- Let have i.i.d. standard Gaussian entries. Then its Schatten norm is 1-subgaussian for any including the Hilbert-Schmidt norm and, most importantly, the operator norm To see this, use rotational invariance of

Pingback: Subexponentials I: Bernstein inequality and deviations bound for a quadratic form | Look at the corners!

Pingback: Azuma-Hoeffding inequality | Look at the corners!