Thresholding for sparse signals. Square-Root Lasso

In this post we continue with soft thresholding. Our estimator \widehat \theta  is an optimal solution (unique due to strong convexity) to

\displaystyle \min_{\theta} \|\theta -y\|_2^2 + \lambda \|\theta\|_1,

and the observation model is  y = \theta^* + z,  z \sim\sigma_n\,\mathcal{N}(0,I_n),  note that we included the noise variance  \sigma_n := \frac{\sigma}{\sqrt{n}} in z.  We are interested in how the loss of  \widehat \theta compares with ‘the oracle’ \theta^*  the loss for which is zero. We will assume that the signal is sparse; specifically, its support is  S \subset [n]  with  s:=|S|.

Continue reading

Posted in Sparsity, Statistics, Uncategorized | 1 Comment

Thresholding over the l1-ball

Since long ago I wanted to write a post with an elementary introduction to soft / hard thresholding. The time has come to pay the dues, so here we go.

Continue reading

Posted in Memos, Sparsity, Statistics | Leave a comment

Formulations of Mirror Descent

A very nice reference for this post is [this blog post].

Mirror Descent is a first-order algorithm to solve a convex optimization problem  \min_{x \in X} f(x).  In this post we will learn several equivalent formulations of this algorithm and discuss how they are related to each other.

Continue reading

Posted in Course on Convex Optimization, Memos | Leave a comment

Brunn-Minkowski, Prékopa-Leindler, and isoperimetry

Found a great overview of the subject with a lot of insight. Another brief intro is here.

Posted in Memos | Tagged | Leave a comment

Lagrange duality via the Fenchel conjugate. The dual of ERM

A recurring trick in optimization is the characterization of Lagrange dual problem in terms of the much simpler Fenchel duality. This trick is used in dual methods for Machine Learning and Signal Processing, ADMM, Augmented Lagrangian, etc.

Continue reading

Posted in Optimization tricks | 1 Comment

‘Shear’ convex polyhedra: solution

Here I describe the solution to the problem from the last post. Spoilers!

Continue reading

Posted in Pretty little things | Tagged , | Leave a comment

‘Shear’ convex polyhedra

I came up with what seems quite a beautiful geometric construction solving the problem described below. On a compact convex set  C \subset \mathbb{R}^n  consider the function

\displaystyle w_1(x) = \max_{y \in C}\, y_1 - \min_{y \in C}\, y_1

\displaystyle\text{s.t.}\,\,\,\, y_2=x_2,\, ...,\, y_n = x_n.

where we denote x = (x_1,\, ...,\, x_n).  In other words, “hanging” in the point  x,  we measure the “width” of the set along the first axis of the Cartesian system (note that  w_1(x)  actually doesn’t depend on x_1). In the same fashion we define functions  w_2(x),\, ...,\, w_n(x)  (w_i(x) measures the “width” along the i-th axis). Now consider the following functionals of  C:

\displaystyle w^*(C) = \max_{x \in C} \left[\sum_{i=1}^n w_i(x)\right];

\displaystyle w_i^*(C) = \max_{x \in C}\, w_i(x);

\displaystyle W^*(C) = \sum_{i=1}^n w_i^*(C).

Continue reading

Posted in Pretty little things | Tagged , | 1 Comment