Switching the objective and the constraint

The time has come to publish a post, yet unfinished, with an optimization trick back from the last spring. Suppose we have the following family of convex optimization problems:


\displaystyle\, \min_x\, f(x)

\displaystyle\, \text{s.t.}\,\,\,\, g(x)\le\alpha


parametrized by  \alpha \ge 0  for fixed  f, g.  Denote  P(\alpha)  the value of the problem corresponding to  \alpha  (obviously  P(\alpha)  is a non-increasing function of  \alpha).  Given some unknown in advance  \alpha_0,  we are asked to find

\alpha_- = \min\left\{\alpha \ge 0\,\middle|\,P(\alpha) = P_0 := P(\alpha_0)\right\}.

Obviously  P(\sigma)  is non-increasing. The question makes sense, for example, in the context of approximation. Specifically, consider the following family of  fitting problems:

\displaystyle\, \min_x\, \|Ax-b\|_\infty

\displaystyle\, \text{s.t.}\, \|x\|_1 \le \alpha

for fixed  A \in \mathbb{R}^{m \times n}  and  b \in \mathbb{R}^m.  Here,  \alpha is the overall “budget”, and the columns of  A provide the approximation vectors used to approximate  b.  First we are given the initial resource  \alpha_0  which provide the error P(\alpha_0).  Then we are asked how much of the initial budget we can actually spare without losing the quality of the approximation.

Suppose that each particular problem (for a fixed \alpha) of  may be efficiently solved (this is the case e.g. for fitting problems above). Then, we can adopt the following approach: we first evaluate  P(\alpha)  at any point  \alpha  solving the problem instance corresponding to this  \alpha.  Since  P(\alpha)  can be evaluated fast for any  \alpha,  we can use binary search to find  \alpha_-  (i.e. each iteration gives us one digit since we restricted ourself in advance with only  \alpha \ge 0). It is evident (why?) that binary search method will yield the correct result.

Discussion.  Can we do better (quadratic convergence) for fitting problems? Note that we can almost surely obtain  P'(\alpha) = \lambda(\alpha),  where  \lambda(\alpha)  is the optimal Lagrange multiplier, solving the dual problem (see perturbation and sensitivity analysis in Stephen Boyd’s lectures). Besides,  P(\alpha)  is convex. So what comes to mind is using Newton’s method or some other root-finding method (some of them have superlinear convergence). But to employ each of them, some regularity conditions need to be verified.

Advertisements

About Dmitry Ostrovsky

PhD student with interests in learning theory, statistics, and optimization, working with Anatoli Juditsky and Zaid Harchaoui.
This entry was posted in Convex Optimization, Optimization tricks. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s