## Switching the objective and the constraint

The time has come to publish a post, yet unfinished, with an optimization trick back from the last spring. Suppose we have the following family of convex optimization problems:

$\displaystyle\, \min_x\, f(x)$

$\displaystyle\, \text{s.t.}\,\,\,\, g(x)\le\alpha$

parametrized by  $\alpha \ge 0$  for fixed  $f, g.$  Denote  $P(\alpha)$  the value of the problem corresponding to  $\alpha$  (obviously  $P(\alpha)$  is a non-increasing function of  $\alpha$).  Given some unknown in advance  $\alpha_0,$  we are asked to find

$\alpha_- = \min\left\{\alpha \ge 0\,\middle|\,P(\alpha) = P_0 := P(\alpha_0)\right\}.$

Obviously  $P(\sigma)$  is non-increasing. The question makes sense, for example, in the context of approximation. Specifically, consider the following family of  fitting problems:

$\displaystyle\, \min_x\, \|Ax-b\|_\infty$

$\displaystyle\, \text{s.t.}\, \|x\|_1 \le \alpha$

for fixed  $A \in \mathbb{R}^{m \times n}$  and  $b \in \mathbb{R}^m.$  Here,  $\alpha$ is the overall “budget”, and the columns of  $A$ provide the approximation vectors used to approximate  $b.$  First we are given the initial resource  $\alpha_0$  which provide the error $P(\alpha_0).$  Then we are asked how much of the initial budget we can actually spare without losing the quality of the approximation.

Suppose that each particular problem (for a fixed $\alpha$) of  may be efficiently solved (this is the case e.g. for fitting problems above). Then, we can adopt the following approach: we first evaluate  $P(\alpha)$  at any point  $\alpha$  solving the problem instance corresponding to this  $\alpha.$  Since  $P(\alpha)$  can be evaluated fast for any  $\alpha,$  we can use binary search to find  $\alpha_-$  (i.e. each iteration gives us one digit since we restricted ourself in advance with only  $\alpha \ge 0$). It is evident (why?) that binary search method will yield the correct result.

Discussion.  Can we do better (quadratic convergence) for fitting problems? Note that we can almost surely obtain  $P'(\alpha) = \lambda(\alpha),$  where  $\lambda(\alpha)$  is the optimal Lagrange multiplier, solving the dual problem (see perturbation and sensitivity analysis in Stephen Boyd’s lectures). Besides,  $P(\alpha)$  is convex. So what comes to mind is using Newton’s method or some other root-finding method (some of them have superlinear convergence). But to employ each of them, some regularity conditions need to be verified.