Farkas’ lemma

Farkas’ lemma  is a simple (and very useful) statement in convex analysis.


Theorem (Farkas’ lemma)

For any matrix  A \in \mathbb{R}^{m \times n}  and vector  b \in \mathbb{R}^{m},  exactly one of the following holds:

(1)     linear system  Ax = b  has a solution  x \in \mathbb{R}^n_+  in the first orthant;

(2)    there exists  y \in \mathbb{R}^m  such that  A^T y \in \mathbb{R}^n_+  and  b^T y < 0.


It has an elementary analogue known from a course of linear algebra:


Theorem (Fredholm’s alternative)

Exactly one of the following holds:

(1′)     linear system  Ax = b  has a solution;

(2′)    there exists  y \in \mathbb{R}^m  such that  A^Ty = 0  and  b^T y \ne 0.


Rather than giving a detailed self-contained proof, I’ll just provide a geometric intuition behind it. Note that (2) implies the negation of (1). To see it, transpose  Ax = b  and then multiply by  y  from (2). Now it remains to prove that the negation of (1) implies (2).

Let us start with a notion of a cone. A  cone  K \in \mathbb{R}^m is any set which is closed under multiplication by positive scalars. In other words,  x \in K  implies  \alpha x \in K  for any  \alpha \ge 0. Note that this definition aligns with the geometric notion of a (non-truncated) cone with apex in the origin. Cones are convex sets, i.e.they can contain two points only with the entire segment between these points.

Further, for any set of vectors  \xi_1, ..., \xi_n,  its  conic hull  is defined as a minimal cone containing them. It is easy to make sure that it is given (parametrized) by  \alpha_1 \cdot \xi_1 +\, ...\, + \alpha_n \cdot \xi_n  for  \alpha_i \ge 0.  If we rewrite  \xi_i,  1 \le i \le n,  as the columns of a matrix  \Xi,  then their conic hull will be the set  \{\Xi \alpha\}, \alpha \in \mathbb{R}^n_+.

Now we can see that geometrically (1) simply says that  b  is in the conic hull of columns  a_1,\, ...,\, a_n  of A.  Indeed, there exist a coefficient vector  x_1,\, ..., \, x_n  such that  b = x_1 \cdot a_1 + ... + x_n \cdot a_n.

So, the opposite of (1) means that b is separated from the cone  K = \{Ax\}, x \in \mathbb{R}^n_+.  Then, as cones are convex sets, we can appeal to the separation theorem to draw a hyperplane separating  \{a_1,\, ...,\, a_n\}  and  b  and obtain (2).

Remark.  Farkas’ lemma has an interesting consequence. It occurs that the  LP feasibility problem  (“determine if (2) is feasible”) is essentially as difficult as the separation problem  for a cone and a point (given a cone and a point, output the separation hyperplane if the point does not belong to the cone and the negative answer if it does).

Advertisements

About Dmitry Ostrovsky

PhD student with interests in learning theory, statistics, and optimization, working with Anatoli Juditsky and Zaid Harchaoui.
This entry was posted in Convex Optimization, Memos, Optimization tricks. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s