Lagrange Duality#

Primal Problem#

Consider a general optimization problem (called as the primal problem):

(1)#\[\begin{split}\underset{x}{\min} & f(x) \\ \text { s.t. } & h_i(x) \leq 0, i=1, \cdots, m \\ & \ell_j(x)=0, j=1, \cdots, r\end{split}\]

We define its Lagrangian dual version as:

(2)#\[L(x, u, v)=f(x)+\sum_{i=1}^m u_i h_i(x)+\sum_{j=1}^r v_j \ell_j(x)\]

Lagrange multipliers \(u \in \mathbb{R}^m, v \in \mathbb{R}^r\).

Note

This expression may appear complex and difficult to understand at first glance. However, we will provide a detailed explanation of how it can be utilized to solve the constrained optimization problem presented in Problem Eq.1.

Lemma 1

At each feasible \(x, f(x)=\underset{u \geq 0, v}{\max} L(x, u, v)\), and the supremum is taken iff \(u \geq 0\) satisfying \(u_i h_i(x)=0, i=1, \cdots, m\).

Lemma 2

The optimal value of the primal problem, named as \(f^*\), satisfies:

(3)#\[f^*=\underset{x}{\min} \theta_p(x)=\underset{x}{\min}\underset{u \geq 0, v}{\max} L(x, u, v)\]

Proof of Lemma 1

Define \(\theta_p(x)=\underset{u \geq 0, v}{\max} L(x, u, v)\). If \(x\) is feasible, that means the conditions in Problem Eq.1 are satisfied. Then we have \(h_i(x)\le0\) and \(\ell_j(x)=0\), thus \(L(x, u, v)=f(x)+\sum_{i=1}^m u_i h_i(x)+\sum_{j=1}^r v_j \ell_j(x)\le f(x)\). The last inequality becomes equality iff \(u_ih_i(x)=0, i=1,...,m\). So, if \(x\) is feasible, we obtain \(f(x)=\theta_p(x)\), where the subscript \(p\) denotes primal problem.

Proof of Lemma 2

If \(x\) is infeasible, we have \(h_i(x)>0\) or \(\ell_j(x)\neq0\). Then a quick fact is that \(\theta_p(x)\rightarrow +\infty\) as \(u_i\rightarrow +\infty\) or \(v_jh_j(x)\rightarrow +\infty\). So in total, if \(f^*\) violates the constraints, it will not be the optimal value of the primal problem. Thus we obtain \(f^*=\underset{x}{\min}\theta_p(x)\) if \(f^*\) is the optimal value of the primal problem.

Dual Problem#

Given a Lagrangian multiplier, we define its Lagrange dual function as:

(4)#\[\theta_d(u,v)=\underset{x}{\min} L(x,u,v)\]

where the subscription \(d\) denotes the dual problem. It is worth mentioning that the infimum here does not require \(x\) to be taken in the feasible set.

Given the primal problem Eq.1, we define its Lagrange dual problem as:

(5)#\[\begin{split}\begin{array}{rl} \underset{u,v}{\max}& \theta_d(u, v) \\ \text {s.t.} & u \geq 0 \end{array}\end{split}\]

From the definitions we easily obtain that the optimal value of the dual problem, named as \(g^*\), satisfies: \(g^*=\underset{u\ge0,v}{\max}\underset{x}{\min}L(x,u,v)\).

Lemma3

The dual problem is a convex optimization problem.

Proof of Lemma 3

By definition, \(\theta_d(u,v)=\underset{x}{\min} L(x,u,v)\) can be viewed as point-wise infimum of affine functions of \(u\) and \(v\), thus is concave. \(u \geq 0\) is affine constraints. Hence dual problem is a concave maximization problem, which is a convex optimization problem.

Strong and Week Duality#

In the previous section, we learned about the definition of primal and dual problems. You may have noticed that the dual problem has a useful property, it is convex.

Note

The natural question that arises is whether the solution to the primal problem can be obtained by solving the dual problem, since the latter is easier to solve.

To answer this question, we need to understand the concepts of weak and strong duality. These concepts will allow us to establish a connection between the primal and dual problems.

Introduction to Weak Duality

The Lagrangian dual problem yields a lower bound for the primal problem. It always holds true that \(f^*\ge g^*\). We define that as weak duality.

We have the definitions that:

(6)#\[f^*=\underset{x}{\min}\underset{u \geq 0, v}{\max} L(x, u, v) \quad g^*=\underset{u\ge0,v}{\max}\underset{x}{\min} L(x,u,v)\]

Then:

(7)#\[\begin{split}\begin{aligned} g^*&=\underset{u\ge0,v}{\max}\underset{x}{\min} L(x,u,v)=\underset{x}{\min} L(x,u^*,v^*)\nonumber\\ &\le L(x^*,u^*,v^*)\le \underset{u\ge 0,v}{\max} L(x^*,u,v)\nonumber\\ &=\underset{x}{\min}\underset{u \geq 0, v}{\max} L(x, u, v)=f^*\nonumber \end{aligned}\end{split}\]

The weak duality is intuitive because it simply takes a small step based on the definition. However, it makes little sense for us to solve Problem Eq.1, because \(f^*\neq g^*\). So we will introduce strong duality and luckily, with that we can obtain \(f^*=g^*\).

Introduction to Strong Duality

In some problems, we actually have \(f^*=g^*\), which is called strong duality. In fact, for convex optimization problems, we nearly always have strong duality, only in addition to some slight conditions. A most common condition is the Slater’s condition.

If the primal is a convex problem, and there exists at least one strictly feasible \(\tilde{x}\in \mathbb{R}^n\), satisfying the Slater’s condition, meaning that:

(8)#\[\exists \tilde{x}, h_i(\tilde{x})<0, i=1, \ldots, m, \ell_j(\tilde{x})=0, j=1, \ldots r\]

Then strong duality holds.

Summary#

This section introduces the Lagrange method, a powerful tool that allows us to convert a constrained optimization problem into an unconstrained optimization problem. In addition, under certain conditions, the solution of a complex primal problem can be converted to a relatively simple solution of a dual problem. Safe RL algorithms are essentially solutions to constrained problems, so understanding the Lagrange method is crucial to understanding many of these algorithms.

References#