Arora et al. (2012)

1. Introduction

In this post I work through the main result in Arora, Barak, Brunnermeier and Ge (2012) which formulates a simple securitization setting with $I$ underlying assets (e.g., mortgages) and $J$ derivative assets (e.g., CDOs) where:

The seller can cheat and create non-random derivatives that will generate lemons cost for the buyer, and…
With limited computational power, the buyer will not be able to detect whether or not the seller has cheated.

For a finance-lite discussion of the results, see Dick Lipton’s excellent blog post. I’ve also created a short deck of slides (here).

2. Why Use Derivatives?

I begin by walking through the motivation for using derivatives in the first place via a simple model in the spirit of DeMarzo (2005). In this sort of model, derivatives like CDOs solve an asymmetric information problem between a seller (e.g., a investment bank) and a buyer (e.g., a pension fund).

Suppose that a securities seller owns $I \gg 1$ mortgages which each payout $\$y_i$ defined as:

(1) $\begin{align*} y_i &= x_i + z_i, \quad i \in \mathcal{I} \end{align*}$

where $x_i$ is an iid random variable such that $x_i = \$\bar{x}$ with probability $\rho$ and $x_i = \$0$ with probability $1-\rho$ while $z_i$ is Gaussian white noise with distribution $\mathtt{N}(0,\sigma_z^2)$ . For simplicity, I set $\bar{x} = \$1$ and $\rho = 1/2$ in the analysis below.

State space tree for x.

The model relies on two main assumptions:

The seller knows $\$x_i$ for each $i \in \mathcal{I}$ while the buyer does not.
The seller would prefer $\$\delta$ in cash to $\$1$ in mortgages with $\delta \in (0,1)$ .

The first assumption defines the information asymmetry. To make the problem interesting, it also has to be the case that the seller cannot credibly reveal his information prior to trading. e.g., he can’t just point out which mortgages are bad apples in an email. The second assumption gives the seller a good faith reason to get rid of the mortgages. This assumption can be interpreted as saying that the seller is less liquid than the buyer and would prefer to have more cash.

If a buyer considered purchasing a “random” collection of $D < I/2$ mortgages from the seller, he would be worried that the seller would put all of the bad mortgages in this collection so that the buyer would receive a collection of mortgages worth $\$0$ rather than a “fair value” of $\$D/2$ . In such a world, the buyer would never purchase mortgages from the seller and the seller would be forced to hold onto all of the mortgages which he values at only $\$\delta \cdot I / 2$ due to his liquidity concerns. This loss of $\$(1 - \delta) \cdot I / 2$ is the core of the asymmetric information problem.

However, a smart seller can eliminate the asymmetric information problem entirely via careful security design. Suppose that he (a) offers to sell a security with two tranches where the senior tranche has a face value of $\$I/2$ and (b) holds onto the junior tranche. After agreeing to accept the first $\$I/2$ in losses by holding onto the junior tranche, the seller essentially converts the remaining payout from the the bundle of mortgages into a riskless asset as $I \nearrow \infty$ since the white noise terms $z_i$ cancel out on average. Thus, by securitizing the mortgages the seller is able to get the “fair” value of his assets in spite of the asymmetric information problem:

(2) $\begin{align*} (1/2) \cdot \sum_{i=1}^I x_i &= I/2 = \mathtt{E} \left[ \sum_{i=1}^I y_i \right] \end{align*}$

3. New Framework

In Arora et al. (2012), the seller still wants to sell $I$ mortgages which each payoff $\bar{x} = 1$ with probability $\rho = 1/2$ . However, Arora et al. (2012) adjust this standard framework in three key ways:

The seller cannot hold the entire junior tranche.
The buyer and seller engage in a zero-sum transaction.
The seller creates $J$ binary derivative securities (henceforth, CDOs) from these $I$ underlying mortgages.

The first assumption means that the seller cannot fix the asymmetric information problem via security design in the same manner as DeMarzo (2012). The second assumption removes any possibility that derivative creation might be socially beneficial. In Arora et al. (2012), the seller gets utility from the amount of money he can extract from the buyer via clever security design. The third and final assumption is the key step that allows computational complexity to enter into model. Specifically, each CDO’s payout now depends on $D \leq I$ of the mortgages in the entire pool. Each of these $J$ new CDOs payout an amount $\$V_j$ where $V_j = \$\bar{v}$ if no more than $(D + \phi)/2$ of the underlying mortgages default and $V_j = \$0$ otherwise with:

(3) $\begin{align*} \bar{v} &= \left( \frac{D - \phi}{2 \cdot D} \right) \cdot \frac{I}{J} \end{align*}$

Given the second assumption, the seller’s assignment of the $I$ mortgages to $J$ CDOs then defines a bipartite graph as illustrated below:

Seller assignment bipartite graph.

Now, suppose that the seller has private information that a subset $\mathcal{L}$ of the $I$ mortgages are lemons and guaranteed to payout $0$ . Here, private information means that the seller knows which $L$ mortgages have a payout of $\$0$ while the buyer is aware that the seller knows this information. To increase his profit, the seller might carefully design this graph using his knowledge of which mortgages are lemons in order to reduce the value of the CDOs relative to their “fair” value of $D/2$ .

Definition (Lemons Cost): Let $\kappa_j(L)$ denote the lemons cost to the buyer if the seller knows a set $\mathcal{L}$ of the mortgages will payout $x_i = 0$ :

(4) $\begin{align*} \kappa_j(L) &= \mathtt{E}\left[ V_j \right] - \underset{\mathcal{L} \subseteq \mathcal{I}}{\mathtt{min}} \left\{ \mathtt{E}\left[ \ V_j \ \middle| \ x_i = 0, \ \forall i \in \mathcal{L} \ \right] \right\} \end{align*}$

where $\kappa_j$ has units of dollars.

Thus, the lemons cost can be thought of as the amount by which a CDO’s value could possibly decline if the seller knew that a set of $\mathcal{L}$ lemons existed in the entire mortgage pool and tried to cheat the buyer out of as much money as possible. For example, suppose that in the model from the section above the seller didn’t actually know each and every $x_i$ but instead just knew that a subset $\mathcal{L}$ of the $I$ mortgages would have $x_i = \$0$ . Then, the fair value of the entire bundle of mortgages would have been $(I - L)/2 = I/2 - L/2$ so that the lemons cost would have been $\kappa(L) = \$L/2$ .

The key insight of this paper is that trying to figure out whether or not the seller has randomly assigned mortgages to CDOs is equivalent to the densest subgraph problem which does not have a polynomial time solution. e.g., if the buyer faces any computational constraints whatsoever, then the seller can foist some lemons costs on the buyer without being detected. More formally, the buyer faces the following problem:

Definition (Densest Subgraph Problem): The densest subgraph problem is to distinguish between the following two distributions, $\mathtt{Rnd}$ and $\mathtt{Lmn}$ , over the bipartite graph above:

$\mathtt{Rnd}$ : The seller chooses $D$ random mortgages for every CDO.

$\mathtt{Lmn}$ : Let nature select a random set of $\mathcal{L}$ assets with $x_i = \$0$ payout. The seller then selects a subset of $\mathcal{B}$ CDOs (i.e., boobytrapped CDOs) and a number of lemons $\theta$ to plant in each boobytrapped CDO. For every non-boobytrapped CDO, the seller chooses $D$ random mortgages. For every boobytrapped CDO, the seller randomly chooses $D - \theta$ from the total population of mortgages and $\theta$ mortgages from set of $\mathcal{L}$ mortgages with known $x_i = \$0$ .

If $I = o(J \cdot D)$ and $(B \cdot \theta^2 / L) = o(J \cdot D^2 / I)$ , then there is no polynomial time algorithm to distinguish between the distributions $\mathtt{Rnd}$ and $\mathtt{Lmn}$ .

Roughly speaking, the densest subgraph problem is tantamount to figuring out whether or not the seller planted $\theta$ mortgages with no payout in $B$ of the $J$ CDOs he issued. The seller then has preferences for maximizing the total amount of lemons costs he can extract from the buyer, subject to the constraint that there is no polynomial time algorithm that the buyer can use to detect his cheating. Note that in the simple DeMarzo (2012) setting, if the seller could commit to issuing a CDO with a “random” collection of $D$ mortgages and published the collection of mortgage IDs, the buyer could check whether or not these $D$ mortgage IDs were in fact randomly selected using a polynomial time algorithm. The challenge in the Arora et al. (2012) setting comes from the fact that the same mortgage $i$ could show up in several CDOs.

Above, I use Landau notation:

Definition (Little- $o$ Notation): We say that “ $f$ is little- $o$ of $h$ as $x$ approaches $x_0$ ” and write $f(x) = o(h(x))$ as $x \to x_0$ to mean that:

(5) $\begin{align*} \lim_{x \to x_0} \frac{f(x)}{h(x)} &= 0 \end{align*}$

Some simple math reveals that the seller only needs to fix about $\theta \sim \sqrt{D}$ of the assets in each CDO to affect payout as the expected number of defaults would be $(D + \theta)/2$ while the standard deviation would be $(1/2) \cdot \sqrt{D - \theta}$ . Thus, the lemons cost for a boobytrapped CDO will be $\kappa_j(L) = \bar{v} \cdot \Delta_\theta$ where $\Delta_\theta = \mathtt{N}[ (\theta - \phi)/(2 \cdot \sqrt{D})]$ is the increase in the probability that a CDO does not payout due to the seller’s boobytrap and $\mathtt{N}[\cdot]$ denotes the standard normal distribution. The seller then chooses the number of CDOs to boobytrap, $B$ , and the number of lemons to plant in each boobytrapped CDO, $\theta$ , in order to maximize his utility as defined below:

(6) $\begin{align*} U &= \underset{\{B,\theta\}}{\mathtt{max}} \left\{ \ B \cdot \bar{v} \cdot \Delta_\theta \ \right\} \\ &\text{ s.t. no polynomial time algorithm\ldots} \end{align*}$

Since $\Delta_\theta$ is concave for $\theta < \phi$ and convex if $\theta \geq \phi$ , the optimal choice of $\theta$ is either $\theta = 0$ or $\theta = c \cdot \sqrt{D}$ for some small constant $c$ .

4. Main Result

The main result of the paper is to characterize the seller’s utility $U$ given this optimization program:

Theorem: If $\theta - \phi > 3 \cdot \sqrt{D}$ and $\theta/D \gg L/I$ , the seller will earn a utility:

(7) $\begin{align*} U &\geq \left(1 - 2 \cdot \mathtt{N}[-\phi/(2 \cdot \sqrt{D})] - o(1) \right) \cdot B \cdot \bar{v} \approx L \cdot \sqrt{I/J} \end{align*}$

Thus, if the seller issues roughly the same number of CDOs as there are mortgages, he can each roughly $\$1$ for each additional lemon that he discovers in the underlying pool of mortgages without the buyer being able to figure out whether or not he has cheated an boobytrapped some of the CDOs.

Proof: The strategy of the proof will be to first take any $B$ and $\theta$ that satisfy the theorem conditions and proceed to compute the lemons cost for a boobytrapped CDO. Then, use the link between the default rate on mortgages and the default rate on CDOs to compute an upper bound on the lemons cost of a non-boobytrapped CDO.

Suppose that $B$ and $\theta$ satisfy the non-polynomial time algorithm bounds. For each boobytrapped CDO $j \in \mathcal{B}$ , let $s_j$ denote the total number of defaulted mortgages. Since the CDO is boobytrapped $\theta$ mortgages will always default, so the expectation $\mathtt{E}[s_j]$ can be computed as:

(8) $\begin{align*} \mathtt{E}[s_j] = \frac{D - \theta}{2} + \theta \end{align*}$

Given that the number of assets in each CDO, $D$ , is large enough that the law of large numbers holds, we would then have that:

(9) $\begin{align*} \mathtt{Pr}\left[ \ s_j \geq \left(\frac{D + \phi}{2}\right) \ \right] &= 1 - \mathtt{N}\left[-\frac{\phi}{2 \cdot \sqrt{D}}\right] \end{align*}$

Next, I want to derive a bound on the expected payout of a non-boobytrapped CDO. Suppose that the expected number of defaulted assets for each non-boobytrapped CDO is $t$ . We know that $t$ has to satisfy the relationship:

(10) $\begin{align*} B \cdot \left( \frac{D + \theta}{2} \right) + \left( J - B \right) \cdot t &= \left( \frac{I + L}{2 \cdot L} \right) \cdot \frac{J \cdot D}{I} \end{align*}$

Both the left and right hand sides of the above equation are the proportion of mortgages that default. The left hand side of the above equation is the proportion of mortgages in the boobytrapped CDOs that default, $B \cdot (D + \theta)/2$ , plus the proportion of mortgages in the non-boobytrapped CDOs that default, $(J - B) \cdot t$ . The right hand side is the number of defaults per securitized mortgage times the average number of times each mortgage is securitized into a CDO. Thus, by solving for $t$ we can derive the inequality below:

(11) $\begin{align*} t &= \left( \frac{I + L}{2 \cdot L} \right) \cdot \frac{J \cdot D}{\left( J - B \right) \cdot I} - \frac{B}{J - B} \cdot \left( \frac{D + \theta}{2} \right) \\ &= \frac{D}{2} \cdot \left( \frac{(I + L) \cdot J}{(J - B) \cdot L \cdot I} \right) - \frac{B \cdot D}{2 \cdot (J - B)} - \frac{B \cdot \theta}{2 \cdot (J - B)} \\ &= \frac{D}{2} \cdot \left( \frac{I\cdot J}{(J - B) \cdot L \cdot I} \right) + \frac{D}{2} \cdot \left( \frac{L \cdot J}{(J - B) \cdot L \cdot I} \right) - \frac{B \cdot D}{2 \cdot (J - B)} - \frac{B \cdot \theta}{2 \cdot (J - B)} \\ &\geq \frac{D}{2} + \frac{L}{2 \cdot I} - \frac{B \cdot \theta}{2 \cdot (J - B)} \end{align*}$

Thus, the probability that the number of defaulted assets in a non-boobytrapped CDO is $(D+\phi)/2$ can be computed as:

(12) $\begin{align*} \mathtt{N}\left[ - 3 - \frac{B \cdot \theta}{2 \cdot (J - B) \cdot D} \right] &= \mathtt{N}\left[ - \frac{\phi}{2 \cdot \sqrt{D}} \right] - \left. \frac{d\mathtt{N}}{dx} \right|_{x = -3} \cdot \left( \frac{B \cdot \theta}{2 \cdot (J - B) \cdot D} \right) \\ &= \mathtt{N}\left[ - \frac{\phi}{2 \cdot \sqrt{D}} \right] - O \left( \frac{B \cdot \theta}{2 \cdot (J - B) \cdot D} \right) \end{align*}$

Thus, the expected number of securities (boobytrapped and non-boobytrapped) that yield no payout is at least $J \cdot \mathtt{N}[ - \phi/(2 \cdot \sqrt{D})] + (1 - 2 \cdot \mathtt{N}[ - \phi/(2 \cdot \sqrt{D})]) \cdot B - o(B)$ which is roughly $(1 - 2 \cdot \mathtt{N}[ - \phi/(2 \cdot \sqrt{D})])$ smaller than the case with no boobytrapping.