Dyson Brownian Motion

1. Introduction

I outline the construction of Dyson Brownian motion which governs the evolution of the eigen-values of an (N \times N)-dimensional stochastic process of Hermitian matrices. For instance, if A(t) is such a process, then:

(1)   \begin{align*} A(t+dt) \ &= \ A(t) \ + \ (dt)^{1/2} \cdot G, \end{align*}

where G in an (N \times N)-dimensional random Hermitian matrix drawn from the Gaussian Unitary Ensemble (\mathtt{GUE}_N).

Why study the eigen-values of a stream of Hermitian matrices? At first glance, this seems like a rather obscure mathematical object. Before I can answer this question, in Section 2 I first define what a Hermitian matrix is and discuss how I would select this random matrix G. Then, in Section 3 I can give some practical examples in which Dyson Brownian motion would be a useful construction. I also give an alternative interpretation of Dyson Brownian motion related to non-intersecting Brownian processes and explain the use of complex matrices in an economic context. Finally, in Section 4, I construct Dyson Brownian motion.

The main source for the material in this post is Terry Tao‘s set of lecture notes on Random Matrix Theory, though I also used Mehta (2004) and Anderson, Guinnet and Zeitouni (2009) as references.

2. Mathematical Foundation

First things first: “What is a Hermitian matrix?”

Definition (Hermitian Matrix): A square N \times N matrix A is called Hermitian if it is self-adjoint:

(2)   \begin{align*} A \ &= \ A^* \end{align*}

Each element (i,j) of a Hermitian matrix A is the complex conjugate of element (j,i) in A. Thus, the diagonal elements have to be real. Consider an example Hermitian matrix A':

(3)   \begin{align*} A' \ &= \ \begin{bmatrix} 1 & - i \\ i & 2 \end{bmatrix} \end{align*}

Hermitian matrices are just the complex extension of real symmetric matrices. For instance, the matrix A'' below is a real instance of a Hermitian matrix:

(4)   \begin{align*} A'' \ &= \ \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \end{align*}

Next, when I defined the stochastic process A(t) above, I characterized each lurch forward by the addition of a random matrix G scaled by the square root of the time interval. I pull this random matrix from the Gaussian Unitary Ensemble.

Definition (Gaussian Unitary Ensemble): The Gaussian Unitary Ensemble \mathtt{GUE}(N) is a probability space over the vector space of N \times N dimensional Hermitian matrices governed by the measure \mu defined as:

(5)   \begin{align*} \mu(A) \ &= \ \frac{1}{2^{\frac{N}{2}}\cdot \pi^{\frac{N^2}{2}}} \cdot e^{-\frac{N}{2} \cdot \mathtt{Tr}(A^2)} \end{align*}

So, each upper triangular element a_{i,j} will be drawn from \mathcal{N}(0,1)_{\mathbb{C}} while each element a_{i,i} along the diagonal will be drawn from \mathcal{N}(0,1)_{\mathbb{R}}. To get a feel for what this definition really means, consider a concrete example. Suppose that we want to know the probability \mu(A') for the example A' above. Using these paramters, I find that:

(6)   \begin{align*} \begin{split} (A')^2 \ &= \ \begin{bmatrix} 2 & 3 \cdot i \\ - 3 \cdot i & 5 \end{bmatrix} \\ \mathtt{Tr}[(A')^2] \ &= \ 2 \ + \ 5 \ = \ 7 \\ \mu(A') \ &= \ \frac{1}{2^{\frac{2}{2}} \cdot \pi^{\frac{2^2}{2}}} \cdot e^{-\frac{2}{2} \cdot 7} \\ &= \ \frac{1}{2 \cdot \pi^2} \cdot e^{-7} \\ &= \ 0.0000462 \end{split} \end{align*}

The analysis below will follow through if you consider only adding random matrices G drawn from an ensemble of real symmetric matrices. This ensemble is known as the Gaussian Orthogonal Ensemble.

Note that each of the elements of the matrices A(t) do not follow their own independent Brownian motion processes. In a loose sense, this would be “too little” structure. The main thrust of the Dyson Brownian motion construction is that the eigen-values of this process follow a Brownian motion plus a twist term. The eigen-values are an attractive summary statistic for 2 reasons. First, we know that they have a simple spectrum due to the fact that at each point in time A(t) is a Hermitian matrix. What’s more, the Ky Fan inequality tells us that the eigen-values should have a smooth transition function over time.

3. Motivating Examples

With these terms in hand, I can now ask: “Why worry about matrix valued stochastic processes?”

First, consider some financial applications. Financial theory is built around the \beta-pricing models which measure the correlation between the returns of assets and various risk factors. Models which assume that these \beta measures remain constant for long periods of time do poorly in empirical tests.1 It would be nice to characterize the evolution of these variance-covariance matrices.2 Alternatively, suppose that you were in the business of identifying the principle components of stock returns–either explicitly or by hunting for additional factors.3 Here, you might want to check to work out how likely it is that the largest principle component has moved by d\lambda in order to test your model.

In an entirely different context, Dyson Brownian motion can also be thought of as characterizing the evolution of N Brownian motions \begin{bmatrix} \lambda_1(t) & \ldots & \lambda_N(t) \end{bmatrix} that have been restricted to never intersect. The problem of modelling the eigen-values of Hermitian matrices and non-intersecting Brownian processes are not link a priori. However, constructing these non-intersecting processes is hard and an elegant solution emerges via solving the eigen-value process problem for Hermitian matrices which have a simple spectrum. For an example of the economic usefulness of such a trick, conside modelling the real option of a worker to switch jobs.4 At each point in time, he has a next best option but the exact nature of that next best option will change over time. Rather than keeping track of all possibilities, you could just model the evolution of the best option via Dyson Brownian motion.

Finally, I want to make a quick note about the use of complex valued rather than real matrices. Physicists declare that real symmetric matrices preserve time reversal symmetry while complex Hermitian matrices do not. For instance, in the financial application above where each of the entries in the variance-covariance matrix process have to be real, you can always undo the last step of the stochastic process by hitting A(t+dt) by a well chosen inverse. However, when using complex valued matrices, this inverse is no longer possible as complex numbers are periodic; i.e.,

(7)   \begin{align*} \begin{split} i \ &= \ \sqrt{-1} \\ i^2 \ &= \ -1 \\ i^3 \ &= \ - i \\ i^4 \ &= \ 1 \end{split} \end{align*}

To see the implications of this fact in a macroeconomic setting, consider a complex valued extension of a Leontif production model as follows. Suppose that prices are local5 and form the real part of each a_{i,j} entry while the magnitude of the transaction forms the complex part. So, for instance, a transaction of 3 tons of steel between a builder i and a steel maker j at a price of 5 dollars per ton would manifest itself as a_{i,j} = 5 + 3 \cdot i and \bar{a}_{i,j} = a_{j,i} = 5 - 3 \cdot i. Thus, by introducing an additional dimension to the Leontif matrix, the physical properties of the process it represents changes dramatically.

4. Construction

Finally, I actually get around to defining Brownian motion. Below, I state the result:

Theorem (Dyson Brownian Motion): Let t > 0, dt > 0 and \begin{bmatrix} \lambda_1(t) & \ldots & \lambda_N(t) \end{bmatrix} be the spectrum of eigen-values of the N \times N Hermitian matrix valued process \{A(t)\}_{t \geq 0}. Then, we have:

(8)   \begin{align*} d \lambda_i(t) \ &= \ d B_i(t) \ + \ \sum_{1 \leq j \leq N: j \neq i} \ \frac{d t}{\lambda_i(t) - \lambda_j(t)} \end{align*}

for all 1 \leq i \leq N, where d \lambda_i(t) = \lambda_i(t + dt) - \lambda_i(t) and \begin{bmatrix} dB_1 & \ldots & dB_N \end{bmatrix} are independent Brownian motion processes.

In words, this theorem says that the eigen-values of a stochastic process of Hermitian matrices behave like independent Brownian motions plus a repulsion force which is inversely proportional to the distances between any 2 eigen-values. What’s more, this repulsive force is not-localized. Each eigen-value \lambda_i(t) is pushed an pulled by \lambda_{i-1}(t) and \lambda_{i+1}(t) but also \lambda_1(t) and \lambda_N(t) as well as each and every eigen-level in between.

To formulate the construction, I use a Lemma from Hadamard given below:

Lemma (Hadamard Operator): The eigen-values of A have the following first and second derivatives with respect to time t:

(9)   \begin{align*} \dot{\lambda}_i \ &= \ u_i^* \ \dot{A} \ u_i \\ \ddot{\lambda}_i \ &= \ u^*_i \ \ddot{A} \ u_i \ + \ 2 \cdot \sum_{j \neq k} \ \frac{\left\vert u_i^* \ \dot{A} \ u_i \right\vert^2}{\lambda_j - \lambda_k} \end{align*}

Proof (Hadamard Operator):

(10)   \begin{align*} \begin{split} A \ u_i \ &= \lambda_i \ u_i \\ u_i^* \ u_i \ &= \ 1 \end{split} \end{align*}

(11)   \begin{align*} \begin{split} \dot{A} \ u_i \ + \ A \ \dot{u}_i \ &= \dot{\lambda}_i \cdot u_i \ + \ \lambda_i \cdot \dot{u}_i \\ \dot{u}_i^* \ u_i \ + \ u_i^* \ \dot{u}_i \ &= \ 0 \end{split} \end{align*}

(12)   \begin{align*} \dot{\lambda}_i \ &= \ u_i^* \ \dot{A} \ u_i \end{align*}

(13)   \begin{align*} \begin{split} \ddot{\lambda}_i \ &= \ \dot{u}^*_i \ \dot{A} \ u_i \ + \ u^*_i \ \ddot{A} \ u_i \ + \ u^*_i \ \dot{A} \ \dot{u}_i \\ 0 \ &= \ \dot{u}_j^* \ \dot{A} \ u_i \ + \ (\lambda_j - \lambda_i) \cdot u_j^* \ \dot{u}_i \end{split} \end{align*}

Finally, I use this lemma together with the properties of Hermitian matrices to finish the construction of Dyson Brownian motion.

Proof (Dyson Brownian Motion):

(14)   \begin{align*} A(t + dt) \ &= \ A(t) \ + \ (dt)^{1/2} \cdot G \end{align*}

(15)   \begin{align*} \lambda_i(t + dt) \ &= \ \lambda_i(t) \ + \ (dt)^{1/2} \cdot \nabla_G \lambda_i(t) \ + \ \frac{dt}{2} \cdot \nabla_G^2 \lambda_i(t) \ + \ \ldots \end{align*}

(16)   \begin{align*} \begin{split} \nabla_G \lambda_i(t) \ &= \ u_i^* \ G \ u_i \\ \nabla_G^2 \lambda_i(t) \ &= \ 2 \cdot \sum_{i \neq j} \ \frac{\left\vert u_j^* \ G \ u_i \right\vert^2}{\lambda_i(t) - \lambda_j(t)} \end{split} \end{align*}

(17)   \begin{align*} d \lambda_i(t) \ &= \ (dt)^{1/2} \cdot \left\{ \ u_i^* \ G \ u_i \ \right\} \ + \ dt \cdot \left\{ \ \sum_{i \neq j} \ \frac{\left\vert u_j^* \ G \ u_i \right\vert^2}{\lambda_i(t) - \lambda_j(t)} \ \right\} \end{align*}

(18)   \begin{align*} d \lambda_i(t) \ &= \ (dt)^{1/2} \cdot \varepsilon_{i,i} \ + \ dt \cdot \left\{ \ \sum_{i \neq j} \ \frac{\left\vert \varepsilon_{i,j} \right\vert^2}{\lambda_i(t) - \lambda_j(t)} \ \right\} \end{align*}