The empirical content of the discount factor view of asset pricing can all be derived from the equation below:
where denotes the prevailing stochastic discount factor and denotes an asset’s excess return. Equation (1) reads: “In the absence of margin requirements and transactions costs, it costs you today to borrow at the riskless rate, buy a stock, and hold the position for period.” The question is then why average excess returns, , vary across the assets even though they all have the same price today by construction.
The answer hinges on the behavior of the stochastic discount factor, , in Equation (1). What is this thing? Everyone knows that it is better to have today than tomorrow, and the present value of an asset that pays out tomorrow is the called the discount rate. Sometimes important stuff will happen in the next hours that changes how awesome it is to have an additional tomorrow. As a result, the realized discount rate is a random variable each period (i.e., follows a stochastic process). e.g., if agents have utility, , then the stochastic discount factor is and the stuff (i.e., risk factor) is changes in log consumption.
An asset pricing model is a machine which takes as inputs a) each agent’s preferences, b) each agent’s information, and c) a list of the relevant risk factors affecting how agents discount the future and produces a stochastic discount factor as its output. In this post, I show how to test an asset pricing model using the cross-section of asset returns. i.e., by linking how average excess returns vary across assets to each asset’s exposure to the risk factors governing the behavior of the stochastic discount factor.
2. Theoretical Predictions
The key to massaging Equation (1) into a form that can be taken to the data is noticing that for any random variables and , the following identity holds:
Thus, if I let denote the stochastic discount factor and denotes any of the excess returns, I can link the expected excess return to holding an asset to its covariance with the stochastic discount factor:
The first term is dimensionless and represents the amount of exposure asset has to the risk factor . The second term has dimension , is common across all assets, and represents the price of exposure to the risk factor since it has the same units as the expected return . Asset pricing theories say that each asset’s expected return should be proportional to the market-wide prices of risk where the constant on proportionality is the asset’s “exposure” to that risk factor.
What does “exposure” mean here? To answer this question I need to put a bit more structure on the stochastic discount factor, , and the excess return, . I remain agnostic about which asset pricing model actually governs returns and which risk factors that affect discount rates, but to avoid writing out lots of messy matrices I do assume that there is only a single factor, , with and . I then write the stochastic discount factor as the sum of a function of , , and some noise, :
where I use a Taylor expansion to linearize the function around the point and assume terms of order are negligible so that and . This means that if the risk factor is larger than expected, , then agents value having an additional tomorrow more than usual. Similarly, suppose each excess return is the sum of an asset-specific function of , , and some asset-specific noise, :
where I use a Taylor expansion to linearize the function around the point and assume terms are negligible so that and . This means that if the risk factor is larger than expected, , then asset ‘s realized excess returns will be larger than average.
Each asset’s exposure to the risk factor is summarized by the coefficient . Assets which have higher realized returns when the risk factor is high (have a large ) will have lower average returns (high prices) since these assets are good hedges against the risk factor. i.e., these assets look like insurance. Equation (1)’s empirical content is then that an asset’s average excess returns, , is proportional to its exposure to the risk factor, , where the constant of proportionality is the same for all assets:
By letting we can interpret this relationship as a realization of the first Hansen-Jagannathan bound:
3. Empirical Strategy
To test Equation (7), an econometrician has to estimate unknown parameters:
using periods of observations. i.e., parameters for each asset (its average excess returns and its factor exposure) as well as market-wide parameters (the risk factor mean and the market price of risk). There are equations to estimate these parameters with via GMM so that the system is over-identified whenever there are assets:
The first equation pins down the mean of the factor . The following equations identify the parameters governing the relationship between the risk factor and each asset’s excess returns. The final equations pin down the market price of risk, , for exposure to the risk factor . A risk is “priced” if .
Note that this empirical strategy doesn’t pin down every single one of the parameters governing the relationship between the stochastic discount factor and each asset’s excess returns. e.g., the parameter estimates and are composites of several deep parameters:
The underlying parameters and as well as , , and are not identifiable from this approach since they satisfy conservation laws which leave the estimates for and unchanged:
e.g., if you increase by and decrease by , then the estimate of remains unchanged.
4. Time Scale Considerations
There is a hidden assumption floating around behind the empirical strategy outlined in Section above. Namely, that each asset’s factor exposure is constant and the market price of risk is constant. In practice, this is surely not the case as is documented in Jagannathan and Wang (1996) and Lewellen and Nagel (2006). OK… so constant factor exposures and prices of risk is an approximation. Fine. How good/bad an approximation is it? e.g., Fama and MacBeth (1973) use rolling month windows to estimate each asset’s . Is this too long a window relative to how much factor exposures vary over time? Alternatively, should we be using a longer window to more accurately pin down these parameters? It turns out that the estimation strategy gives some guidance about the relationship between the optimal estimation window and parameter persistence which I discuss below.
First, I model the evolution of the true parameters. To test an asset pricing model using the cross-section of excess returns, we are interested in knowing whether or not . Suppose the true market price of risk, , follows a random walk:
where so that the final is a random variable with distribution:
Second, I note that the estimation strategy outlined in Section above gives signal, , about the average market price of risk with distribution:
where denotes estimation error from the GMM procedure. There is an additional complication to consider. Namely, if the true market price of risk is floating around during the estimation period, it will add additional noise to the parameter estimates and increase . To keep things simple, suppose that nature sets the market price of risk to at the beginning of the estimation sample and it remains constant during estimation period. Then, is revealed at the end of time and prevails afterwards. This will mean that the derivations below will be inequalities due to the underestimate of .
What I really care about is the distance between the true at the end of the sample which governs the market going forward and the GMM estimate of . Thus, I should choose out sample period length, , to minimize:
As a result, to find the optimal I take the first order condition:
where denotes the variance of my priors about the market price of risk governing the estimation sample . The solution to this equation defines the window length, , which optimally trades off the benefit of getting a more precise estimate of with the cost of decreasing the relevance of this estimate due to the evolution of .
GMM maps onto a parameter of the underlying model. To keep things simple, suppose there is only asset and unknown parameters:
so that the system of estimation equations reduces to:
This assumption means that I don’t have to consider how learning about one asset affects my beliefs about another asset. In this world, if , then GMM reduces to OLS and since:
Evaluating the first order condition then gives:
Solving for yields:
Let’s plug in some values to make sure this formula makes sense. First, notice that if the market price of risk is constant, , then and you should pick or as large as possible. Second, notice that if you already know the true , then and you should pick . Finally, notice that if the test asset has no exposure to the risk factor, , then the equation is undefined since any window length gives you the same amount of information—i.e., none.