Comparing “Explanations” for the iVol Puzzle

1. Motivation

A stock’s idiosyncratic-return volatility is the root-mean-squared error, $\mathit{ivol}_{n,t} = \sqrt{ \sfrac{1}{D_t} \cdot \sum_{d_t=1}^{D_t} \varepsilon_{n,d_t}^2}$ , from the daily regression

(1) $\begin{align*} r_{n,d_t} = \alpha + \beta_{\mathit{Mkt}} \cdot r_{\mathit{Mkt},d_t} + \beta_{\mathit{SmB}} \cdot r_{\mathit{SmB},d_t} + \beta_{\mathit{HmL}} \cdot r_{\mathit{HmL},d_t} + \varepsilon_{n,d_t}. \end{align*}$

Ang, Hodrick, Xing, and Zhang (2006) shows that stocks with lots of idiosyncratic-return volatility in the previous month have extremely low returns in the current month. To quantify just how low these returns are, I run a cross-sectional regression each month,

(2) $\begin{align*} r_{n,t} &= \mu_{r,t} + \beta_t \cdot \widetilde{\mathit{ivol}}_{n,t-1} + \epsilon_{n,t} \end{align*}$

where $\widetilde{\mathit{ivol}}_{n,t} = \sfrac{(\mathit{ivol}_{n,t} - \mu_{\mathit{ivol},t})}{\sigma_{\mathit{ivol},t}}$ . Over the period from January 1965 to December 2012, I estimate that a trading strategy which is long the stocks with higher-than-average idiosyncratic-return volatility in the previous month and short the stocks with lower-than-average idiosyncratic-return volatility in the previous month,

(3) $\begin{align*} \beta_t &= \frac{1}{N \cdot \sigma_{\mathit{ivol},t-1}} \cdot \sum_n \left( \mathit{ivol}_{n,t-1} - \mu_{\mathit{ivol},t-1} \right) \cdot r_{n,t}, \end{align*}$

has an average excess returns of $\langle \beta_t \rangle = -0.98{\scriptstyle \%}$ per month or $-10.05{\scriptstyle \%}$ per year.

This is puzzling on two levels. First, standard asset-pricing theory says that traders shouldn’t be compensated for holding diversifiable risk, so it’s surprising that idiosyncratic-return volatility is priced at all. “But, wait a second.”, you say. “Maybe there’s some friction that makes it hard to diversify-away some of this idiosyncratic risk.” Right. Here’s where the second level comes in. If this were the case, if what we’re calling idiosyncratic-return volatility was somehow non-diversifiable, then you’d expect stocks with lots of idiosyncratic-return volatility to trade at a discount, not a premium. You’d expect these stocks to earn higher returns to compensate traders for holding additional risk, not lower returns. The results in Ang, Hodrick, Xing, and Zhang (2006) are so interesting because they suggest that idosyncratic-return volatility is not only priced but priced wrong. People don’t just care about idiosyncratic-return volatility, they covet it.

Since 2006 there have been numerous papers that have attempted to explain this puzzling result. But, how can we tell if an explanation is good? In this post, I show how to answer this question using techniques introduced in Hou and Loh (2014). You can find all the code here. The data comes from WRDS.

2. Standard Approach

The standard approach for testing to see if a candidate variable explains the idiosyncratic-return-volatility puzzle involves two stages. In the first stage, people show that the candidate variable is a strong predictor of idiosyncratic-return volatility in a cross-sectional regression,

(4) $\begin{align*} \widetilde{\mathit{ivol}}_{n,t} &= \gamma_t \cdot \widetilde{x}_{n,t} + \xi_{n,t}, \end{align*}$

where $\widetilde{x}_{n,t} = \sfrac{(x_{n,t} - \mu_{x,t})}{\sigma_{x,t}}$ . So, for example, if the candidate variable is the maximum daily return in the previous month as suggested in Bali, Cakici, and Whitelaw (2011), then this first-stage regression verifies that the stocks with the highest maximum return in any given month also have the most idiosyncratic-return volatility.

In the second stage, people then run a horse-race regression to see if the candidate variable “drives out” the significance of idiosyncratic-return volatility when predicting monthly returns:

(5) $\begin{align*} r_{n,t} &= \mu_{r,t} + \beta_t \cdot \widetilde{\mathit{ivol}}_{n,t-1} + \delta_t \cdot \widetilde{x}_{n,t-1} + \epsilon_{n,t}. \end{align*}$

The idea behind this second-stage regression is simple. If the estimated $\langle \beta_t \rangle = 0$ and $\langle \delta_t \rangle < 0$ , then the candidate variable explains both a) which stocks had lots of idiosyncratic-return volatility in the the previous month and b) which stocks realized very low returns in the current month.

But, what happens if $\langle \beta_t \rangle < 0$ and $\langle \delta_t \rangle < 0$ , meaning that the candidate variable doesn’t explain all of the idiosyncratic-return-volatility puzzle? How can we tell how much of the idiosyncratic-return-volatility puzzle is explained by a given candidate? This two-stage regression procedure can’t answer this question.

3. Alternative Strategy

To understand how much of the puzzle is explained by, say, a stock’s maximum daily return in the previous month, we need to decompose the excess returns to trading on idiosyncratic-return volatility into two components, namely, the part explained by a stock’s maximum return and everything else:

(6) $\begin{align*} \beta &= \mathrm{Cov}[ \, r_{n,t} , \, \widetilde{\mathit{ivol}}_{n,t-1} \, ] \\ &= \mathrm{Cov}\left[ \, r_{n,t}, \, \gamma_t \cdot \widetilde{x}_{n,t-1} + \xi_{n,t-1} \, \right] \\ &= \underbrace{\gamma_t \cdot \mathrm{Cov}\left[ \, r_{n,t}, \, \widetilde{x}_{n,t-1} \, \right]}_{\text{Explained}} + \underbrace{\mathrm{Cov}\left[ \, r_{n,t}, \, \xi_{n,t-1} \, \right]}_{\substack{\text{Everything} \\ \text{Else}}} \end{align*}$

This explained part is just the returns to a trading strategy that is long the stocks with a higher-than-average value of the candidate variable in the previous month and short the stocks with a lower-than-average value of the candidate variable in the previous month,

(7) $\begin{align*} \beta_{E,t} &= \gamma_t \cdot \mathrm{Cov}\left[ \, r_{n,t}, \, \widetilde{x}_{n,t-1} \, \right] = \frac{\gamma_t}{N \cdot \sigma_{x,t-1}} \cdot \sum_n \left( x_{n,t-1} - \mu_{x,t-1} \right) \cdot r_{n,t}, \end{align*}$

with the portfolio position scaled by the predictive power of the candidate variable over idiosyncratic-return volatility, $\gamma_t$ . Put differently, a candidate variable explains a lot of the idiosyncratic-return-volatility puzzle if it is a good predictor of idiosyncratic-return volatility, $\gamma_t > 0$ , and it generates negative returns when you trade on it, $(N \cdot \sigma_{x,t-1})^{-1} \cdot \sum_n( x_{n,t-1} - \mu_{x,t-1}) \cdot r_{n,t} < 0$ .

If we have an expression for the explained component of the idiosyncratic-return-volatility puzzle, then we can use GMM to estimate it. Let $\mathbf{\Theta}_t$ be a vector of coefficients to be estimated in month $t$ ,

(8) $\begin{align*} \mathbf{\Theta}_t &= \begin{bmatrix} \mu_{r,t} & \beta_t & \gamma_t & \beta_{E,t} \end{bmatrix}, \end{align*}$

using the cross-section of all NYSE, AMEX, and NASDAQ stocks in the WRDS. We can then estimate this $(1 \times 4)$ -dimensional vector of coefficients using the following moment conditions:

(9) $\begin{align*} g_N(\mathbf{\Theta}_t) &= \frac{1}{N} \cdot \sum_n \begin{pmatrix} r_{n,t} - \mu_{r,t} - \beta_t \cdot \widetilde{\mathit{ivol}}_{n,t-1} \\ \left( r_{n,t} - \mu_{r,t} - \beta_t \cdot \widetilde{\mathit{ivol}}_{n,t-1} \right) \cdot \widetilde{\mathit{ivol}}_{n,t-1} \\ \left( \widetilde{\mathit{ivol}}_{n,t-1} - \gamma_t \cdot \widetilde{x}_{n,t-1} \right) \cdot \widetilde{x}_{n,t-1} \\ \beta_{E,t} - \gamma_t \cdot r_{n,t} \cdot \widetilde{x}_{n,t-1} \end{pmatrix}, \end{align*}$

where the first two moments estimate the OLS regression in Equation (2), the third moment estimates the OLS regression in Equation (4), and the fourth moment estimates the explained component of the idiosyncratic-return-volatility puzzle as given in Equation (7). When I estimate these $4$ coefficients each month for the maximum return candidate explanation, I find that the explained return is $-0.84{\scriptstyle \%}$ per month, or about $\sfrac{0.84}{0.99} = 85{\scriptstyle \%}$ of the total puzzle.

$plot--ivol-puzzle--fraction-explained$

4. What Counts As An “Explanation”?

One of the nice things about setting up the problem up this way—as opposed to using approximation methods like in the original Hou and Loh (2014) article—is that it makes really clear what an explanation is. A candidate variable is a good explanation of the idiosyncratic-return-volatility puzzle if you can’t make (or lose) money by trading on idiosyncratic-return volatility that isn’t predicted by the candidate variable:

(10) $\begin{align*} \text{``Everything Else'' term in Equation (6)} &= \frac{1}{N} \cdot \sum_n \left( \widetilde{\mathit{ivol}}_{n,t-1} - \gamma_t \cdot \widetilde{x}_{n,t-1} \right) \cdot r_{n,t}. \end{align*}$

Note that this is a very particular definition of “explained”. Usually, when you think about an explanation, you have in mind some causal mechanism. But, that’s not what’s meant here. It’s not as if you could give some stocks higher idiosyncratic-return volatility and lower future returns by randomly assigning them higher maximum returns in the current month and not changing any of their other properties. Clearly, there is some deeper mechanism as play that’s causing both higher idiosyncratic-return volatility and higher maximum returns. But, you can’t trade on the part of idiosyncratic-return volatility that’s not explained by the maximum return.