Neglecting The Madness Of Crowds

Motivation

This post is motivated by two stylized facts about bubbles and crashes. The first is that these events are often attributed to the madness of crowds. In popular accounts, they occur when a large number of inexperienced traders floods into the market and mob psychology takes over. For some examples, just think about day traders during the DotCom bubble, out-of-town buyers during the housing bubble, or first-time investors during the Chinese warrant bubble.

The second stylized fact is that, even though bubbles and crashes have a large impact on the market, traders seems to ignore the risk posed by the madness of crowds during normal times. Gripped by “new-era thinking”, they often insist on justifying market events with fundamentals until some sudden price movement forces them to reckon with the madness of crowds. This phenomenon is referred to as “neglected risk” in the asset-pricing literature.

With these two stylized facts in mind, this post investigates how hard it is for traders to learn about aggregate noise-trader demand when the number of noise traders can vary over several orders of magnitude—i.e., when there’s a possibility that the crowd’s gone mad. I find something surprising: it makes sense for existing traders to neglect the madness of crowds during normal times. Here’s the logic. Noise traders push prices away from fundamentals. So, if you don’t see a large unexpected price movement away from fundamentals, then there must not be very many noise traders in the market. And, if there aren’t very many noise traders, then they can’t affect the equilibrium price very much. But, this means that there’s no way for you to learn about aggregate noise-trader demand from the equilibrium price, which means that there’s no reason for you to revise your beliefs about aggregate noise-trader demand away from zero.

To illustrate this point, I’m going to make use of a happy mathematical coincidence. It turns out that, if you assume changes in the number of noise traders are governed by a stochastic version of the logistic growth model (see here, here, or here for examples), then the stationary distribution for the number of noise traders will be Exponential. And, the right way to learn about the mean of a Gaussian random variable whose variance is drawn from an Exponential distribution is to use the LASSO, a penalized regression which delivers point estimates that are precisely zero whenever the unpenalized estimate is sufficiently small.

Inference Problem

Here’s is the inference problem I’m going to study. Suppose there’s a stock with price $P$ , and there are $N > 0$ noise traders present in this market. And, assume that this price is a linear function of three variables:

(1) $\begin{equation*} P = \alpha + \beta \cdot F + \gamma \cdot \{C - S\} \end{equation*}$

Above, $F$ denotes the stock’s fundamental value, $C \sim \mathrm{Normal}(0, \, N)$ denotes noise due to the madness of crowds, and $S \sim \mathrm{Normal}(0, \, \sigma^2)$ denotes noise due to random supply shocks. The negative sign on supply noise comes from the fact that more supply means lower prices. You can think about the supply noise as the result of hedging demand or rebalancing cascades. The source doesn’t matter. The key point is that this noise source has constant variance.

Crowd noise is different, though. Its variance is equal to the number of noise traders in the market, $N$ , and this population can change. Suppose there are $n = 1, \, \ldots, \, N$ noise traders, and each individual trader in this crowd has demand that’s iid normal:

(2) $\begin{equation*} c_n \overset{\scriptscriptstyle \text{iid}}{\sim} \mathrm{Normal}(0, \, 1) \end{equation*}$

Then, the aggregate demand of the entire crowd of noise traders has distribution:

(3) $\begin{equation*} C \overset{\scriptscriptstyle \text{def}}{=} {\textstyle \sum_{n=1}^N} \, c_n \sim \mathrm{Normal}(0, \, N) \end{equation*}$

In this setting, the rescaled pricing error, $\tilde{P} \overset{\scriptscriptstyle \text{def}}{=} \sfrac{1}{\gamma} \cdot \{P - \alpha - \beta \cdot F\}$ , is a normally distributed signal about the aggregate demand coming from the crowd of noise traders:

(4) $\begin{equation*} \tilde{P} = C - S \end{equation*}$

When the price is above its fundamental value, $\{P - \alpha - \beta \cdot F\} > 0$ , it must be because either the crowd of noise traders has high demand, $C > 0$ , or there is unexpectedly low supply, $S < 0$ . The question I want to answer below is: how hard is it for traders to figure out which noise source is responsible?

Bayes rule tells us how to learn about the crowd’s aggregate demand from the equilibrium pricing error:

(5) $\begin{equation*} \mathrm{Pr}(C|\tilde{P}) \propto \mathrm{Pr}(\tilde{P}|C) \times {\textstyle \int_0^{\infty}} \, \mathrm{Pr}(C|N) \cdot \mathrm{Pr}(N) \cdot \mathrm{d}N \end{equation*}$

Supply noise is normally distributed. This means we know how to calculate $\mathrm{Pr}(\tilde{P}|C)$ . So, if we knew the distribution of the number of noise traders in the market, then we could evaluate the remaining integral and solve for the most likely value for the crowd’s aggregate demand given the observed pricing error:

(6) $\begin{equation*} \hat{C}(\tilde{P}) \overset{\scriptscriptstyle \text{def}}{=} \underset{C \in \mathrm{R}}{\arg\min} \left\{ \, \log \mathrm{Pr}(\tilde{P}|C) + \log {\textstyle \int_0^{\infty}} \, \mathrm{Pr}(C|N) \cdot \mathrm{Pr}(N) \cdot \mathrm{d}N \, \right\} \end{equation*}$

Population Size

There are many ways that you could model the size of the noise-trader crowd. One way to go would be to use a population-dynamics model from the ecology literature, such as the stochastic version of the logistic growth model. This model is specifically designed to explain the unexpected booms and busts that we seen in wildlife populations. If we take this approach, then the number of noise traders, $N(t)$ , is governed by the following stochastic differential equation:

(7) $\begin{equation*} \mathrm{d}N = \theta \cdot \{ \mu - N \} \cdot N \cdot \mathrm{d}t - \delta \cdot N \cdot \mathrm{d}t + \varsigma \cdot N \cdot \mathrm{d}W \end{equation*}$

In the equation above, $\theta \cdot \{\mu - N\}$ denotes the rate at which the crowd of noise traders grows, $\delta > 0$ denotes the rate at which noise traders lose interest, $\mathrm{d}W$ is a Wiener process capturing random fluctuations in the number of noise traders in the crowd, and $\varsigma > 0$ denotes the volatility of these random fluctuations.

The key property of the logistic growth model is that it’s nonlinear. Population growth, $\theta \cdot \{ \mu - N \} \cdot N$ , is a quadratic function of the number of noise traders as shown in the figure below. This nonlinearity is what allows the model to generate population booms and busts. This nonlinearity will occur if existing noise traders try to recruit their friends to enter the market as well (see here, here, here, or here). $\mu > 0$ denotes the typical number of noise traders that could potentially be persuaded to join the crowd. And, $\theta > 0$ captures the intensity with which existing noise traders persuade their remaining $\{\mu - N\}$ friends to join.

Thus, when there are only a handful of noise traders, the crowd grows slowly because there aren’t many traders to do the recruiting. As the crowd gets larger, population growth increases. But, this growth eventually slows down again because, when there are already lots of noise traders in the market, it is hard to increase the size of the crowd because there aren’t many traders left to be recruited, $\{\mu - N\} \approx 0$ .

Because the logistic growth model has been studied for so long, the population-size distribution that it generates is well known. There are two regimes: $\theta \cdot \mu < \delta$ and $\theta \cdot \mu > \delta$ . If $\theta \cdot \mu < \delta$ , then population of noise traders will eventually die out, $\lim_{t\to\infty}N = 0$ . To see why, think about how the system behaves as the crowd size gets small. When $N = \epsilon \approx 0$ , the crowd grows at an almost linear rate, $\theta \cdot \mu \cdot \epsilon +\mathcal{O}(\epsilon^2)$ . So, $\theta \cdot \mu < \delta$ means that, when the crowd gets small enough, existing noise traders lose interest faster then they can recruit their friends, which leads to the end of the crowd, $N=0$ . By contrast, if the growth rate is larger than the rate of decay when $N$ is small, $\theta \cdot \mu > \delta$ , then the population of noise traders will never get all the way to $N=0$ . And, if we rescale the units so that $\sfrac{\varsigma^2}{2} = \theta \cdot \mu - \delta$ , then we will find that number of noise traders in the crowd will be governed by an Exponential distribution with rate parameter $\sfrac{\lambda^2}{2}$ :

(8) $\begin{equation*} N \sim \mathrm{Exponential}(\sfrac{\lambda^2}{2}) \qquad \text{where} \qquad \lambda = \sqrt{{\textstyle \frac{2}{\mu} \cdot \left\{ \frac{\theta \cdot \mu}{\theta \cdot \mu - \delta} \right\}}} \end{equation*}$

The figure above shows the probability-density function for an Exponential distribution. It illustrates how, when the rate parameter is larger, the size of the noise-trader crowd tends to be smaller. And, the functional form for $\lambda$ reveals that the rate parameter will be largest as $\theta \cdot \mu \searrow \delta$ —i.e., when the growth rate is barely larger than the decay rate when the crowd size is small. Makes sense.

Neglected Risk

We now have our distribution for the number of noise traders in the market. So, we can return to our original inference problem and try to solve for the most likely value of the crowd’s aggregate demand given the observed pricing error, $\hat{C}(\tilde{P})$ . It turns out that, if the variance of the crowd’s aggregate demand is drawn from an Exponential distribution, then it’s easy to solve the integral in Equation (6). Andrews and Mallows shows that:

(9) $\begin{equation*} \frac{\lambda}{2} \cdot e^{- \, \lambda \cdot |C|} = \int_0^\infty \, \underset{C|N \sim \mathrm{Normal}(0, \, N)}{ \left\{ \frac{1}{\sqrt{2 \cdot \pi \cdot N}} \cdot e^{-\, \frac{\{C-0\}^2}{2 \cdot N}} \right\} } \cdot \underset{N \sim \mathrm{Exponential}(\sfrac{\lambda^2}{2})}{ \left\{ \frac{\lambda^2}{2} \cdot e^{-\,\frac{\lambda^2}{2} \cdot N} \right\} } \cdot \mathrm{d}N \end{equation*}$

So, if we set $\sfrac{\lambda}{2} \cdot e^{-\lambda \cdot |C|} = {\textstyle \int_0^{\infty}} \, \mathrm{Pr}(C|N) \cdot \mathrm{Pr}(N) \cdot \mathrm{d}N$ , then the optimization problem in Equation (6) simplifies:

(10) $\begin{equation*} \hat{C}(\tilde{P}) = \underset{C \in \mathrm{R}}{\arg\min} \left\{ \, \frac{1}{2 \cdot \sigma^2} \cdot \{\tilde{P} - C\}^2 + \lambda \cdot | C | \, \right\} \end{equation*}$

This simplification is really cool because the optimization problem above is just the optimization problem for the LASSO with a penalty parameter of $\sigma^2 \cdot \lambda$ (see Park and Casella).

There’s something weird about this result, though. Using the LASSO implies that:

(11) $\begin{equation*} \hat{C}(\tilde{P}) = \mathrm{Sign}(\tilde{P}) \cdot \left\{ \, |\tilde{P}| - \sigma^2 \cdot \lambda \, \right\}_+ \end{equation*}$

Above, $\{ x \}_+ = x$ if $x > 0$ and $0$ . So, if the observed pricing error is relatively small, $|\tilde{P}| < \sigma^2 \cdot \lambda$ , then a fully-rational trader will walk away from the market believing that $\hat{C}(\tilde{P}) = 0$ . He will completely neglect the risk coming from the crowd of noise traders.

Here’s the logic behind this neglect. If the pricing error is small, then there must not be very many noise traders in the market. If there aren’t very many noise traders, then they can’t affect the equilibrium price very much. And, this means that there’s no way for traders to learn about aggregate noise-trader demand from the equilibrium price, which means that there’s no reason for them to revise their beliefs about noise-trader demand away from zero after seeing the price.

What’s more, this line of reasoning is consistent with the functional form for the LASSO’s penalty parameter, $\sigma^2 \cdot \lambda$ . This expression says that traders will ignore the madness of crowds in the face of more extreme pricing errors (larger values of $|\tilde{P}|$ ) when either the crowd of noise traders tends to be smaller ( $\lambda$ is larger) or it’s easier to explain pricing errors with supply noise ( $\sigma^2$ is larger). And, this basic intuition should carry over to other situations where the size of the crowd of noise traders has some other fat-tailed distribution rather than an Exponential distribution.