Summary: Trading on Coincidences

1. Motivating Example

This post gives a non-technical summary of the results in my job market paper, Trading on Coincidences (2012). I start with a simple example. Suppose you see Apple among the $10$ stocks with the highest returns over the past quarter from October to December. Should you bother checking the tech industry more closely for a shock to fundamentals? Well, probably not. After all, there are always going to be $10$ stocks in the list of $10$ stocks with the highest returns. Investigating the tech industry would mean investigating $9$ or $10$ industries every single trading period.¹ What’s more, Apple’s extreme performance might not have anything to do with its industry. Perhaps a California-specific shock was the culprit? So, if you are going to investigate the tech industry, you are going to end up analyzing every single attribute of every single stock in the top $10$ returns. That’s a lot of work!

However, if you see both Apple and Microsoft in the top $10$ returns, this calculus changes. Now, you might actually want to investigate the tech industry. Coincidences like this one—i.e., $2$ or more stocks with the same attribute in the top or bottom $10$ returns—are much less likely to occur by pure chance.² In addition, using $2$ firms allows you to narrow the list of possible explanations. For instance, you can rule out any California-specific explanation since Microsoft is headquartered in Washington. Pushing this logic further, if Apple, Microsoft, and Research in Motion all realize top $10$ returns, then you should definitely look for a deeper explanation. Multi-stock coincidences like this are very rare, and these $3$ companies only share a handful of common attributes.³

2. Outline of Results

First, I start from the assumption that traders face an attention allocation problem. I build a model with $N$ stocks that realize attribute-specific cash flow shocks. Traders can’t sort all $N$ stocks over and over again on each of the $H$ characteristics (e.g., industry, customer, country, accounting firm, etc…) to check for these attribute-specific shocks by brute force.⁴ They must use an attention allocation heuristic. Second, I then propose one such heuristic: trading on coincidences. i.e., traders only update their beliefs about an attribute after observing a coincidence. I characterize the information content of a coincidence in this model and solve for asset prices. Third, I show that if traders use this heuristic, then stock returns will display post-coincidence comovement. i.e., regardless of when the attribute-specific shock to fundamentals occurs, asset prices will only respond after traders observe a coincidence. Fourth, I take this prediction to the data. I find that post-coincidence comovement at the industry level generates an $11{\scriptstyle \%/\mathrm{yr}}$ excess return that is not explained by any of the canonical factors models, other boundedly rational stories, or well known behavioral biases. Fifth and finally, I use computational complexity to give a physical basis for the scarcity of attention.

Roadmap:

Assume traders face an attention allocation problem and build consistent model.

Propose that traders use coincidences to solve this problem.

Show that returns will display post-coincidence comovement as a result.

Give empirical evidence of post-coincidence comovement suggesting that traders:

Use coincidences to direct their attention, and

Face an attention allocation problem with first order asset pricing implications.

Give interpretation of scarce attention using on computational bounds.

3. Detailed Description

I now discuss the nuts and bolts of each of these $5$ steps.

3.1. Model

First, I propose that traders face an attention allocation problem. Here is how I model this problem:

Assets: I consider a discrete time, infinite horizon economy. There are $N$ stocks and each stock has $H$ different attributes. For example, Apple is headquartered in Cupertino, is in the technology industry, is a major customer of Gorilla Glass, etc… Stocks realize persistent, rare, attribute-specific cash flow shocks. For example, if an innovation suddenly makes Gorilla Glass less expensive to manufacture in October, then all tablet making tech firms will realize a positive cash flow shock starting in October and lasting for several months (e.g., October, November, December, etc…). Firms pay out all of their earnings as dividends. Thus, the payout space is a giant $(N \times H)$ -dimensional matrix with all the attributes of all the stocks.

Agents: Traders are risk neutral and have priors about whether or not any particular attribute has realized a cash flow shock. I refer to these priors as a “mental model.” Because there are so many different ways to sort and groups stocks, I assume it is computationally infeasible for traders to check every single attribute-specific cluster of firms.⁵⁶ Instead, they must use some attention allocation heuristic.⁷

3.2. Heuristic

Second, as one way to solve this problem, I show that traders can use coincidences as an attention allocation device.⁸⁹ If agents trade on coincidences, then they only update their mental model about attribute-specific cash flows after they observe $2$ or more stocks with that attribute among the $10$ firms with the highest or lowest past returns. For instance, if the tech industry realizes a positive cash flow shock in October, most traders won’t immediately notice this change. There are too many possible stories explaining market returns for traders to investigate them all. Even though most traders haven’t noticed this industry-specific cash flow shock, all tech stocks will have slightly higher returns in October, November, December, etc… because of higher dividend payouts or because a few constrained specialists have started to incorporate this information into prices. However, most traders will only update their beliefs about the tech industry in December when Apple and Microsoft realize top $10$ returns and attract their attention. I solve for asset prices and give asymptotic expressions for the amount of information contained in coincidences.

3.3. Prediction

Third, I show that if traders use this heuristic, then stock returns will display post-coincidence comovement. For example, suppose again that the tech industry realizes a positive cash flow shock in October. Because there are so many things that could possibly affect stock returns at any one instant, most traders won’t immediately notice this event. The bomb has burst, but the blast wave hasn’t arrived yet. If people trade on coincidence, they will only notice this shock after Apple and Microsoft earn top $10$ returns in December. Thus, the prices of all tech stocks will rise in January after this coincidence—i.e., stock returns will display post-coincidence comovement. Crucially, this prediction holds “pointwise” across all characteristics that traders consider. For instance, suppose that traders look for both industry-specific coincidences and country-specific coincidences. Evidence of post-coincidence comovement at either the industry level or the country level is evidence that people are trading on coincidences.

3.4. Empirics

Fourth, I find that post-coincidence comovement at the industry level generates an $11{\scriptstyle \%/\mathrm{yr}}$ excess return:

Is It Tradable? Suppose that Apple and Microsoft both earn top $10$ returns in December while Ford, GM, and Toyota all realize bottom $10$ returns in December. I show that a trading strategy that is long all tech stocks except for Apple and Microsoft in January and short all auto stocks except for Ford, GM, and Toyota in January generates an $11{\scriptstyle \%/\mathrm{yr}}$ excess return with an annualized Sharpe ratio of $0.6$ . The excess returns to this trading strategy are not explained by industry momentum or other canonical factor models.¹⁰ Post-coincidence comovement is a necessary but not a sufficient explanations for the trading strategy returns. In the theoretical model, traders should update their beliefs about the tech industry in the first nanosecond of January. Thus, there would be no scope for trading profits. Any trading profits are due unmodeled trading frictions or limits to arbitrage.

Is It Measurable? In response to this concern, I also use a panel regression specification. Specifically, I look at the returns to IBM in January after Apple and Microsoft realized top $10$ returns and then also in, say, August after Oracle and Cisco realized bottom $10$ returns. I then compute the average difference between these two numbers across all firms after taking out firm-specific and month-specific fixed effects. Again, I find a spread of $11{\scriptstyle \%/\mathrm{yr}}$ which is on the same order of magnitude as the trading strategy $\alpha$ . This specification yields three additional results. First, I show that the size of this spread is largest after fresh coincidences. For instance, Apple and Microsoft might earn top $10$ returns in December, January, and February. I find that all of the post-coincidence comovement occurs in January in the month immediately after the first coincidence in December. This evidence is consistent with the idea that traders update their mental model after a coincidence attracts their attention. Next, I show that the size of this post-coincidence spread is increasing in the size of the coincidence. For instance, Apple and Microsoft might end up in the top $10$ returns in January by pure chance. However, if you see Apple, Microsoft, Research in Motion, Intel, and Cisco all in the top $10$ , something must have happened to the tech industry. Thus, coincidences involving more firms from an attribute are better signals and should yield a larger price reaction. Finally, I show that the cumulative abnormal returns following a coincidence are persistent out to $12$ months. Perhaps traders see Apple and Microsoft in the top $10$ , think to themselves: “Oh, wow! The tech industry must be doing fantastic.”, and then bid up the price of all tech stocks way too high. If this were the case, then the cumulative abnormal returns following a coincidence should spike and then revert back. This is not what happens.

3.5. Complexity

Finally, I give a physical interpretation for why traders’ attention is so scarce using tools from computational complexity. Suppose that you want to check every attribute-specific cluster of firms for a cash flow shock. I show that with $7000$ stocks, this brute force search strategy would take over $22{\scriptstyle \mathrm{days}}$ to complete at $1{\scriptstyle \mathrm{MIPS}}$ . By contrast, I find that by trading on coincidences people can dramatically reduce their time costs and still uncover a large fraction of all attribute-specific cash flow shocks. For instance, using the same parameters, I find that trading on coincidences requires less than a $1{\scriptstyle \mathrm{min}}$ of processing time. There are good reasons to be uneasy about the absolute level of the $22{\scriptstyle \mathrm{days}}$ estimate. Nevertheless, this estimate does suggest that traders attention allocation problem is non-trivial. After all there are only $21$ trading days in month on average. What’s more, there is an order of magnitude gap between the time cost of following the brute force inference strategy and trading on coincidences.

4. Key Implications

Let me now briefly outline $3$ interesting takeaways from this paper. First, this paper highlights a completely new and empirically relevant layer to traders’ inference problem—i.e., how do traders direct their attention? After all, traders sit in front of $4+$ computer monitors for a reason. Finding the right inference problem to solve is hard. For another example, Warren Buffett justified Berkshire Hathaway’s cash holdings in his 1987 Annual Letter to shareholders by writing: “Our basic principle is that if you want to shoot rare, fast-moving elephants, you should always carry a loaded gun.” The lesson is clear: Pulling the trigger is easy. Finding the elephant is hard.

Second, this paper suggests that the innate pattern recognition skills that make people good doctors, lawyers, and engineers from 9-to-5 can be used to uncover subtle changes in the market. Pattern recognition skills are hardcoded and universal, and coincidences are just one particularly salient pattern. The machinery developed in this paper can be used to analyze traders’ reactions to streaks or regular cycles. For instance, traders might ask themselves: “What are the odds that gold futures prices would rise for $6$ straight months by pure chance? Perhaps I should investigate this contract further?”

Finally, this paper poses the question: “How often should we see extreme price patterns?” Lots of other papers have looked for explanations for particular extreme events. For example, “How could we possibly rationalize the tech boom of the later 1990s?”, or “Why did house prices rise so much in Las Vegas in 2004 but not in Austin or Albuquerque?” By contrast, this paper takes an entirely different approach and asks: “How often should traders see some asset with an extreme price path?” For instance: “How unlikely is it that traders cycled their attention from biotech stocks to junk bonds back to dotcom stocks then to housing and most recently to gold futures over the course of a $30$ year period?”

When there are $50$ industries, you should expect to observe $9.15$ distinct industries when looking at $10$ randomly selected stocks. ↩
When there are $50$ industries, you should expect to see a $2$ -way coincidence in some industry $9$ out of every $10$ periods by pure chance. This is by no means rare, but it is an order of magnitude less frequent than when looking at every single industry represented in the top $10$ returns. ↩
When there are $50$ industries, you should expect to see a $3$ -way coincidence in some industry once every $20$ periods by pure chance. ↩
i.e., traders face an “unsupervised learning problem” (see Hastie, Tibshirani, and Friedman (2009, Ch. 14)). They don’t just have to solve a really hard but well-defined inference problem; rather, they also have to figure out which inference problem to analyze in the first place. Punchline: Searching for the right problem to solve requires cognitive/computational resources. ↩
This difficulty exists regardless of how easy it is to solve the resulting inference problems. e.g., the Sunday New York Times crossword puzzle is hard because it is difficult to search through all the words you know to come up with reasonable solutions to each clue. It is actually really easy to verify that the solution posted on Monday is indeed correct. Punchline: Search is harder than verification. ↩
In Gabaix (2012), traders only pay a cost for thinking about the impact of Peruvian copper discoveries on Apple’s dividend if they actively include this particular factor in their predictive model. By contrast, in the current paper, the initial step of considering the impact of Peruvian copper discoveries on Apple’s future dividends and then deciding not to include this factor in a predictive model comes with a cost. Traders can’t do this preprocessing step for every single obscure factor that might possibly affect Apple’s future dividends. They have to limit their attention to a manageable subset of factors. ↩
Contrary to popular belief, computer chess programs don’t use brute force. This is infeasible. Instead, they mine a huge database of past chess games known as “the book.” This database is big, but no where near as large as the universe of all possible games. Human players actually have the advantage when games “go off book.” For instance, Garry Kasparov famously went off book early in his games against Deep Blue. Punchline: Computers are really fast, but search is really really hard. ↩
This particular heuristic is motivated by anecdotal evidence. Open up any new site and you will find quotes like: “half of the top $10$ performers on the Nikkei 225 this year are domestic-oriented.” —The Wall Street Journal. Japan Commands New Respect. Jun 15, 2004. ↩
The trading on coincidences heuristic that I propose is not optimized in any sense. If a trader wanted to optimize this heuristic, he would surely have to consider engineering details like variation in the speed of computers, the length of the trading period, and the dimensions of the market. However, my goal is not optimality. I simply propose a plausible heuristic, derive a unique empirical prediction, and then ask the data whether or not people actually take this approach. ↩
Why is this not the same as industry momentum? The timing is different. To illustrate, let’s think about this tech stock example and compare the post-coincidence comovement trading strategy to an industry momentum trading strategy that is long/short the industries with the highest/lowest returns over the last $6$ months. Tech stocks realize a positive cash flow shock in October. This shock raises the mean return of all tech stocks slightly, but not enough to push the tech stocks to the top of the industry return rankings. People trading on coincidences only notice this shock in December when Apple and Microsoft earn top $10$ returns. As a result, the price of all tech stocks jumps up in January. An industry momentum trading strategy is then going to be long the tech industry in February at the earliest. Post-coincidence comovement actually triggers inclusion in an industry momentum portfolio. ↩