WELCOME TO RUFFER

All-weather investing

Seeking consistent positive returns.

Come rain or shine.

Ruffer provides investment management services for institutions, pension funds, charities, financial planners and individual investors.
Location
Select your Location
Visitor Type
Select investor type
Select investor type
London
Ruffer LLP
80 Victoria Street
London SW1E 5JL
Paris
Ruffer S.A.
103 boulevard Haussmann
75008 Paris, France
New York
Ruffer LLC
300 Park Avenue
New York NY 10022
Edinburgh
Ruffer LLP
31 Charlotte Square
Edinburgh EH2 4ET

Fifty shades of beige

 

Henry Jolliffe
Research Director*

How can we translate the unstructured data of the federal reserve’s beige book economic updates into a tool with predictive power?

Use an eighteenth century statistical theorem, or course – a theorem which also stopped the early internet being buried under a mountain of spam.

Like so much that is pure and decent in the world, our tale begins in Tunbridge Wells. Here, Thomas Bayes, a Presbyterian minister, first expounded Bayesian inference, which provides the statistical theorem for the intuitively simple but logically complex process of revising our beliefs in light of new evidence. For example, Bayesian logic allows us to re-evaluate the probability that it will rain on a clear summer’s day once a brooding storm cloud has appeared on the horizon.

Parking Bayes

As so often with works of genius, the true implications of Bayes’ insights were not appreciated until long after his lifetime. Following his death in 1761, his friend and collaborator Richard Price refined and developed Bayes’ work, presenting An Essay towards solving a Problem in the Doctrine of Chances at the Royal Society in 1763 – to no great acclaim.

It was not until 1812, when French polymath Pierre-Simon de Laplace independently addressed the problem in his Théorie analytique des probabilités, that the general version of the theorem was formulated, formally representing what is now known as Bayes’ theorem.

Bayesian methods were ideally suited to the problems of the nuclear age, which required assigning probabilities to events that had never before occurred, like the accidental detonation of a hydrogen bomb.

Yikes!

To better understand the intuition behind this equation, imagine two opaque jars, one containing 50 blue balls and 50 red balls and the other containing 99 blue balls and one red ball. You are given one of these jars at random and you pick one ball from it, also at random. If the ball you picked is red, what is the likelihood it came from the first jar?

Intuitively, we know that the much higher number of red balls in the first jar makes it far more likely the red ball is from the first jar. Bayes’ theorem enables us to put a numeric probability on this intuition. In this case, if we plug the probabilities into the equation, the answer is 98%.

Despite its superficial simplicity, the significance of Bayes’ insight is hard to overstate.

The Bayesian Explosion

After Laplace’s formulation, Bayes’ theorem languished in relative obscurity. It was not until the mid-twentieth century that Bayes rocketed into the mainstream.

The dawn of the Cold War – following the wartime achievements of applied mathematicians, such as the breaking of the Enigma code by Alan Turing and his colleagues at Bletchley Park – caused an explosion in government funding for scientific and statistical research.

Bayesian methods were ideally suited to the problems of the nuclear age, which required assigning probabilities to events that had never before occurred, like the accidental detonation of a hydrogen bomb, or the reliability of new devices, such as intercontinental ballistic missiles. Orthodox statistics were mute in the face of such questions, unsatisfactorily surmising that events with no prior observations simply could not occur.

Outside the military-industrial complex, the Bayesian renaissance spread far and wide. Notable examples include the work of Jerome Cornfield, an American statistician, who in 1959 used Bayesian techniques to prove the then-surprising conclusion that smoking causes lung cancer.1

More recently, modern theories of the human mind have gone so far as to suggest that, rather than simply awaiting an environmental stimulus to generate a response, the brain is constantly generating hypotheses about its environment and using observed stimuli to update its prediction by applying (you guessed it) a process of Bayesian inference.“We are not cognitive couch potatoes idly awaiting the next ‘input’, so much as proactive predictavores – nature’s own guessing machines forever trying to stay one step ahead by surfing the incoming waves of sensory stimulation.”2

Perhaps we are all Bayesians after all!

Have you got anything without spam?

One successful application of Bayes’ theorem led to the creation of junk email filters. It also gave us the idea for an applied use in  macroeconomics (which we shall explore below).

The dawn of the new millennium brought the widespread adoption of email. But no sooner had humanity invented a new means of ultra-cheap mass communication than it was turned into a new avenue for the dissemination of misinformation, fraud and noise on an unprecedented scale.

In the early 2000s, unsolicited automated email (called ‘spam’ in homage to a 1970 Monty Python sketch) posed an existential threat to the nascent internet. A study commissioned by the Pew Research Center in 2003 found that over half of all email  was spam, and 12% of personal email users were spending up to half an hour a day dealing with it. The study concluded that “spam is beginning to undermine the integrity of email and to degrade the online experience.”3

This echoed the Congressional testimony of the (unfortunately named) Orson Swindle, Commissioner of the Federal Trade Commission, earlier that year that “spam is about to kill the ‘killer app’ of the internet.”4

To the modern reader, these claims seem to verge on the hysterical. In the 15 years I have owned an email account, I haven’t had to spend any time dealing with spam. Yet spam still accounts for 45% of all electronic correspondence – as a quick check of your junk folder will reveal.5 So why is it no longer a problem?

No sooner had humanity invented a new means of ultra-cheap mass communication than it was turned into a new avenue for the dissemination of misinformation, fraud and noise on an unprecedented scale.

Bayes to the rescue

During the race to develop and refine spam filtering software, one approach stood out. Proposed by the American computer scientist Paul Graham in 2002, this solution reframed the problem in Bayesian terms.6

Rather than two jars of blue and red balls, Graham used two bags of words. One bag contained the words from a corpus of genuine correspondence (ham), the other words from a corpus of spam. By comparing word frequencies between the bags, Graham was able to assign to each word the probability that it belonged in either a spam or a ham email. Unsurprisingly, terms like ‘click’, ‘subscribe’, ‘$$$’ and ‘Viagra’ were found far more often in spam.

Emails could be reviewed automatically against these probability tables, with the scores for the individual words combined into an aggregate score for the email, thus deriving the probability that it was spam.

Two years later, at the inaugural Conference on Email and Anti-Spam (and doubtless the hottest ticket in town), Tony Meyer and Brendon Whateley presented a software solution, SpamBayes, building on Graham’s original insights.7

Spam was cured.

Parsing the beige book

The Beige Book (officially, the Summary of Commentary on Current Economic Conditions) is a report published by the Federal Reserve (Fed) eight times a year, before the meetings of the Federal Open Market Committee. It contains anecdotal information from each of the Fed districts around the US on a broad range of economic conditions.

Since the Beige Book’s first publication in 1970, the corpus amounts to 455 books and over 5.5 million words (the equivalent of 300 hours of continuous reading), forming a unique historical record of the US economy.8

It also constitutes a prime example of unstructured data. Unlike structured data – that is, statistics diligently compiled into easily accessible numerical formats such as timeseries and tables – unstructured data is inherently harder to analyse quantitatively.

The challenge we set ourselves: could we use Grant’s Bayesian methods for dealing with spam to cut through this lack of structure and score Beige Books based on macroeconomic variables?

Specialised lexicons

Using the Beige Book corpus, we generated two bags of words. One contained words found in the Beige Books when unemployment rose in the six months after publication, while the other contained words found when unemployment fell in the next six months. These bags allowed us to score Beige Books with a probability that unemployment would subsequently fall or rise (Figure 1).

Click to view larger image

Somewhat to our surprise, this novel approach worked well. Our results clearly identified each of the six major economic cycles since the mid-1970s (with the possible exception of the 1981-1982 recession). The Beige Book may be an uninspiring read, but the homogeneous nature of the corpus, with similar phraseology used repeatedly in similar contexts, makes it an unexpectedly good subject for Bayesian techniques.

Reassuringly, looking inside our bags for words that correspond strongly with either an increase or a decrease in unemployment (Figure 2), we find words that make intuitive sense (‘problem’, ‘drop’ and ‘weak’ versus ‘robust’, ‘strengthen’ and ‘positive’).

Click to view larger image

By scoring Beige Books in this way across a range of macroeconomic variables, we can assess the underlying state of the US economy. In addition, we can produce a range of specialised lexicons to glean insights from other sources, including corporate filings and earnings call transcripts.

Then I saw Tom Baynes, now I’m a believer, not a trace of doubt in my mind.

Among the converts

Given the success of our Beige Book experiment, we are exploring further areas where we can apply Bayesian techniques to aid our investment decision making. We can only echo the sentiments of the cult classic Bayesian Believer 9 (to the tune of I’m a Believer by The Monkees).

American Pie
The ancien régime of low inflation and free money is over. The painful adjustment process has further to run, with scope for mishaps as liquidity drains from the system. The market dreams of a Goldilocks scenario, just right for risky assets. But will the bears be kept at bay?
Read
Gone with the win-win?
A massive change in perceptions is needed to take into account the true environmental and social costs of businesses' operations. Guest contributor Duncan Austin suggests that perhaps sustainability is less a market opportunity and much more a moral challenge.
Read
  1. Cornfield, Haenszel, Hammond, Lilienfeld, Shimkin, Wynder (1959), Smoking and Lung Cancer: Recent Evidence and a Discussion of Some Questions
  2. Clark (2016), Surfing Uncertainty: Prediction, Action, and the Embodied Mind
  3. Pew Research Center (2003), Spam: How It Is Hurting Email and Degrading Life on the Internet
  4. The Washington Post (2003), FTC seeks broader powers to fight spam; Chicago Tribune
  5. Statista (2022), Spam: share of global email traffic 2014-20218 Beige Book Archive
  6.  Graham (2002), A Plan for Spam
  7. Meyer, Whateley (2004), SpamBayes: Effective open-source, Bayesian based, email classification system
  8. Beige Book Archive
  9. Carlin, Glickman, Rosenthal, Leslie, via Renfro (2008), International Society for Bayesian Analysis 2008 Bayesian Cabaret ‘Bayesian Believer’

*Henry worked at Ruffer until April 2023

This article first appeared in The Ruffer Review 2023.

Past performance is not a guide to future performance. The value of investments and the income derived therefrom can decrease as well as increase and you may not get back the full amount originally invested. Ruffer performance is shown after deduction of all fees and management charges, and on the basis of income being reinvested. The value of overseas investments will be influenced by the rate of exchange.

The views expressed in this article are not intended as an offer or solicitation for the purchase or sale of any investment or financial instrument, including interests in any of Ruffer’s funds. The information contained in the article is fact based and does not constitute investment research, investment advice or a personal recommendation, and should not be used as the basis for any investment decision. References to specific securities are included for the purposes of illustration only and should not be construed as a recommendation to buy or sell these securities. This document does not take account of any potential investor’s investment objectives, particular needs or financial situation. This article reflects Ruffer’s opinions at the date of publication only, the opinions are subject to change without notice and Ruffer shall bear no responsibility for the opinions offered. This financial promotion is issued by Ruffer LLP, which is authorised and regulated by the Financial Conduct Authority. Read the full disclaimer.

Investment Review
April 2024: Jonathan Ruffer discusses the stock market’s seemingly invincible summer. This has created distortions in both debt and equity markets, and with them, opportunities to benefit from a change in the season.
Audio icon
Read
Something new under the sun
Several new features of the global financial system have increased both the risk of a market crisis and its likely severity.
Read
Out of sight, out of mind
April 2024: Markets today are very different to the pre-2008 era. But has systemic risk been removed or relocated?
Read
London
Ruffer LLP
80 Victoria Street
London SW1E 5JL
Paris
Ruffer S.A.
103 boulevard Haussmann
75008 Paris, France
New York
Ruffer LLC
300 Park Avenue
New York NY 10022
Edinburgh
Ruffer LLP
31 Charlotte Square
Edinburgh EH2 4ET