Research Affiliates: Not so happy days for backtests

Scott Longley

two men smiling for a picture

In their latest piece of analysis of the world of smart beta, it might be said that the team at Research Affiliates (RA) come not to bury backtests but to, shall we say, put them in their place.

And that place is very much in the spotlight due to what the RA team, led by John West (pictured left) and Alex Pickard (pictured right), see as unsustainable – not to say fanciful – claims for smart-beta performance that in essence does not stand up to much scrutiny.

In an extended analogy, the RA team reach for the new infamous episode of Happy Days where the Fonz jumps a caged shark as part of a waterskiing daredevil challenge. It is a classic example of overreach by a tired sitcom and the phrase ‘jump the shark’ has entered the popular idiom ever since as representing particularly implausible moments or storylines.

So to backtests. West and Pickard say the confluence of shorter time horizons, increased competition, and recent underperformance lays behind what they term as smart beta’s own ‘jump the shark’ moment and the implausible return outcomes that now exist within the space.

ETF Insight: The dangers of smart beta backtesting

“It is our understanding that at least one factor strategy provider is claiming a 4% annualised excess return over the last 10 years, without incurring a single calendar year of underperformance versus the cap-weighted index,” they write. “Is this plausible?”

Well, no and they proceed to demolish such claims by looking at past mutual fund performance and assessing just what would be a reasonable amount of outperformance that could be claimed by a smart beta fund.

They start with a dataset of US mutual fund performance from 1979 through to December 2018, or 4,463 funds – excluding index funds – which have survived for at least one calendar year. The simple measurement was fund performance after fees against the S&P 500; this is the same benchmark for the ‘shark-jumping’ claims, gives a measure of absolute performance against the market rather than the benchmark (which can be ‘gamed’ if a manager wishes to bet an ‘easy’ benchmark), and finally, allows for an easy interpretation.

One decided upon the dataset, West and Pickard suggest the historical performance of the active funds can then be analysed to assess what would be a plausible live performance. Crucially, this comes without two specific drawbacks from backtests: one, they are net of transaction costs and management fees. And two, fund managers have no need to publish their methodologies, whereas smart beta providers have to be explicit in their methods (which often leads to post-publication declines in performance).

Analysing the performance

Over the time periods of one year, three years and 10 years, the results were, of course, not surprising – most mutual funds underperform the market regardless of time period. Over one year, less than only 43% funds beat the market. That dropped to 41% over three years, before rising slightly over 10 years.

Another key finding is star performers – i.e. those with excess returns of over 4% a year – drop off over time. When we move from three to 10 years, the win rate falls to 9.19% even with the survivorship bias.

Research Affiliates’ Harvey: Backtesting is a huge problem in the industry

“Heroic outperformance generally does not endure as market cycles progress,” they write. “Funds typically have exposure mandates, and as a result, the funds optimally positioned to take advantage of today’s popular asset class and factor exposures will not be exposed to tomorrow’s.

“In other words, when the value factor does well, most value funds do well, and when low volatility is the factor du jour, low-vol funds outperform, and so on. But factors and asset classes inevitably undergo periods of underperformance, and so do the funds exposed to them.”

West and Pickard explain funds will often outperform only in certain market conditions, or for a defined period, but not over longer running cycles. This is due, they suggest, to three interrelated features of fund management performance that don’t necessarily hold true forever.

One is obvious – luck. “With a starting universe of a few thousand equity securities, a strategy – especially a concentrated one – can get randomly lucky given the large standard deviation of returns of individual securities and industries,” they write.

Second, capacity and trading issues mean that fund size is often inversely related to performance. Third, when mean reversion takes hold, it means 10-year returns can often be lower than the three-year or five-year.

The consistency challenge

Unsurprisingly, the conclusion is that consistency is hard to find. “The cyclicality of returns is a challenge for both asset managers and their clients. Clients want high excess returns with consistency. Smart beta providers are well aware of this concern and are increasingly emphasising multi-factor strategies to ostensibly alleviate wide performance swings associated with a particular investing style.”

West and Pickard ask the question at this point as to whether backtested multi-factor results – which they say are “too often represented as a reasonable basis for forecasting future returns” – are supported by the live track record of the best performing mutual funds?

To get to their answer, they look at what percentage of funds beat the market by over 4% a year over a three-year span and also outperformed over each one-year period. Understatedly, West and Pickard say that “achieving this level of consistency is much more difficult to do.” In fact, only 3.7% managed this feat.

So, getting back to the ‘jump the shark’ claim – namely that a smart beta backtest produced a 4% average annual excess returns over the past 10 years while also outperforming each individual year.

Well, if you are a proponent of such a thesis then you might want to look away now because such a feat has not been achieved once. Never. “In effect, any smart beta vendor who suggests that this is a reasonable expectation is laying claim to skill that no asset manager has ever exhibited before,” they conclude.

Even if the criteria are relaxed subtly to 3% average annual excess returns over the past 10 years the success rate is vanishingly small – just twice in 23,740 observations. As West and Pickard say, citing work by their colleagues Arnott, Cornell, and Shepherd from 2018, a bubble in asset prices requires “implausible future return assumptions.”

So they ask: “Might we have reached a bubble in smart beta performance claims?”

The realms of reality

Well, what are plausible claims? West and Pickard put forward the “reasonable” assumption that the “best smart beta strategies can earn an annualised 10-year excess return of 1-2% net of transaction costs.” Taking this tack avoids unsupportable claims that smart beta is the “magic elixir”.

The final hurdle is the consistency of returns. As West and Pickard point out, the desire on the part of investors to avoid any periods of underperformance and “woefully unrealistic.” Most long-term outperformers earn excess returns in only five or six years out of 10.

“A smart beta strategy, indeed any strategy whose performance deviates (even successfully) from the market’s performance, is virtually guaranteed to have multiple years of underperformance over a 10-year holding period,” they add.

“Backtests, especially those optimised to maximise the backtest results and then presented in sample (spanning the very years that were used to develop the model), may create the illusion of seemingly massive excess returns and limited to few if any bouts of underperformance,” they conclude. “A long-term survey of live mutual fund returns reveals a very different picture.”

As they say, West and Pickard’s aim is not to trash backtests per se. Any empirical research depends on backtests. Rather, their concern is that the quant community uses backtests repeatedly to fine-tune backtests’ results. “This practice is exacerbated with smart beta index strategies because the cost of launching another index backfilled with a better track record is virtually nil. In our view, if a backtest is used, iteratively and repeatedly, to boost a strategy’s own backtested performance, the strategy probably should be discarded.”

ETF providers in smart beta catch 22

And to those who think that a performance of 1-2% a year is not enough. Well, as West and Pickard say, in a world where savers are penalised by low interest rates and generally high equity valuations, a return of upwards of 20% after 10 years does not look all that shabby. Indeed, they suggest "carefully selected allocations to the better smart beta strategies" is "one of the more effective ways to narrow the return expectations gap".

“And with many smart beta strategies, especially those linked to the value factor, trading at abnormally cheap relative valuations, we see happy days again for smart beta investors with reasonable expectations.” And all with no need for any sharks.

Featured in this article


No ETFs to show.