Disclaimer. “The worth of an econometrics textbook tends to be inversely related to the technical material devoted to multicollinearity” – Williams, R. Economic Record 68, 80-1. (1992) via Kennedy, A Guide to Econometrics (6th edition).
If this quote does not interest you, then the rest of this research note is unlikely to be exciting, so perhaps best to skip this one.
Well, we are glad you made it, so let’s start talking about multicollinearity. The basic concept is that when we have several independent variables in a regression analysis, then these should be independent of each other. If these variables are correlated, then this erodes the statistical significance of the results.
Regression analysis in finance is almost always plagued by multicollinearity. First, many seemingly independent asset classes are more correlated than frequently assumed, e.g. US equities and high yield bonds. Secondly, correlations are not stable and change frequently. Merger arbitrage is uncorrelated to equities when stock markets are calm, but correlations spike when stocks crash.
Fortunately, there are methodologies to test for and reduce multicollinearity. A variance inflation factor (VIF) can be used to measure it. A Lasso regression can be used to increase the significance of regressions, which typically results in the removal of correlated variables. Another approach is to residualise the variables and make them more independent.
In this research note, we will explore residualising variables used in a regression analysis.
Residualising stock market indices
Residualising a variable simply means removing the effects of other variables. Ironically, we are going to use regression to do this as well. As a case study, we use the S&P 500 and residualise it against other equity indices, specifically European, Japanese, and emerging market (EM) stocks. First, we calculate the betas of the other equity indices against the S&P 500, and then remove their contribution to the return of the S&P 500 each day. This results in a residualised S&P 500 that represents the performance of US stocks without the influence of other key stock markets. It could be viewed as the idiosyncratic performance of US companies.
We observe that the performance of the residualised S&P 500 was almost identical to that of the original S&P 500 in the period from 2011 to 2021. The key difference is that the residualised returns were smoother and the drawdowns during stock market crashes were less severe.
In contrast, residualising the European stock market index highlights a more differentiated performance to the original index when compared to the results in the US. The trends between the two indices in Europe were similar between 2011 and 2016, but thereafter the residualised returns were consistently negative.
What does this mean?
We are not certain, but the results could be interpreted that without the animal spirits of the other stock markets, ie the US, Japan, and EM, the performance of European stocks would have been quite negative over the last decade. Europe’s economy has grown slower than the global one, but the European stocks benefitted from a global bullish environment for equities.
Next, we residualise the stock market indices of the US, Europe, Japan, and EM, and then calculate the correlations of the residualised to the original indices in the period from 2011 to 2021. The correlations were above 0.8 in the US and Japan, and higher than 0.6 in Europe and EM.
Overall, this highlights that the residualisation process does have a meaningful impact on these indices. Asset classes that have correlations of 0.6 are often viewed as diversifying, so most investors would likely categorise the residualised European and EM indices as substantially different from the original ones.
Benefits of residualising variables
The goal of residualising variables is to make them more independent of each other and reduce multicollinearity. We can demonstrate this by showing the 12-month rolling correlations of US and EM equities. Investing in emerging markets is often seen as diversifying, but the correlation to US stocks over the last decade was 0.5 on average and increased to 0.7 during crisis times like in March 2020.
We observe that the correlations drop to slightly below zero when the EM equity index is residualised. However, if both indices are residualised, then the average correlation decreases to -0.5. The trends in correlation are identical, but these results are not intuitive. Why would the correlation of the residual indices be negative and not zero?
Again, we are not certain, but perhaps currencies explain the negative relationship. The USD has been relatively strong over the last decade, similar to the performance of the US stock market. In contrast, EM stocks have largely been flat and EM economies benefit from a depreciating USD. One stock market benefitted while the other suffered from a rising USD, which is reflected in negatively correlated idiosyncratic returns.
Using residualised variables in a factor exposure analysis
Finally, we explore using the residualised indices in a simple factor exposure analysis of a global equities portfolio. Utilising the original stock indices highlights that the majority returns of the portfolio can be attributed to the US stock market and the minority to European stocks. There is no exposure to Japanese or EM stocks, and the R2 is 0.96, which implies the explanatory power of the model is high.
However, using the residualised indices paints a completely different picture as this portrays European and EM stocks contributing significantly to the performance of the global equities portfolio, although the R2 is almost identical. The factor beta to US equities is also considerably higher, which is a reflection of the lower volatility of the residualised US equity index.
One interpretation of these results is that using residualised variables paints a truer picture of the underlying exposure of the global equities portfolio. Although US stocks dominate the portfolio, most of these companies do business in Europe or EM, which is revealed by the residualised indices. For example, Apple is a US company, but China is its largest market, so the company has significant EM exposure.
Should investors residualise all variables for a factor exposure analysis?
Technically this can be achieved easily. A global multi-asset portfolio easily requires 20 different factors, eg equity indices, equity factors, fixed income indices, fixed income factors, currencies, commodities, etc. Fortunately, the residualisation process is relatively simple and can be conducted quickly for a large number of factors.
However, although this analysis highlights a case for using residualised variables in a factor exposure analysis, we only analysed equity indices. As a follow-on, we should look at residualising indices across asset classes and the impact on the factor exposure analysis of complex multi-asset portfolios. More work is required before calling a war for independence.
Nicolas Rabener is founder and CEO of FactorResearch