For those who were unable to attend, we compiled the key takeaways of the webinar featuring Professor Lasse Heje Pedersen from the Copenhagen Business School, NYU and principal at AQR.
Pedersen’s presentation revisited a topic that has been widely discussed in academic spheres and in financial media, namely the question whether or not financial research is facing a so-called ‘replication crisis’. His research focuses on measuring which studies in the domain of financial economics are replicable, by addressing two important challenges: internal validity and external validity. Researchers have argued that some economic studies cannot be replicated using the same data or a slightly different methodology, making them internally invalid. Similarly, there have been concerns about external validity, the argument being that although studies can be robustly replicated, they are spurious and are not expected to be replicable using different samples or in other time periods.
Professor Pedersen’s paper titled “Is There a Replication Crisis in Finance?” seeks to determine what percentage of factor research is replicable. Pedersen and colleagues use a large, new, replicable data set with 153 factors across 93 countries, combined with a theory-based Bayesian approach. “We combine the data set with a model that tells us a logical way to learn from this data. This is essentially a way to think about joint estimation of all these factors, in a way that it embeds a multiple testing correction,” Pedersen explains. “One of the issues here is that all these other papers have been treated in isolation, but we want to take into account that there are many tests being run at the same time.”
Midway through the presentation, Pedersen succinctly summarized the main findings of the study: “82% of factor research is, by our estimates, replicable in a way that is very robust. Our research uses a different method than all the original papers, that is highly externally valid. This percentage may not be perfect – but it is surprisingly high.”
The professor subsequently elaborated on the measurement of internal validity used in the research: “The confidence intervals of each factor marked in blue (see figure 1) correspond to a replicated factor. This is because the confidence interval is entirely above 0, which means that we are confident that the alpha is positive and, simultaneously, the results from the original study also showed that these factors were significantly positive – so then we can say that this factor is replicated,” Pedersen explained.
When comparing the research to the literature, Pedersen pointed out several differences and similarities: “Our hierarchical model uses a joint estimation, where we are using the benefit of all the data that we have, whilst the literature just uses one factor at a time. Therefore, the literature always has an OLS (Ordinary Least Squares) estimate, whilst we have a shrunk posterior, meaning that we are learning more about the true alpha from the joint use of that while being conservative and shrinking toward zero. Their confidence intervals widen when they use their multiple testing method, while our confidence interval actually contracts because we learn much more from the joint data and therefore have a more precise notion of what the true alpha is.”
To measure for external validity, the professor discussed a time-series of factors in the US and how the factors held up when analyzed before or after their original sampling. According to Pedersen, there is evidence that some of these factors are impacted by data mining and exogenous circumstances, but “there is certainly strong evidence that the majority of these factors are real.”
He elaborated on this statement by comparing his research with a critical paper by Harvey et al. (2016), who has argued that ‘most claimed research findings in financial economics are likely false’. Pedersen: “Harvey and colleagues argue that we should blow up standard errors and reject most factors as they are wrong. However, if you do that, you could be leaving money on the table as an investor. (…) What we try to do is take factors and evaluate: does our method say this factor is real? In other words, we look at the posterior mean, posterior volatility and the level of significance. If it is significant, we would say that this is an interesting factor to trade on.”
Pedersen observed that out-of-sample performance was consistent and that the data was remarkably externally valid: “it is out-of-sample relative to the original papers, relative to Harvey’s paper, and even out-of-sample in terms of geography – and the performance is still significant. For us, this was remarkably persuasive evidence and actually quite surprising. These findings hold up after their discovery, both in the US where they were discovered and outside the US, so we have external validity both in time and in geography.”
Interested in learning more about the research? You can also download the slides here or watch the recording of the webinar: