Improved estimation of the covariance matrix of stock returns with an application to portfolio selection

https://doi.org/10.1016/S0927-5398(03)00007-0Get rights and content

Abstract

This paper proposes to estimate the covariance matrix of stock returns by an optimally weighted average of two existing estimators: the sample covariance matrix and single-index covariance matrix. This method is generally known as shrinkage, and it is standard in decision theory and in empirical Bayesian statistics. Our shrinkage estimator can be seen as a way to account for extra-market covariance without having to specify an arbitrary multifactor structure. For NYSE and AMEX stock returns from 1972 to 1995, it can be used to select portfolios with significantly lower out-of-sample variance than a set of existing estimators, including multifactor models.

Introduction

The objective of this paper is to estimate the covariance matrix of stock returns. This is a fundamental question in empirical finance with implications for portfolio selection and for tests of asset pricing models such as the CAPM.

The traditional estimator—the sample covariance matrix—is seldom used because it imposes too little structure. When the number of stocks N is of the same order of magnitude as the number of historical returns per stock T, the total number of parameters to estimate is of the same order as the total size of the data set, which is clearly problematic. When N is larger than T, the sample covariance matrix is always singular, even if the true covariance matrix is known to be non-singular.1

These severe problems may come as a surprise, since the sample covariance matrix has appealing properties, such as being maximum likelihood under normality. But this is to forget what maximum likelihood means. It means the most likely parameter values given the data. In other words: let the data speak (and only the data). This is a sound principle, provided that there is enough data to trust the data. Indeed, maximum likelihood is justified asymptotically as the number of observations per variable goes to infinity. It is a general drawback of maximum likelihood that it can perform poorly in small sample. For the covariance matrix, small sample problems occur unless T is at least one order of magnitude larger than N.

The cure is to impose some structure on the estimator. Ideally, the particular form of the structure should be dictated by the problem at hand. In the case of stock returns, a low-dimensional factor structure seems natural. But this leaves two very important questions: How much structure should we impose? And what factors should we use?

To address these questions properly, we have to be more specific about how we impose a low-dimensional factor structure. One possible way is to specify a K-factor model with uncorrelated residuals. Then K controls show much structure we impose: the fewer the factors, the stronger the structure. The advantages of this approach are that it is quite familiar to the Finance profession and that the factors sometimes have economic interpretation. The disadvantages are that there is no consensus on the identity of the factors—except for the first one, which represents a market index—and that there is no consensus on the number of factors K either (Connor and Korajczyk, 1995). In other words, choosing between factor models is very ad hoc. It does not mean that none of them works well, it means that we do not know which one works well a priori. For example, if we are interested in selecting portfolios with low out-of-sample variance, in any given data set there may exist a factor model that performs well, but it may be a different one for every data set, and there is no way of telling which one works well without looking out-of-sample, which is cheating. The art of choosing a factor model adapted to a given data set without seeing its out-of-sample fit is just that: an art.

This is why, in this paper, we study another way of imposing factor structure. It is to take a weighted average of the sample covariance matrix with Sharpe's (1963) single-index model estimator. The weight α (between zero and one) assigned to the single-index model controls how much structure we impose: the heavier the weight, the stronger the structure. This is a well-known technique in Statistics called shrinkage dating back to Stein (1956): α is called the shrinkage intensity and the single-index model is our choice of shrinkage target. The advantages are that there is strong consensus on the nature of the single factor (a market index) and that there is a way of estimating the optimal shrinkage consistently. The estimation of α is the technically challenging part of this paper. It provides a rigorous answer to the question of how much structure we should impose. On any given data set, there will be a different optimal shrinkage intensity and our estimation technique will find it without having to look out-of-sample. This takes the ad-hockery out of the task of imposing structure on the covariance matrix of stock returns. It replaces the art of factor selection by a fully automatic procedure.

At this point, it is worth mentioning that the paper is solely concerned with the structure of risk in the stock market, not with the structure of expected returns. Multifactor models of the covariance matrix can still be very useful if economic arguments tie them up to the cross-section of expected returns, as in the Arbitrage Pricing Theory of Ross (1976). Any discussion of the relationship between risk factors and expected returns is outside the scope of the paper. There should be no ambiguity over whether we define “factors” in terms of the mean vector or of the covariance matrix of stock returns: it is always the latter.

Muirhead (1987) reviews the large literature on shrinkage estimators of the covariance matrix in finite-sample statistical decision theory. All these estimators suffer from at least two severe drawbacks, either of which is enough to make them ill-suited to stock returns: (i) they break down when N>T; (ii) they do not exploit the a priori knowledge that stock returns tend to be positively correlated to one another. Frost and Savarino (1986) show that the solution to the second problem is to use a shrinkage target that incorporates a market factor, but they ignore without justification the correlation between estimation error on the shrinkage target and on the covariance matrix, and they are still exposed to the first problem. A main contribution of our paper to the literature is to address the first problem through the definition the optimal shrinkage intensity by minimizing a loss function that does not involve the inverse of the covariance matrix. Moreover, the technique is so general that it is applicable to other shrinkage targets as well.

A noteworthy innovation is that the optimal shrinkage intensity depends on the correlation between estimation error on the sample covariance matrix and on the shrinkage target. Intuitively, if the two of them are positively (negatively) correlated, then the benefit of combining the information that they contain is smaller (larger). The introduction of this correlation term resolves a deep logical inconsistency in earlier empirical Bayesian literature, where the prior is estimated from sample data, yet at the same time is assumed to be independent from sample data.

We test the performance of our shrinkage estimator on stock returns data for portfolio selection. Using NYSE and AMEX stocks from 1972 to 1995, we find that our estimator yields portfolios with significantly lower out-of-sample variance than a set of well-established competitors, including multifactor models.

The remainder of the paper is organized as follows. Section 2 presents our shrinkage estimator of the covariance matrix. Section 3 presents empirical evidence on its out-of-sample performance for portfolio selection. Finally, Section 4 concludes. All figures and tables appear at the end of the paper.

Section snippets

Shrinkage estimator of the covariance matrix

This section presents the covariance matrix estimator that we recommend for stock returns.

Empirical results

We present empirical evidence on the performance of the shrinkage estimator defined in Section 3.6. We compare it to existing estimators in terms of its ability to select portfolios of stocks with low out-of-sample variance.

Conclusion

We have developed a flexible method for imposing some structure into a large-dimensional estimation problem, namely the problem of estimating the covariance matrix of a large number of stock returns. The crux of the method is to shrink the unbiased but very variable sample covariance matrix towards the biased but less variable single-index model covariance matrix and to thereby obtain a more efficient estimator. In addition, the resulting estimator is invertible and well-conditioned, which is

Acknowledgements

We wish to thank Andrew Lo, John Heaton, Bin Zhou, Timothy Crack, Bruce Lehmann, Richard Michaud, Richard Roll, Pedro Santa-Clara and Jay Shanken for their feedback. Also, the paper has benefited from seminar participants at MIT, the NBER, UCLA, Washington University in Saint Louis, Yale, Chicago, Wharton, UBC, Pompeu Fabra and Universität Dortmund. All remaining errors are our own.

Research of the second author was supported by DGES grant BEC2001-1270.

References (25)

  • E.F. Fama et al.

    Industry costs of equity

    Journal of Financial Economics

    (1997)
  • R. Roll

    A critique of the asset pricing theory's test: Part I. On past and potential testability of the theory

    Journal of Financial Economics

    (1977)
  • S.A. Ross

    The arbitrage theory of capital asset pricing

    Journal of Economic Theory

    (1976)
  • S. Bender et al.

    Arbitrage and the structure of risk: a mathematical analysis

    Modern Finance

    (1997)
  • T.-R. Bollerslev et al.

    A capital asset pricing model with time varying covariances

    Journal of Political Economy

    (1988)
  • G. Connor et al.

    A test for the number of factors in an approximate factor model

    Journal of Finance

    (1993)
  • G. Connor et al.

    The arbitrage pricing theory and multifactor models of asset returns

  • B. Efron et al.

    Stein's paradox in statistics

    Scientific American

    (1977)
  • E.J. Elton et al.

    Estimating the dependence structure of share prices

    Journal of Finance

    (1973)
  • E.F. Fama et al.

    Risk, return and equilibrium: empirical tests

    Journal of Political Economy

    (1973)
  • P.A. Frost et al.

    An empirical Bayes approach to portfolio selection

    Journal of Financial and Quantitative Analysis

    (1986)
  • J.D. Jobson et al.

    Estimation for Markowitz efficient portfolios

    Journal of the American Statistical Association

    (1980)
  • Cited by (0)

    View full text