## Abstract

The concept of diversification is central in finance and has become even more so since the 2008 financial crisis. In this article, the authors introduce a new measure for diversification. The measure, referred to as “diversification delta,” is nonparametric, based on higher moments, easily interpretable due to its mathematical formulation, and incorporates the advantages of the present measures of diversification while extending them. The measure is applied to infrastructure returns data in order to understand the benefits of diversifying across various infrastructure classes, gaining useful insights for infrastructure fund managers and investors.

Investors diversify their portfolios in order to reduce their exposure to unpriced risks of individual assets, and typically, the correlation matrix of asset returns is regarded as the common indicator of diversification (Dopfel [2003]). The correlation matrix, however, is merely a quantification of the pairwise relation between two or more stochastic processes and does not account for the variance of the individual assets and their effects on the variance of the portfolio (Statman and Scheid [2008]). Yet when the variance of the portfolio is also considered, the level of diversification is only measured by using the first two moments of the statistical distribution. And as Samuelson [1967] observed, the measurement of diversification through two moments may indeed be too restrictive and “crude.”

Over the years, several methods have been proposed to increase the accuracy of diversification measurement. An extensive review of the clustering-based methods can be found in the work of Brown and Goetzmann [2003] and Lhabitant [2004]. But in the context of this article, two methods are particularly significant for their innovative contributions: the Portfolio Diversification Index (PDI) by Rudin and Morgan [2006] and return gaps by Statman and Scheid [2008]. The PDI uses principal component analysis to quantify diversification, whereas return gaps considers the variance to quantify the portfolio’s dispersion of the returns around the mean. Notwithstanding, both measures fall short; they rely on the first two moments of the return distributions and do not include both correlation and variance.

The purpose of this article is to introduce a new measure of diversification, which, while being based on all moments, also retains the ease of application and interpretation of the two described measures. The concept of Shannon entropy, or information entropy, is at the core of our measure, hereafter referred to as the diversification delta (DD). Shannon entropy can measure the uncertainty related to the entire statistical distribution and by so doing, the diversification delta is a response to Samuelson’s criticism. Entropy captures the reduction in uncertainty as the portfolio of stocks becomes more diversified, in other words, increased diversification of a portfolio reduces uncertainty and lowers entropy in its final outcome.

The greater generality of information entropy places the DD somewhat outside the scope of traditional financial theory, therefore, a detailed definition of the measure and its testing is our main aim in this article. To further demonstrate the robustness of our approach, we apply the measure to a set of infrastructure indices, choosing the test sample to include both crisis and pre-crisis data. In doing so, the advantages of the DD over its main alternative, the correlation matrix, are clearly visible.

The remainder of the article is structured as follows. Next, we define the main characteristics of the diversification delta measurement. We then describe an application of the DD to measure the diversification of infrastructure investments. Our general conclusions round out the article.

**DEFINITION OF THE DIVERSIFICATION DELTA**

Differential entropy, that is, the continuous generalization of Shannon entropy [1948], is a measure of uncertainty of a random variable, and it is the concept which underpins the diversification delta. In our context, differential entropy represents the investor’s average uncertainty of the returns of an investment. Whereas variance quantifies the concentration of a return distribution around its mean, entropy measures concentration irrespective of its location in the distribution. For example, high levels of concentration around the tails of a distribution of asset returns will affect the entropy of the distribution, whereas variance will remain largely unaffected, as explained in Appendix A.

The entropy *H* of a random variable *X* with possible values *x ? X, X = R* can mathematically be defined as

where *f(x)* is the probability density function of *X*. *H* is formulated as the expectation of the natural logarithm of the probability density function. Whenever areas of concentration occur and some outcomes are more likely than others, *f(x)* increases as a consequence. Such areas of concentration decrease the uncertainty of the possible outcomes of a random draw. The entropy will therefore decrease when a distribution is more concentrated around a certain point.

As can be observed from Equation (1), *H* is exclusively related to probabilities in the discrete case and to the distribution functions in the continuous case, making complete abstraction of the nominal nature of the associated random variable. Two distinct probability distributions could therefore exist with the same entropy but with different variances. Empirical financial data, however, belong to a subclass of single maximum with monotonicity on both sides of the maximum; see Cont [2001] for a review of the properties of financial data. In practice, the previously mentioned case is not observed because it would require perfectly symmetrical anomalies in the distribution of returns.

At this stage, we can introduce the formal definition of the diversification delta. Let *X*
_{1}, *X*
_{2}, …, *X*
_{N} be risky assets of universe *U*. *P* is a portfolio with portfolio weights
and
. In order to facilitate its interpretation, we define the *Diversification Delta DD*(*P*) as the following ratio:

*DD(P)* is the ratio of the weighted average entropy of the assets
minus the entropy of the portfolio *H(P)* divided by the weighted average entropy of the assets
. The ratio measures the relative reduction in entropy, or put differently, by combining into a portfolio assets *X _{i}
* with weights

*w*, the ratio evaluates the relative reduction in uncertainty. Campbell [1966] showed how the exponential value of entropy retains the sought-after characteristics of the measure while allowing us to avoid the case of a singularity when .

_{i}The interpretation of the DD is rather straightforward. The DD is defined as a ratio varying between zero and one. A value of one indicates that only market risk remains in the portfolio and all idiosyncratic risk has been diversified. In such a case there is no longer any difference between the weighted average of the entropy of the individual assets and the entropy of the portfolio as a whole. This means that the weighted specific risk of the single stocks has no influence over the portfolio and what remains is the risk common to all, that is, the market risk.

In order to illustrate the DD concept, it is interesting to consider it in the context of an equally weighted portfolio of standardized Gaussian data; these data can be generated easily. The use of standardized data allows us to make an abstraction of the effects of variance so that we can most effectively understand how diversification affects the DD. We analyze two cases.

1. The DD and its relation to Pearson’s correlation coefficient, using an equally weighted two-asset portfolio of

*X*_{1},*X*_{2}.2. The DD when the portfolio’s size increases from one to

*N*assets in order to understand the effect of the pool size on the measure (Elton and Gruber [1997]).

For the estimation of the DD and the entropy, we use a recently developed nonparametric method known as k-d partitioning (Stowell and Plumbley [2009]); the method is described in detail in Appendix B.

**Case 1: Schematic Overview Using an Equally Weighted Two-asset Portfolio of ***X*
_{1},*X*
_{2}

*X*

_{1},

*X*

_{2}

We generate two sets of random data with an ex ante determined correlation coefficient varying between –1 and 1. The data are standardized Gaussian data. We subsequently compute the DD for each chosen value of the correlation coefficient and repeat this procedure for 1,000 iterations in order to avoid a selection bias. The result, averaged across the 1,000 iterations, is plotted in Exhibit 1.

The relationship between the DD and the correlation coefficient is nonlinear as expected. The definition of the DD, which through entropy is nonlinear, and the correlation coefficient, which is a linear dependence measure, leads to this nonlinearity. It is noteworthy that the decrease in the DD with respect to the correlation coefficient is more accentuated for negatively correlated stocks. The implication here is that, when portfolio selection or portfolio management applications are concerned, the DD will indicate a favorable bias toward uncorrelated stocks.

An overview of the various cases and conclusions that can be drawn from Case 1 is presented in Exhibit 2.

**Case 2: Increasing the Pool Size and the Effect on the ***DD*

*DD*

Our second case focuses on the most basic form of diversification, spreading wealth across assets. Following Elton and Gruber [1997], we use a set of *N* assets and systematically increase the portfolio’s size from 1 to *N,* monitoring the effect on the DD. In order to avoid a selection bias, we generate 1,000 sets of standardized Gaussian data, with each set consisting of 100 series. Each series contains randomly generated observations that follow a Gaussian distribution. The DD is measured for each set as *N* increases from 1 to 100. The result plotted in Exhibit 3 depicts an average across the 1,000 sets.

Consistent with the findings of Elton and Gruber, the DD makes most of its gains with the first 30 assets. After this point the entropy of the portfolio remains largely constant, whereas the average entropy of the individual assets could increase further. Each new asset adds new specific risk to the mix, which, due to the large number of assets, is averaged in the calculation of the portfolio entropy. The DD therefore increases slowly toward the value of one, as all idiosyncratic risk is systematically eliminated and only market risk remains.

We can conclude by underlining the three distinct advantages of the DD. First, it is a nonparametric measure of diversification and includes higher moments of the distribution, thereby addressing Samuelson’s criticism. Second, it is related to both correlation and variance, and third, the DD is easily interpretable due to its mathematical formulation. In the next section, in order to illustrate these three advantages, we apply the DD measure to a sample of global infrastructure indices.

**APPLICATIONS IN INFRASTRUCTURE INVESTMENT**

Infrastructure is a generic term representing the grouping of sectors that operate using very distinct business models with varying sensitivities to the business cycle; see Tan [2011]. The selected data are a sample of the weekly returns for the Dow Jones Brookfield Global Infrastructure Index, expressed in U.S. dollars, from its inception in 2002 to 2010. The data are separated along eight infrastructure sectors: airports (DJBAR), ports (DJBPR), water (DJBWR), communication infrastructure (DJBCM), oil and gas transport and storage (DJBOS), electricity transmission (DJBTD), toll roads (DJBTR), and diversified operators (DJBDV).

Using the first 150 weeks as in-sample data, we consider the out-of-sample behavior of an equally weighted portfolio of infrastructure indices by sliding the 150-week sample forward a week at a time. Exhibit 4 plots the DD and the intra-portfolio correlation (i.e., the weighted average correlation) for the out-of-sample period between 2005 and 2010. The exhibit quite clearly shows a plot in two parts, in which the bankruptcy of Lehman Brothers represents the pivotal moment. As the sample moves closer to the start of the recession, the correlation increases and the DD decreases. But after the bankruptcy of Lehman Brothers, and a brief but note-worthy upshot of the DD, the correlation becomes flat while the DD continues a marked descent to reach very low levels.

In order to clarify the analysis, we need to look at the characteristics of the data sample and the definition of the DD. After the bankruptcy, the level of non-Gaussianity in the sample increases significantly as expressed by the higher values of the Jarque–Bera normality test statistic (Exhibit 5). The sample was perhaps not Gaussian before the bankruptcy, yet the market conditions created an environment with relatively similar distributions for all sectors. Consequently, the difference between the entropy of a portfolio and the average entropy of the individual sectors became smaller.

After the bankruptcy, extreme returns and high volatility spread the distributions of the individual sectors thinner over a larger range. In fact, on average, co-movement in the market did not increase any more, as shown by the correlation, but the individual distributions no longer resemble each other so closely. The fact that individual distributions are more thinly spread implies that the distribution of an equally weighted portfolio will also be more thinly spread and that the DD will consequently be reduced further as the moving window slides along the sample.

The correlation coefficient and the DD therefore lead to very different conclusions. The DD shows that the general similarity between most sectors during the boom sample reduces the actual level of diversification obtained with the portfolio. During the bust part of the sample, the declining DD indicates how increased sector-specific risk further reduces the actual level of diversification. In both cases, inadequate but consciously chosen weights of the assets in the portfolio leave the portfolio exposed to sector-specific risk during both booms and busts. The correlation coefficient misses both realities. A portfolio manager may attribute the increased correlation to the general level of panic induced by the awaited crisis, yet the fact that the market reality has fundamentally changed cannot be deduced from the correlation alone.

The declining DD should, however, prompt the investor to act on the news of declining diversification by rebalancing the portfolio. In order to show what effect active portfolio management would have had on the DD, we construct a minimum-variance infrastructure portfolio through modern portfolio theory using the same dataset. Exhibit 6 plots the DD for the minimum-variance portfolio, which is rebalanced on a monthly basis.

The impact of the dynamic management of the portfolio is thus very relevant. As predicted, the level of diversification in the portfolio in general is much higher than for the equally weighted portfolio. Even though, with monthly rebalancing, we are probably overshooting the aim of optimizing the level of diversification, the example is a telling one. It allows us to conclude that, during the boom and bust sample, sector risk was adequately dealt with because the DD increased to stay high as sector risk became more prominent after the bankruptcy. The DD therefore gives the portfolio manager a much clearer picture of the reality in the market than the correlation coefficient.

**CONCLUSION**

We introduce a new measure for diversification of a portfolio or group of assets referred to as the diversification delta (DD). The diversification delta is based on information entropy, which is a measure of uncertainty associated to a random variable. The DD is nonparametric and uses all moments of the distribution, and it is easily interpretable.

Through several tests we show how the DD relates to traditional measures of diversification such as the intra-portfolio correlation and the portfolio variance. We demonstrate how the DD is a powerful tool for portfolio managers, because it is sensitive to the changing dynamics of the market as expressed by the changes in the return distributions of the assets. When the correlation coefficient fails to identify important shifts in sector-specific risk, the DD indicates further diversification is possible. We are confident that the DD represents an important addition to portfolio diversification analysis, which will allow for further innovative practical and theoretical developments.

**APPENDIX A**

**ENTROPY VERSUS VARIANCE**

The link between entropy and variance has not been described explicitly in the literature. Many authors have, however, either made use of the very apparent similarity between the two concepts or have provided a partial explanation for the existing link. One example that is particularly useful to our case is the explanation of Ebrahimi, Maasoumi, and Soofi [1999].

Ebrahimi, Maasoumi, and Soofi suggested the use of a Legendre expansion to approximate the probability density function by a series of polynomials, similar to the well-known Taylor expansion. A smooth density function *f*(*x*) can be well approximated as

where *G _{i}(x),i* = 1,…,

*N*are Legendre polynomials, which are defined as

From these expressions it can be shown that the variance *V*(*x*) equals

Equation (A-4) indicates that the variance depends on a_{2} and only increases when a_{2} increases.

If we now consider the partial derivative of the entropy with respect to a_{2}, an expression may be found that relates the variation of entropy to the variation of the variance,

This expression suggests the dependence of entropy on higher-order moments. Entropy thus gives a closer representation of the true probability distribution.

**APPENDIX B**

**ESTIMATION OF DIFFERENTIAL ENTROPY THROUGH K-D PARTITIONING**

We use a new nonparametric method presented by Stowell and Plumbley [2009] to estimate entropy. The estimation technique rests on the partition of data into cells of equal probability mass.

If we consider the dataset x and the partition *A* of
, the probability mass of *f*(*x*) in each cell equals
, which leads to the following approximation of the probability density function of the partitioning into cells, where µ(*A _{j}
*) is the volume of the cell

*A*,

_{j}This approximation of *f _{A}(x)* has the same probability mass in each cell as

*f(x)*, yet with a uniform density. Although the shape of

*f(x)*is usually unknown, we work directly from a series of data points of asset returns,

*x*. If we consider

_{j}*N*such points and the fact that probability mass in each cell is equal, we can estimate

*p*by where

_{j}*n*is the number of points in cell

_{j}*j.*An estimate of the probability density function then equals

Consequently, given the definition of entropy, an estimator can be constructed using the following relation

B-3 B-4The key to estimating entropy now lies with the selection of a partitioning scheme to create the partitions *A _{j}
*. Stowell and Plumbley solve this problem by using an adaptive partitioning method. This choice is made to guarantee a uniform distribution in each cell. The method they develop, referred to as k-d partitioning, is based on recursively splitting data into their quantiles using the median. The stopping criterion is, of course, a test for uniformity.

Apart from being the newest development to date in nonparametric entropy estimators, it also outperforms its rival methods in terms of speed and computational complexity. Additionally, the bias of the estimation is as good as, or better than, other rival methods in low dimensionality for a case such as the one for which we use the estimator.

## ENDNOTES

We would like to thank an anonymous referee for insightful comments and remarks regarding our paper.

Maximilian Vermorken and Francesca Medda have been sponsored by the EPSRC grant EP/H004505.

**Disclaimer**The opinions expressed in this article are the sole responsibility of the authors and do not necessarily reflect the views of the European Investment Bank.

- © 2012 Pageant Media Ltd