-
PDF
- Split View
-
Views
-
Cite
Cite
Jangho Yang, Torsten Heinrich, Julian Winkler, François Lafond, Pantelis Koutroumpis, J. Doyne Farmer, Measuring productivity dispersion: a parametric approach using the Lévy alpha-stable distribution, Industrial and Corporate Change, Volume 34, Issue 1, January 2025, Pages 79–117, https://doi.org/10.1093/icc/dtae021
- Share Icon Share
Abstract
It is well known that value added (VA) per worker is extremely heterogeneous among firms, but relatively little has been done to characterize this heterogeneity more precisely. Here, we show that the distribution of VA per worker exhibits heavy tails, a very large support, and consistently features a proportion of negative values, which prevents log transformation. We propose to model the distribution of VA per worker using the four-parameter Lévy stable distribution, a natural candidate deriving from the generalized central limit theorem, and we show that it is a better fit than key alternatives. Fitting a distribution allows us to capture dispersion through the tail exponent and scale parameters separately. We show that these parametric measures of dispersion can be useful to characterize the evolution of dispersion in recent years.
1. Introduction
In the last two decades, the availability of micro-data has revealed the tremendous productivity differences between firms, even within sectors at a fairly detailed level (Bartelsman and Doms, 2000; Syverson, 2011). Measuring productivity dispersion is important in order to understand misallocation (Gopinath et al., 2017), innovation and diffusion (Berlingieri et al., 2017; Andrews et al., 2019), business dynamism (Foster et al., 2021), and ultimately the aggregate productivity slowdown (Goldin et al., 2021).

Measured standard deviation in sub-samples of firm LP. We construct a distribution by pooling together the productivity levels of firms in France, Italy, Germany, and Spain (which have the same currency), for all years, expressed in thousands. Then, for each subsample size N, we compute the standard deviation of 100,000 subsamples and report the average (dots) and 5th and 95th percentiles. In Appendix 7, we explain that on i.i.d. data with a power law tail, the theoretical scaling between the sample standard deviation and the sample size would be |$N^{\frac{1}{\alpha}-\frac{1}{2}}$|, where α is the tail exponent. The solid line shows that this scaling holds empirically here (we use our estimate of the tail parameter |$\hat{\alpha}=1.33$|, and the intercept is calculated by simulating 100,000 i.i.d. random samples of size 10 from a Lévy alpha-stable with the same parameters as those we estimate on empirical data). Data from Orbis Europe
In this paper, we focus on the distribution of value added (VA) per worker at the firm level, a common measure of productivity in the literature.1 We employ a commercial dataset, Orbis Europe. We have access to a comprehensive version that includes around 23 million European firm-year observations for the period 2006–2017. The distribution of productivity levels, even within country-year pairs, exhibits heavy tails and, therefore, an infinite variance. In practice, this means that measuring the variance in a given sample is not meaningful, as the result is driven by the sample size rather than reflecting a true moment of the underlying population (Fig. 1).
A common solution to this problem is to compute the variance of the logarithm of productivity (instead of the variance of productivity). For variables that have positive support, such as revenue per worker, this is an acceptable solution since the distribution of the log values likely exhibits finite variance. However, this is problematic for VA per worker, because firm-level datasets often contain a substantial proportion of firms that have negative VA—typically of only a few percent but up to 23% for country-year pairs in our dataset (Table 1). To state the obvious, as far as measuring dispersion is concerned, removing the firms on the left tail is problematic. The prevalence of firms that stay alive despite very poor performance should be a key indicator in studies of misallocation, creative destruction, and business dynamism.
. | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 2.58 | 1.33 | 1.56 | 1.62 | 1.49 | 1.44 | 1.55 | 1.59 | 1.53 | 1.49 | 1.53 | 1.58 |
Bulgaria | 9.33 | 4.02 | 5.79 | 5.86 | 7.45 | 7.60 | 7.02 | 6.35 | 6.10 | 5.47 | 5.13 | |
Croatia | 6.91 | 4.46 | 4.60 | 5.66 | 6.18 | 5.27 | 5.79 | 4.54 | 4.00 | 3.55 | 2.86 | |
Czech Republic | 7.10 | 4.89 | 5.32 | 6.49 | 6.26 | 6.35 | 6.33 | 6.19 | 5.48 | 4.52 | 4.30 | 3.69 |
Denmark | 2.51 | 2.98 | 2.83 | 2.51 | 2.91 | 3.04 | ||||||
Estonia | 2.90 | 4.09 | 6.09 | 4.65 | 3.25 | 3.11 | 3.12 | 3.37 | 3.38 | 3.30 | 2.75 | |
Finland | 2.09 | 1.60 | 1.83 | 1.99 | 1.84 | 2.03 | 1.91 | 1.95 | 1.94 | 2.08 | 2.08 | 1.90 |
France | 1.19 | 1.10 | 1.31 | 1.27 | 1.13 | 1.16 | 1.32 | 1.46 | 1.43 | 1.60 | 1.81 | 1.89 |
Germany | 2.56 | 1.96 | 2.48 | 2.43 | 2.14 | 2.17 | 2.16 | 2.00 | 2.39 | 2.48 | 2.31 | |
Hungary | 4.11 | 3.11 | 7.01 | 6.61 | 6.68 | 7.18 | 5.76 | 5.16 | 4.87 | 4.47 | 3.62 | |
Italy | 4.64 | 3.08 | 3.64 | 4.16 | 3.68 | 3.78 | 5.00 | 4.74 | 4.32 | 3.84 | 3.16 | 2.79 |
Poland | 3.77 | 4.01 | 4.78 | 6.11 | 3.72 | 3.95 | 8.12 | 7.42 | 2.49 | |||
Portugal | 8.62 | 6.06 | 6.51 | 6.12 | 5.81 | 6.87 | 8.34 | 7.87 | 7.86 | 6.47 | 5.57 | 4.80 |
Romania | 7.58 | 8.52 | 10.73 | 11.73 | 11.46 | 12.31 | 12.47 | 10.40 | 7.34 | 6.39 | ||
Slovakia | 4.44 | 4.55 | 7.64 | 7.11 | 7.23 | 7.20 | 7.13 | 6.05 | 4.10 | 4.21 | ||
Slovenia | 1.50 | 1.69 | 1.81 | 1.88 | 1.56 | 1.44 | 1.35 | 1.25 | 1.15 | |||
Spain | 3.78 | 2.92 | 3.60 | 3.90 | 3.65 | 4.04 | 4.41 | 4.05 | 3.33 | 2.94 | 2.57 | 2.07 |
Sweden | 3.82 | 2.16 | 2.41 | 2.27 | 2.13 | 2.18 | 2.10 | 2.10 | 2.07 | 2.08 | 2.15 | 2.35 |
UK | 20.98 | 21.72 | 23.49 | 11.31 | 4.14 | 4.04 | 3.84 | 3.52 | 3.57 | 3.73 | 3.92 |
. | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 2.58 | 1.33 | 1.56 | 1.62 | 1.49 | 1.44 | 1.55 | 1.59 | 1.53 | 1.49 | 1.53 | 1.58 |
Bulgaria | 9.33 | 4.02 | 5.79 | 5.86 | 7.45 | 7.60 | 7.02 | 6.35 | 6.10 | 5.47 | 5.13 | |
Croatia | 6.91 | 4.46 | 4.60 | 5.66 | 6.18 | 5.27 | 5.79 | 4.54 | 4.00 | 3.55 | 2.86 | |
Czech Republic | 7.10 | 4.89 | 5.32 | 6.49 | 6.26 | 6.35 | 6.33 | 6.19 | 5.48 | 4.52 | 4.30 | 3.69 |
Denmark | 2.51 | 2.98 | 2.83 | 2.51 | 2.91 | 3.04 | ||||||
Estonia | 2.90 | 4.09 | 6.09 | 4.65 | 3.25 | 3.11 | 3.12 | 3.37 | 3.38 | 3.30 | 2.75 | |
Finland | 2.09 | 1.60 | 1.83 | 1.99 | 1.84 | 2.03 | 1.91 | 1.95 | 1.94 | 2.08 | 2.08 | 1.90 |
France | 1.19 | 1.10 | 1.31 | 1.27 | 1.13 | 1.16 | 1.32 | 1.46 | 1.43 | 1.60 | 1.81 | 1.89 |
Germany | 2.56 | 1.96 | 2.48 | 2.43 | 2.14 | 2.17 | 2.16 | 2.00 | 2.39 | 2.48 | 2.31 | |
Hungary | 4.11 | 3.11 | 7.01 | 6.61 | 6.68 | 7.18 | 5.76 | 5.16 | 4.87 | 4.47 | 3.62 | |
Italy | 4.64 | 3.08 | 3.64 | 4.16 | 3.68 | 3.78 | 5.00 | 4.74 | 4.32 | 3.84 | 3.16 | 2.79 |
Poland | 3.77 | 4.01 | 4.78 | 6.11 | 3.72 | 3.95 | 8.12 | 7.42 | 2.49 | |||
Portugal | 8.62 | 6.06 | 6.51 | 6.12 | 5.81 | 6.87 | 8.34 | 7.87 | 7.86 | 6.47 | 5.57 | 4.80 |
Romania | 7.58 | 8.52 | 10.73 | 11.73 | 11.46 | 12.31 | 12.47 | 10.40 | 7.34 | 6.39 | ||
Slovakia | 4.44 | 4.55 | 7.64 | 7.11 | 7.23 | 7.20 | 7.13 | 6.05 | 4.10 | 4.21 | ||
Slovenia | 1.50 | 1.69 | 1.81 | 1.88 | 1.56 | 1.44 | 1.35 | 1.25 | 1.15 | |||
Spain | 3.78 | 2.92 | 3.60 | 3.90 | 3.65 | 4.04 | 4.41 | 4.05 | 3.33 | 2.94 | 2.57 | 2.07 |
Sweden | 3.82 | 2.16 | 2.41 | 2.27 | 2.13 | 2.18 | 2.10 | 2.10 | 2.07 | 2.08 | 2.15 | 2.35 |
UK | 20.98 | 21.72 | 23.49 | 11.31 | 4.14 | 4.04 | 3.84 | 3.52 | 3.57 | 3.73 | 3.92 |
Notes: This table records the proportion (%) of negative observations of VA per country-year for all 19 countries.
. | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 2.58 | 1.33 | 1.56 | 1.62 | 1.49 | 1.44 | 1.55 | 1.59 | 1.53 | 1.49 | 1.53 | 1.58 |
Bulgaria | 9.33 | 4.02 | 5.79 | 5.86 | 7.45 | 7.60 | 7.02 | 6.35 | 6.10 | 5.47 | 5.13 | |
Croatia | 6.91 | 4.46 | 4.60 | 5.66 | 6.18 | 5.27 | 5.79 | 4.54 | 4.00 | 3.55 | 2.86 | |
Czech Republic | 7.10 | 4.89 | 5.32 | 6.49 | 6.26 | 6.35 | 6.33 | 6.19 | 5.48 | 4.52 | 4.30 | 3.69 |
Denmark | 2.51 | 2.98 | 2.83 | 2.51 | 2.91 | 3.04 | ||||||
Estonia | 2.90 | 4.09 | 6.09 | 4.65 | 3.25 | 3.11 | 3.12 | 3.37 | 3.38 | 3.30 | 2.75 | |
Finland | 2.09 | 1.60 | 1.83 | 1.99 | 1.84 | 2.03 | 1.91 | 1.95 | 1.94 | 2.08 | 2.08 | 1.90 |
France | 1.19 | 1.10 | 1.31 | 1.27 | 1.13 | 1.16 | 1.32 | 1.46 | 1.43 | 1.60 | 1.81 | 1.89 |
Germany | 2.56 | 1.96 | 2.48 | 2.43 | 2.14 | 2.17 | 2.16 | 2.00 | 2.39 | 2.48 | 2.31 | |
Hungary | 4.11 | 3.11 | 7.01 | 6.61 | 6.68 | 7.18 | 5.76 | 5.16 | 4.87 | 4.47 | 3.62 | |
Italy | 4.64 | 3.08 | 3.64 | 4.16 | 3.68 | 3.78 | 5.00 | 4.74 | 4.32 | 3.84 | 3.16 | 2.79 |
Poland | 3.77 | 4.01 | 4.78 | 6.11 | 3.72 | 3.95 | 8.12 | 7.42 | 2.49 | |||
Portugal | 8.62 | 6.06 | 6.51 | 6.12 | 5.81 | 6.87 | 8.34 | 7.87 | 7.86 | 6.47 | 5.57 | 4.80 |
Romania | 7.58 | 8.52 | 10.73 | 11.73 | 11.46 | 12.31 | 12.47 | 10.40 | 7.34 | 6.39 | ||
Slovakia | 4.44 | 4.55 | 7.64 | 7.11 | 7.23 | 7.20 | 7.13 | 6.05 | 4.10 | 4.21 | ||
Slovenia | 1.50 | 1.69 | 1.81 | 1.88 | 1.56 | 1.44 | 1.35 | 1.25 | 1.15 | |||
Spain | 3.78 | 2.92 | 3.60 | 3.90 | 3.65 | 4.04 | 4.41 | 4.05 | 3.33 | 2.94 | 2.57 | 2.07 |
Sweden | 3.82 | 2.16 | 2.41 | 2.27 | 2.13 | 2.18 | 2.10 | 2.10 | 2.07 | 2.08 | 2.15 | 2.35 |
UK | 20.98 | 21.72 | 23.49 | 11.31 | 4.14 | 4.04 | 3.84 | 3.52 | 3.57 | 3.73 | 3.92 |
. | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 2.58 | 1.33 | 1.56 | 1.62 | 1.49 | 1.44 | 1.55 | 1.59 | 1.53 | 1.49 | 1.53 | 1.58 |
Bulgaria | 9.33 | 4.02 | 5.79 | 5.86 | 7.45 | 7.60 | 7.02 | 6.35 | 6.10 | 5.47 | 5.13 | |
Croatia | 6.91 | 4.46 | 4.60 | 5.66 | 6.18 | 5.27 | 5.79 | 4.54 | 4.00 | 3.55 | 2.86 | |
Czech Republic | 7.10 | 4.89 | 5.32 | 6.49 | 6.26 | 6.35 | 6.33 | 6.19 | 5.48 | 4.52 | 4.30 | 3.69 |
Denmark | 2.51 | 2.98 | 2.83 | 2.51 | 2.91 | 3.04 | ||||||
Estonia | 2.90 | 4.09 | 6.09 | 4.65 | 3.25 | 3.11 | 3.12 | 3.37 | 3.38 | 3.30 | 2.75 | |
Finland | 2.09 | 1.60 | 1.83 | 1.99 | 1.84 | 2.03 | 1.91 | 1.95 | 1.94 | 2.08 | 2.08 | 1.90 |
France | 1.19 | 1.10 | 1.31 | 1.27 | 1.13 | 1.16 | 1.32 | 1.46 | 1.43 | 1.60 | 1.81 | 1.89 |
Germany | 2.56 | 1.96 | 2.48 | 2.43 | 2.14 | 2.17 | 2.16 | 2.00 | 2.39 | 2.48 | 2.31 | |
Hungary | 4.11 | 3.11 | 7.01 | 6.61 | 6.68 | 7.18 | 5.76 | 5.16 | 4.87 | 4.47 | 3.62 | |
Italy | 4.64 | 3.08 | 3.64 | 4.16 | 3.68 | 3.78 | 5.00 | 4.74 | 4.32 | 3.84 | 3.16 | 2.79 |
Poland | 3.77 | 4.01 | 4.78 | 6.11 | 3.72 | 3.95 | 8.12 | 7.42 | 2.49 | |||
Portugal | 8.62 | 6.06 | 6.51 | 6.12 | 5.81 | 6.87 | 8.34 | 7.87 | 7.86 | 6.47 | 5.57 | 4.80 |
Romania | 7.58 | 8.52 | 10.73 | 11.73 | 11.46 | 12.31 | 12.47 | 10.40 | 7.34 | 6.39 | ||
Slovakia | 4.44 | 4.55 | 7.64 | 7.11 | 7.23 | 7.20 | 7.13 | 6.05 | 4.10 | 4.21 | ||
Slovenia | 1.50 | 1.69 | 1.81 | 1.88 | 1.56 | 1.44 | 1.35 | 1.25 | 1.15 | |||
Spain | 3.78 | 2.92 | 3.60 | 3.90 | 3.65 | 4.04 | 4.41 | 4.05 | 3.33 | 2.94 | 2.57 | 2.07 |
Sweden | 3.82 | 2.16 | 2.41 | 2.27 | 2.13 | 2.18 | 2.10 | 2.10 | 2.07 | 2.08 | 2.15 | 2.35 |
UK | 20.98 | 21.72 | 23.49 | 11.31 | 4.14 | 4.04 | 3.84 | 3.52 | 3.57 | 3.73 | 3.92 |
Notes: This table records the proportion (%) of negative observations of VA per country-year for all 19 countries.
1.1 How can we measure dispersion in a dataset with negative values and heavy tails?
In this paper, we propose a transparent and straightforward approach: first, finding a good parametric model for the distribution of productivity, and second, using the fitted parameters to evaluate dispersion. For instance, if the data were normally distributed, an excellent way to evaluate how dispersion differs across time or countries would have been to estimate the scale parameter of the normal distribution, that is, the variance. Here, however, we find that the data exhibit heavy tails and substantial asymmetry, so we turn to more complex models. We find that the Lévy alpha-stable distribution gives an excellent fit to the data, and we argue that its parameters—particularly the scale and the tail parameters—provide appropriate metrics of dispersion.
The scale parameter of the Lévy distribution is similar to the standard deviation in a normal distribution. Roughly speaking, it provides information on the span of the support of the distribution where most of the data lies. In contrast, the tail exponent provides information on the prevalence of extremely productive or unproductive firms.
While the Lévy alpha-stable distribution elegantly addresses the issues of heavy tails and negative values, the problem with parametric measures is that the underlying model can be wrong. We address this in two ways. First, we show that the Lévy alpha-stable offers a surprisingly good fit to our productivity data and that it is clearly better than the Subbotin (asymmetric exponential power, AEP) distribution (Bottazzi and Secchi, 2011), and as good as or slightly better than the Asymmetric Student-t (AST) distribution (Zhu and Galbraith, 2010), both of which have one more parameter. We add credibility to the model by using independent tests to confirm that the second moment is infinite. Second, we argue that the Lévy alpha-stable model is plausible a priori simply because it is the result of the generalized central limit theorem (GCLT) (Nolan, 2020). If VA per worker can be thought of as the aggregation of several micro-level variables such as the productivity of individual employees, contracts, tasks, or routines, then under fairly general conditions, the GCLT predicts that firm-level productivity will follow a Lévy alpha-stable distribution.
Having established that the data are well fitted by a Lévy distribution, we can use the estimated parameters to evaluate dispersion. We summarize different approaches—parametric approaches from the Lévy and the AST and non-parametric approaches using different ways of combining quantiles. We then proceed with a case study: the evolution of dispersion in recent years. Because we have only 10 years of data, it is difficult to extract clear patterns, but we can clearly conclude that using log-based indicators on truncated data provides a very different view of dispersion, compared to other metrics. To establish whether the parametric metrics can be useful, we focus on the UK, because it had an exceptionally high number of negative VA firms during the financial crisis (partly due to the high prevalence of non-profit organizations (NPOs) in Orbis UK). We find that the Lévy parameters are very helpful in describing the evolution of the distribution during and after the financial crisis. First, we find a large increase in skewness and a decline in “scale” (body) dispersion, as the mass of negative VA firms disappeared and shifted to the right. This pattern is picked up by a suitably designed quantile-based measures of body dispersion but not by the common 90/10 interquantile ratio, which cannot be computed when more than 10% of firms have negative VA. Second, the Lévy fit indicates a substantial increase in tail dispersion, which is confirmed by other estimators of the tail exponents, but is not captured by a common (95/50) quantile-based estimate of the prevalence of “superstar” firms.
1.2 Literature
Our results relate to three strands of literature. The first is the large body of literature on heavy tails in economics and finance (Mitchell, 1915; Mandelbrot, 1960, 1963; Fama, 1965; Samuelson, 1967; Embrechts et al., 1997; Fagiolo et al., 2008; Gabaix, 2011; Acemoglu ET al., 2017; Axtell, forthcoming), more specifically on the distributions of firm sizes (Ijiri and Simon, 1977; Axtell and Guerrero, 2001) and firm growth rates (Bottazzi and Secchi, 2003; Bottazzi et al., 2007; Bottazzi and Secchi, 2011; Schwarzkopf et al., 2010; Holly et al., 2013; Moran et al., 2020). Most related to our study is the work on estimating productivity distributions such as Aoyama et al. (2010), who estimate a generalized beta model for labor productivity (LP) levels,2 Souma, et al. (2009), who find a power law tail for LP levels,3 and Gaffeo (2008); Gaffeo (2011), who estimates a Lévy alpha-stable distribution for the growth rates of sector-level total factor productivity. In the present paper, we focus on firm-level data, productivity levels, and productivity measured as value added per worker.
Second, our results on productivity levels are relevant to the literature that discusses how productivity dispersion may reflect the misallocation of factors of production (Hsieh and Klenow, 2009; Bartelsman et al., 2013; Foster et al., 2021; Gopinath et al., 2017; Haltiwanger et al., 2018). In this paper, we critically discuss measures of dispersion in detail, showing how some statistics may be misleading and what parts of the distribution can affect dispersion. We also suggest that theoretical models should aim to derive a Lévy alpha-stable distribution, where ideally the parameters would be interpreted in terms of misallocation or other sources.4
Third, our results relate to the work of statistical agencies around the world, who are developing procedures to make summary statistics of the micro-data publicly available, and in particular, industry-level productivity dispersion (Cunningham et al., 2021; Berlingieri et al., 2017). A key contribution of this paper is to make a suggestion regarding what statistics should be released, namely five quantiles necessary to perform McCulloch (1986)’s quantile estimation of the four Lévy alpha-stable parameters. Typically, readily available software packages use the 5th, 25th, 50th, 75th, and 95th quantiles, as in McCulloch (1986)’s original paper. A remarkable consequence of the quality of the Lévy fit and the availability of the McCulloch (1986) quantile estimator is that we can in principle obtain a good estimate of the tail parameter by using only five quantiles, the largest of which is only the 95th quantile. No substantial risk of disclosing any confidential data would result from this. This is important because estimators of the tail exponent generally rely on order statistics, which are by definition very disclosive and would not be made available by statistical offices. We suggest these five quantiles are good candidates, and could be supplemented by more quantiles which can then be used for testing the quality of the fit.
The paper is organized as follows: Section 2 describes the data sources and the basic patterns, including the presence of heavy tails. Section 3 presents the Lévy alpha-stable distribution, the main alternative parametric models, our fitting methods and shows the quality of the fit. Section 4 then compares the quantile-based and Lévy-based measures of dispersion in practice, showing the evolution of dispersion over time. Section 5 concludes.
2. Data and descriptive statistics
2.1 Data sources
In this section, we present the dataset and some basic patterns of the empirical data. We use data from the Orbis Europe database, compiled by Bureau van Dijk, which includes the balance sheets and profit-loss statements for 7 million unique firms across Europe, yielding approximately 23 million firm-year observations from 2006 to 2017. We keep all types of organizations and use the words “firms” throughout to refer to all kinds of organizations in our dataset.
Unlike other widely used firm-level data such as Compustat and Worldscope, Orbis Europe records a large number of small- and medium-sized firms that are often not publicly traded.5 Table A2 in Appendix 1 reports the number of observations per country-year.
To avoid double counting, we only use unconsolidated data.6 We then remove duplicated firm-year pairs, and we exclude self-employed firms, where the distinction between wage and profit are ambiguous. Finally, we regard negative values for sales, wage, and employment as missing values. For a detailed discussion on data cleaning, see Appendix 1.
2.2 Construction of variables
For each firm i in industry j and year t, we define real VA (|$Y_{i,t}$|) as the sum of real labor income (|$W_{i,t}$|) and real capital income (|$\Pi_{i,t}$|), that is
where ω and π are nominal wage and nominal profit, and |$p_{j,c,t}^v$| is the VA deflator of industry j in country c at time t. In Orbis, nominal wage and nominal profit variables are recorded in Cost of Employees (STAF) and Earnings Before Interest, Taxes, Depreciation & Amortization (EBITDA)7. The firms’ VA is deflated using the industry-level VA deflator from the EU KLEMS database (|$VA\_P$| in Jäger and The Conference Board (2018)). See Appendix 1.3 for more details.
Denoting the number of employees by |$L_{i,t}$| (EMPL in Orbis), our measure of firm-level LP (|$LP_{i,t}$|) is defined as the ratio of VA to the number of employees,
and is expressed in units of currency of the country where the firm is located.
2.2.1 Handling negative VA
Researchers in productivity analysis typically proceed to a data transformation: they analyze the natural logarithm of productivity (Bartelsman and Wolf, 2018). However, when output is measured as VA, firm-level LP can be negative. If a firm’s intermediate cost is greater than revenues in a given fiscal year, the firm has negative VA and productivity. From the income perspective of the definition of VA (Wages + Earnings), the sum of earnings and wages can be negative since firms’ losses (negative profits) can be greater than wages.
Empirically, a sizeable share of firms has negative LP. Table 1 shows the proportion (%) of negative observations per country-year in our data. Overall, 5% of the firms in our sample have negative productivity. As a more extreme case, more than a fifth of UK firms had negative productivity in 2006–2008.8
2.3 Characteristic patterns of the distributions
To motivate Lévy alpha-stable distributions as a model for LP, we first show qualitatively that the distributions are heavy-tailed and asymmetric.
Figure 2 shows the distribution of LP for France for the years covered by the data on a semi-log scale.9 An important observation is that even though the chart shows 99.99% of observations, the tails appear relatively “well behaved.” Once the bin’s midpoints are chosen appropriately, there are no large fluctuations and one could draw a fairly smooth line through the dots. This suggests that the common practice of winsorizing (removing, e.g. 1% of observations in each tail) removes observations that are actually well behaved statistically rather than being unexpected or strange “outliers” (see e.g. Nolan (2020: 196–97) for a discussion).

Distribution of LP in France, 2006–2015. To display as many data points in the extreme tails as possible, we use logarithmic binning for the bottom and top 10% productivity. The plot shows 99.99% of the entire data. Units are in Euros, deflated to 2015
We note the following five general characteristics of the empirical distribution of the LP. First, the distribution appears unimodal. Second, the support of the distribution is very large. Third, the distribution is asymmetric, with a pronounced right skewness. Fourth, the distribution exhibits slowly decaying tails. And fifth, comparing the results for different years (not shown here), we find that the shape of the distribution is very persistent, as one would expect.
We now focus on one of the most important of these features: heavy tails.
2.4 Testing for power law tails and infinite variance
We say that the tail of a distribution follows a power law if its cumulative distribution function F(x) takes the form
where L(x) is a slowly varying function, and α is the tail exponent. Intuitively, the smaller the tail exponent, the slower the frequency of a large event decreases as we consider increasingly extreme values. Very extreme values will be relatively more frequent than in a scenario with a higher tail exponent. In practice, it also determines what moments are finite; any moment greater than α will be infinite, while any moment less than α remains finite. Some authors use the term “heavy” tails for the case where α < 2, as in this case the variance is infinite (Nolan, 2020; Resnick, 2007).
Here, we propose four methods to determine whether LP distributions have heavy tails: (modified) Q–Q plots, scaling of the sample standard deviation with sample size, direct estimates of the tail exponent based on extreme value distributions, and Trapani (2016)’s test for infinite moments.
2.4.1 Q–Q plots
Resnick (2007) shows that Q–Q (quantile–quantile) plots for location-scale families can be used not only to provide visual evidence on the compatibility of a specific functional form with the data but also to estimate the parameters. This is convenient because if X is a random variable drawn from a Pareto distribution,
where |$k^*$| is the threshold, and α is the tail exponent, the distribution of |$\log X$| is a location-scale family. Resnick (2007) shows that the values
where |$X_{(i)}$| is the ith-order statistic, would lie on a line with intercept |$\log k$| and slope |$1/\alpha$| under the null (Eq. 4). To implement this procedure, we need to choose a cutoff |$k^*$|, or, alternatively, the sample size N. If we believe that the data have a Pareto tail, but only as |$x \to \infty$|, we would prefer to choose the highest possible cutoff. However, this implies less data and thus more variance in the resulting estimates, so there is a trade-off.
Figure 3 shows the log of top and bottom 0.5% observations of firm labor productivities. The datapoints by country roughly follow straight lines, which is good evidence that the tails are Pareto. Moreover, the slopes appear to lie between 1 and 1/2, suggesting |$1\lt\alpha\lt2$|.

Q–Q plots for positive and negative LP tails, 2015. Quantiles of the LP for the top and bottom 0.5% of firms in 18 European countries are very close to the quantiles of a Pareto distribution. A Pareto tail with a exponent α would appear as a line with slope |$1/\alpha$| (Resnick, 2007). A line with a slope of one (solid) and a line with a slope of 1/2 (dashed) are plotted for visual comparison
One point to note is that the slopes for negative productivity appear slightly steeper than for positive productivity, suggesting a degree of tail asymmetry that is not in line with the Lévy hypothesis, where the tail exponent is the same on each side (Section 2.1). This is a good motivation to use the five-parameter AST as benchmark, as it allows left and right tails to have different parameters. However, Fig. A3 in Appendix shows that the AST does not systematically outperform the Lévy distribution for fitting the tails.
2.4.2 Scaling of the sample standard deviation with sample size
If the distribution has power law tail with α < 2, the variance and higher-order moments are infinite. Due to this property, in finite samples, the larger the sample size, the higher the chance that an extreme event is drawn, leading to a larger sample standard deviation. More precisely, the sample standard deviation scales with sample size N as |$N^{\frac{1}{\alpha}-\frac{1}{2}}$| (see Appendix 7 for a heuristic derivation).
Figure 4 shows the scaling of the sample standard deviation of LP for different sample sizes. It is clear that for most countries, the sample standard deviation gets larger as the sample size increases. For four selected countries, we show the theoretical scaling derived by estimating the power law exponent as the tail parameter of a Lévy alpha-stable distribution. The scaling of the sample standard deviation holds well empirically, although with a possible plateauing as |$n \to \infty$| for some countries, such as Spain. Overall, this provides good evidence not only for power law tails but also for the idea that the Lévy alpha-stable distribution can be a good model to retrieve the value of the tail index.

Measured standard deviation in sub-samples of firm LP, country samples. We construct a distribution by pooling together the productivity levels of firms in each of 19 countries, for all years. Then, for each subsample size N, we compute the standard deviation of each of 1,000 subsamples. The plot on the left shows the average of standard deviations for each of all 19 countries, while the plot on the right shows the same scaling for four selected countries, with the linear scaling calculated as |$N^{\frac{1}{\hat{\alpha}}-\frac{1}{2}}$| (see Appendix 7), where |$\hat{\alpha}$| is estimated by fitting the Lévy distribution. The shaded area corresponds to the values of the sample standard deviation that fall between the 5th and 95th percentiles
2.4.3 Estimating tail exponents
Another way to test for infinite second moment is to estimate the tail exponent and see if it is lower than 2. Typically, one considers data from the tail only; that is, order statistics of up to order k, which makes it possible to estimate tail behavior and determine the finiteness of moments, independently of the behavior of the rest of the distribution. An important issue is that one has to chose a value |$k^*$|, which determines which data are used to estimate the tail parameter. Usually, the tails are influenced in a non-negligible way by the slowly varying function L(x) (Eq. 3), and therefore the choice of |$k^*$| may be difficult and lead to biased estimates of α.
An early and popular method for estimating tail exponents is the Hill estimator, but as noted in Resnick (2007), the Hill estimator provides very different estimates of the tail depending on which value of |$k^*$| we choose (“Hill Horror Plots”). Because of these well-known issues for regularly varying distributions (Eq. 3) that are somewhat far from pure power laws, several estimators of the tail exponents have been developed and tested for cases where the slowly varying function is non-negligible. Here, we use the estimators described and implemented by Voitalov et al. (2019). Voitalov et al. (2019)’s implementation includes an automatic double bootstrapping procedure for picking |$k^*$|. We use their package with default values. In addition, we also use the popular Hill estimator described in Clauset et al. (2009), which finds |$k^*$| by minimizing the Kolmogorov–Smirnov statistic (computed assuming that the true model is a pure power law). We add the estimate of α based on the Lévy distribution as well.
Figure 5 (left) shows the results using data from France for each year separately and strongly suggests that α < 2. Three out of five estimators suggest similar values of α ≈ 1.4 (dots for 2010 are missing for two estimators as the routines sometimes fail). The two estimators that suggest α > 2 for the earlier years are Hill-based estimators, which are known to produce poor results for Lévy distributed data (Resnick, 2007). To further confirm the plausibility of these estimates, for each year, we simulate one sample of Lévy distributed data using the empirical sample size and using parameters for the Lévy distribution estimated from the empirical data. In this case (right panel), five estimators behave almost exactly as in the empirical data, suggesting that the Lévy distribution is a plausible model and reinforcing the previous conclusion that the second moment is infinite.

Estimated values of the tail exponent |$\hat{\alpha}$|, using pooled data from France for each year separately and using six different estimators: The Hill estimator from Clauset et al. (2009) (“Hill (MLE+KS)”), three estimators implemented by Voitalov et al. (2019) (“Adj Hill,” “Moments,” and “Kernel”), and estimates of α from the Lévy distribution, either using maximum likelihood (our preferred approach, see Appendix 3) or the quantile estimator from McCulloch (1986)). The left chart shows |$\hat{\alpha}$| for the empirical data. The right chart shows |$\hat{\alpha}$| for data drawn from a Lévy-stable distribution with parameters equal to those estimated on the empirical data using the MLE method. The left chart shows that four estimators suggest infinite variance (|$\hat{\alpha}\lt2$|), while the two Hill-based estimators sometimes suggest finite variance on empirical data but would also fail to capture infinite variance if the data were indeed Lévy-distributed (right). We conclude that LP has indeed a power law tail with α < 2
The left panel of Figure 5 also helps to make an important point: if the data is not Lévy distributed but is a regularly varying distribution with a tail exponent that is correctly estimated by the kernel or moments estimators, then estimating the exponent using simply five quantiles and assuming a Lévy distribution gives relatively good results in the case of our data. This is important, because statistical offices holding detailed micro-data may not want to release the order statistics that are necessary to run the kernel or moments estimators. They may, however, be willing to release five quantiles, since the lowest/highest quantiles (5th and 95th) are likely to be uninformative about the situation of specific firms (in contrast to, say, the 99.9th quantile).
2.4.4 Trapani’s test
As a final check, we implemented a more formal statistical test due to Trapani (2016). The testing procedure exploits the fact that non-finite sample moments diverge with sample size. A full overview of the testing procedure and the detailed results are available in Appendix 8.
In the vast majority of both the country-year and country-industry samples, we cannot reject the hypothesis that the second moment of LP is infinite. For example, for 84% of country-year samples, Trapani’s finite moment test failed to reject the null hypothesis of the infinite second moment at the 5% significance level.
3. Models and estimation
This section provides a detailed discussion of the key properties of the Lévy alpha-stable distribution and introduces the estimation methods. For details on the competing distributional models (the five-parameter AST distribution and the 5-parameter Asymmetric Exponential Power (AEP)), see Appendix 4 and Bottazzi and Secchi (2011).
3.1 Models
3.1.1 The Lévy alpha-stable distribution
Here, we briefly describe the key properties of the Lévy alpha-stable distribution and the GCLT (see Appendix 2 for details).
The Lévy alpha-stable distribution is a four-parameter distribution with the following parameters: the parameter |$\alpha \in (0,2]$| is also called the tail exponent because the tail of the distribution decays as a power law with exponent α, that is, |$P(X \gt x) \approx Cx^{-\alpha}$|. The lower α, the thicker the tail (the more “tail dispersion” there is). However, note that while a smaller α indicates more dispersion because of a heavier tail, a smaller α also leads to a higher concentration of values close to the mode, and thus, in this sense, to a lower dispersion in the body of the distribution.
The “skew” or “asymmetry” parameter |$\beta \in [-1,1]$| is such that for β = 0, the distribution is symmetric, for β > 0 right-skewed, and for β < 0 it is left-skewed. In practice, we will find that productivity distributions are right-skewed.
The higher the scale parameter γ, the wider the body of the distribution. The parameter γ is not dimensionless; it is expressed in the same units as the data.
Finally, the location parameter |$\delta \in (-\infty, +\infty)$| shifts the distribution, even though it is not in general exactly related to key quantities such as the mode, mean, or median.
The Lévy distribution arises from the GCLT, which is the appropriate generalization of the classical CLT when the distribution of the variables being summed up does not necessarily have a finite variance. Like for the CLT, there are extensions showing conditions where the GCLT still holds for non i.i.d. variables, although these results are more complicated.
3.1.2 Estimation method
To estimate the parameters of the Lévy alpha-stable distribution, we use the maximum likelihood estimation (MLE) method based on the limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS-B) algorithm (Byrd et al., 1995). Despite its superior performance for the Lévy parameter estimation (Nolan, 2020, Chapter 4), the MLE is computationally highly prohibitive, as the density function of the Lévy alpha-stable distribution does not have a closed form and thus its evaluation takes long. For this reason, when using the MLE, we resort to using a subsample of the data. As long as the size of the subsample is large enough (|$\geq30,000$|), the estimation results remain robust, irrespective of the specific subsamples used in the MLE estimation. See Appendix 3 for a detailed discussion of the estimation method.
3.1.3 Benchmarks
There are two benchmark models used in this paper: the five-parameter AST distribution (Zhu and Galbraith, 2010) and the five-parameter AEP distribution (Bottazzi and Secchi, 2011). Similar to the Lévy alpha-stable distribution, these distributions generalize the Gaussian distribution to allow for skewness and more probability density in the tails. The AST distribution provides a particularly good benchmark as it has the power law tail similar to the Lévy alpha-stable distribution, but it allows for asymmetry in the tails with two distinct tail parameters (ν1 and ν2). Therefore, it can help us examine potentially different patterns in each of the left and right tails and determine whether accounting for asymmetry results in an overall better fit for the data. In contrast, the AEP distribution does not feature power law tails and can serve as an effective benchmark to demonstrate how crucial accounting for power-tails is in understanding the distribution of firm-level LP.
Both of these distributions have been suggested as models for quantities that are known to have heavier tails than Gaussians, such as return volatility (Zhu and Galbraith, 2011), changes in currency exchange rates (Ayebo and Kozubowski, 2003), and (logarithmic) firm growth rates (Bottazzi et al., 2007). Appendix 4 gives the functional forms of the five-parameter AST and five-parameter AEP models, the details of the fitting procedure, and goodness comparisons to the Lévy alpha-stable model fit.
3.2 Distribution fit and model comparison
Figure 6 shows the empirical densities and the fits for the Lévy alpha-stable, AST, and AEP distributions10 for the France sample.11 The Lévy alpha-stable distribution is able to fit the entire domain of the empirical distribution extremely well,12 possibly with small shortcomings in dealing with asymmetric tail behavior (as noted in Section 2.4).

Distribution of LP in France with Lévy alpha-stable and AST and AEP fits. The solid line indicates the Lévy alpha-stable, the dashed blue line indicates the five-parameter AST fit, and the dashed red line indicates the five-parameter AEP fit
The AST is broadly able to fit the body of the empirical distribution but in the case of France is less effective in capturing the tail part of the distribution. The AEP distribution not only fails to fit the tails but also offers an unsatisfactory fit for the distribution’s body.13 In Appendix 4.2, Table A8 compares the log-likelihood of each model, and Fig. A3 shows the fit. While the AEP is generally inferior, the AST is sometimes an equal or better fit then the Lévy, typically for East European countries, which have smaller sample sizes.
As an additional validation, Fig. 7 shows the scaling of the estimated standard deviation in randomly generated samples from the Lévy alpha-stable, AST, and AEP models, respectively. This comparison uses a simulated sample of 2.4 million observations, equivalent to the size of our empirical dataset for France. The distributions were generated using the parameters estimated from the respective model based on our empirical data for France. The Lévy distribution random sample exhibits a power law scaling with the standard deviation and aligns closely with the empirical data. By contrast, while the AST random sample does show scaling with the standard deviation, it does not match well with the empirical data.14 Predictably, an AEP random sample fails to reproduce the pattern of the scaling due to its lack of power law tails.

Simulation of the scaling of the sample standard deviation with sample size for Lévy alpha-stable, AST, and AEP distributed data. We generate surrogate datasets that follow either the Lévy alpha-stable (left), the five-parameter AST (middle), or the five-parameter AEP distribution (right), using the same sample size as the empirical data for France (2.4 million), and the parameters estimated on the empirical data. We then estimate the standard deviation in subsamples of the surrogate datasets, using many subsamples for each subsample size. The figures show mean and 5%–95% quantiles for estimates of the standard deviation. The dots show the mean (across subsamples of a given size) of the sample standard deviation in the empirical dataset.
4. Measuring productivity dispersion
To sum up from the earlier discussion, we cannot measure productivity dispersion using the empirical standard deviation because it is likely to be a measurement of a moment that is infinite, so its measured value depends on sample size and is not meaningful. We also cannot measure dispersion as the standard deviation of the log of VA per worker, because too many values are negative and would need to be removed when taking the log. As a result, we are left with only two serious families of contenders: non-parametric metrics based on quantiles and parametric metrics based on distributions with heavy tails and a support on both negative and positive values.
We first discuss our choices of indicators of dispersion in the body of the distribution and our indicators for tail fatness. Of course, these choices are somewhat arbitrary, as one can choose different quantiles and sometimes different parametrizations for a given distribution. Table 2 provides an overview of all our dispersion metrics.
Summary of dispersion metrics: this table compares common dispersion measure, in the first two columns, against parametric measures from our paper (Lévy) and the AST
. | Non param. . | LogN . | Lévy . | AST . |
---|---|---|---|---|
Location | Q50 | µLN | δ | µAST |
Abs. Scale | |$Q_{75} - Q_{25}$| | σLN | γ | σAST |
Rel. Scale | |$(Q_{75} - Q_{25}) / Q_{50}$| or |$Q_{90}/Q_{10}$| | |$\sigma_{LN} / \mu_{LN}$| | |$\gamma/\delta$| | |$\sigma_{AST}/\mu_{AST}$| |
Right tail | (|$Q_{95} - Q_{50}) / Q_{50}$| | |$Q_{95} / Q_{50}$| | α | ν2 |
Left tail | |$(Q_{50} - Q_{05})/Q_{50}$| | |$Q_{05} / Q_{50}$| | α | ν1 |
Skewness | |$\frac{(Q90-Q50)-(Q50-Q10)}{Q90-Q10}$| | - | β | αAST |
Comments | Q10 can be < 0 | Excludes non-positive values |
. | Non param. . | LogN . | Lévy . | AST . |
---|---|---|---|---|
Location | Q50 | µLN | δ | µAST |
Abs. Scale | |$Q_{75} - Q_{25}$| | σLN | γ | σAST |
Rel. Scale | |$(Q_{75} - Q_{25}) / Q_{50}$| or |$Q_{90}/Q_{10}$| | |$\sigma_{LN} / \mu_{LN}$| | |$\gamma/\delta$| | |$\sigma_{AST}/\mu_{AST}$| |
Right tail | (|$Q_{95} - Q_{50}) / Q_{50}$| | |$Q_{95} / Q_{50}$| | α | ν2 |
Left tail | |$(Q_{50} - Q_{05})/Q_{50}$| | |$Q_{05} / Q_{50}$| | α | ν1 |
Skewness | |$\frac{(Q90-Q50)-(Q50-Q10)}{Q90-Q10}$| | - | β | αAST |
Comments | Q10 can be < 0 | Excludes non-positive values |
Summary of dispersion metrics: this table compares common dispersion measure, in the first two columns, against parametric measures from our paper (Lévy) and the AST
. | Non param. . | LogN . | Lévy . | AST . |
---|---|---|---|---|
Location | Q50 | µLN | δ | µAST |
Abs. Scale | |$Q_{75} - Q_{25}$| | σLN | γ | σAST |
Rel. Scale | |$(Q_{75} - Q_{25}) / Q_{50}$| or |$Q_{90}/Q_{10}$| | |$\sigma_{LN} / \mu_{LN}$| | |$\gamma/\delta$| | |$\sigma_{AST}/\mu_{AST}$| |
Right tail | (|$Q_{95} - Q_{50}) / Q_{50}$| | |$Q_{95} / Q_{50}$| | α | ν2 |
Left tail | |$(Q_{50} - Q_{05})/Q_{50}$| | |$Q_{05} / Q_{50}$| | α | ν1 |
Skewness | |$\frac{(Q90-Q50)-(Q50-Q10)}{Q90-Q10}$| | - | β | αAST |
Comments | Q10 can be < 0 | Excludes non-positive values |
. | Non param. . | LogN . | Lévy . | AST . |
---|---|---|---|---|
Location | Q50 | µLN | δ | µAST |
Abs. Scale | |$Q_{75} - Q_{25}$| | σLN | γ | σAST |
Rel. Scale | |$(Q_{75} - Q_{25}) / Q_{50}$| or |$Q_{90}/Q_{10}$| | |$\sigma_{LN} / \mu_{LN}$| | |$\gamma/\delta$| | |$\sigma_{AST}/\mu_{AST}$| |
Right tail | (|$Q_{95} - Q_{50}) / Q_{50}$| | |$Q_{95} / Q_{50}$| | α | ν2 |
Left tail | |$(Q_{50} - Q_{05})/Q_{50}$| | |$Q_{05} / Q_{50}$| | α | ν1 |
Skewness | |$\frac{(Q90-Q50)-(Q50-Q10)}{Q90-Q10}$| | - | β | αAST |
Comments | Q10 can be < 0 | Excludes non-positive values |
4.1 “Body” dispersion: absolute and relative measures of scale
Let us start by discussing interquantile ratios (IQRs), which are very popular in part because they are easy to interpret. Let us denote the value of the pth quantile for VA per worker by |$Q_p(V)$|. Considering for instance firms at the 90th and 10th quantiles, we define the IQR as the ratio
which tells us how many times more productive the firm at the 90th percentile is compared to the firm at the 10th percentile. Usually, the variables are initially log transformed. If all values are positive, the firm sitting at the pth quantile of VA per worker is also the firm sitting at the pth quantile of the log(VA per worker), so we have |$\log(Q_p(V))=Q_p(\log(V))$|, and therefore Eq. 6 can be rewritten as
In other words, the IQR is the exponential of the interquantile range of the log-transformed values. While one usually log transforms the data before computing an inter-quantile range, this is by no means necessary, because one could just directly compute an IQR.
This makes a key advantage of IQRs clear: even when there are negative values, IQRs can still be computed, as long as the bottom quantile is positive. For example, Spain has roughly 3–4% of negative LP observations (Table 1), so we do not need to drop the negative values if we want to compare, say, the bottom 5% to the top 5%. Unfortunately, the practice of automatically taking log as a first step of the analysis has sometimes led to removing non-positive values before computing IQRs, leading to a bias in the measurement of dispersion.15
An issue with IQRs is that they do not make sense when the lower quantile is negative, which can happen relatively frequently. In our sample of 206 country-year observations, this happens 10 times, almost 5%. An alternative is to compute the |$\text{IQR}_{90-10} \equiv Q_{90}(V) - Q_{10}(V)$|, which faces no particular issues when |$Q_{10}(V)\lt0$|. This is an “absolute” measure of dispersion, expressed in the same units as the original data. A value of around 70,000 for the early years in France, for instance, means that the firm at the 90th quantile has a VA per worker that is € 70,000 higher than the firm at the 10th quantile. This can be compared with the parametric absolute measures of scale, namely γ in the Lévy distribution, σAST in the AST, and σLN, the standard deviation of the log values (removing non-positive observations). In fact, when α is small and β is large, γ is very close to half the interquartile range (Nolan (2020), see Appendix 2). To facilitate the comparison between quantile-based and Lévy, we choose IQR|$_{75-25}$| as a measure of absolute dispersion.
Since IQR|$_{90/10}$| is a very popular and intuitive way to measure dispersion, however, we would like to devise similar relative measures of dispersion from the parameters of the Lévy distribution. A natural candidate is simply to divide by the estimated location parameter |$\hat{\gamma} / \hat{\delta}$|, since δ also depends on the units of the original data. We can devise a similar metric for the AST and for the standard deviation of the log-transformed data. Note, however, that the non-parametric “relative” scale dispersion |$\text{IQR}_{90/10}$| (Eq. 6) takes the 10th percentile as the basis, while the three parametric indicators above take a location parameter, presumably much closer to the mean or median than to a lower quantile such Q10. This suggests that, to make better comparisons, a better non-parametric measure of relative scale dispersion would be taking an absolute metric of dispersion and dividing by a measure of central tendency. Thus, we choose |$(Q_{75} - Q_{25})/Q_{50}$|.
4.2 “Tail” dispersion.
A key advantage of the Lévy model is that it provides an easy way to interpret the parameter for the tail, the tail exponent α. The AST is even better in this respect, as it allows a separate estimate for the left tail, ν1, and the right tail, ν2. We consider that a fatter tail (α closer to zero) indicates more dispersion, since fatter tails indicate that there is a higher mass of super-productive firms. In almost all cases, the skewness parameter β is substantially above 0, indicating right-skewness. So, as we will confirm, α tends to be an indicator of the fatness of the right tail. We often use |$-\alpha,-\nu_2,-\nu_2$| so that these indicators, like all the others, indicate more dispersion when they increase.
To compare these indicators of tail dispersion to non-parametric metrics, we follow the literature and focus on the 95th quantile (e.g. De Loecker et al. (2022); Andrews et al. (2019)), which we express as relative distance to the median, IQR|$_{95/50}=Q_{95}/Q_{50}$|. This measures the extent to which superstar firms deviate from a “typical” firm.16 Our choice is driven by previous work that has identified an increase in productivity dispersion mostly due to the divergence of superstar firms (Andrews et al., 2019; De Loecker et al., 2022). By symmetry, we could use |$Q_{05}/Q_{50}$| for the left tail (Oliveira-Cunha et al. (2021) use |$Q_{10}/Q_{50}$|), but Q05 is often negative during crisis years, so we use |$(Q_{50}-Q_{05})/Q_{50}$|.
We compute all these indicators at the level of 206 country-year pairs. To better understand the changing shapes of the distributions, we also report the skewness parameters of the Lévy and AST. As a non-parametric measure, we use Kelley’s skewness (Kelley, 1947; Guvenen et al., 2021).
4.3 Has productivity dispersion increased?
Time series patterns of productivity dispersion have been a central concern of the current literature on productivity, including reports by the Organisation for Economic Co-operation and Development (OECD) for advanced economies (Andrews et al., 2019; Berlingieri et al., 2017), Haldane (2017) and De Loecker et al. (2022) for the UK, Gopinath et al. (2017) for European economies, and Cette et al. (2018) for France. Our goal here is to propose alternative measures of dispersion and compare them with the existing literature.
In Fig. 8, we show the time series of all the indicators from Table 2, for all countries.

Time series patterns of metrics characterizing the distributions. See Table 2 for definition of metrics. |$\hat{\mu}_N$| is the sample mean. The first row shows indexes (normalized to 1 in 2006) of central tendency (or “location”), the second row shows indexes of absolute dispersion (“scale”), the third row shows metrics of relative dispersion (on a log y axes), the fourth row shows metrics of tail heavyness (with quantile metrics on a log y axes), and the last row shows metrics of skewness. A few extreme datapoints are removed for visualization purposes (5 for IQR|$_{90/10}$|, 2 for |$\sigma_{\text{AST}}/\mu_{\text{AST}}$|). The blue curve shows the UK, and the red curve shows Germany
A first comment is that there is no evidence of a generalized increase in productivity dispersion. There tends to be more variability across countries than across time and heterogenous trends across countries. This is not necessarily at odd with previous work. Some have found that the increase in productivity dispersion took place mostly before 2006 (typically with data starting in 1996), with productivity dispersion only weakly increasing or possibly stagnating after 2006 (Andrews et al. (2019, Fig. 1), De Loecker et al. (2022, Figs 3 and 4b) for the UK); De Loecker et al. (2022, Fig. 4a) find increasing dispersion mostly after 2006 using Orbis, but note that using administrative data, the patterns are weaker. Oliveira-Cunha et al. (2021, Figs 8 and 10) find broadly stable dispersion for the UK and OECD countries. Various studies use different datasets, different subsets (i.e. industries), different concepts (employment weighted or not), cleaning strategies (in particular, include or exclude negative VA), periods and base years, and ways to present the data.
To investigate whether our Lévy-based dispersion metric is useful in practice, we highlight two countries. We expect that parametric metrics would be more useful when traditional indicators fail, that is, when there are many negative values and when the share of negative values changes over time. We highlight the UK, which has had a high share of negative values and a dramatic decline in negative values, and Germany, which has a fairly small and roughly constant share of negative values (Table 1). The very high number of negative VA firms in our sample of UK firms, and its fall post crisis, is much more dramatic than in other studies and datasets where it is closer to 5–10% than to 20%, and where it does not fluctuate as much (Haldane, 2017; Office for National Statistics, 2022; De Loecker et al., 2022). In Appendix 5, we show that a large share of negative VA organizations in the sample is due to the high prevalence of non-profit and charitable organizations. Here, we take our sample at face value and focus on whether our Lévy-based metrics are able to describe the patterns in a parsimonious and insightful way.
A careful look at Fig. 8 allows us to make three points. First, generally speaking, the metrics based on the lognormal behave differently from the others. This can be seen visually for the UK, but we have also confirmed this by computing correlations between all metrics.
Second, for Germany, the patterns depicted by the different metrics are usually relatively similar, while for the UK, different metrics can have quite different patterns. This highlights the fact that a careful choice of dispersion metrics appears more important when there is a high and volatile ratio of negative VA firms.
Third, in the UK, there is decline in body dispersion and an increase in tail dispersion17, and only the Lévy-based indicators are able to capture both patterns. The decline in body dispersion is captured by the |$(Q_{75}-Q_{25})/Q_{50}$| metric but not by the |$Q_{90}/Q_{10}$| because |$Q_{10}\lt0$| in the early years. It is also not captured by the AST, because the estimated AST parameters indicate a decrease in the location parameter, which is at odds with all other measures of location (first row).
The increase in tail dispersion is certainly picked up by Lévy but not by the |$Q_{95}/Q_{50}$| metric. While one could argue that using higher quantiles may be able to pick up the fattening of tail, as in fact most tail estimators use data from the very top quantiles, a remarkable property of the quantile estimator of the Lévy distribution is that it would be able to estimate α without any quantile higher than Q95. As a final note, the last row suggests that there is a vast increase in right-skewness, on which all indicators agree.
To sum up, in a country like the UK where there was an exceptionally high and volatile number of negative VA firms, it would be difficult to describe the change in dispersion using non-parametric metrics (where the choice of quantiles is arbitrary and especially difficult for the tails). While the AST would be a good candidate in principle, it turns out to work poorly is this specific instance. Instead, the Lévy parameters describe a strong increase in skewness and strong reduction is dispersion in the body of the distribution, as the mass of low productivity firms collapses after the end of the global financial crisis (we do not know if this is due to recovery or exit, but it is due to the presence of NPOs in the UK sample). The Lévy fit also shows an increase in the tail heavyness. We think this latter aspect makes a particularly compelling case for our approach, because tail exponents are arguably the best way to parsimoniously describe what happens in the tail of infinite variance distributions.
5. Discussion and conclusion
The distributions of firm-level VA per worker have extremely large support, are asymmetric, and are heavy-tailed. A major consequence of this is that measuring dispersion is not straightforward: standard deviations are poor metrics because second moments do not exist, and log transformations are not recommended due to the proportion of the negative values.
We propose the Lévy alpha-stable distribution as a sensible distributional model for LP, motivated by empirical evidence of an infinite variance, and by the fact that the Lévy distribution should be fairly common, as it emerges from the GCLT. We show that the Lévy distribution provides a better fit to the data than the five-parameter AEP distribution and at least as good a fit as the less parsimonious AST distribution.
Good distributional models make it possible to offer a richer picture of dispersion. While the scale parameter captures the overall width of the distribution, the tail parameter captures the occurrence of the extreme events. These are qualitatively distinct aspects of dispersion. While existing research does attempt to distinguish between body and tail dispersion using quantile ratios, we argue that parametric measures make this distinction clearer and more objective, as there is no need to log-transform the data or to choose specific quantiles.
To illustrate this point, we have presented time series patterns in the UK, where our sample has a dramatic number of negative VA firms, making lower quantiles negative in crisis years. We have found that the Lévy parameters are useful to describe the strong time series patterns: an increase in skewness, a decline in body dispersion, and an increase in tail fatness.
To go further, the research on productivity dispersion needs administrative data, which typically cannot be made available. Our findings have implications for statistical offices releasing moments of micro-data. Because the Lévy alpha-stable distribution can be estimated using five quantiles (more conveniently the 5th, 25th, 50th, 75th, and 95th), if statistical offices release more than these five quantiles, then the Lévy alpha-stable can be estimated on five quantiles and the fit can be tested on the other quantiles. If the fit is good, researchers can then compute any other statistic about the distribution almost as if they had access to the raw data.
Finally, our paper establishes a number of facts that are useful for the burgeoning literature on misallocation and productivity dispersion, suggesting that models should be able to reproduce Lévy-distributed VA per worker, with tail parameters |$\alpha \in [1,1.5]$| (although possibly with a broader range when considering country-industry-year samples), substantially positive skewness (|$\beta \in [0.5,1]$|), and a “body” dispersion |$\gamma/\delta \in [0.2,1]$|.
Funding
We would like to acknowledge funding from Baillie Gifford, the Institute for New Economic Thinking at the Oxford Martin School, and the Rebuilding Macroeconomics project, which is funded by the Economic and Social Research Council (ESRC). We are grateful to Jean-Philippe Bouchaud and José Moran for very useful comments.
Conflict of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Footnotes
See, for instance, Souma et al. (2009); Aoyama et al. (2015); Andrews et al. (2019); Berlingieri et al. (2017); Gu (2019); Ilzetzki and Simonelli (2017); Campbell et al. (2019); Gouin-Bonenfant (2022) and De Loecker et al. (2022). Baily et al. (1996) note that it is a “conceptually preferable measure of labor productivity” to gross output per worker, even though they have to remove negative VA establishments in some of their analyses. Oulton (2000) discards roughy 1% of his sample of UK firms, which have negative VA. Aradanaz-Badia et al. (2017) note that only a third of the UK firms with negative VA in 2010 had exited the market by 2015, which is only 10 percentage points higher than other firms. De Loecker et al. (2022) find a proportion of negative VA UK firms in both Orbis and administrative data. While issues with VA-based measures of productivity have been noted (Campbell et al., 2019; Cunningham et al., 2021), measures based on gross output are less comparable across different sectors, as intermediate consumption is naturally larger in some industries than in others, so that using gross output-based measures creates a need to use industry means to renormalize the values and compare them; this itself comes with its own issues, if only because classification systems are imperfect, lack granularity, and change over time.
Generalized beta models are defined only over a positive support and thus are not suitable for our data, which contain negative productivity firms.
See also Aoyama et al. (2009); Mizuno et al. (2012) and Aoyama et al. (2015).
There is an important theoretical literature on productivity dispersion, which recognizes that the right tail of the distribution of productivity levels is a power law and explains this mostly as the result of innovation and imitation processes, see Ghiglino (2012); Lucas and Moll (2014); Perla and Tonetti (2014) and König et al. (2016). The presence of extreme values in empirical data also implies that data cleaning choices may affect substantive research results, as shown for the misallocation literature by Rotemberg and White (2021).
Orbis Europe includes many different types of firms, including private limited company, joint-stock company, partnership, cooperative, consortium, foundation, and public agency/corporation. For a more detailed discussion on the advantages and drawbacks of the Orbis Europe database, see Kalemli-Ozcan et al. (2015).
For some corporate groups that consist of multiple firms, both consolidated and unconsolidated data are provided, i.e. data are listed for the entire group and again for each of the firms.
VA can also be calculated by subtracting intermediate costs (MATE in Orbis) from gross output (TURN in Orbis). This is the output-based approach to computing VA. However, material costs are less frequently observed compared to wages and earnings in our sample. About 27.67% of firms in our sample do not report material costs. See Appendix 6 for a detailed discussion of output-based LP and its comparison to the income-based productivity distribution.
Other authors find smaller percentages for the UK, but there is systematically at least a few percents of negative VA firms. This is close to 10% in Haldane (2017, Chart 17). In employment-weighted distributions, this is around 4–5% (Office for National Statistics, 2022). De Loecker et al. (2022) find only “a few” in Orbis but remove many sectors and firms with less than 10 employees.
Visualizing histograms for heavy-tailed data presents unique challenges. When all data are visualized, the expansive support can obscure meaningful patterns in high-density modal areas. Additionally, this can result in many missing bins in the tails. Simply truncating the tails in a visualization is not viable, as doing so contradicts the very essence of studying data with extreme values. To strike a balance and effectively visualize both the tail and body behavior, a dual binning approach is adopted. For the body of the data, specifically between the 10% and 90% quantiles, uniform binning is used to ensure equal-sized bins. However, for the bottom and top 10% of values, or the tails, we employ logarithmic binning. This method places data points in the tails into bins spaced at exponential distances, meaning that the extreme values of larger magnitude in the tails are grouped together into larger bins. We then use the middle value of each bin to visualize the histogram.
We use the R-package SkewtDist package (Xia, 2022) to estimate the AST and Bottazzi (2014)’s Subbotools package for the interval-constrained likelihood optimization of the 5-parameter AEP.
We selected France as the most representative case given its significant country sample size, alongside a firm size composition that closely aligns with the sample mean. For a detailed overview of the number of observations and size composition by country, please refer to Table A3.
There is a mild discontinuity around zero, as is shown by a slight peak in the histogram between zero and approximately 10,000 Euros. Upon thorough examination, this discontinuity is observed in several country samples. We were not able to find a clear cause for this. For the France sample, the distortion around zero is predominantly due to the medium-sized firms even though small and large firms do have some degree of distortion around zero.
See Moran et al. (2020) for a discussion of the inability of the Laplace distribution to fit growth rates due to the sharp change of behavior around the mode.
The same result applies to most of the 19 countries in our sample. Notable exceptions are Romania and Slovenia, where the AST is a better fit than the Lévy.
If one wants to take the log but keep all the observations (to keep the ranks, at least for the positive values (Campbell et al., 2019)), one can code the log of negative values as some values strictly lower than the lowest value of the log values of the positive values available in the sample. Note that coding log LP as 0 would be incorrect as there could be values of VA per worker that are between 0 and 1 (particularly if expressed in thousands or millions of currency units), leading to log values that are negative.
Berlingieri et al. (2017) and Oliveira-Cunha et al. (2021) look at the evolution of IQR|$_{90/50}$| as a measure of “divergence of the upper tail.” As discussed in Section 2.4, tail estimators determine a threshold value below which the data are not used for estimation. Typically they make use of only a few percent of top observations. For instance, the moment and kernel estimators reported in Fig. 5 would typically use between 0.1% and 2.5% of observations but sometimes more (the precise numbers depend on the random seed, as the double bootstrap procedure to find an optimal threshold is stochastic).
In Appendix 5, we find that removing non-profit and charitable organizations changes some of these patterns: the trends in location, skewness, and tail parameters remain the same, but the scale parameter suggests a smooth increase in dispersion rather than an overall fall due to a dramatic fall in 2008–2010 observed in the full sample.
For 25 countries, there were insufficient data: Albania, Austria, Belarus, Bosnia and Herzegovina, Cyprus, Greece, Iceland, Ireland, Kosovo, Latvia, Liechtenstein, Lithuania, Luxembourg, Malta, Monaco, Montenegro, Netherlands, Norway, North Macedonia, Moldova, Russian Federation, Serbia, Switzerland, Turkey, and Ukraine.
For α > 2, the rescaled sum converges to a normal distribution. For |$0\lt\alpha\leq 1$|, the mean of X does not exist, so the expressions for centering the sum are different, but the sum still converges to an alpha stable distribution. Note also that we state the theorem in terms of convergence to Nolan’s |$\boldsymbol{S_1}$| parametrization, but our empirical work is conducted with the |$\boldsymbol{S_0}$| parametrization.
There exist many mechanisms generating power laws, see Mitzenmacher (2004) and Gabaix (2009). Although not focusing on tail dispersion, Ilzetzki and Simonelli (2017) provide evidence of substantial dispersion in vote counting productivity across polling stations.
In spatial and network econometrics, it is well known that spatial correlations are qualitatively different from time series correlations, as reflected for instance in the fact that one needs to decide whether to use “infill” or “increasing domain” asymptotics (Kelejian and Piras, 2017).
Generically, the smaller α dominates (see e.g. Gabaix (2009), Eq. 3), but convergence might be slow so that these theoretical results do not always work well in samples of even moderate size. See also Cohen et al. (2020) for the case of heterogenous α (but focusing on |$0\lt\alpha\lt1$|).
This distribution is the standard Subbotin distribution and is obtained for |$a_\mathrm{l}=a_\mathrm{r}$| and |$b_\mathrm{l}=b_\mathrm{r}$| for the five-parameter AEP.
Sornette (2006) shows that this is the value of the maximum that is not exceeded with probability |$1/e \approx 37\%$|. Generalizing to the value of the maximum that is not exceeded with probability P implies the condition |$\frac{\ln(1/p)}{N}=P(X \gt X_{\text{max}})$|, which does not change the scaling of Xmax with N. Newman (2005) shows that the expected value of the maximum also scales as |$N^{\frac{1}{\alpha}}$|.
This is without loss of generality, see (2.8) in Cohen et al. (2020).
Trapani suggests |$r=n^{0.8}$| where n is the number of observations used to compute Ap.
References
Appendix 1 Data appendix
1.1. Raw data
We use the Orbis Europe firm level database, which is part of the Orbis data provided by Bureau van Dijk. The database encompasses more than 21 million firms of all sizes from 44 different European countries. Around 23 million firm-year observations for 7 million unique firms in 19 countries are used in the present analysis; see Table A2. The variables used are listed in Table A1.
Variables from Orbis Europe used for the analysis. See Eq. 1 for the construction of VA and LP
Orbis code . | Notation . | Description . |
---|---|---|
IDNR | Firm’s identification number | |
EBITDA | π | Nominal Earnings Before Interest, Taxes, Depreciation & Amortization |
STAF | ω | Nominal Wages (staff costs) |
EMPL | L | Employment |
NACE_PRIM_CODE | Industrial classification code (NACE Rev. 2) | |
CLOSDATE_year | Year |
Orbis code . | Notation . | Description . |
---|---|---|
IDNR | Firm’s identification number | |
EBITDA | π | Nominal Earnings Before Interest, Taxes, Depreciation & Amortization |
STAF | ω | Nominal Wages (staff costs) |
EMPL | L | Employment |
NACE_PRIM_CODE | Industrial classification code (NACE Rev. 2) | |
CLOSDATE_year | Year |
Variables from Orbis Europe used for the analysis. See Eq. 1 for the construction of VA and LP
Orbis code . | Notation . | Description . |
---|---|---|
IDNR | Firm’s identification number | |
EBITDA | π | Nominal Earnings Before Interest, Taxes, Depreciation & Amortization |
STAF | ω | Nominal Wages (staff costs) |
EMPL | L | Employment |
NACE_PRIM_CODE | Industrial classification code (NACE Rev. 2) | |
CLOSDATE_year | Year |
Orbis code . | Notation . | Description . |
---|---|---|
IDNR | Firm’s identification number | |
EBITDA | π | Nominal Earnings Before Interest, Taxes, Depreciation & Amortization |
STAF | ω | Nominal Wages (staff costs) |
EMPL | L | Employment |
NACE_PRIM_CODE | Industrial classification code (NACE Rev. 2) | |
CLOSDATE_year | Year |
. | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 24,823 | 88,877 | 94,373 | 94,899 | 95,631 | 96,810 | 97,357 | 97,093 | 97,324 | 97,629 | 96,909 | 60,423 |
Bulgaria | 0 | 12,231 | 38,205 | 38,136 | 39,190 | 81,215 | 93,304 | 94,817 | 95,577 | 99,594 | 99,453 | 102,942 |
Croatia | 16,451 | 38,875 | 39,957 | 42,387 | 40,536 | 39,360 | 40,340 | 40,472 | 41,208 | 42,434 | 44,062 | 0 |
Czech Republic | 24,328 | 52,697 | 46,049 | 70,294 | 73,284 | 77,457 | 76,924 | 78,795 | 75,475 | 77,139 | 60,192 | 40,554 |
Denmark | 0 | 0 | 0 | 0 | 0 | 0 | 10,153 | 13,116 | 12,649 | 14,232 | 41,054 | 45,520 |
Estonia | 0 | 14,505 | 16,095 | 13,939 | 15,932 | 16,909 | 18,088 | 19,017 | 20,185 | 20,511 | 20,926 | 19,876 |
Finland | 14,585 | 31,271 | 30,235 | 32,541 | 32,344 | 38,403 | 38,749 | 39,827 | 40,825 | 37,529 | 35,797 | 36,760 |
France | 188,682 | 248,892 | 235,860 | 249,724 | 259,352 | 233,105 | 199,456 | 207,906 | 207,671 | 155,439 | 128,023 | 82,804 |
Germany | 19,537 | 36,256 | 37,810 | 38,440 | 39,498 | 42,164 | 50,222 | 73,045 | 40,261 | 37,871 | 33,155 | 0 |
Hungary | 0 | 65,589 | 16,220 | 109,059 | 91,033 | 88,773 | 120,705 | 121,833 | 129,449 | 132,779 | 133,472 | 129,107 |
Italy | 80,270 | 205,580 | 300,766 | 255,141 | 201,252 | 406,372 | 425,845 | 418,763 | 422,876 | 437,585 | 435,280 | 380,695 |
Poland | 13,548 | 25,968 | 31,798 | 54,341 | 18,129 | 14,024 | 15,311 | 13,514 | 0 | 0 | 0 | 25,350 |
Portugal | 97,015 | 202,418 | 208,151 | 201,864 | 197,862 | 192,167 | 187,614 | 184,607 | 187,208 | 191,268 | 193,563 | 193,080 |
Romania | 0 | 167,761 | 175,666 | 161,536 | 162,342 | 173,394 | 179,953 | 182,637 | 178,535 | 183,662 | 191,385 | 0 |
Slovakia | 0 | 16,158 | 22,700 | 33,137 | 32,910 | 35,012 | 33,821 | 39,242 | 43,622 | 34,241 | 34,956 | 0 |
Slovenia | 0 | 0 | 0 | 9,659 | 29,250 | 29,453 | 27,599 | 27,442 | 28,932 | 29,746 | 30,959 | 31,813 |
Spain | 270,505 | 424,877 | 468,998 | 461,257 | 436,631 | 424,314 | 404,536 | 390,900 | 392,478 | 403,412 | 405,003 | 345,054 |
Sweden | 18,283 | 83,593 | 100,106 | 102,341 | 104,274 | 106,361 | 107,975 | 108,888 | 110,179 | 112,690 | 114,515 | 100,681 |
UK | 16,128 | 49,873 | 53,686 | 51,906 | 50,259 | 49,457 | 49,832 | 51,090 | 52,126 | 52,793 | 52,818 | 0 |
. | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 24,823 | 88,877 | 94,373 | 94,899 | 95,631 | 96,810 | 97,357 | 97,093 | 97,324 | 97,629 | 96,909 | 60,423 |
Bulgaria | 0 | 12,231 | 38,205 | 38,136 | 39,190 | 81,215 | 93,304 | 94,817 | 95,577 | 99,594 | 99,453 | 102,942 |
Croatia | 16,451 | 38,875 | 39,957 | 42,387 | 40,536 | 39,360 | 40,340 | 40,472 | 41,208 | 42,434 | 44,062 | 0 |
Czech Republic | 24,328 | 52,697 | 46,049 | 70,294 | 73,284 | 77,457 | 76,924 | 78,795 | 75,475 | 77,139 | 60,192 | 40,554 |
Denmark | 0 | 0 | 0 | 0 | 0 | 0 | 10,153 | 13,116 | 12,649 | 14,232 | 41,054 | 45,520 |
Estonia | 0 | 14,505 | 16,095 | 13,939 | 15,932 | 16,909 | 18,088 | 19,017 | 20,185 | 20,511 | 20,926 | 19,876 |
Finland | 14,585 | 31,271 | 30,235 | 32,541 | 32,344 | 38,403 | 38,749 | 39,827 | 40,825 | 37,529 | 35,797 | 36,760 |
France | 188,682 | 248,892 | 235,860 | 249,724 | 259,352 | 233,105 | 199,456 | 207,906 | 207,671 | 155,439 | 128,023 | 82,804 |
Germany | 19,537 | 36,256 | 37,810 | 38,440 | 39,498 | 42,164 | 50,222 | 73,045 | 40,261 | 37,871 | 33,155 | 0 |
Hungary | 0 | 65,589 | 16,220 | 109,059 | 91,033 | 88,773 | 120,705 | 121,833 | 129,449 | 132,779 | 133,472 | 129,107 |
Italy | 80,270 | 205,580 | 300,766 | 255,141 | 201,252 | 406,372 | 425,845 | 418,763 | 422,876 | 437,585 | 435,280 | 380,695 |
Poland | 13,548 | 25,968 | 31,798 | 54,341 | 18,129 | 14,024 | 15,311 | 13,514 | 0 | 0 | 0 | 25,350 |
Portugal | 97,015 | 202,418 | 208,151 | 201,864 | 197,862 | 192,167 | 187,614 | 184,607 | 187,208 | 191,268 | 193,563 | 193,080 |
Romania | 0 | 167,761 | 175,666 | 161,536 | 162,342 | 173,394 | 179,953 | 182,637 | 178,535 | 183,662 | 191,385 | 0 |
Slovakia | 0 | 16,158 | 22,700 | 33,137 | 32,910 | 35,012 | 33,821 | 39,242 | 43,622 | 34,241 | 34,956 | 0 |
Slovenia | 0 | 0 | 0 | 9,659 | 29,250 | 29,453 | 27,599 | 27,442 | 28,932 | 29,746 | 30,959 | 31,813 |
Spain | 270,505 | 424,877 | 468,998 | 461,257 | 436,631 | 424,314 | 404,536 | 390,900 | 392,478 | 403,412 | 405,003 | 345,054 |
Sweden | 18,283 | 83,593 | 100,106 | 102,341 | 104,274 | 106,361 | 107,975 | 108,888 | 110,179 | 112,690 | 114,515 | 100,681 |
UK | 16,128 | 49,873 | 53,686 | 51,906 | 50,259 | 49,457 | 49,832 | 51,090 | 52,126 | 52,793 | 52,818 | 0 |
. | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 24,823 | 88,877 | 94,373 | 94,899 | 95,631 | 96,810 | 97,357 | 97,093 | 97,324 | 97,629 | 96,909 | 60,423 |
Bulgaria | 0 | 12,231 | 38,205 | 38,136 | 39,190 | 81,215 | 93,304 | 94,817 | 95,577 | 99,594 | 99,453 | 102,942 |
Croatia | 16,451 | 38,875 | 39,957 | 42,387 | 40,536 | 39,360 | 40,340 | 40,472 | 41,208 | 42,434 | 44,062 | 0 |
Czech Republic | 24,328 | 52,697 | 46,049 | 70,294 | 73,284 | 77,457 | 76,924 | 78,795 | 75,475 | 77,139 | 60,192 | 40,554 |
Denmark | 0 | 0 | 0 | 0 | 0 | 0 | 10,153 | 13,116 | 12,649 | 14,232 | 41,054 | 45,520 |
Estonia | 0 | 14,505 | 16,095 | 13,939 | 15,932 | 16,909 | 18,088 | 19,017 | 20,185 | 20,511 | 20,926 | 19,876 |
Finland | 14,585 | 31,271 | 30,235 | 32,541 | 32,344 | 38,403 | 38,749 | 39,827 | 40,825 | 37,529 | 35,797 | 36,760 |
France | 188,682 | 248,892 | 235,860 | 249,724 | 259,352 | 233,105 | 199,456 | 207,906 | 207,671 | 155,439 | 128,023 | 82,804 |
Germany | 19,537 | 36,256 | 37,810 | 38,440 | 39,498 | 42,164 | 50,222 | 73,045 | 40,261 | 37,871 | 33,155 | 0 |
Hungary | 0 | 65,589 | 16,220 | 109,059 | 91,033 | 88,773 | 120,705 | 121,833 | 129,449 | 132,779 | 133,472 | 129,107 |
Italy | 80,270 | 205,580 | 300,766 | 255,141 | 201,252 | 406,372 | 425,845 | 418,763 | 422,876 | 437,585 | 435,280 | 380,695 |
Poland | 13,548 | 25,968 | 31,798 | 54,341 | 18,129 | 14,024 | 15,311 | 13,514 | 0 | 0 | 0 | 25,350 |
Portugal | 97,015 | 202,418 | 208,151 | 201,864 | 197,862 | 192,167 | 187,614 | 184,607 | 187,208 | 191,268 | 193,563 | 193,080 |
Romania | 0 | 167,761 | 175,666 | 161,536 | 162,342 | 173,394 | 179,953 | 182,637 | 178,535 | 183,662 | 191,385 | 0 |
Slovakia | 0 | 16,158 | 22,700 | 33,137 | 32,910 | 35,012 | 33,821 | 39,242 | 43,622 | 34,241 | 34,956 | 0 |
Slovenia | 0 | 0 | 0 | 9,659 | 29,250 | 29,453 | 27,599 | 27,442 | 28,932 | 29,746 | 30,959 | 31,813 |
Spain | 270,505 | 424,877 | 468,998 | 461,257 | 436,631 | 424,314 | 404,536 | 390,900 | 392,478 | 403,412 | 405,003 | 345,054 |
Sweden | 18,283 | 83,593 | 100,106 | 102,341 | 104,274 | 106,361 | 107,975 | 108,888 | 110,179 | 112,690 | 114,515 | 100,681 |
UK | 16,128 | 49,873 | 53,686 | 51,906 | 50,259 | 49,457 | 49,832 | 51,090 | 52,126 | 52,793 | 52,818 | 0 |
. | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 24,823 | 88,877 | 94,373 | 94,899 | 95,631 | 96,810 | 97,357 | 97,093 | 97,324 | 97,629 | 96,909 | 60,423 |
Bulgaria | 0 | 12,231 | 38,205 | 38,136 | 39,190 | 81,215 | 93,304 | 94,817 | 95,577 | 99,594 | 99,453 | 102,942 |
Croatia | 16,451 | 38,875 | 39,957 | 42,387 | 40,536 | 39,360 | 40,340 | 40,472 | 41,208 | 42,434 | 44,062 | 0 |
Czech Republic | 24,328 | 52,697 | 46,049 | 70,294 | 73,284 | 77,457 | 76,924 | 78,795 | 75,475 | 77,139 | 60,192 | 40,554 |
Denmark | 0 | 0 | 0 | 0 | 0 | 0 | 10,153 | 13,116 | 12,649 | 14,232 | 41,054 | 45,520 |
Estonia | 0 | 14,505 | 16,095 | 13,939 | 15,932 | 16,909 | 18,088 | 19,017 | 20,185 | 20,511 | 20,926 | 19,876 |
Finland | 14,585 | 31,271 | 30,235 | 32,541 | 32,344 | 38,403 | 38,749 | 39,827 | 40,825 | 37,529 | 35,797 | 36,760 |
France | 188,682 | 248,892 | 235,860 | 249,724 | 259,352 | 233,105 | 199,456 | 207,906 | 207,671 | 155,439 | 128,023 | 82,804 |
Germany | 19,537 | 36,256 | 37,810 | 38,440 | 39,498 | 42,164 | 50,222 | 73,045 | 40,261 | 37,871 | 33,155 | 0 |
Hungary | 0 | 65,589 | 16,220 | 109,059 | 91,033 | 88,773 | 120,705 | 121,833 | 129,449 | 132,779 | 133,472 | 129,107 |
Italy | 80,270 | 205,580 | 300,766 | 255,141 | 201,252 | 406,372 | 425,845 | 418,763 | 422,876 | 437,585 | 435,280 | 380,695 |
Poland | 13,548 | 25,968 | 31,798 | 54,341 | 18,129 | 14,024 | 15,311 | 13,514 | 0 | 0 | 0 | 25,350 |
Portugal | 97,015 | 202,418 | 208,151 | 201,864 | 197,862 | 192,167 | 187,614 | 184,607 | 187,208 | 191,268 | 193,563 | 193,080 |
Romania | 0 | 167,761 | 175,666 | 161,536 | 162,342 | 173,394 | 179,953 | 182,637 | 178,535 | 183,662 | 191,385 | 0 |
Slovakia | 0 | 16,158 | 22,700 | 33,137 | 32,910 | 35,012 | 33,821 | 39,242 | 43,622 | 34,241 | 34,956 | 0 |
Slovenia | 0 | 0 | 0 | 9,659 | 29,250 | 29,453 | 27,599 | 27,442 | 28,932 | 29,746 | 30,959 | 31,813 |
Spain | 270,505 | 424,877 | 468,998 | 461,257 | 436,631 | 424,314 | 404,536 | 390,900 | 392,478 | 403,412 | 405,003 | 345,054 |
Sweden | 18,283 | 83,593 | 100,106 | 102,341 | 104,274 | 106,361 | 107,975 | 108,888 | 110,179 | 112,690 | 114,515 | 100,681 |
UK | 16,128 | 49,873 | 53,686 | 51,906 | 50,259 | 49,457 | 49,832 | 51,090 | 52,126 | 52,793 | 52,818 | 0 |
1.2. Data cleaning
The first step in processing the data is to ensure that no observations have missing values for their reporting firms’ Identification Number (IDNR) or year. We only use unconsolidated in order to avoid double counting. Since some accounts provide a default year for various reasons, observations where the reported year is not between 2006 and 2017 are removed. Negative sales, wages, and employment observations are regarded as missing. Furthermore, we remove firm-year observations where employment or VA are missing.
In addition, we discard country-year samples without at least 10,000 observations for 5 years or more. This reduces the list of countries considered from 44 to 19 countries (Table A2).18
1.3. Deflation
Country . | Obs . | Small . | Medium . | Large . | Very large . |
---|---|---|---|---|---|
Belgium | 1,042,148 | 72.18 | 20.78 | 5.85 | 1.19 |
Bulgaria | 794,664 | 74.19 | 22.75 | 2.62 | 0.44 |
Croatia | 426,082 | 81.30 | 16.07 | 2.22 | 0.41 |
Czech Republic | 753,188 | 59.15 | 32.07 | 7.78 | 1.00 |
Denmark | 136,724 | 62.38 | 27.21 | 7.29 | 3.12 |
Estonia | 195,983 | 83.05 | 14.91 | 1.92 | 0.12 |
Finland | 408,866 | 62.91 | 28.55 | 7.05 | 1.49 |
France | 2,396,914 | 69.34 | 22.89 | 6.55 | 1.22 |
Germany | 448,259 | 11.99 | 39.19 | 36.13 | 12.69 |
Hungary | 1,138,019 | 82.23 | 15.43 | 2.02 | 0.32 |
Italy | 3,970,425 | 56.54 | 35.40 | 7.04 | 1.03 |
Poland | 211,983 | 35.22 | 43.29 | 18.48 | 3.01 |
Portugal | 2,236,817 | 83.40 | 14.50 | 1.82 | 0.28 |
Romania | 1,756,871 | 84.33 | 13.71 | 1.66 | 0.30 |
Slovakia | 325,799 | 74.11 | 21.27 | 3.97 | 0.65 |
Slovenia | 244,853 | 80.79 | 16.38 | 2.48 | 0.35 |
Spain | 4,827,965 | 71.21 | 24.48 | 3.75 | 0.57 |
Sweden | 1,169,886 | 70.62 | 24.59 | 4.11 | 0.68 |
UK | 529,968 | 33.19 | 31.33 | 27.40 | 8.08 |
Mean | 1,211,338 | 65.69 | 24.46 | 7.90 | 1.94 |
Country . | Obs . | Small . | Medium . | Large . | Very large . |
---|---|---|---|---|---|
Belgium | 1,042,148 | 72.18 | 20.78 | 5.85 | 1.19 |
Bulgaria | 794,664 | 74.19 | 22.75 | 2.62 | 0.44 |
Croatia | 426,082 | 81.30 | 16.07 | 2.22 | 0.41 |
Czech Republic | 753,188 | 59.15 | 32.07 | 7.78 | 1.00 |
Denmark | 136,724 | 62.38 | 27.21 | 7.29 | 3.12 |
Estonia | 195,983 | 83.05 | 14.91 | 1.92 | 0.12 |
Finland | 408,866 | 62.91 | 28.55 | 7.05 | 1.49 |
France | 2,396,914 | 69.34 | 22.89 | 6.55 | 1.22 |
Germany | 448,259 | 11.99 | 39.19 | 36.13 | 12.69 |
Hungary | 1,138,019 | 82.23 | 15.43 | 2.02 | 0.32 |
Italy | 3,970,425 | 56.54 | 35.40 | 7.04 | 1.03 |
Poland | 211,983 | 35.22 | 43.29 | 18.48 | 3.01 |
Portugal | 2,236,817 | 83.40 | 14.50 | 1.82 | 0.28 |
Romania | 1,756,871 | 84.33 | 13.71 | 1.66 | 0.30 |
Slovakia | 325,799 | 74.11 | 21.27 | 3.97 | 0.65 |
Slovenia | 244,853 | 80.79 | 16.38 | 2.48 | 0.35 |
Spain | 4,827,965 | 71.21 | 24.48 | 3.75 | 0.57 |
Sweden | 1,169,886 | 70.62 | 24.59 | 4.11 | 0.68 |
UK | 529,968 | 33.19 | 31.33 | 27.40 | 8.08 |
Mean | 1,211,338 | 65.69 | 24.46 | 7.90 | 1.94 |
Country . | Obs . | Small . | Medium . | Large . | Very large . |
---|---|---|---|---|---|
Belgium | 1,042,148 | 72.18 | 20.78 | 5.85 | 1.19 |
Bulgaria | 794,664 | 74.19 | 22.75 | 2.62 | 0.44 |
Croatia | 426,082 | 81.30 | 16.07 | 2.22 | 0.41 |
Czech Republic | 753,188 | 59.15 | 32.07 | 7.78 | 1.00 |
Denmark | 136,724 | 62.38 | 27.21 | 7.29 | 3.12 |
Estonia | 195,983 | 83.05 | 14.91 | 1.92 | 0.12 |
Finland | 408,866 | 62.91 | 28.55 | 7.05 | 1.49 |
France | 2,396,914 | 69.34 | 22.89 | 6.55 | 1.22 |
Germany | 448,259 | 11.99 | 39.19 | 36.13 | 12.69 |
Hungary | 1,138,019 | 82.23 | 15.43 | 2.02 | 0.32 |
Italy | 3,970,425 | 56.54 | 35.40 | 7.04 | 1.03 |
Poland | 211,983 | 35.22 | 43.29 | 18.48 | 3.01 |
Portugal | 2,236,817 | 83.40 | 14.50 | 1.82 | 0.28 |
Romania | 1,756,871 | 84.33 | 13.71 | 1.66 | 0.30 |
Slovakia | 325,799 | 74.11 | 21.27 | 3.97 | 0.65 |
Slovenia | 244,853 | 80.79 | 16.38 | 2.48 | 0.35 |
Spain | 4,827,965 | 71.21 | 24.48 | 3.75 | 0.57 |
Sweden | 1,169,886 | 70.62 | 24.59 | 4.11 | 0.68 |
UK | 529,968 | 33.19 | 31.33 | 27.40 | 8.08 |
Mean | 1,211,338 | 65.69 | 24.46 | 7.90 | 1.94 |
Country . | Obs . | Small . | Medium . | Large . | Very large . |
---|---|---|---|---|---|
Belgium | 1,042,148 | 72.18 | 20.78 | 5.85 | 1.19 |
Bulgaria | 794,664 | 74.19 | 22.75 | 2.62 | 0.44 |
Croatia | 426,082 | 81.30 | 16.07 | 2.22 | 0.41 |
Czech Republic | 753,188 | 59.15 | 32.07 | 7.78 | 1.00 |
Denmark | 136,724 | 62.38 | 27.21 | 7.29 | 3.12 |
Estonia | 195,983 | 83.05 | 14.91 | 1.92 | 0.12 |
Finland | 408,866 | 62.91 | 28.55 | 7.05 | 1.49 |
France | 2,396,914 | 69.34 | 22.89 | 6.55 | 1.22 |
Germany | 448,259 | 11.99 | 39.19 | 36.13 | 12.69 |
Hungary | 1,138,019 | 82.23 | 15.43 | 2.02 | 0.32 |
Italy | 3,970,425 | 56.54 | 35.40 | 7.04 | 1.03 |
Poland | 211,983 | 35.22 | 43.29 | 18.48 | 3.01 |
Portugal | 2,236,817 | 83.40 | 14.50 | 1.82 | 0.28 |
Romania | 1,756,871 | 84.33 | 13.71 | 1.66 | 0.30 |
Slovakia | 325,799 | 74.11 | 21.27 | 3.97 | 0.65 |
Slovenia | 244,853 | 80.79 | 16.38 | 2.48 | 0.35 |
Spain | 4,827,965 | 71.21 | 24.48 | 3.75 | 0.57 |
Sweden | 1,169,886 | 70.62 | 24.59 | 4.11 | 0.68 |
UK | 529,968 | 33.19 | 31.33 | 27.40 | 8.08 |
Mean | 1,211,338 | 65.69 | 24.46 | 7.90 | 1.94 |
The KLEMS database provides the most comprehensive data on deflation for the countries and industries covered by the Orbis Europe sample. In particular, tables Statistical_National-Accounts.rds and Statistical_Capital.rds provide VA and gross output for two-digit NACE Rev.2 industries for the countries in the Orbis Europe sample. Depending on its location, industry, and year of reporting, we deflate firms’ VA using the VA deflator.
1.4. Size proportion
Table A3 displays the number of observations and company size composition by country. Among the top three countries with the largest sample sizes (France, Italy, and Spain), France’s size composition closely aligns with the sample mean. Spain tends to have a relatively lower share of large and very large firms, while Italy has a lower share of small firms. For this reason, we showcase our main results regarding the distribution fit and model comparison (see Section 3.2) using the French sample in the paper.
Appendix 2 The Lévy alpha-stable distribution
The Lévy alpha-stable distribution is a natural candidate for many distributions exhibiting heavy tails, as it emerges from GCLT. We first describe the key properties of the Lévy alpha-stable distribution and then discuss the GCLT.
2.1. Characteristics of the Lévy alpha-stable distribution
The Lévy alpha-stable distribution is a four-parameter distribution with parameters α (tail exponent), β (skewness), γ (scale), and δ (location). The density function exists in closed form only in a few special cases such as α = 2 (Gaussian), |$\{\alpha=1, \beta=0\}$| (Cauchy), and |$\{\alpha=1/2, \beta=1\}$| (standard Lévy). Out of several alternatives, we use Nolan (1998) S0 parametrization, which has the characteristic function
where i is the imaginary unit, |$t \in R$| is the argument of the characteristic function, and |$ \operatorname {sgn}()$| is the sign function. This particular parametrization is useful for numerical work and statistical inference since the characteristic function is continuous in all four parameters (Nolan, 2020: 7).

Probability densities of Lévy alpha-stable distributions for different parameter values. The top left plot varies the α parameter (tail exponent), which determines how heavy the tail is. The top right plot varies the β parameter (skew parameter). The bottom left plot varies the γ parameter, which determines the “scale” or “width” of the distribution. The last plot varies the δ parameter, which shifts the location of the modal value of the distribution
Fig. A1 shows the probability density function of the Lévy alpha-stable distribution, with three different values for each of the four parameters. Starting from the top left, the parameter |$\alpha \in (0,2]$| is also called the tail exponent because the tail of the distribution decays as a power law with exponent α, that is, |$P(X \gt X) \approx Cx^{-\alpha}$|. This implies that scaling the value of x by a factor h just scales the tail probability by h−α, since |$h^{-\alpha}P(X \gt X)\approx C(hx)^{-\alpha}$|. For example, if α = 1, values of x that are twice as large (i.e., h = 2) as some reference |$\tilde{x}$| will be half as common, no less. As a result, with power law tails, extreme values are very common and can dominate certain moments of the distribution. Thus, the tail parameter α is an important indicator of dispersion. The lower α, the thicker the tail (the more “tail dispersion” there is). However, note that while a smaller α indicates more dispersion because of a heavier tail, a smaller α also leads to a higher concentration of values close to the mode and thus, in this sense, to a lower dispersion in the body of the distribution.
The top-right panel shows the effects of the “skew” or “asymmetry” parameter |$\beta \in [-1,1]$|. For β = 0, the distribution is symmetric, for β > 0 right-skewed, and for β < 0 it is left-skewed (except when α = 2, as the skew parameter β vanishes in that case). In practice, we will find that productivity distributions are right-skewed.
The bottom-left panel shows the effect of the scale parameter |$\gamma \in [0, +\infty]$|. The higher the scale parameter γ, the wider the body of the distribution. The parameter γ is not dimensionless; it is expressed in the same units as the data. We can get more intuition on the role of γ by considering special cases (Nolan, 2020: 168). First, γ is highly related to the (25–75) interquartile range; in fact, when β is small and α is large (say, β = 0.5 and α = 1.3), the interquartile range divided by 2 is a good estimator of γ. Second, for large enough α, the re-centered data (so that δ = 0) can be used to estimate γ as the median of the absolute values.
Finally, the bottom-right panel shows the effect of the location parameter |$\delta \in (-\infty, +\infty)$|, which is fairly intuitive even though it is not in general exactly related to key quantities such as the mode, mean, or median. For symmetric stable distributions, the median is equal to the location, and for asymmetric distributions, the median remains close to the location as long as |$|\beta|$| is not too large and α is not too small (Nolan, 2020: 168). When α > 1, the sample mean is actually a consistent (but slowly converging) estimator of δ (Nolan, 2020: 174). Nolan (2020: 92) defines an S2 parametrization specifically so that δ is the mode.
We can gain further intuition by considering special cases: for α = 2, the distribution becomes a Gaussian with variance |$2\gamma^2$| and mean δ. For |$\alpha = 1, \beta = 0$|, the distribution becomes Cauchy with scale parameter γ and location parameter δ. The role of δ and γ as location and scale parameters can be seen clearly by writing the Cauchy density as |$f(x) \propto \Big[ 1 + \Big( \frac{x-\delta}{\gamma} \Big)^2\Big]^{-1}$|. More generally, if |$X \sim \textbf{S}_0(\alpha,\beta,\gamma,\delta)$|, then |$(X-\delta)/\gamma \sim \textbf{S}_0(\alpha,\beta,1,0)$|.
One of the key characteristics of the Lévy alpha-stable distribution is that γ and α measure qualitatively different aspects of dispersion, providing a richer perspective on observed empirical patterns.
2.2. The generalized central limit theorem
Consider the sum of i.i.d. variables (with a zero mean, without loss of generality). If they have a finite variance, then this sum divided by |$N^{1/2}$| tends in distribution to a Gaussian distribution—this is the CLT. If we are unwilling to assume that the i.i.d. variables have finite variance, the generalized CLT states that the sum should be normalized by |$N^{1/\alpha}$|, where α is the tail exponent of the i.i.d. variables (or α = 2 if they have finite variance), and it tends to the Lévy alpha-stable distribution (which is the normal distribution if α = 2).
We take a more formal presentation from Nolan (2020), Theorem 3.12, and state it assuming |$1\lt\alpha\lt2$|, which is the relevant case here, and simplifies exposition.19
In plain English, the sum can be re-centered using the first moment (thanks to |$1\lt\alpha\lt2$|), the rescaling factor is proportional to |$N^{1/\alpha}$| (and a function of α and the tail balance parameters |$c^{+}$| and |$c^{-}$|), and the asymmetry parameter β depends on the relative difference between tail balance parameters |$c^{+}$| and |$c^{-}$|.
This theorem is remarkably general. For instance, assume that employees (or tasks, contracts, or routines) draw their productivity from a heavy-tailed distribution.20 Firms’ productivity being an average of the productivity of their employees, sufficiently large firms will have Lévy distributed productivity.
A legitimate concern with this justification for the alpha stable distribution is that employees are unlikely to draw their productivity from the same distribution, and even if they did, these draws are unlikely to be independent: the two “i”s of the i.i.d. assumption are violated.
In the case of finite variance variables, the CLT has been extended to cover independent but heterogeneous variables, non-independent but identically distributed variables, and non-independent, heterogenous variables, see White (2000). There are similar attempts for infinite variance variables, but less is known and the conditions are typically more difficult to state, less intuitive, and harder to check empirically.
Starting with dependent variables, there is growing literature, but it is generally focused on time series rather than cross-sectional correlations.21 That said, it is known that under “mild” time series dependence (e.g. m-dependent sequences), the GCLT still applies (see Bartkiewicz et al. (2011) for precise conditions and a history of this literature).
Regarding heterogeneous variables, it is important to realize that thanks to the property of stability, a sum of independent but heterogenous stable distributions (with the same α, also called “index of stability”) is still a stable distribution, with parameters being an explicit function of the weights and parameters of underlying distributions (Nolan, 2020, Proposition 1.3). As a result, if we have a fixed number of heterogeneous distributions (say, two types of employees), we can let the number of copies go to infinity for each kind of distributions, obtain a stable distribution for each of the sums, and then sum up all the stable variables, which will give another stable distribution. A more precise statement appears in Shintani and Umeno (2018).
While this discussion suggests that the basic GCLT has a broader scope than i.i.d. variables, there remain a number of limitations, such as a fixed α for heterogeneous variables,22 the relative lack of results on GCLT for cross-sectional dependence (but see Cohen et al. (2020) for progress on this and on heterogenous variables), and limited results on large deviations. As a result, we refrain from attempting to provide a specific data-generating process (DGP) here. It is likely that several plausible DGPs can deliver a stable distribution, but in our view, evaluating which one is more plausible would require fitting its parameters, and simulating it using finite samples of sizes similar to the empirical data, as well as checking additional predictions of each model. We thus leave this for further research, and in this paper, we focus on establishing that the distribution of productivity is Lévy alpha-stable and discuss the values of the parameters for subsamples, such as different points in time, countries, and industries.
Appendix 3 Fitting procedure
3.1. Fitting methods
There are three primary estimation methods for the Lévy alpha-stable distribution in the context of asymmetric distributions: MLE, the quantile-based estimator of McCulloch (1986), and a method based on empirical characteristic function (Nolan, 2020, Chapter 4). Among these, the quantile-based estimation is the fastest as it requires only a few quantiles to estimate the parameters. However, we noted that the quantile estimation can produce imprecise results, especially for countries where the empirical distribution is substantially right-skewed, such as Bulgaria, Estonia, Romania, Slovakia, and Slovenia. For instance, the quantile method occasionally returns a value of β = 0.95, the upper limit for β within this estimation method.
For this reason, we employ the MLE to obtain the most accurate estimates utilizing the L-BFGS-B algorithm (Byrd et al., 1995). However, even though the MLE demonstrates superior performance for the Lévy parameter estimation (Nolan, 2020, Chapter 4), it is computationally highly prohibitive. This is because the density function of the Lévy alpha-stable distribution does not have a closed form, making its evaluation time-consuming. To get around this problem, we opt for using a subsample of the data. To ensure that this subsampling does not introduce bias into our MLE estimates, we conduct a Monte Carlo simulation. Specifically, we generate synthetic data of size 1,000,000 from the Lévy alpha-stable distribution using parameters |$\alpha=1.2, \beta=0.55, \gamma=1, \delta=0$|. From these data, we draw random samples of sizes |$N=5000, 10,000, 15,000, 20,000$|, and |$30,000$|. For each sample size, we draw 100 samples and estimate the parameters using MLE. Subsequently, we compute the mean and both the 2.5th and 97.5th quantiles of these 100 observations.

Performance of the MLE in subsampling. For each sample size, we sample 100 sets of i.i.d. values from the synthetic data of size 100,000 generated from the Lévy alpha-stable distribution using parameters |$\alpha=1.2, \beta=0.55, \gamma=1, \delta=0$|. The horizontal lines show the true values of the parameters. The dots show the average of the estimated parameters over the 100 repetitions, and the line shows the 2.5th and 97.5th percentiles
Figure A2 displays the results. The black lines show the true values of the parameters, while the red dots show the average of the estimated parameters over the 100 repetitions, and the red line shows the 10th and 90th percentiles. For all four parameters, the MLE accurately estimates the true value even with a very small subsample size. The error bar becomes smaller as the size of the subsample increases. When the size of the subsample reaches 30,000, the 95% error bar is only around 0.01 and 0.02 away from the true value for α and β, respectively. To conclude, we prefer to use the MLE on a single manageable subsample of the data than to use the quantile method on the full sample.
In using the MLE for Lévy parameter estimation in this paper, we employ a subsample of 30,000 and 100,000 observations for country-year sample and country sample estimates.
3.2. Estimated parameters
Table A4 presents the estimated parameters of the Lévy alpha-stable distribution using the MLE. When estimating country aggregate samples, we use subsamples of 100,000 observations. For Croatia, the Czech Republic, Denmark, and Sweden, the built-in MLE function of the StableEstim package fails to operate properly. In these cases, we rely on the optim() function, which is the base optimization function in R, to obtain the results.
The estimated parameters of Lévy alpha-stable distribution for each country sample. The parameters are estimated using the MLE from subsamples of 100,000 observations. All MLE results are based on the L-BFGS-B algorithm (Byrd et al., 1995)
Country . | |$\boldsymbol{\hat{\alpha}}$| . | |$\boldsymbol{sd(\hat{\alpha})}$| . | |$\boldsymbol{\hat{\beta}}$| . | |$\boldsymbol{sd(\hat{\beta})}$| . | |$\boldsymbol{\hat{\gamma}}$| . | |$\boldsymbol{sd(\hat{\gamma})}$| . | |$\boldsymbol{\hat{\delta}}$| . | |$\boldsymbol{sd(\hat{\delta})}$| . |
---|---|---|---|---|---|---|---|---|
Belgium | 1.310 | 0.004 | 0.798 | 0.005 | 17.84 | 0.06 | 48.68 | 0.10 |
Bulgaria | 1.027 | 0.003 | 0.769 | 0.004 | 4.95 | 0.02 | 7.45 | 0.03 |
Croatia | 1.220 | 0.004 | 0.747 | 0.005 | 42.01 | 0.06 | 77.83 | 0.09 |
Czech Republic | 1.195 | 0.004 | 0.791 | 0.005 | 190.96 | 0.80 | 306.84 | 1.00 |
Denmark | 1.282 | 0.004 | 0.587 | 0.006 | 155.13 | 0.28 | 443.85 | 0.80 |
Estonia | 1.164 | 0.004 | 0.855 | 0.004 | 5.86 | 0.02 | 10.13 | 0.03 |
Finland | 1.415 | 0.004 | 0.769 | 0.006 | 16.50 | 0.05 | 41.63 | 0.09 |
France | 1.341 | 0.004 | 0.830 | 0.005 | 15.29 | 0.05 | 40.33 | 0.08 |
Germany | 1.156 | 0.004 | 0.787 | 0.004 | 21.11 | 0.08 | 49.03 | 0.11 |
Hungary | 1.192 | 0.010 | 0.796 | 0.008 | 1312.80 | 36.55 | 2154.31 | 40.24 |
Italy | 1.390 | 0.004 | 0.752 | 0.006 | 15.65 | 0.05 | 30.78 | 0.09 |
Poland | 1.105 | 0.004 | 0.863 | 0.003 | 28.79 | 0.11 | 41.63 | 0.15 |
Portugal | 1.346 | 0.004 | 0.688 | 0.006 | 6.84 | 0.02 | 12.12 | 0.04 |
Romania | 1.148 | 0.004 | 0.768 | 0.004 | 12.69 | 0.05 | 15.50 | 0.07 |
Slovakia | 1.176 | 0.004 | 0.795 | 0.004 | 8.03 | 0.03 | 12.25 | 0.04 |
Slovenia | 1.291 | 0.004 | 0.801 | 0.005 | 7.56 | 0.03 | 21.50 | 0.04 |
Spain | 1.289 | 0.004 | 0.674 | 0.006 | 11.10 | 0.04 | 25.78 | 0.06 |
Sweden | 1.417 | 0.004 | 0.673 | 0.007 | 164.79 | 0.43 | 455.90 | 0.48 |
UK | 1.093 | 0.004 | 0.545 | 0.005 | 18.71 | 0.08 | 31.01 | 0.09 |
Country . | |$\boldsymbol{\hat{\alpha}}$| . | |$\boldsymbol{sd(\hat{\alpha})}$| . | |$\boldsymbol{\hat{\beta}}$| . | |$\boldsymbol{sd(\hat{\beta})}$| . | |$\boldsymbol{\hat{\gamma}}$| . | |$\boldsymbol{sd(\hat{\gamma})}$| . | |$\boldsymbol{\hat{\delta}}$| . | |$\boldsymbol{sd(\hat{\delta})}$| . |
---|---|---|---|---|---|---|---|---|
Belgium | 1.310 | 0.004 | 0.798 | 0.005 | 17.84 | 0.06 | 48.68 | 0.10 |
Bulgaria | 1.027 | 0.003 | 0.769 | 0.004 | 4.95 | 0.02 | 7.45 | 0.03 |
Croatia | 1.220 | 0.004 | 0.747 | 0.005 | 42.01 | 0.06 | 77.83 | 0.09 |
Czech Republic | 1.195 | 0.004 | 0.791 | 0.005 | 190.96 | 0.80 | 306.84 | 1.00 |
Denmark | 1.282 | 0.004 | 0.587 | 0.006 | 155.13 | 0.28 | 443.85 | 0.80 |
Estonia | 1.164 | 0.004 | 0.855 | 0.004 | 5.86 | 0.02 | 10.13 | 0.03 |
Finland | 1.415 | 0.004 | 0.769 | 0.006 | 16.50 | 0.05 | 41.63 | 0.09 |
France | 1.341 | 0.004 | 0.830 | 0.005 | 15.29 | 0.05 | 40.33 | 0.08 |
Germany | 1.156 | 0.004 | 0.787 | 0.004 | 21.11 | 0.08 | 49.03 | 0.11 |
Hungary | 1.192 | 0.010 | 0.796 | 0.008 | 1312.80 | 36.55 | 2154.31 | 40.24 |
Italy | 1.390 | 0.004 | 0.752 | 0.006 | 15.65 | 0.05 | 30.78 | 0.09 |
Poland | 1.105 | 0.004 | 0.863 | 0.003 | 28.79 | 0.11 | 41.63 | 0.15 |
Portugal | 1.346 | 0.004 | 0.688 | 0.006 | 6.84 | 0.02 | 12.12 | 0.04 |
Romania | 1.148 | 0.004 | 0.768 | 0.004 | 12.69 | 0.05 | 15.50 | 0.07 |
Slovakia | 1.176 | 0.004 | 0.795 | 0.004 | 8.03 | 0.03 | 12.25 | 0.04 |
Slovenia | 1.291 | 0.004 | 0.801 | 0.005 | 7.56 | 0.03 | 21.50 | 0.04 |
Spain | 1.289 | 0.004 | 0.674 | 0.006 | 11.10 | 0.04 | 25.78 | 0.06 |
Sweden | 1.417 | 0.004 | 0.673 | 0.007 | 164.79 | 0.43 | 455.90 | 0.48 |
UK | 1.093 | 0.004 | 0.545 | 0.005 | 18.71 | 0.08 | 31.01 | 0.09 |
The estimated parameters of Lévy alpha-stable distribution for each country sample. The parameters are estimated using the MLE from subsamples of 100,000 observations. All MLE results are based on the L-BFGS-B algorithm (Byrd et al., 1995)
Country . | |$\boldsymbol{\hat{\alpha}}$| . | |$\boldsymbol{sd(\hat{\alpha})}$| . | |$\boldsymbol{\hat{\beta}}$| . | |$\boldsymbol{sd(\hat{\beta})}$| . | |$\boldsymbol{\hat{\gamma}}$| . | |$\boldsymbol{sd(\hat{\gamma})}$| . | |$\boldsymbol{\hat{\delta}}$| . | |$\boldsymbol{sd(\hat{\delta})}$| . |
---|---|---|---|---|---|---|---|---|
Belgium | 1.310 | 0.004 | 0.798 | 0.005 | 17.84 | 0.06 | 48.68 | 0.10 |
Bulgaria | 1.027 | 0.003 | 0.769 | 0.004 | 4.95 | 0.02 | 7.45 | 0.03 |
Croatia | 1.220 | 0.004 | 0.747 | 0.005 | 42.01 | 0.06 | 77.83 | 0.09 |
Czech Republic | 1.195 | 0.004 | 0.791 | 0.005 | 190.96 | 0.80 | 306.84 | 1.00 |
Denmark | 1.282 | 0.004 | 0.587 | 0.006 | 155.13 | 0.28 | 443.85 | 0.80 |
Estonia | 1.164 | 0.004 | 0.855 | 0.004 | 5.86 | 0.02 | 10.13 | 0.03 |
Finland | 1.415 | 0.004 | 0.769 | 0.006 | 16.50 | 0.05 | 41.63 | 0.09 |
France | 1.341 | 0.004 | 0.830 | 0.005 | 15.29 | 0.05 | 40.33 | 0.08 |
Germany | 1.156 | 0.004 | 0.787 | 0.004 | 21.11 | 0.08 | 49.03 | 0.11 |
Hungary | 1.192 | 0.010 | 0.796 | 0.008 | 1312.80 | 36.55 | 2154.31 | 40.24 |
Italy | 1.390 | 0.004 | 0.752 | 0.006 | 15.65 | 0.05 | 30.78 | 0.09 |
Poland | 1.105 | 0.004 | 0.863 | 0.003 | 28.79 | 0.11 | 41.63 | 0.15 |
Portugal | 1.346 | 0.004 | 0.688 | 0.006 | 6.84 | 0.02 | 12.12 | 0.04 |
Romania | 1.148 | 0.004 | 0.768 | 0.004 | 12.69 | 0.05 | 15.50 | 0.07 |
Slovakia | 1.176 | 0.004 | 0.795 | 0.004 | 8.03 | 0.03 | 12.25 | 0.04 |
Slovenia | 1.291 | 0.004 | 0.801 | 0.005 | 7.56 | 0.03 | 21.50 | 0.04 |
Spain | 1.289 | 0.004 | 0.674 | 0.006 | 11.10 | 0.04 | 25.78 | 0.06 |
Sweden | 1.417 | 0.004 | 0.673 | 0.007 | 164.79 | 0.43 | 455.90 | 0.48 |
UK | 1.093 | 0.004 | 0.545 | 0.005 | 18.71 | 0.08 | 31.01 | 0.09 |
Country . | |$\boldsymbol{\hat{\alpha}}$| . | |$\boldsymbol{sd(\hat{\alpha})}$| . | |$\boldsymbol{\hat{\beta}}$| . | |$\boldsymbol{sd(\hat{\beta})}$| . | |$\boldsymbol{\hat{\gamma}}$| . | |$\boldsymbol{sd(\hat{\gamma})}$| . | |$\boldsymbol{\hat{\delta}}$| . | |$\boldsymbol{sd(\hat{\delta})}$| . |
---|---|---|---|---|---|---|---|---|
Belgium | 1.310 | 0.004 | 0.798 | 0.005 | 17.84 | 0.06 | 48.68 | 0.10 |
Bulgaria | 1.027 | 0.003 | 0.769 | 0.004 | 4.95 | 0.02 | 7.45 | 0.03 |
Croatia | 1.220 | 0.004 | 0.747 | 0.005 | 42.01 | 0.06 | 77.83 | 0.09 |
Czech Republic | 1.195 | 0.004 | 0.791 | 0.005 | 190.96 | 0.80 | 306.84 | 1.00 |
Denmark | 1.282 | 0.004 | 0.587 | 0.006 | 155.13 | 0.28 | 443.85 | 0.80 |
Estonia | 1.164 | 0.004 | 0.855 | 0.004 | 5.86 | 0.02 | 10.13 | 0.03 |
Finland | 1.415 | 0.004 | 0.769 | 0.006 | 16.50 | 0.05 | 41.63 | 0.09 |
France | 1.341 | 0.004 | 0.830 | 0.005 | 15.29 | 0.05 | 40.33 | 0.08 |
Germany | 1.156 | 0.004 | 0.787 | 0.004 | 21.11 | 0.08 | 49.03 | 0.11 |
Hungary | 1.192 | 0.010 | 0.796 | 0.008 | 1312.80 | 36.55 | 2154.31 | 40.24 |
Italy | 1.390 | 0.004 | 0.752 | 0.006 | 15.65 | 0.05 | 30.78 | 0.09 |
Poland | 1.105 | 0.004 | 0.863 | 0.003 | 28.79 | 0.11 | 41.63 | 0.15 |
Portugal | 1.346 | 0.004 | 0.688 | 0.006 | 6.84 | 0.02 | 12.12 | 0.04 |
Romania | 1.148 | 0.004 | 0.768 | 0.004 | 12.69 | 0.05 | 15.50 | 0.07 |
Slovakia | 1.176 | 0.004 | 0.795 | 0.004 | 8.03 | 0.03 | 12.25 | 0.04 |
Slovenia | 1.291 | 0.004 | 0.801 | 0.005 | 7.56 | 0.03 | 21.50 | 0.04 |
Spain | 1.289 | 0.004 | 0.674 | 0.006 | 11.10 | 0.04 | 25.78 | 0.06 |
Sweden | 1.417 | 0.004 | 0.673 | 0.007 | 164.79 | 0.43 | 455.90 | 0.48 |
UK | 1.093 | 0.004 | 0.545 | 0.005 | 18.71 | 0.08 | 31.01 | 0.09 |
3.3. Parameter comparison: MLE vs. quantile method
Table A5 compares the parameters of the Lévy alpha-stable distribution estimated by the MLE (using a single sample of 100,000 obs.) and the quantile method of McCulloch (1986) using the full sample). Only minimal differences are observed, predominantly for the value of α in Eastern European countries like Bulgaria, Croatia, the Czech Republic, Estonia, Poland, Romania, Slovakia, and Slovenia. The quantile method tends to overestimate the right skewness of the distribution for these countries. Conversely, the MLE method estimates a lower skewness and slightly heavier tail (higher α).
Parameter comparison between MLE and quantile methods. The table compares the parameters of the Lévy alpha-stable distribution estimated by the MLE and the quantile estimation method of McCulloch (1986)
Country . | |$\boldsymbol{\hat{\alpha}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\alpha}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\beta}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\beta}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\gamma}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\gamma}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\delta}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\delta}}_{\textbf{q}}$| . |
---|---|---|---|---|---|---|---|---|
Belgium | 1.31 | 1.31 | 0.80 | 0.81 | 17.84 | 17.69 | 48.68 | 48.48 |
Bulgaria | 1.03 | 1.14 | 0.77 | 0.95 | 4.95 | 5.30 | 7.45 | 6.80 |
Croatia | 1.22 | 1.28 | 0.75 | 0.85 | 42.01 | 43.59 | 77.83 | 74.65 |
Czech Republic | 1.20 | 1.22 | 0.79 | 0.85 | 190.96 | 193.86 | 306.84 | 302.04 |
Denmark | 1.28 | 1.26 | 0.59 | 0.57 | 155.13 | 153.97 | 443.85 | 444.20 |
Estonia | 1.16 | 1.25 | 0.85 | 0.95 | 5.86 | 6.42 | 10.13 | 9.82 |
Finland | 1.41 | 1.34 | 0.77 | 0.66 | 16.50 | 15.96 | 41.63 | 42.20 |
France | 1.34 | 1.34 | 0.83 | 0.84 | 15.29 | 15.19 | 40.33 | 40.05 |
Germany | 1.16 | 1.09 | 0.79 | 0.72 | 21.11 | 20.11 | 49.03 | 49.65 |
Hungary | 1.19 | 1.25 | 0.80 | 0.91 | 1312.80 | 1370.24 | 2154.31 | 2093.89 |
Italy | 1.39 | 1.43 | 0.75 | 0.82 | 15.65 | 15.84 | 30.78 | 30.83 |
Poland | 1.11 | 1.14 | 0.86 | 0.95 | 28.79 | 28.52 | 41.63 | 40.83 |
Portugal | 1.35 | 1.31 | 0.69 | 0.65 | 6.84 | 6.59 | 12.12 | 12.08 |
Romania | 1.15 | 1.24 | 0.77 | 0.92 | 12.69 | 13.40 | 15.50 | 14.61 |
Slovakia | 1.18 | 1.24 | 0.79 | 0.91 | 8.03 | 8.25 | 12.25 | 11.82 |
Slovenia | 1.29 | 1.35 | 0.80 | 0.95 | 7.56 | 7.84 | 21.50 | 20.98 |
Spain | 1.29 | 1.29 | 0.67 | 0.67 | 11.10 | 11.13 | 25.78 | 25.75 |
Sweden | 1.42 | 1.40 | 0.67 | 0.65 | 164.79 | 163.67 | 455.90 | 454.25 |
UK | 1.09 | 1.11 | 0.55 | 0.56 | 18.71 | 19.14 | 31.01 | 30.89 |
Country . | |$\boldsymbol{\hat{\alpha}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\alpha}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\beta}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\beta}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\gamma}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\gamma}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\delta}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\delta}}_{\textbf{q}}$| . |
---|---|---|---|---|---|---|---|---|
Belgium | 1.31 | 1.31 | 0.80 | 0.81 | 17.84 | 17.69 | 48.68 | 48.48 |
Bulgaria | 1.03 | 1.14 | 0.77 | 0.95 | 4.95 | 5.30 | 7.45 | 6.80 |
Croatia | 1.22 | 1.28 | 0.75 | 0.85 | 42.01 | 43.59 | 77.83 | 74.65 |
Czech Republic | 1.20 | 1.22 | 0.79 | 0.85 | 190.96 | 193.86 | 306.84 | 302.04 |
Denmark | 1.28 | 1.26 | 0.59 | 0.57 | 155.13 | 153.97 | 443.85 | 444.20 |
Estonia | 1.16 | 1.25 | 0.85 | 0.95 | 5.86 | 6.42 | 10.13 | 9.82 |
Finland | 1.41 | 1.34 | 0.77 | 0.66 | 16.50 | 15.96 | 41.63 | 42.20 |
France | 1.34 | 1.34 | 0.83 | 0.84 | 15.29 | 15.19 | 40.33 | 40.05 |
Germany | 1.16 | 1.09 | 0.79 | 0.72 | 21.11 | 20.11 | 49.03 | 49.65 |
Hungary | 1.19 | 1.25 | 0.80 | 0.91 | 1312.80 | 1370.24 | 2154.31 | 2093.89 |
Italy | 1.39 | 1.43 | 0.75 | 0.82 | 15.65 | 15.84 | 30.78 | 30.83 |
Poland | 1.11 | 1.14 | 0.86 | 0.95 | 28.79 | 28.52 | 41.63 | 40.83 |
Portugal | 1.35 | 1.31 | 0.69 | 0.65 | 6.84 | 6.59 | 12.12 | 12.08 |
Romania | 1.15 | 1.24 | 0.77 | 0.92 | 12.69 | 13.40 | 15.50 | 14.61 |
Slovakia | 1.18 | 1.24 | 0.79 | 0.91 | 8.03 | 8.25 | 12.25 | 11.82 |
Slovenia | 1.29 | 1.35 | 0.80 | 0.95 | 7.56 | 7.84 | 21.50 | 20.98 |
Spain | 1.29 | 1.29 | 0.67 | 0.67 | 11.10 | 11.13 | 25.78 | 25.75 |
Sweden | 1.42 | 1.40 | 0.67 | 0.65 | 164.79 | 163.67 | 455.90 | 454.25 |
UK | 1.09 | 1.11 | 0.55 | 0.56 | 18.71 | 19.14 | 31.01 | 30.89 |
Parameter comparison between MLE and quantile methods. The table compares the parameters of the Lévy alpha-stable distribution estimated by the MLE and the quantile estimation method of McCulloch (1986)
Country . | |$\boldsymbol{\hat{\alpha}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\alpha}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\beta}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\beta}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\gamma}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\gamma}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\delta}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\delta}}_{\textbf{q}}$| . |
---|---|---|---|---|---|---|---|---|
Belgium | 1.31 | 1.31 | 0.80 | 0.81 | 17.84 | 17.69 | 48.68 | 48.48 |
Bulgaria | 1.03 | 1.14 | 0.77 | 0.95 | 4.95 | 5.30 | 7.45 | 6.80 |
Croatia | 1.22 | 1.28 | 0.75 | 0.85 | 42.01 | 43.59 | 77.83 | 74.65 |
Czech Republic | 1.20 | 1.22 | 0.79 | 0.85 | 190.96 | 193.86 | 306.84 | 302.04 |
Denmark | 1.28 | 1.26 | 0.59 | 0.57 | 155.13 | 153.97 | 443.85 | 444.20 |
Estonia | 1.16 | 1.25 | 0.85 | 0.95 | 5.86 | 6.42 | 10.13 | 9.82 |
Finland | 1.41 | 1.34 | 0.77 | 0.66 | 16.50 | 15.96 | 41.63 | 42.20 |
France | 1.34 | 1.34 | 0.83 | 0.84 | 15.29 | 15.19 | 40.33 | 40.05 |
Germany | 1.16 | 1.09 | 0.79 | 0.72 | 21.11 | 20.11 | 49.03 | 49.65 |
Hungary | 1.19 | 1.25 | 0.80 | 0.91 | 1312.80 | 1370.24 | 2154.31 | 2093.89 |
Italy | 1.39 | 1.43 | 0.75 | 0.82 | 15.65 | 15.84 | 30.78 | 30.83 |
Poland | 1.11 | 1.14 | 0.86 | 0.95 | 28.79 | 28.52 | 41.63 | 40.83 |
Portugal | 1.35 | 1.31 | 0.69 | 0.65 | 6.84 | 6.59 | 12.12 | 12.08 |
Romania | 1.15 | 1.24 | 0.77 | 0.92 | 12.69 | 13.40 | 15.50 | 14.61 |
Slovakia | 1.18 | 1.24 | 0.79 | 0.91 | 8.03 | 8.25 | 12.25 | 11.82 |
Slovenia | 1.29 | 1.35 | 0.80 | 0.95 | 7.56 | 7.84 | 21.50 | 20.98 |
Spain | 1.29 | 1.29 | 0.67 | 0.67 | 11.10 | 11.13 | 25.78 | 25.75 |
Sweden | 1.42 | 1.40 | 0.67 | 0.65 | 164.79 | 163.67 | 455.90 | 454.25 |
UK | 1.09 | 1.11 | 0.55 | 0.56 | 18.71 | 19.14 | 31.01 | 30.89 |
Country . | |$\boldsymbol{\hat{\alpha}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\alpha}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\beta}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\beta}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\gamma}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\gamma}}_{\textbf{q}}$| . | |$\boldsymbol{\hat{\delta}}_{\textbf{mle}}$| . | |$\boldsymbol{\hat{\delta}}_{\textbf{q}}$| . |
---|---|---|---|---|---|---|---|---|
Belgium | 1.31 | 1.31 | 0.80 | 0.81 | 17.84 | 17.69 | 48.68 | 48.48 |
Bulgaria | 1.03 | 1.14 | 0.77 | 0.95 | 4.95 | 5.30 | 7.45 | 6.80 |
Croatia | 1.22 | 1.28 | 0.75 | 0.85 | 42.01 | 43.59 | 77.83 | 74.65 |
Czech Republic | 1.20 | 1.22 | 0.79 | 0.85 | 190.96 | 193.86 | 306.84 | 302.04 |
Denmark | 1.28 | 1.26 | 0.59 | 0.57 | 155.13 | 153.97 | 443.85 | 444.20 |
Estonia | 1.16 | 1.25 | 0.85 | 0.95 | 5.86 | 6.42 | 10.13 | 9.82 |
Finland | 1.41 | 1.34 | 0.77 | 0.66 | 16.50 | 15.96 | 41.63 | 42.20 |
France | 1.34 | 1.34 | 0.83 | 0.84 | 15.29 | 15.19 | 40.33 | 40.05 |
Germany | 1.16 | 1.09 | 0.79 | 0.72 | 21.11 | 20.11 | 49.03 | 49.65 |
Hungary | 1.19 | 1.25 | 0.80 | 0.91 | 1312.80 | 1370.24 | 2154.31 | 2093.89 |
Italy | 1.39 | 1.43 | 0.75 | 0.82 | 15.65 | 15.84 | 30.78 | 30.83 |
Poland | 1.11 | 1.14 | 0.86 | 0.95 | 28.79 | 28.52 | 41.63 | 40.83 |
Portugal | 1.35 | 1.31 | 0.69 | 0.65 | 6.84 | 6.59 | 12.12 | 12.08 |
Romania | 1.15 | 1.24 | 0.77 | 0.92 | 12.69 | 13.40 | 15.50 | 14.61 |
Slovakia | 1.18 | 1.24 | 0.79 | 0.91 | 8.03 | 8.25 | 12.25 | 11.82 |
Slovenia | 1.29 | 1.35 | 0.80 | 0.95 | 7.56 | 7.84 | 21.50 | 20.98 |
Spain | 1.29 | 1.29 | 0.67 | 0.67 | 11.10 | 11.13 | 25.78 | 25.75 |
Sweden | 1.42 | 1.40 | 0.67 | 0.65 | 164.79 | 163.67 | 455.90 | 454.25 |
UK | 1.09 | 1.11 | 0.55 | 0.56 | 18.71 | 19.14 | 31.01 | 30.89 |
Appendix 4 Reference model and comparison of the fit
As a reference model, we use the five-parameter AST distribution by Zhu and Galbraith (2010) and the five-parameter AEP distribution by Bottazzi and Secchi (2011). In the following, we give the functional forms, details of the fitting procedures, and a comparison of the goodness of the results.
4.1. Functional forms of the AST and AEP models, fitting procedure, and estimated parameters
4.1.2 Five-parameter AEP
The probability density function of the five-parameter AST distribution is (Bottazzi and Secchi, 2011)
with
θ is the location parameter; the scale parameter is separate for the left side (|$a_\mathrm{l}$|) and right side (|$a_\mathrm{r}$|) of the distribution |$x\lt\theta$| and |$x\gt\theta$|. |$b_\mathrm{l}$| and |$b_\mathrm{r}$| are a left-shape and a right-shape parameter. The distribution recovers the normal distribution |$\mathcal{N}(\mu, \sigma)$| for parameter values |$\theta=\mu, a_\mathrm{l}=a_\mathrm{r}=\sigma, b_\mathrm{l}=b_\mathrm{r}=2$|.
The distribution can be fitted with interval-constrained likelihood optimization, which is, however, computationally expensive. We used the Subbotools package (Bottazzi , 2014). To speed up the computation, we first perform a three-parameter exponential power23 fit and then a five-parameter AEP fit that is initialized with the estimates from the three-parameter exponential power fit. Table A7 shows the estimate parameters of the five-parameter AEP distribution.
The estimated parameters of the five-parameter AEP distribution for each country sample. The parameters are estimated using the interval-constrained likelihood optimization from the Subbotools package (Bottazzi , 2014)
Country . | |$\boldsymbol{b_{\textbf{l}}}$| . | |$\boldsymbol{b_{\textbf{r}}}$| . | |$\boldsymbol{a_{\textbf{l}}}$| . | |$\boldsymbol{a_{\textbf{r}}}$| . | θ . |
---|---|---|---|---|---|
Belgium | 0.65 | 0.55 | 14.37 | 34.02 | 43.66 |
Bulgaria | 0.88 | 0.36 | 12.12 | 15.58 | 17.44 |
Croatia | 0.85 | 0.48 | 50.94 | 85.02 | 96.00 |
Czech Republic | 1.00 | 0.40 | 355.56 | 460.68 | 575.00 |
Denmark | 0.51 | 0.46 | 140.60 | 269.71 | 413.18 |
Estonia | 0.46 | 0.60 | 3.45 | 13.39 | 4.65 |
Finland | 1.13 | 0.43 | 27.73 | 31.15 | 61.00 |
France | 0.83 | 0.49 | 16.88 | 27.96 | 45.86 |
Germany | 0.50 | 0.41 | 16.62 | 43.97 | 44.50 |
Hungary | 0.53 | 0.51 | 973.25 | 2711.20 | 1556.00 |
Italy | 0.65 | 0.63 | 14.05 | 28.49 | 25.32 |
Poland | 0.58 | 0.47 | 21.33 | 64.35 | 33.29 |
Portugal | 0.51 | 0.48 | 5.74 | 11.48 | 10.39 |
Romania | 0.46 | 0.58 | 9.12 | 28.26 | 5.35 |
Slovakia | 0.50 | 0.55 | 5.73 | 17.24 | 6.93 |
Slovenia | 0.65 | 0.67 | 5.01 | 15.82 | 16.37 |
Spain | 0.51 | 0.50 | 9.38 | 20.28 | 21.77 |
Sweden | 1.01 | 0.50 | 245.11 | 269.86 | 590.50 |
UK | 0.41 | 0.41 | 18.26 | 36.68 | 25.01 |
Country . | |$\boldsymbol{b_{\textbf{l}}}$| . | |$\boldsymbol{b_{\textbf{r}}}$| . | |$\boldsymbol{a_{\textbf{l}}}$| . | |$\boldsymbol{a_{\textbf{r}}}$| . | θ . |
---|---|---|---|---|---|
Belgium | 0.65 | 0.55 | 14.37 | 34.02 | 43.66 |
Bulgaria | 0.88 | 0.36 | 12.12 | 15.58 | 17.44 |
Croatia | 0.85 | 0.48 | 50.94 | 85.02 | 96.00 |
Czech Republic | 1.00 | 0.40 | 355.56 | 460.68 | 575.00 |
Denmark | 0.51 | 0.46 | 140.60 | 269.71 | 413.18 |
Estonia | 0.46 | 0.60 | 3.45 | 13.39 | 4.65 |
Finland | 1.13 | 0.43 | 27.73 | 31.15 | 61.00 |
France | 0.83 | 0.49 | 16.88 | 27.96 | 45.86 |
Germany | 0.50 | 0.41 | 16.62 | 43.97 | 44.50 |
Hungary | 0.53 | 0.51 | 973.25 | 2711.20 | 1556.00 |
Italy | 0.65 | 0.63 | 14.05 | 28.49 | 25.32 |
Poland | 0.58 | 0.47 | 21.33 | 64.35 | 33.29 |
Portugal | 0.51 | 0.48 | 5.74 | 11.48 | 10.39 |
Romania | 0.46 | 0.58 | 9.12 | 28.26 | 5.35 |
Slovakia | 0.50 | 0.55 | 5.73 | 17.24 | 6.93 |
Slovenia | 0.65 | 0.67 | 5.01 | 15.82 | 16.37 |
Spain | 0.51 | 0.50 | 9.38 | 20.28 | 21.77 |
Sweden | 1.01 | 0.50 | 245.11 | 269.86 | 590.50 |
UK | 0.41 | 0.41 | 18.26 | 36.68 | 25.01 |
The estimated parameters of the five-parameter AEP distribution for each country sample. The parameters are estimated using the interval-constrained likelihood optimization from the Subbotools package (Bottazzi , 2014)
Country . | |$\boldsymbol{b_{\textbf{l}}}$| . | |$\boldsymbol{b_{\textbf{r}}}$| . | |$\boldsymbol{a_{\textbf{l}}}$| . | |$\boldsymbol{a_{\textbf{r}}}$| . | θ . |
---|---|---|---|---|---|
Belgium | 0.65 | 0.55 | 14.37 | 34.02 | 43.66 |
Bulgaria | 0.88 | 0.36 | 12.12 | 15.58 | 17.44 |
Croatia | 0.85 | 0.48 | 50.94 | 85.02 | 96.00 |
Czech Republic | 1.00 | 0.40 | 355.56 | 460.68 | 575.00 |
Denmark | 0.51 | 0.46 | 140.60 | 269.71 | 413.18 |
Estonia | 0.46 | 0.60 | 3.45 | 13.39 | 4.65 |
Finland | 1.13 | 0.43 | 27.73 | 31.15 | 61.00 |
France | 0.83 | 0.49 | 16.88 | 27.96 | 45.86 |
Germany | 0.50 | 0.41 | 16.62 | 43.97 | 44.50 |
Hungary | 0.53 | 0.51 | 973.25 | 2711.20 | 1556.00 |
Italy | 0.65 | 0.63 | 14.05 | 28.49 | 25.32 |
Poland | 0.58 | 0.47 | 21.33 | 64.35 | 33.29 |
Portugal | 0.51 | 0.48 | 5.74 | 11.48 | 10.39 |
Romania | 0.46 | 0.58 | 9.12 | 28.26 | 5.35 |
Slovakia | 0.50 | 0.55 | 5.73 | 17.24 | 6.93 |
Slovenia | 0.65 | 0.67 | 5.01 | 15.82 | 16.37 |
Spain | 0.51 | 0.50 | 9.38 | 20.28 | 21.77 |
Sweden | 1.01 | 0.50 | 245.11 | 269.86 | 590.50 |
UK | 0.41 | 0.41 | 18.26 | 36.68 | 25.01 |
Country . | |$\boldsymbol{b_{\textbf{l}}}$| . | |$\boldsymbol{b_{\textbf{r}}}$| . | |$\boldsymbol{a_{\textbf{l}}}$| . | |$\boldsymbol{a_{\textbf{r}}}$| . | θ . |
---|---|---|---|---|---|
Belgium | 0.65 | 0.55 | 14.37 | 34.02 | 43.66 |
Bulgaria | 0.88 | 0.36 | 12.12 | 15.58 | 17.44 |
Croatia | 0.85 | 0.48 | 50.94 | 85.02 | 96.00 |
Czech Republic | 1.00 | 0.40 | 355.56 | 460.68 | 575.00 |
Denmark | 0.51 | 0.46 | 140.60 | 269.71 | 413.18 |
Estonia | 0.46 | 0.60 | 3.45 | 13.39 | 4.65 |
Finland | 1.13 | 0.43 | 27.73 | 31.15 | 61.00 |
France | 0.83 | 0.49 | 16.88 | 27.96 | 45.86 |
Germany | 0.50 | 0.41 | 16.62 | 43.97 | 44.50 |
Hungary | 0.53 | 0.51 | 973.25 | 2711.20 | 1556.00 |
Italy | 0.65 | 0.63 | 14.05 | 28.49 | 25.32 |
Poland | 0.58 | 0.47 | 21.33 | 64.35 | 33.29 |
Portugal | 0.51 | 0.48 | 5.74 | 11.48 | 10.39 |
Romania | 0.46 | 0.58 | 9.12 | 28.26 | 5.35 |
Slovakia | 0.50 | 0.55 | 5.73 | 17.24 | 6.93 |
Slovenia | 0.65 | 0.67 | 5.01 | 15.82 | 16.37 |
Spain | 0.51 | 0.50 | 9.38 | 20.28 | 21.77 |
Sweden | 1.01 | 0.50 | 245.11 | 269.86 | 590.50 |
UK | 0.41 | 0.41 | 18.26 | 36.68 | 25.01 |
4.1.1 Five-parameter AST
The probability density function of the five-parameter AST distribution is (Zhu and Galbraith, 2010)
with
where µ is the location parameter, σ is the scale parameter, |$\alpha \in (0,1)$| is the skewness parameter, and |$\nu_1 \gt 0$| and |$\nu_2 \gt 0$| are the left and right tail parameters. |$\Gamma()$| is the Gamma function. We fit the AST using the MLE. For the computation, we use the R-package SkewtDist package (Xia, 2022). Table A6 shows the estimated parameters of the five-parameter AST distribution.
Estimated parameters of the five-parameter AST distribution for each country sample. The parameters are estimated using the MLE from the R-package SkewtDist package (Xia, 2022)
Country . | µ . | σ . | α . | |$\boldsymbol{\nu_{\textbf{1}}}$| . | |$\boldsymbol{\nu_{\textbf{2}}}$| . |
---|---|---|---|---|---|
Belgium | 39.57 | 59.64 | 0.28 | 2.22 | 1.75 |
Bulgaria | 3.91 | 16.58 | 0.20 | 1.03 | 1.34 |
Croatia | 51.97 | 141.04 | 0.25 | 1.57 | 1.72 |
Czech Republic | 182.78 | 648.84 | 0.24 | 1.61 | 1.61 |
Denmark | 392.56 | 510.91 | 0.34 | 1.92 | 1.54 |
Estonia | 4.82 | 20.22 | 0.16 | 1.22 | 1.87 |
Finland | 37.89 | 54.09 | 0.36 | 3.18 | 1.65 |
France | 32.24 | 51.21 | 0.28 | 2.45 | 1.84 |
Germany | 37.69 | 69.18 | 0.26 | 1.58 | 1.33 |
Hungary | 1365.14 | 4519.57 | 0.24 | 1.62 | 1.62 |
Italy | 21.98 | 53.46 | 0.29 | 2.05 | 2.16 |
Poland | 16.82 | 99.87 | 0.17 | 1.36 | 1.55 |
Portugal | 10.03 | 22.11 | 0.34 | 2.11 | 1.63 |
Romania | 5.67 | 43.53 | 0.20 | 1.23 | 1.68 |
Slovakia | 6.49 | 27.39 | 0.22 | 1.41 | 1.66 |
Slovenia | 16.27 | 25.72 | 0.23 | 1.89 | 1.99 |
Spain | 21.11 | 36.73 | 0.31 | 1.83 | 1.62 |
Sweden | 395.64 | 547.79 | 0.34 | 2.46 | 1.98 |
UK | 21.93 | 61.51 | 0.30 | 1.14 | 1.29 |
Country . | µ . | σ . | α . | |$\boldsymbol{\nu_{\textbf{1}}}$| . | |$\boldsymbol{\nu_{\textbf{2}}}$| . |
---|---|---|---|---|---|
Belgium | 39.57 | 59.64 | 0.28 | 2.22 | 1.75 |
Bulgaria | 3.91 | 16.58 | 0.20 | 1.03 | 1.34 |
Croatia | 51.97 | 141.04 | 0.25 | 1.57 | 1.72 |
Czech Republic | 182.78 | 648.84 | 0.24 | 1.61 | 1.61 |
Denmark | 392.56 | 510.91 | 0.34 | 1.92 | 1.54 |
Estonia | 4.82 | 20.22 | 0.16 | 1.22 | 1.87 |
Finland | 37.89 | 54.09 | 0.36 | 3.18 | 1.65 |
France | 32.24 | 51.21 | 0.28 | 2.45 | 1.84 |
Germany | 37.69 | 69.18 | 0.26 | 1.58 | 1.33 |
Hungary | 1365.14 | 4519.57 | 0.24 | 1.62 | 1.62 |
Italy | 21.98 | 53.46 | 0.29 | 2.05 | 2.16 |
Poland | 16.82 | 99.87 | 0.17 | 1.36 | 1.55 |
Portugal | 10.03 | 22.11 | 0.34 | 2.11 | 1.63 |
Romania | 5.67 | 43.53 | 0.20 | 1.23 | 1.68 |
Slovakia | 6.49 | 27.39 | 0.22 | 1.41 | 1.66 |
Slovenia | 16.27 | 25.72 | 0.23 | 1.89 | 1.99 |
Spain | 21.11 | 36.73 | 0.31 | 1.83 | 1.62 |
Sweden | 395.64 | 547.79 | 0.34 | 2.46 | 1.98 |
UK | 21.93 | 61.51 | 0.30 | 1.14 | 1.29 |
Estimated parameters of the five-parameter AST distribution for each country sample. The parameters are estimated using the MLE from the R-package SkewtDist package (Xia, 2022)
Country . | µ . | σ . | α . | |$\boldsymbol{\nu_{\textbf{1}}}$| . | |$\boldsymbol{\nu_{\textbf{2}}}$| . |
---|---|---|---|---|---|
Belgium | 39.57 | 59.64 | 0.28 | 2.22 | 1.75 |
Bulgaria | 3.91 | 16.58 | 0.20 | 1.03 | 1.34 |
Croatia | 51.97 | 141.04 | 0.25 | 1.57 | 1.72 |
Czech Republic | 182.78 | 648.84 | 0.24 | 1.61 | 1.61 |
Denmark | 392.56 | 510.91 | 0.34 | 1.92 | 1.54 |
Estonia | 4.82 | 20.22 | 0.16 | 1.22 | 1.87 |
Finland | 37.89 | 54.09 | 0.36 | 3.18 | 1.65 |
France | 32.24 | 51.21 | 0.28 | 2.45 | 1.84 |
Germany | 37.69 | 69.18 | 0.26 | 1.58 | 1.33 |
Hungary | 1365.14 | 4519.57 | 0.24 | 1.62 | 1.62 |
Italy | 21.98 | 53.46 | 0.29 | 2.05 | 2.16 |
Poland | 16.82 | 99.87 | 0.17 | 1.36 | 1.55 |
Portugal | 10.03 | 22.11 | 0.34 | 2.11 | 1.63 |
Romania | 5.67 | 43.53 | 0.20 | 1.23 | 1.68 |
Slovakia | 6.49 | 27.39 | 0.22 | 1.41 | 1.66 |
Slovenia | 16.27 | 25.72 | 0.23 | 1.89 | 1.99 |
Spain | 21.11 | 36.73 | 0.31 | 1.83 | 1.62 |
Sweden | 395.64 | 547.79 | 0.34 | 2.46 | 1.98 |
UK | 21.93 | 61.51 | 0.30 | 1.14 | 1.29 |
Country . | µ . | σ . | α . | |$\boldsymbol{\nu_{\textbf{1}}}$| . | |$\boldsymbol{\nu_{\textbf{2}}}$| . |
---|---|---|---|---|---|
Belgium | 39.57 | 59.64 | 0.28 | 2.22 | 1.75 |
Bulgaria | 3.91 | 16.58 | 0.20 | 1.03 | 1.34 |
Croatia | 51.97 | 141.04 | 0.25 | 1.57 | 1.72 |
Czech Republic | 182.78 | 648.84 | 0.24 | 1.61 | 1.61 |
Denmark | 392.56 | 510.91 | 0.34 | 1.92 | 1.54 |
Estonia | 4.82 | 20.22 | 0.16 | 1.22 | 1.87 |
Finland | 37.89 | 54.09 | 0.36 | 3.18 | 1.65 |
France | 32.24 | 51.21 | 0.28 | 2.45 | 1.84 |
Germany | 37.69 | 69.18 | 0.26 | 1.58 | 1.33 |
Hungary | 1365.14 | 4519.57 | 0.24 | 1.62 | 1.62 |
Italy | 21.98 | 53.46 | 0.29 | 2.05 | 2.16 |
Poland | 16.82 | 99.87 | 0.17 | 1.36 | 1.55 |
Portugal | 10.03 | 22.11 | 0.34 | 2.11 | 1.63 |
Romania | 5.67 | 43.53 | 0.20 | 1.23 | 1.68 |
Slovakia | 6.49 | 27.39 | 0.22 | 1.41 | 1.66 |
Slovenia | 16.27 | 25.72 | 0.23 | 1.89 | 1.99 |
Spain | 21.11 | 36.73 | 0.31 | 1.83 | 1.62 |
Sweden | 395.64 | 547.79 | 0.34 | 2.46 | 1.98 |
UK | 21.93 | 61.51 | 0.30 | 1.14 | 1.29 |
4.2. Comparison of the goodness of fit
Table A8 compares the log-likelihoods of the Lévy alpha-stable, the AST, and the AEP distributions for all country samples. Each log-likelihood value is normalized by dividing it by the number of observations for each country. “Mean” refers to the unweighted average normalized log-likelihood across all 19 countries. “Size-weighted mean” is the weighted average normalized log-likelihood across all 19 countries, where the weights are proportional to the sample size of each respective country. Despite having one fewer parameter, the Lévy alpha-stable model performs as well as the AST, and significantly better than the AEP5. Furthermore, as shown by the size-weighted mean, the Lévy alpha-stable model performs better for countries with larger sample sizes, such as France, Italy, Portugal, and Spain. As a result, the likelihood for the Lévy alpha-stable model has a higher value for the weighted mean compared to the unweighted mean.
Comparison of log-likelihoods for Lévy alpha-stable, AST, and AEP5. For ease of reading, each log-likelihood value is normalized by dividing it by the number of observations per country. The mean refers to the average normalized log-likelihood across all 19 countries. The weighted mean is the normalized log-likelihood but weighted by the sample size of each respective country
Country . | Obs. . | Lévy . | AST . | AEP5 . |
---|---|---|---|---|
Belgium | 1,042,148 | −5.040 | −5.044 | −5.111 |
Bulgaria | 794,664 | −4.001 | −3.998 | −4.313 |
Croatia | 426,082 | −5.961 | −5.957 | −6.049 |
Czech Republic | 753,188 | −7.504 | −7.506 | −7.686 |
Denmark | 136,724 | −7.254 | −7.254 | −7.343 |
Estonia | 195,983 | −4.009 | −4.001 | −4.055 |
Finland | 408,866 | −4.908 | −4.912 | −5.022 |
France | 2,396,914 | −4.858 | −4.862 | −4.947 |
Germany | 448,259 | −5.355 | −5.361 | −5.459 |
Hungary | 1,138,019 | −9.443 | −9.443 | −9.504 |
Italy | 3,970,425 | −4.865 | −4.873 | −4.948 |
Poland | 211,983 | −5.666 | −5.672 | −5.748 |
Portugal | 2,236,817 | −4.073 | −4.076 | −4.158 |
Romania | 1,756,871 | −4.825 | −4.820 | −4.874 |
Slovakia | 325,799 | −4.346 | −4.346 | −4.405 |
Slovenia | 244,853 | −4.181 | −4.176 | −4.215 |
Spain | 4,827,965 | −4.607 | −4.610 | −4.693 |
Sweden | 1,169,886 | −7.200 | −7.200 | −7.287 |
UK | 529,968 | −5.320 | −5.320 | −5.407 |
Mean | −5.443 | −5.444 | −5.538 | |
Size-weighted Mean | −5.180 | −5.183 | −5.272 |
Country . | Obs. . | Lévy . | AST . | AEP5 . |
---|---|---|---|---|
Belgium | 1,042,148 | −5.040 | −5.044 | −5.111 |
Bulgaria | 794,664 | −4.001 | −3.998 | −4.313 |
Croatia | 426,082 | −5.961 | −5.957 | −6.049 |
Czech Republic | 753,188 | −7.504 | −7.506 | −7.686 |
Denmark | 136,724 | −7.254 | −7.254 | −7.343 |
Estonia | 195,983 | −4.009 | −4.001 | −4.055 |
Finland | 408,866 | −4.908 | −4.912 | −5.022 |
France | 2,396,914 | −4.858 | −4.862 | −4.947 |
Germany | 448,259 | −5.355 | −5.361 | −5.459 |
Hungary | 1,138,019 | −9.443 | −9.443 | −9.504 |
Italy | 3,970,425 | −4.865 | −4.873 | −4.948 |
Poland | 211,983 | −5.666 | −5.672 | −5.748 |
Portugal | 2,236,817 | −4.073 | −4.076 | −4.158 |
Romania | 1,756,871 | −4.825 | −4.820 | −4.874 |
Slovakia | 325,799 | −4.346 | −4.346 | −4.405 |
Slovenia | 244,853 | −4.181 | −4.176 | −4.215 |
Spain | 4,827,965 | −4.607 | −4.610 | −4.693 |
Sweden | 1,169,886 | −7.200 | −7.200 | −7.287 |
UK | 529,968 | −5.320 | −5.320 | −5.407 |
Mean | −5.443 | −5.444 | −5.538 | |
Size-weighted Mean | −5.180 | −5.183 | −5.272 |
Comparison of log-likelihoods for Lévy alpha-stable, AST, and AEP5. For ease of reading, each log-likelihood value is normalized by dividing it by the number of observations per country. The mean refers to the average normalized log-likelihood across all 19 countries. The weighted mean is the normalized log-likelihood but weighted by the sample size of each respective country
Country . | Obs. . | Lévy . | AST . | AEP5 . |
---|---|---|---|---|
Belgium | 1,042,148 | −5.040 | −5.044 | −5.111 |
Bulgaria | 794,664 | −4.001 | −3.998 | −4.313 |
Croatia | 426,082 | −5.961 | −5.957 | −6.049 |
Czech Republic | 753,188 | −7.504 | −7.506 | −7.686 |
Denmark | 136,724 | −7.254 | −7.254 | −7.343 |
Estonia | 195,983 | −4.009 | −4.001 | −4.055 |
Finland | 408,866 | −4.908 | −4.912 | −5.022 |
France | 2,396,914 | −4.858 | −4.862 | −4.947 |
Germany | 448,259 | −5.355 | −5.361 | −5.459 |
Hungary | 1,138,019 | −9.443 | −9.443 | −9.504 |
Italy | 3,970,425 | −4.865 | −4.873 | −4.948 |
Poland | 211,983 | −5.666 | −5.672 | −5.748 |
Portugal | 2,236,817 | −4.073 | −4.076 | −4.158 |
Romania | 1,756,871 | −4.825 | −4.820 | −4.874 |
Slovakia | 325,799 | −4.346 | −4.346 | −4.405 |
Slovenia | 244,853 | −4.181 | −4.176 | −4.215 |
Spain | 4,827,965 | −4.607 | −4.610 | −4.693 |
Sweden | 1,169,886 | −7.200 | −7.200 | −7.287 |
UK | 529,968 | −5.320 | −5.320 | −5.407 |
Mean | −5.443 | −5.444 | −5.538 | |
Size-weighted Mean | −5.180 | −5.183 | −5.272 |
Country . | Obs. . | Lévy . | AST . | AEP5 . |
---|---|---|---|---|
Belgium | 1,042,148 | −5.040 | −5.044 | −5.111 |
Bulgaria | 794,664 | −4.001 | −3.998 | −4.313 |
Croatia | 426,082 | −5.961 | −5.957 | −6.049 |
Czech Republic | 753,188 | −7.504 | −7.506 | −7.686 |
Denmark | 136,724 | −7.254 | −7.254 | −7.343 |
Estonia | 195,983 | −4.009 | −4.001 | −4.055 |
Finland | 408,866 | −4.908 | −4.912 | −5.022 |
France | 2,396,914 | −4.858 | −4.862 | −4.947 |
Germany | 448,259 | −5.355 | −5.361 | −5.459 |
Hungary | 1,138,019 | −9.443 | −9.443 | −9.504 |
Italy | 3,970,425 | −4.865 | −4.873 | −4.948 |
Poland | 211,983 | −5.666 | −5.672 | −5.748 |
Portugal | 2,236,817 | −4.073 | −4.076 | −4.158 |
Romania | 1,756,871 | −4.825 | −4.820 | −4.874 |
Slovakia | 325,799 | −4.346 | −4.346 | −4.405 |
Slovenia | 244,853 | −4.181 | −4.176 | −4.215 |
Spain | 4,827,965 | −4.607 | −4.610 | −4.693 |
Sweden | 1,169,886 | −7.200 | −7.200 | −7.287 |
UK | 529,968 | −5.320 | −5.320 | −5.407 |
Mean | −5.443 | −5.444 | −5.538 | |
Size-weighted Mean | −5.180 | −5.183 | −5.272 |
As shown in Fig. 6, the Lévy alpha-stable distribution is especially effective at modeling the tail part of the data. To emphasize this point, Fig. A3 displays the distribution of LP with Lévy alpha-stable and ASP fits for all 19 countries. All distributions display 99.99% of observations. The Lévy alpha-stable distribution either outperforms the AST or performs equally well when modeling the tail parts for almost all country samples. Romania and Slovenia are notable exceptions where the Lévy model predicts a heavier tail.

Distribution of LP with Lévy alpha-stable and ASP fits for all 19 countries. The solid line indicates the Lévy alpha-stable and the dashed line indicates the five-parameter AST fit. The distribution displays 99.99% of observations for each of all countries. We only want to demonstrate the fit and the shape and not offer a comparison of the values between countries
Appendix 5 The disproportionate influence of UK Non profit and charitable organizations on the LP distribution
In the UK, NPOs and charitable organizations account for approximately 22% of the sample (Table A9), which is much higher than in other countries where this figure is typically less than 0.1%. These organizations tend to have a much higher fraction of negative VA observations (44% of NPOs, and nearly 100% for charitable entities), and therefore contribute disproportionately to the share of VA firms. Removing these organizations, the share of negative VA organizations falls from 8.75% to 5.38%.
Legal form . | Total obs. . | Prop.(%) . | Negative obs.(%) . | Weighted negative obs.(%) . |
---|---|---|---|---|
Charitable organization | 3796 | 0.72 | 94.92 | 0.68 |
Companies with unknown/unrecorded legal form | 22 | 0.00 | 9.09 | 0.00 |
Foreign companies | 189 | 0.04 | 17.46 | 0.01 |
Legal form unknown | 16 | 0.00 | 43.75 | 0.00 |
Limited company by guarantee | 733 | 0.14 | 54.30 | 0.07 |
Limited liability partnership—LLP | 157 | 0.03 | 3.82 | 0.00 |
NPOs | 114,802 | 21.66 | 17.99 | 3.90 |
Not companies act | 4 | 0.00 | 0.00 | 0.00 |
Other legal forms | 1158 | 0.22 | 7.51 | 0.02 |
Partnerships | 6869 | 1.30 | 3.01 | 0.04 |
Private limited companies | 370,489 | 69.91 | 5.24 | 3.66 |
Private limited company | 23,072 | 4.35 | 5.06 | 0.22 |
Public company AIM | 4 | 0.00 | 75.00 | 0.00 |
Public company not quoted | 630 | 0.12 | 10.00 | 0.01 |
Public company quoted | 15 | 0.00 | 46.67 | 0.00 |
Public company quoted OFEX | 4 | 0.00 | 0.00 | 0.00 |
Public limited companies | 7946 | 1.50 | 8.99 | 0.14 |
Unlimited company | 62 | 0.01 | 11.29 | 0.00 |
Legal form . | Total obs. . | Prop.(%) . | Negative obs.(%) . | Weighted negative obs.(%) . |
---|---|---|---|---|
Charitable organization | 3796 | 0.72 | 94.92 | 0.68 |
Companies with unknown/unrecorded legal form | 22 | 0.00 | 9.09 | 0.00 |
Foreign companies | 189 | 0.04 | 17.46 | 0.01 |
Legal form unknown | 16 | 0.00 | 43.75 | 0.00 |
Limited company by guarantee | 733 | 0.14 | 54.30 | 0.07 |
Limited liability partnership—LLP | 157 | 0.03 | 3.82 | 0.00 |
NPOs | 114,802 | 21.66 | 17.99 | 3.90 |
Not companies act | 4 | 0.00 | 0.00 | 0.00 |
Other legal forms | 1158 | 0.22 | 7.51 | 0.02 |
Partnerships | 6869 | 1.30 | 3.01 | 0.04 |
Private limited companies | 370,489 | 69.91 | 5.24 | 3.66 |
Private limited company | 23,072 | 4.35 | 5.06 | 0.22 |
Public company AIM | 4 | 0.00 | 75.00 | 0.00 |
Public company not quoted | 630 | 0.12 | 10.00 | 0.01 |
Public company quoted | 15 | 0.00 | 46.67 | 0.00 |
Public company quoted OFEX | 4 | 0.00 | 0.00 | 0.00 |
Public limited companies | 7946 | 1.50 | 8.99 | 0.14 |
Unlimited company | 62 | 0.01 | 11.29 | 0.00 |
Legal form . | Total obs. . | Prop.(%) . | Negative obs.(%) . | Weighted negative obs.(%) . |
---|---|---|---|---|
Charitable organization | 3796 | 0.72 | 94.92 | 0.68 |
Companies with unknown/unrecorded legal form | 22 | 0.00 | 9.09 | 0.00 |
Foreign companies | 189 | 0.04 | 17.46 | 0.01 |
Legal form unknown | 16 | 0.00 | 43.75 | 0.00 |
Limited company by guarantee | 733 | 0.14 | 54.30 | 0.07 |
Limited liability partnership—LLP | 157 | 0.03 | 3.82 | 0.00 |
NPOs | 114,802 | 21.66 | 17.99 | 3.90 |
Not companies act | 4 | 0.00 | 0.00 | 0.00 |
Other legal forms | 1158 | 0.22 | 7.51 | 0.02 |
Partnerships | 6869 | 1.30 | 3.01 | 0.04 |
Private limited companies | 370,489 | 69.91 | 5.24 | 3.66 |
Private limited company | 23,072 | 4.35 | 5.06 | 0.22 |
Public company AIM | 4 | 0.00 | 75.00 | 0.00 |
Public company not quoted | 630 | 0.12 | 10.00 | 0.01 |
Public company quoted | 15 | 0.00 | 46.67 | 0.00 |
Public company quoted OFEX | 4 | 0.00 | 0.00 | 0.00 |
Public limited companies | 7946 | 1.50 | 8.99 | 0.14 |
Unlimited company | 62 | 0.01 | 11.29 | 0.00 |
Legal form . | Total obs. . | Prop.(%) . | Negative obs.(%) . | Weighted negative obs.(%) . |
---|---|---|---|---|
Charitable organization | 3796 | 0.72 | 94.92 | 0.68 |
Companies with unknown/unrecorded legal form | 22 | 0.00 | 9.09 | 0.00 |
Foreign companies | 189 | 0.04 | 17.46 | 0.01 |
Legal form unknown | 16 | 0.00 | 43.75 | 0.00 |
Limited company by guarantee | 733 | 0.14 | 54.30 | 0.07 |
Limited liability partnership—LLP | 157 | 0.03 | 3.82 | 0.00 |
NPOs | 114,802 | 21.66 | 17.99 | 3.90 |
Not companies act | 4 | 0.00 | 0.00 | 0.00 |
Other legal forms | 1158 | 0.22 | 7.51 | 0.02 |
Partnerships | 6869 | 1.30 | 3.01 | 0.04 |
Private limited companies | 370,489 | 69.91 | 5.24 | 3.66 |
Private limited company | 23,072 | 4.35 | 5.06 | 0.22 |
Public company AIM | 4 | 0.00 | 75.00 | 0.00 |
Public company not quoted | 630 | 0.12 | 10.00 | 0.01 |
Public company quoted | 15 | 0.00 | 46.67 | 0.00 |
Public company quoted OFEX | 4 | 0.00 | 0.00 | 0.00 |
Public limited companies | 7946 | 1.50 | 8.99 | 0.14 |
Unlimited company | 62 | 0.01 | 11.29 | 0.00 |
Here, we investigate how sample selection by legal form affects the results reported in Section 4.3. Figure A4 displays the time series of the Lévy parameters across three different groups: the entire sample (red dots, repeating from Fig. 8 in the main text), the sample excluding the NPOs & charitable companies (green triangles), and a set with only public and private companies (blue squares). The sample without NPOs & charitable organizations and the one with only private and public limited companies exhibit similar patterns, so we will focus on comparing the full sample and the sample that excludes only NPOs & charitable organizations.

Time series of Lévy parameters for different legal forms in the UK. The circle represents the entire sample. The triangle denotes the samples without “Charitable organizations” and “Non-profit organizations.” The square denotes the samples consisting only of “Private limited company,” “Private limited companies,” and “Public limited companies.” Note that Orbis categorizes “Private limited companies” into two separate groups. The unit of currency is the Pound sterling, £
The tail and skewness parameters |$-\alpha$| and β exhibit a more stable pattern in the sample excluding NPOs & charitable entities, though the overall qualitative results across both samples are consistent, with the distribution becoming more right-skewed and with a heavier tail. The location parameter δ is predictably higher in samples that omit NPOs & charitable organizations, as they tend to have lower LP, but the trend remains very similar. The most notable difference can be found in the scale parameter γ. The complete sample exhibits a decrease in γ due to an initially high level which then dropped dramatically between 2008 and 2010. In contrast, the sample without NPOs & charitable organizations begins with a lower γ that rises gradually. Both samples show an increase in γ between 2010 and 2016.
While sample selection affects the trends, the Lévy parameters appear to remain effective at capturing the dynamics.
Appendix 6 Output-based LP
VA can also be computed by deducting intermediate costs (MATE in Orbis) from gross output (TURN in Orbis). For each firm i in industry j and year t, we now define real VA (|$Y_{i,t}$|) as the difference between sales (|$X_{i,t}$|) and intermediate costs (|$M_{i,t}$|), that is
where χ and m are nominal sales and material costs, and |$p_{j,c,t}^v$| is the VA deflator of industry j in country c at time t. In Orbis, nominal sales and nominal material costs variables are recorded in Sales (TURN) and material costs (MATE).
This method represents the output-based approach for calculating VA and thus LP. However, compared to wages and earnings, material costs are less frequently recorded in our sample. As shown in the first column of Table A10, 27.67% of firms in our sample do not report material costs. In some nations, such as the UK and Denmark, no firms report material costs at all.
The output approach to calculating VA does not eliminate negative values. The middle column of Table A10 indicates an approximate 2% negative VA. Additionally, a significant correlation exists between the output-based and income-based LP, with the overall correlation of about 0.7, as highlighted in the last column of Table A10.
Correlation between LP income and LP output. The correlation is absent for Denmark and the UK because there are no observations of LP output in these countries
Country . | Missing obs. (%) . | Negative obs. (%) . | Correlation . |
---|---|---|---|
Belgium | 87.18 | 2.60 | 0.44 |
Bulgaria | 6.85 | 1.86 | 0.61 |
Croatia | 0.30 | 6.47 | 0.81 |
Czech Republic | 0.64 | 4.54 | 0.76 |
Denmark | 100.00 | ||
Estonia | 5.41 | 2.02 | 0.76 |
Finland | 11.62 | 0.51 | 0.76 |
France | 17.01 | 0.20 | 0.76 |
Germany | 47.04 | 0.72 | 0.74 |
Hungary | 86.15 | 1.48 | 0.70 |
Italy | 4.07 | 2.10 | 0.70 |
Poland | 0.83 | 0.86 | 0.80 |
Portugal | 22.33 | 1.51 | 0.72 |
Romania | 2.76 | 3.52 | 0.60 |
Slovakia | 0.31 | 2.62 | 0.69 |
Slovenia | 0.51 | 0.29 | 0.72 |
Spain | 8.93 | 1.73 | 0.71 |
Sweden | 23.75 | 0.48 | 0.75 |
UK | 100.00 | ||
Mean | 27.67 | 1.97 | 0.71 |
Country . | Missing obs. (%) . | Negative obs. (%) . | Correlation . |
---|---|---|---|
Belgium | 87.18 | 2.60 | 0.44 |
Bulgaria | 6.85 | 1.86 | 0.61 |
Croatia | 0.30 | 6.47 | 0.81 |
Czech Republic | 0.64 | 4.54 | 0.76 |
Denmark | 100.00 | ||
Estonia | 5.41 | 2.02 | 0.76 |
Finland | 11.62 | 0.51 | 0.76 |
France | 17.01 | 0.20 | 0.76 |
Germany | 47.04 | 0.72 | 0.74 |
Hungary | 86.15 | 1.48 | 0.70 |
Italy | 4.07 | 2.10 | 0.70 |
Poland | 0.83 | 0.86 | 0.80 |
Portugal | 22.33 | 1.51 | 0.72 |
Romania | 2.76 | 3.52 | 0.60 |
Slovakia | 0.31 | 2.62 | 0.69 |
Slovenia | 0.51 | 0.29 | 0.72 |
Spain | 8.93 | 1.73 | 0.71 |
Sweden | 23.75 | 0.48 | 0.75 |
UK | 100.00 | ||
Mean | 27.67 | 1.97 | 0.71 |
Correlation between LP income and LP output. The correlation is absent for Denmark and the UK because there are no observations of LP output in these countries
Country . | Missing obs. (%) . | Negative obs. (%) . | Correlation . |
---|---|---|---|
Belgium | 87.18 | 2.60 | 0.44 |
Bulgaria | 6.85 | 1.86 | 0.61 |
Croatia | 0.30 | 6.47 | 0.81 |
Czech Republic | 0.64 | 4.54 | 0.76 |
Denmark | 100.00 | ||
Estonia | 5.41 | 2.02 | 0.76 |
Finland | 11.62 | 0.51 | 0.76 |
France | 17.01 | 0.20 | 0.76 |
Germany | 47.04 | 0.72 | 0.74 |
Hungary | 86.15 | 1.48 | 0.70 |
Italy | 4.07 | 2.10 | 0.70 |
Poland | 0.83 | 0.86 | 0.80 |
Portugal | 22.33 | 1.51 | 0.72 |
Romania | 2.76 | 3.52 | 0.60 |
Slovakia | 0.31 | 2.62 | 0.69 |
Slovenia | 0.51 | 0.29 | 0.72 |
Spain | 8.93 | 1.73 | 0.71 |
Sweden | 23.75 | 0.48 | 0.75 |
UK | 100.00 | ||
Mean | 27.67 | 1.97 | 0.71 |
Country . | Missing obs. (%) . | Negative obs. (%) . | Correlation . |
---|---|---|---|
Belgium | 87.18 | 2.60 | 0.44 |
Bulgaria | 6.85 | 1.86 | 0.61 |
Croatia | 0.30 | 6.47 | 0.81 |
Czech Republic | 0.64 | 4.54 | 0.76 |
Denmark | 100.00 | ||
Estonia | 5.41 | 2.02 | 0.76 |
Finland | 11.62 | 0.51 | 0.76 |
France | 17.01 | 0.20 | 0.76 |
Germany | 47.04 | 0.72 | 0.74 |
Hungary | 86.15 | 1.48 | 0.70 |
Italy | 4.07 | 2.10 | 0.70 |
Poland | 0.83 | 0.86 | 0.80 |
Portugal | 22.33 | 1.51 | 0.72 |
Romania | 2.76 | 3.52 | 0.60 |
Slovakia | 0.31 | 2.62 | 0.69 |
Slovenia | 0.51 | 0.29 | 0.72 |
Spain | 8.93 | 1.73 | 0.71 |
Sweden | 23.75 | 0.48 | 0.75 |
UK | 100.00 | ||
Mean | 27.67 | 1.97 | 0.71 |
6.1. Fit comparison
We examine the LP distributions in Fig. A5. The top graph illustrates France’s LP, while the bottom graph presents Italy’s. We selected France because it exemplifies a standard case in our study, even though it has 17% missing data, making its distribution a bit more irregular compared to the income-based LP shown in Fig. 6. Italy, on the other hand, was chosen due to its extensive sample size and minimal missing data, at just 4%.
Similar to the income-based LP, the output-based one also displays a unimodal, right-skewed, and heavily tailed pattern. Although the left tail is lighter in the output-based LP compared to the income-based one, it still includes numerous observations below zero. The Lévy alpha-stable distribution fits well. It performs marginally better in terms of the weighted mean of the log-likelihood (Table A11).

Distribution of output-based LP with Lévy alpha-stable and AST fits for France and Italy. The solid line indicates the Lévy alpha-stable, and the dotted line indicates the five-parameter AST fit
Comparison of log-likelihoods for Lévy alpha-stable and AST based on output-based LP distribution. For ease of reading, each log-likelihood value is normalized by dividing it by the number of observations per country. The mean refers to the average normalized log-likelihood across all 19 countries. The weighted mean is the normalized log-likelihood but weighted by the sample size of each respective country. Denmark and UK are missing as they have a zero observation of output-based LP
Country . | Obs. . | Lévy . | AST . |
---|---|---|---|
Belgium | 135,358 | −6.159 | −6.156 |
Bulgaria | 734,319 | −5.174 | −5.168 |
Croatia | 424,799 | −6.165 | −6.163 |
Czech Republic | 748,431 | −7.987 | −7.985 |
Denmark | |||
Estonia | 185,510 | −4.353 | −4.347 |
Finland | 360,935 | −5.372 | −5.375 |
France | 2,000,243 | −5.314 | −5.318 |
Germany | 241,400 | −5.918 | −5.924 |
Hungary | 124,157 | −10.925 | −10.918 |
Italy | 3,811,418 | −5.672 | −5.676 |
Poland | 209,621 | −6.317 | −6.315 |
Portugal | 1,730,073 | −4.410 | −4.416 |
Romania | 1,707,778 | −5.491 | −5.484 |
Slovakia | 324,876 | −4.962 | −4.962 |
Slovenia | 243,159 | −4.896 | −4.894 |
Spain | 4,366,299 | −4.907 | −4.912 |
Sweden | 896,175 | −7.421 | −7.421 |
UK | |||
Mean | −5.967 | −5.967 | |
Size-weighted Mean | −5.493 | −5.495 |
Country . | Obs. . | Lévy . | AST . |
---|---|---|---|
Belgium | 135,358 | −6.159 | −6.156 |
Bulgaria | 734,319 | −5.174 | −5.168 |
Croatia | 424,799 | −6.165 | −6.163 |
Czech Republic | 748,431 | −7.987 | −7.985 |
Denmark | |||
Estonia | 185,510 | −4.353 | −4.347 |
Finland | 360,935 | −5.372 | −5.375 |
France | 2,000,243 | −5.314 | −5.318 |
Germany | 241,400 | −5.918 | −5.924 |
Hungary | 124,157 | −10.925 | −10.918 |
Italy | 3,811,418 | −5.672 | −5.676 |
Poland | 209,621 | −6.317 | −6.315 |
Portugal | 1,730,073 | −4.410 | −4.416 |
Romania | 1,707,778 | −5.491 | −5.484 |
Slovakia | 324,876 | −4.962 | −4.962 |
Slovenia | 243,159 | −4.896 | −4.894 |
Spain | 4,366,299 | −4.907 | −4.912 |
Sweden | 896,175 | −7.421 | −7.421 |
UK | |||
Mean | −5.967 | −5.967 | |
Size-weighted Mean | −5.493 | −5.495 |
Comparison of log-likelihoods for Lévy alpha-stable and AST based on output-based LP distribution. For ease of reading, each log-likelihood value is normalized by dividing it by the number of observations per country. The mean refers to the average normalized log-likelihood across all 19 countries. The weighted mean is the normalized log-likelihood but weighted by the sample size of each respective country. Denmark and UK are missing as they have a zero observation of output-based LP
Country . | Obs. . | Lévy . | AST . |
---|---|---|---|
Belgium | 135,358 | −6.159 | −6.156 |
Bulgaria | 734,319 | −5.174 | −5.168 |
Croatia | 424,799 | −6.165 | −6.163 |
Czech Republic | 748,431 | −7.987 | −7.985 |
Denmark | |||
Estonia | 185,510 | −4.353 | −4.347 |
Finland | 360,935 | −5.372 | −5.375 |
France | 2,000,243 | −5.314 | −5.318 |
Germany | 241,400 | −5.918 | −5.924 |
Hungary | 124,157 | −10.925 | −10.918 |
Italy | 3,811,418 | −5.672 | −5.676 |
Poland | 209,621 | −6.317 | −6.315 |
Portugal | 1,730,073 | −4.410 | −4.416 |
Romania | 1,707,778 | −5.491 | −5.484 |
Slovakia | 324,876 | −4.962 | −4.962 |
Slovenia | 243,159 | −4.896 | −4.894 |
Spain | 4,366,299 | −4.907 | −4.912 |
Sweden | 896,175 | −7.421 | −7.421 |
UK | |||
Mean | −5.967 | −5.967 | |
Size-weighted Mean | −5.493 | −5.495 |
Country . | Obs. . | Lévy . | AST . |
---|---|---|---|
Belgium | 135,358 | −6.159 | −6.156 |
Bulgaria | 734,319 | −5.174 | −5.168 |
Croatia | 424,799 | −6.165 | −6.163 |
Czech Republic | 748,431 | −7.987 | −7.985 |
Denmark | |||
Estonia | 185,510 | −4.353 | −4.347 |
Finland | 360,935 | −5.372 | −5.375 |
France | 2,000,243 | −5.314 | −5.318 |
Germany | 241,400 | −5.918 | −5.924 |
Hungary | 124,157 | −10.925 | −10.918 |
Italy | 3,811,418 | −5.672 | −5.676 |
Poland | 209,621 | −6.317 | −6.315 |
Portugal | 1,730,073 | −4.410 | −4.416 |
Romania | 1,707,778 | −5.491 | −5.484 |
Slovakia | 324,876 | −4.962 | −4.962 |
Slovenia | 243,159 | −4.896 | −4.894 |
Spain | 4,366,299 | −4.907 | −4.912 |
Sweden | 896,175 | −7.421 | −7.421 |
UK | |||
Mean | −5.967 | −5.967 | |
Size-weighted Mean | −5.493 | −5.495 |
Appendix 7 Derivation of the scaling of the sample standard deviation with sample size
Here, we provide a highly stylized, heuristic derivation for the scaling of the sample standard deviation with sample size in the case of a Lévy alpha-stable distributed random variable. The key to this phenomenon is that because the theoretical moment is infinite, the larger the sample size, the higher is the chance that an extreme event is drawn. These extreme events are so extreme that they dominate the sum of squares from which the variance is computed. Thus, the larger the sample size, the larger the sample variance.
Sornette (2006) and Bouchaud and Potters (2003), for instance, provide more precise statements. Here, we expose the argument in the simplest, albeit non-rigorous way. We discuss the maximum, but analogous arguments apply to the minimum.
First, we note that for N large enough, the sample maximum will be dictated by the tail. A key characteristic of the Lévy alpha-stable distribution is that it has power law tails, that is, for large x (Nolan, 2020),
Now, in a sample of size N, we would hardly expect to see an extreme value that has chances of occurring less than |$1/N$|. Thus, we may define the “typical” value of the maximum as the value Xmax such that |$1/N=P(X \gt X_{\text{max}})$|. Using Eq. A.3, we have24 |$1/N \sim X_{\text{max}}^{-\alpha}$|, and solving for Xmax gives
For simplicity, let us assume a mean of zero,25 so that the sample variance is just the average squared value,
In a Lévy alpha-stable distribution, the square of the maximum (or minimum, if larger in absolute value) is so large that it dominates the entire sum of squares, such that we may approximate
where the last step uses Eq. A.4. Now, inserting Eq. A.6 into A.5, and taking square root, we find that the standard deviation depends on the sample size as
Note that when α = 2, so that the distribution is Gaussian, the sample standard deviation does not increase with sample size, as expected.
See also the derivations leading to (4.52) in Sornette (2006) and the heuristic proof based on the rank-size rule in Gabaix (2009, Proof of Proposition 2). A rigorous proof (Gabaix, 2011; Cohen et al., 2020) simply uses the fact that since Xi is (by assumption) regularly varying with index α, it follows that |$X_i^2$| is regularly varying with index |$\alpha/2$| (Gabaix, 2009, e.g. Eq. 8), and we can apply the GCLT to the sum of |$X_i^2$| to obtain the right scaling. The rigorous proof makes it clear that we expect the sample standard deviation to exhibit substantial variations, since the convergence is in distribution to a Lévy stable variable.
Appendix 8 Trapani’s (2016) procedure for testing for infinite moments
Trapani (2016) suggests a test for the divergence of arbitrary moments of order p, including fractional (non-integer) moments. As the moment of order p may be infinite, it is unknown whether a limiting distribution of the moments exists. The test can therefore not be applied directly. Following a randomized testing approach, the test therefore adds artificial randomness to manipulate the quantity in question, the sample moment of order p, to yield a known distribution if the moment is infinite. The resulting test statistic can then be compared to this distribution to obtain a P-value for whether or not the null hypothesis, that the moment of order p is infinite, is correct.
8.1. Test procedure
The test statistic is derived such that under the null hypothesis H0, the pth moment of the distribution is infinite. For this, the approach starts with the absolute sample moment of order p,
To make the resulting test statistic scale-invariant and comparable, the absolute moment must be rescaled,
with |$\psi \in (0,p)$|. |$A_p^{\mathcal{N}}$| denotes the pth absolute fractional lower-order moment of the standard normal distribution, |$A_{\psi}^{\mathcal{N}}$| the ψth absolute fractional lower-order moment of the standard normal, etc. Next, an artificial random sample ξ of size r26 is generated from a standard normal distribution and rescaled,
The intuition here is that φr follows a normal distribution with mean zero and a finite variance, as |$n \rightarrow \infty$|, if |$A_p^\ast$| is finite itself. Thus the problem has been reduced from testing for any moment p, to testing the existence of the variance of the transformed random variable φr.
The next step is to generate a sequence ζr, given by
where |$I[\cdot]$| is an indicator function, and u ≠ 0 is any real number. Under H0, |$\zeta_r(u)$| will have a Bernoulli distribution with mean |$\frac{1}{2}$| and variance |$\frac{1}{4}$|. This is not the case under the alternative, where |$A_p^\ast \lt \infty$|, as |$e^{A_p^\ast}$| converges to a finite value.
Values for u are picked from some density, but for simplicity it can be taken from a uniform distribution |$U(-1,1)$|. As a result, the test statistic of interest is obtained as
Under the null hypothesis that moment p is infinite, |$\vartheta_{r}(u)$| should reduce to zero given that |$\zeta_j(u)$| has a mean of two. |$\Theta_{r}$| is thus shown to follow a χ2 distribution with df = 1 if moment p is infinite.

Performance test for Trapani’s finite moment test conducted using the R package finity. Distribution of χ2 values and P-values given by the test for the divergence of the second moment (k = 2) for samples of size |$N=100,000$| drawn from a Lévy alpha-stable distribution |$S(\alpha, 0.5, 1.0, 0.0; 0)$| for three different values of |$\alpha \in \{1.2, 1.6, 2.0\}$|
8.2. Performance of the test
Trapani’s test depends on a number of parameters; the choice of ψ, r, and u can influence the test results. We largely follow Trapani (2016) in our choices of the parameter values and choose
The test results should be expected to be somewhat noisy. Especially for samples drawn from a Lévy alpha-stable distribution |$S(\alpha, \dots)$| in the vicinity of the true tail parameter |$k\approx \alpha$|, the test loses accuracy.
For the tests performed here, we use the R package finity (Heinrich and Winkler, 2020). Figure A6 shows the distribution of χ2 statistics and P-values for a test of divergence of the second moment (k = 2), for samples of size |$N=100,000$| drawn from a Lévy alpha-stable distribution |$S(\alpha, 0.5, 1.0, 0.0; 0)$| for three different values of α. The results in Fig. A6 reflect a typical pattern of performance of our implementation of the test at relatively large sample sizes (N > 1000), where the test tends to seriously over-reject the null even for values of |$2\gt \alpha \gt 1.6$|, reinforcing the robustness of our conclusion that productivity has an infinite variance.
8.3. Finite moment test results for country-year sample
Testing for infinite moments of firm-level labor productivity. P-values of Trapani’s finite moment test for the nonexistence of the second moment of the distribution of LP, for country-year samples. In the vast majority of cases, the P-value is above any sensible threshold (e.g. 1%, 5%, or 10%), suggesting that the nonexistence of the second moment cannot be rejected. For example, 84% of the P-values are greater than 0.05, meaning that, for 84% of country-year samples, the finite moment test failed to reject the null hypothesis of the infinite second moment at the 5% significance level
Country . | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 1.00 | 0.30 | 0.72 | 0.00 | 0.00 | 0.00 | 0.27 | 0.08 | 0.95 | 0.98 | 0.99 | 0.98 |
Bulgaria | 0.98 | 0.86 | 0.91 | 1.00 | 1.00 | 1.00 | 0.99 | 0.93 | 0.99 | 0.92 | 1.00 | |
Croatia | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.98 | 0.99 | 0.92 | 0.00 | 0.32 | |
Czech Republic | 0.99 | 0.97 | 0.99 | 1.00 | 1.00 | 0.99 | 0.83 | 1.00 | 0.65 | 0.52 | 1.00 | 0.99 |
Denmark | 1.00 | 1.00 | 1.00 | – | 0.99 | 0.99 | ||||||
Estonia | 1.00 | 0.03 | 0.01 | 1.00 | 0.42 | 0.00 | 0.09 | 0.00 | 0.00 | 0.00 | 0.00 | |
Finland | 0.10 | 0.08 | 0.63 | 0.08 | 0.28 | 0.59 | 0.63 | 0.82 | 0.92 | 0.62 | 0.76 | 1.00 |
France | 0.08 | 0.32 | 1.00 | 0.99 | 1.00 | 1.00 | 0.99 | 0.99 | 0.24 | 0.99 | 0.99 | 1.00 |
Germany | 1.00 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 1.00 | – | – | 1.00 | |
Hungary | 1.00 | 0.98 | 1.00 | 0.99 | 0.99 | – | – | 0.99 | 0.99 | 1.00 | 0.99 | |
Italy | 0.99 | 0.44 | 0.93 | 0.99 | 0.67 | 0.51 | 1.00 | 0.00 | 0.00 | 0.00 | 0.87 | 0.00 |
Poland | 1.00 | 1.00 | 0.59 | 0.99 | 0.99 | 0.98 | 0.94 | 1.00 | 0.96 | |||
Portugal | 0.95 | 0.95 | 1.00 | 1.00 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 |
Romania | 1.00 | 0.99 | 0.99 | 0.90 | 0.35 | 1.00 | 1.00 | 0.74 | 0.99 | 1.00 | ||
Slovakia | 0.98 | 0.99 | 1.00 | 0.16 | 0.37 | 1.00 | 0.43 | 0.78 | 0.00 | 0.86 | ||
Slovenia | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.00 | ||||
Spain | 0.28 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Sweden | 0.00 | 0.95 | 1.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.93 | 0.00 | 0.00 |
UK | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | – | 0.99 | – | 1.00 | 0.99 |
Country . | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 1.00 | 0.30 | 0.72 | 0.00 | 0.00 | 0.00 | 0.27 | 0.08 | 0.95 | 0.98 | 0.99 | 0.98 |
Bulgaria | 0.98 | 0.86 | 0.91 | 1.00 | 1.00 | 1.00 | 0.99 | 0.93 | 0.99 | 0.92 | 1.00 | |
Croatia | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.98 | 0.99 | 0.92 | 0.00 | 0.32 | |
Czech Republic | 0.99 | 0.97 | 0.99 | 1.00 | 1.00 | 0.99 | 0.83 | 1.00 | 0.65 | 0.52 | 1.00 | 0.99 |
Denmark | 1.00 | 1.00 | 1.00 | – | 0.99 | 0.99 | ||||||
Estonia | 1.00 | 0.03 | 0.01 | 1.00 | 0.42 | 0.00 | 0.09 | 0.00 | 0.00 | 0.00 | 0.00 | |
Finland | 0.10 | 0.08 | 0.63 | 0.08 | 0.28 | 0.59 | 0.63 | 0.82 | 0.92 | 0.62 | 0.76 | 1.00 |
France | 0.08 | 0.32 | 1.00 | 0.99 | 1.00 | 1.00 | 0.99 | 0.99 | 0.24 | 0.99 | 0.99 | 1.00 |
Germany | 1.00 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 1.00 | – | – | 1.00 | |
Hungary | 1.00 | 0.98 | 1.00 | 0.99 | 0.99 | – | – | 0.99 | 0.99 | 1.00 | 0.99 | |
Italy | 0.99 | 0.44 | 0.93 | 0.99 | 0.67 | 0.51 | 1.00 | 0.00 | 0.00 | 0.00 | 0.87 | 0.00 |
Poland | 1.00 | 1.00 | 0.59 | 0.99 | 0.99 | 0.98 | 0.94 | 1.00 | 0.96 | |||
Portugal | 0.95 | 0.95 | 1.00 | 1.00 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 |
Romania | 1.00 | 0.99 | 0.99 | 0.90 | 0.35 | 1.00 | 1.00 | 0.74 | 0.99 | 1.00 | ||
Slovakia | 0.98 | 0.99 | 1.00 | 0.16 | 0.37 | 1.00 | 0.43 | 0.78 | 0.00 | 0.86 | ||
Slovenia | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.00 | ||||
Spain | 0.28 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Sweden | 0.00 | 0.95 | 1.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.93 | 0.00 | 0.00 |
UK | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | – | 0.99 | – | 1.00 | 0.99 |
Testing for infinite moments of firm-level labor productivity. P-values of Trapani’s finite moment test for the nonexistence of the second moment of the distribution of LP, for country-year samples. In the vast majority of cases, the P-value is above any sensible threshold (e.g. 1%, 5%, or 10%), suggesting that the nonexistence of the second moment cannot be rejected. For example, 84% of the P-values are greater than 0.05, meaning that, for 84% of country-year samples, the finite moment test failed to reject the null hypothesis of the infinite second moment at the 5% significance level
Country . | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 1.00 | 0.30 | 0.72 | 0.00 | 0.00 | 0.00 | 0.27 | 0.08 | 0.95 | 0.98 | 0.99 | 0.98 |
Bulgaria | 0.98 | 0.86 | 0.91 | 1.00 | 1.00 | 1.00 | 0.99 | 0.93 | 0.99 | 0.92 | 1.00 | |
Croatia | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.98 | 0.99 | 0.92 | 0.00 | 0.32 | |
Czech Republic | 0.99 | 0.97 | 0.99 | 1.00 | 1.00 | 0.99 | 0.83 | 1.00 | 0.65 | 0.52 | 1.00 | 0.99 |
Denmark | 1.00 | 1.00 | 1.00 | – | 0.99 | 0.99 | ||||||
Estonia | 1.00 | 0.03 | 0.01 | 1.00 | 0.42 | 0.00 | 0.09 | 0.00 | 0.00 | 0.00 | 0.00 | |
Finland | 0.10 | 0.08 | 0.63 | 0.08 | 0.28 | 0.59 | 0.63 | 0.82 | 0.92 | 0.62 | 0.76 | 1.00 |
France | 0.08 | 0.32 | 1.00 | 0.99 | 1.00 | 1.00 | 0.99 | 0.99 | 0.24 | 0.99 | 0.99 | 1.00 |
Germany | 1.00 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 1.00 | – | – | 1.00 | |
Hungary | 1.00 | 0.98 | 1.00 | 0.99 | 0.99 | – | – | 0.99 | 0.99 | 1.00 | 0.99 | |
Italy | 0.99 | 0.44 | 0.93 | 0.99 | 0.67 | 0.51 | 1.00 | 0.00 | 0.00 | 0.00 | 0.87 | 0.00 |
Poland | 1.00 | 1.00 | 0.59 | 0.99 | 0.99 | 0.98 | 0.94 | 1.00 | 0.96 | |||
Portugal | 0.95 | 0.95 | 1.00 | 1.00 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 |
Romania | 1.00 | 0.99 | 0.99 | 0.90 | 0.35 | 1.00 | 1.00 | 0.74 | 0.99 | 1.00 | ||
Slovakia | 0.98 | 0.99 | 1.00 | 0.16 | 0.37 | 1.00 | 0.43 | 0.78 | 0.00 | 0.86 | ||
Slovenia | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.00 | ||||
Spain | 0.28 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Sweden | 0.00 | 0.95 | 1.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.93 | 0.00 | 0.00 |
UK | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | – | 0.99 | – | 1.00 | 0.99 |
Country . | 2006 . | 2007 . | 2008 . | 2009 . | 2010 . | 2011 . | 2012 . | 2013 . | 2014 . | 2015 . | 2016 . | 2017 . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Belgium | 1.00 | 0.30 | 0.72 | 0.00 | 0.00 | 0.00 | 0.27 | 0.08 | 0.95 | 0.98 | 0.99 | 0.98 |
Bulgaria | 0.98 | 0.86 | 0.91 | 1.00 | 1.00 | 1.00 | 0.99 | 0.93 | 0.99 | 0.92 | 1.00 | |
Croatia | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.98 | 0.99 | 0.92 | 0.00 | 0.32 | |
Czech Republic | 0.99 | 0.97 | 0.99 | 1.00 | 1.00 | 0.99 | 0.83 | 1.00 | 0.65 | 0.52 | 1.00 | 0.99 |
Denmark | 1.00 | 1.00 | 1.00 | – | 0.99 | 0.99 | ||||||
Estonia | 1.00 | 0.03 | 0.01 | 1.00 | 0.42 | 0.00 | 0.09 | 0.00 | 0.00 | 0.00 | 0.00 | |
Finland | 0.10 | 0.08 | 0.63 | 0.08 | 0.28 | 0.59 | 0.63 | 0.82 | 0.92 | 0.62 | 0.76 | 1.00 |
France | 0.08 | 0.32 | 1.00 | 0.99 | 1.00 | 1.00 | 0.99 | 0.99 | 0.24 | 0.99 | 0.99 | 1.00 |
Germany | 1.00 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 1.00 | – | – | 1.00 | |
Hungary | 1.00 | 0.98 | 1.00 | 0.99 | 0.99 | – | – | 0.99 | 0.99 | 1.00 | 0.99 | |
Italy | 0.99 | 0.44 | 0.93 | 0.99 | 0.67 | 0.51 | 1.00 | 0.00 | 0.00 | 0.00 | 0.87 | 0.00 |
Poland | 1.00 | 1.00 | 0.59 | 0.99 | 0.99 | 0.98 | 0.94 | 1.00 | 0.96 | |||
Portugal | 0.95 | 0.95 | 1.00 | 1.00 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 |
Romania | 1.00 | 0.99 | 0.99 | 0.90 | 0.35 | 1.00 | 1.00 | 0.74 | 0.99 | 1.00 | ||
Slovakia | 0.98 | 0.99 | 1.00 | 0.16 | 0.37 | 1.00 | 0.43 | 0.78 | 0.00 | 0.86 | ||
Slovenia | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.00 | ||||
Spain | 0.28 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Sweden | 0.00 | 0.95 | 1.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.93 | 0.00 | 0.00 |
UK | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | – | 0.99 | – | 1.00 | 0.99 |