Corporate R&D Intensity Decomposition: Different Data, Different Results?

Research and Development (R&amp;D) indicators are used to facilitate international comparisons and as targets for research and innovation policy. An example of such an indicator is R&amp;D intensity. The decomposition of the aggregate corporate R&amp;D intensity is able to explain the differences in R&amp;D intensity between countries by determining whether is the result of firms’ underinvestment in R&amp;D or of the differences across sectors. Despite its importance, the literature of corporate R&amp;D intensity decomposition has been developed only recently. This paper reviews for the first time the different methodological frameworks of corporate R&amp;D intensity decomposition and how they are used in practice, shedding light on why sometimes empirical results seem to be contradictory. It inspects how the use of different data sources and analytical methods affect R&amp;D intensity decomposition results, and what the analytical and policy implications are. The paper also provides methodological and analytical guidance to analysts and policymakers.


Introduction
Research and development (R&D) expenditures and intensity indicators have long been regarded as central for growth, productivity and competitiveness by both policy makers and innovation analysts (Schumpeter, 1949;Griliches, 1979;Romer, 1990;Coccia, 2008;Ugur et al., 2016).
R&D intensity targets are one of the main objectives of the EU's research and innovation policy agenda. The Barcelona Target (set in 2002 building on the Lisbon Strategy of 2000, and reaffirmed in the Europe 2020 strategy of 2014) states that the EU should spend 3% of GDP on R&D, two-thirds of which should come from the private sector.
To this extent, in 2002 the European Commission set up a benchmarking exercise which revealed that the EU was not performing at the same level as its main competing economies, notably the US and Japan. In the EU, only 1.9% of GDP was being invested in R&D, compared with the 2.7% in Japan and 2.98% in the US; in other words, there was an 'R&D intensity gap' (European Commission, 2003). As a result of this benchmarking exercise, a target for EU R&D intensity was set in an effort to close the gap (Sheehan and Wyckoff, 2003); unfortunately, the gap today is still similar to the one of 2002 1 .
The main reasons for such interest to R&D intensity (and its gap) resides on considerable contribution of R&D investment to productivity and competitiveness across firms, sectors, countries and type of technology. The ultimate effect is meet long-term socio-economic needs as sustainable growth and wellbeing. In fact, the literature suggests that, in accordance with endogenous Schumpeterian growth theory, productivity growth is positively influenced by R&D expenditure (Schumpeter, 1949;Griliches, 1994;Bartelsman et al., 2019). The increased efficiency due to higher productivity (as a result of an accelerated rate of technological change) allows for greater competitiveness of firms and of the economy as a whole (Ortega-Argilés et al., 2014Montresor and Vezzani, 2015;Castellani et al., 2019).
One of the approaches that has been developed and used by scholars and policy analysts to investigate the EU R&D intensity gap is the 'decomposition' of the R&D intensity gap into differences between countries and sectors. Such decomposition was originally conceived to evaluate the extent to which changes in aggregate R&D intensity can be explained by changes in industrial structure (van Reenen, 1997).
For policy purposes, it is particularly important to determine whether the differences between countries/regions are intrinsic, for example as a result of firms' underinvestment in R&D or structural, for example attributable to the sector composition of an economy.
Despite the relevance of such analytical exercise, the theoretical and methodological frameworks for decomposing countries' R&D intensity have been developed only recently and are still not extensively used in the rather limited literature which often offers contradictory results (Becker and Hall, 2013). Indeed, the examination of firms' R&D intensity in different industries, and at different layers of aggregation, has led to results that 3 are mixed and not completely understood (De Panizza and De Prato, 2009;Moncada-Paternò-Castello et al., 2010;Bjørnskov and Foss, 2016).
Its use (and often misuse) has, therefore, a considerable impact in the field of economics and policy assessment practices, and related measures that are consequences from such assessment.
In reviewing and comparing the existing studies on the decomposition of business R&D intensity, this paper contributes to the literature on R&D policy by explaining whether differences in R&D intensity found between countries are intrinsic (e.g. due to firms' underinvestment in R&D -as Erken, 2008;Gimbau-Albert and Maudos, 2013) or structural (e.g. due to differences in the sectors -as Moncada- Paternò-Castello et al., 2010;Stančik and Biagi, 2015). This paper constitutes the first exercise of identification of the main reasons why most studies, despite relying on very similar data and similar methodological approaches, obtain different results.
Furthermore, as a key contribution to the literature, this paper provides guidelines to the correct use of decomposition of R&D intensity by suggesting which data and methods should be better used, depending on the analytical aim, and help to ascertain the reliability of the results, providing recommendations for their analytical interpretation, also in terms of policy implications.
This paper is structured as follows. Section 2 surveys the literature on the main determinants of corporate R&D intensity, the methodologies used to decompose corporate R&D intensity and their main empirical results; section 3 discusses the main findings, including the reasons for the contrasting results, and illustrates the implications (impact) for the quality of the comparisons derived; section 4 provides the recommendations on data and methodology to decompose corporate R&D intensity, offering a practical guideline for analysts and policymakers; section 5 concludes by making some remarks relevant for analysts and policymakers, and suggesting potential avenues for future research in this area.

R&D intensity decomposition literature: theoretical framework, methodology and empirical findings
The R&D intensity decomposition methodology is based on a very solid theoretical framework of the main determinants of corporate R&D intensity made by an extensive literature and complemented by empirical R&D intensity decomposition results.

Theoretical framework of the determinants of corporate R&D intensity
Economic theory indicates that knowledge development (Schumpeter, 1949) and technical change (Solow, 1957) are the major sources of productivity growth in the long term. R&D is a major source of technical change (Romer, 1990;Guellec and van Pottelsberghe de la Potterie, 2001), and this is recognised as a key element for increasing the knowledge base and, with it, the growth, productivity and competitiveness of an economy (Mowery and Rosenberg, 1989;Coccia, 2008, Ugur et al., 2016. The effects of 'micro-macro convergence' of private and public (social) drivers in the implementation and promotion of corporate R&D Electronic copy available at: https://ssrn.com/abstract=3683943 activities hold the potential returns not only in productivity, but also in profitability, sales, market capitalisation, employment growth, competitiveness and socio-economic welfare (see, for example, Morbey and Reithner, 1990;Griliches, 1979Griliches, , 1994Cincera et al., 2009a;Hall et al., 2010). As regards the firm-level dimension, the theoretical framework of determinants of corporate R&D intensity indicates that the total corporate R&D intensity of a given economy (country) depends on both the structural (sector) composition effect and intrinsic effect (Pakes and Schankerman, 1984;Erken, 2008;Gorg and Greenaway, 2003;Mathieu and van Pottelsberghe de la Potterie, 2010;Vivarelli, 2013;Becker and Hall, 2013).
They argue that the structural factors affecting an economy can be exogenous or endogenous. Endogenous factors are characteristics typical of a given industry sector(s), while exogenous factors are usually external to the sector(s) and the country's macro-economic system.
On the other hand, the intrinsic factors are those that determine the characteristics of the firm(s) and its behaviour, for example the firm's knowledge, financial capacity or strategy and its R&D investment.
However, structural endogenous factors are also, at least to some extent, dependent on intrinsic factors (Erken and van Es, 2007) 2 . In other words, the sectoral structure of a country depends on not only, for example, historical industrial footprints, but also (especially) on the country's aggregate capacity to be successful in technological development or in competition for technology markets and on its collective capacity for R&D-led growth. We should add that structural factors can influence firm-intrinsic factors; for example, although firms' access to government funding for R&D depends on their strategy and their ability (intrinsic factors) to successfully obtain such funding, it is conditional on such public incentives (innovation and industrial policies) being available in the first place. Other (structural) factors as the market characteristics (e.g., technology-related product and demand and the access to human capital) influence the corporate R&D intensity result (Erken, 2008;Mathieu and van Pottelsberghe de la Potterie, 2010) The literature attempting to determine reasons for differences in R&D investment and intensity between economies is extensive. In sum, the main findings from this literature (e.g.: Bartelsman et al. 2019;Capone et al., 2019;Ortega-Argilés et al., 2015;Becker and Hall, 2013) focuses on three main arguments: (i) productivity as one of key drivers that links structural and intrinsic factors, (ii) structural endogenous factors and (iii) the intrinsic factors determining corporate R&D intensity. For an extensive theoretical discussion on these main determinants of corporate R&D intensity we refer the reader to the work by Moncada-Paternò-Castello (2017;pp. 18-25).

Basic method to calculate the decomposition of R&D intensity
Therefore, much of the scientific literature on the determinants of business R&D intensity gap focuses on whether differences in R&D intensity are due to differences in firms' investment (intrinsic effect), or they simply reflect differences in the sectoral composition of the economy (structural effect). To this extent, several methodologies to decompose the R&D intensity gap have been developed for benchmarking purposes.
The methodological starting point of this literature originates from the seminal work of van Reenen (1997), which defined R&D gap decomposition as "a straightforward accounting exercise" (p. 497). Most of recent studies on R&D gap use all the following basic equation when decomposing R&D intensity: where is R&D intensity 3 and , is the share of sales (or VA) of sector in country/region or country/region . 4 The next two subsections review the empirical findings concerning the causes of the EU corporate R&D intensity gap by intrinsic (firm-level) and sectoral effects.

Structural effects
A strand of researchers suggests that the R&D investment gap is mainly due to the structure of the economy (i.e. sectoral composition or structural effects). Scherer (1967) is one of the first empirical studies to show that most of R&D intensity differences can be explained by industry fixed effects. His empirical findings are then confirmed by those of Cohen et al. (1987) whose study shows that the sectoral composition explains half of R&D intensity differences across firms.
Moreover, several scholars have investigated the reasons for the commonly observed EU-US R&D investment gap, and the prevailing opinion is that technological specialisation is one of the main factors underpinning the EU R&D investment gap. Pavitt and Soete (1982) find that one of the main factors underpinning this phenomenon was the high degree of international specialisation in individual EU Member States. Mathieu and Van Pottelsberghe de la Potterie (2010) focus their analysis of business R&D (BERD) intensity gap on 10 EU Member States in 1991-2002 and 20 manufacturing sectors and conclude that this is mainly driven by degree of specialisation in R&D-intensive industries. Their findings suggest that specialisation in high R&D intensity sectors is the reason why R&D intensity is higher in some EU Member States than in others. Reinstaller and Unterlass (2012), also using BERD panel data, analysed the development of R&D intensity in the EU-27 countries and some relevant non-EU countries over the period [2004][2005][2006][2007]. They find that changes in aggregate BERD figures are driven by structural changes and within sector changes, with different speeds of change, depending on countries and sectors. In partial contrast to the above findings, Fagerberg et al. (2014) report evidence of convergence in the technological capabilities of US and EU countries during the 1998-2008 period. Their study also suggests that social capabilities, such as a welldeveloped public knowledge infrastructure, are a condition for the growth of such technological capabilities.
Van Ark et al. (2003) observe that, in the US, expenditure on R&D outside the manufacturing sector has been increasing since the mid-1990s and now accounts for about one-third of total R&D expenditure (from less than one-fifth in 1995). These authors also note that growth in services R&D has been slower in Europe than in the US and that the gap is probably explicable by the fact that ICT diffusion has been slower in Europe than in the US.
Empirical studies on the R&D intensity gap have used a plethora of data sources, often finding different results. For example, Gumbau-Albert and Maudos (2013) used the EU-KLEMS (EUlevel analysis of capital K, labour L, energy E, materials M and service S inputs) database to calculate R&D capital stock (rather than expenditures) with the aim of investigating differences in the technological capital intensity of various industries in 11 EU countries and the US. Their study finds that the technological gap in favour of the US until 1995 is explained by the greater accumulation of technological capital in most of the sectors considered. However, from 1995 onwards, a change in productive specialisation occurred: a significant drop in the relative economic importance of low-tech industries in the EU-11 economy was accompanied by a drop in the relative importance of some medium-tech industries in the US. This resulted into a reduction in the technological gap between the EU and the US. Gumbau-Albert and Maudos (2013) also find that differences in the productive structure of European countries explain most of the differences in technological capital intensity. Another recent decomposition analysis (Foster-McGregor et al., 2013) finds that differences in R&D intensity among manufacturing firms in seven EU Member States, the US and Japan are mainly driven by the intensity effect. Industry structure (composition effect) plays a role in some EU Member States but is never the primary factor. However, the authors suggest that the relative importance of the composition effect and the intensity effect in a decomposition exercise depends on the level of aggregation of the industries, and they recognise that a more detailed industry breakdown would assign greater importance to the composition effect, assuming that companies in the same sub-sector are closer in terms of R&D intensity (in line to Pavitt, 1984). Iorwerth (2005) undertakes a detailed decomposition following Diewert (2005) to shed light on differences between Canadian and US R&D intensities across industries. Using the OECD-STAN database for industrial analysis and the OECD R&D Expenditure in Industry database, his results indicate that, despite the high R&D intensity found in some sectors, their smaller relative size and the low R&D intensity in the motor vehicle and service sectors were the culprit for the low aggregate R&D intensity in Canada compared to US.
Many other scholars have focused on the comparison between US and EU (Ciupagea and Moncada-Paternò-Castello, 2006;O'Sullivan, 2007;Guellec and Sachwald, 2008), suggesting that the EU private R&D investment deficit is mainly due to a sectoral composition effect.
These studies find that the R&D intensity difference could be attributed to the fact that the ICT sector is smaller in the EU than in the US. This empirical reoccurrence confirms the findings of several other studies that base their analyses on samples from the EU R&D Scoreboard data (GFII, 2007;European Commission, 2007, 2008Moncada-Paternò-Castello et al., 2010;Cincera and Veugelers, 2013). In particular, Moncada-Paternò-Castello et al. (2010) find that structural effects accounted for 85% of the gap between the EU and the US, while only 15% of the gap can be attributed to intrinsic effects 5 . They also find that the majority of EU R&D investment is concentrated in a relatively small number of firms operating in sectors that are generally of lower R&D intensity than in the US. Cincera and Veugelers (2013) investigate the role of firms' age for the corporate R&D intensity gap between the EU and the US. They find that 55% of the EU gap is accounted for by greater R&D intensity in younger US firms, and this is almost entirely due to the different sectoral compositions of the two economies. Stančik and Biagi (2015), looking at between and within sector variation in R&D intensity, find that R&D intensity in the EU is lower than in the US, Japan and leading Asian countries due to structural effects, but it is higher than in the BRIC countries (Brazil, Russia, India and China) due to both intrinsic and structural effects. Hernandez et al. (2018) show with recent data that EU companies present a persistent and increasing R&D intensity gap vis-à-vis its US counterparts. The largest part of this gap is due to structural factors (sector composition effect) but the gap over the years 2012-2017 increased also in terms of intrinsic factors (R&D intensity differences sector by sector) in similar proportions to the structural gap.

Intrinsic effects
Many studies have confirmed the theoretical predictions and findings of a group of leading scholars (Dosi, 1997;Pianta, 2005) who consider the EU R&D deficit to be the result of companies' underinvestment in R&D (intrinsic effect). For example, Erken and van Es (2007) examine the differences in business R&D among 14 EU countries and the US in 36 sectors over a 17-year period using data from OECD-STAN (Organisation for Economic Co-operation and Development -STructural ANalysis Database) and ANBERD (Analytical Business Enterprise Research and Development database). Their study concludes that intrinsic effects may be the culprit for the private-sector R&D gap, given that the contribution of sectoral effects to the R&D funding gap between the EU and the US is very low. They also find that, when considering only the manufacturing sector, corporate R&D intensity does not differ much between the US and the EU. Their study suggests that the R&D gap is mainly due to institutional differences, including a lower level of government support for research activities in the EU. More recently, Hernandez et al. (2018), using data on the world top 2500 R&D investors for the period 2012-2107, find that EU companies show higher R&D intensities than their Chinese counterparts.

Mixed effects 8
Other studies have found some clear evidence of mixed (intrinsic and structural) effects. A recent study by Belitz et al. (2015), using OECD data at two-digit level, analyses the difference in private-sector R&D intensity between Germany and some other OECD countries. The study finds that structural effects and firm (intrinsic) effects play more or less equally important roles in explaining the differences between Germany and other OECD countries. The study also finds that both intrinsic and structural effects are strongly driven by a few researchintensive industries. Lindmark et al. (2010) compare two datasets-EU R&D Scoreboard micro-data and BERD statistics-to decompose EU and US R&D intensities. They conclude that about half of the overall R&D gap between the EU and the US lies in the ICT sector. More specifically, there are two aspects of the ICT R&D gap. On the one hand, BERD data suggest that the gap is largely intrinsic: R&D intensity is lower in the EU than in the US in several subsectors, even though ICT sector size and composition are quite similar. On the other hand, company data from the EU R&D Scoreboard suggest that the gap is instead structural: the sector size and composition of sub-sectors differ greatly, whereas R&D intensity is similar 6 .
In this context, it should be emphasised that the high-tech sectors are important not only because high-tech companies invest more intensively in R&D, but also because, in such sectors, generally 7 the link between R&D and productivity is stronger and more significant (Ortega-Argilés and Brandsma, 2010;Bloom et al., 2012;Brown et al., 2017). Furthermore, another important aspect is related to the different efficiency of R&D expenditure in terms of its productivity impact both across sectors, countries and types of technologies. This aspect has been extensively analysed by the literature (Ugur et al., 2016;Batterlsman et al., 2019;Castellani et al., 2019) to which we refer the reader.

Uncovering key analytical issues
The review of the analyses of the EU R&D intensity gap of Section 2 clearly points to contradictory results. In this Section, we address the following main research questions: • Why do these results differ from each other?
• How do results from R&D intensity decomposition change when using different data sources and methodological approaches, and what are their limitations? To answer these questions, first we analytically compare the statistical norms and accounting practices (Section 3.1) and the data sources (Section 3.2). Second, in Section 3.3, we do a systematic analysis of 17 studies on corporate R&D intensity decomposition.

Statistical norms/accounting practices
The contradictory findings regarding the causes of the R&D intensity gap between companies in the EU and in the US (or in other competing economies) suggest that some methodological problems make it difficult to agree on generally accepted measures of structural and intrinsic effects.
In particular, the decomposition of the R&D deficit into two components (see Section 2) has been shown to be sensitive to the level of detail at which industries are compared (Jaumotte and Pain, 2005), to whether or not service sectors are taken into consideration together with manufacturing (Erken and van Es, 2007) and to the data used and methodologies adopted (Pianta, 2005).
When considering both manufacturing and service sectors, the results lack robustness because of the widely recognised problem of comparing service sector R&D data between, for example, the US and the EU, which are subject to very different statistical norms (Erken and van Es, 2007). This is confirmed by Duchêne et al. (2010), who find that, when it comes to the classification of multi-activity companies, the Frascati Manual (OECD, 2002) recommends using the principal activity of the firm as the classification criterion, but subdividing its R&D when the activities are heterogeneous, therefore using product field information (i.e. nature or use of the product for which the R&D is conducted) in order to redistribute the R&D activities to the manufacturing industries concerned. However, not all countries use product field data to the same extent to reclassify R&D: while in the US firms are classified by principal activity only, the majority of EU Member States use product field information to reallocate R&D expenditure.

Data sources
One of the most important causes of this discrepancy in corporate R&D intensity decomposition, according to the literature, is the nature of the data used, and especially the way in which data are collected.

Manufacturing Services
Electronic copy available at: https://ssrn.com/abstract=3683943 most frequently used in analyses of EU corporate R&D intensity decomposition. EU R&D Scoreboard data may be the most appropriate for the examination of corporate R&D investment, as they not only provide detailed information on companies' global R&D investments, but they offer also data on variables that are related to the R&D investments, such as net sales, operating profits and employees.
Also, the focus is on the total R&D by each parent company, rather than parts of R&D by subsidiaries in different territories. Conversely, BERD and ANBERD data refer to all R&D activities performed by businesses within a particular territory (and therefore include small parts of many global businesses), regardless of the location of the business's headquarters, and regardless of its sources of finance. In summary, the distinction between Scoreboard and BERD/ANBERD/EU-KLEMS data can be seen overall as 'global corporate funding' versus 'activity within a geographical area'.
There are several studies that exhaustively discuss the detailed statistical differences between data, ranging from the definitions of R&D to the methodologies used to collect the information. As this survey focuses on the decomposition of R&D intensity, we invite the reader to refer to the appropriate literature for both explanations of statistical differences in data and exhaustive estimations of the extent to which these differences affect the quality of any comparison.
Examples of authors who have investigated the statistical characteristics in depth include Potì et al. (2007), who compared BERD and the Community Innovation Survey (CIS), and O'Mahony and Timmer (2009) These studies suggest that international comparison at sector level and micro-level is not always possible 8 because of often significant methodological differences; however, different sources frequently result in an extremely useful complementarity of information 9 .

Monetary flows
All R&D financed by a particular company from its own funds (externally financed R&D is excluded), regardless of where that R&D activity is performed All R&D expenditures by those parts of companies located within the country, regardless of where the funds for that R&D activity come from. R&D data refer to in-house R&D expenditure only (excluding contracted-out R&D) and only to a company's R&D performed within the national territory (and not to all parts of a company located within the country) As BERD database, but includes a number of estimations for missing data R&D investments are considered as capital stock (and not as expenditure) and are incorporated in gross fixed capital formation ( a ); R&D is specifically considered to be a production asset

Sample
Top R&D-investing companies (only firms' R&D investment that is reported in publicly available annual reports is collected) Collected through a census of all R&Dperforming companies in a country. Only some countries use stratified samples. It covers all large companies and a representative sample of smaller companies with no size threshold Completes BERD with information from national statistical offices and with estimations and sector reclassifications for internationally comparable data Like ANBERD (STAN), this uses additional sources, such as national accounts, industry surveys, labour force surveys and capital formation surveys

Statistical unit
Subsidiaries counted within the consolidated group; R&D systematically attributed to the registered offices Business enterprises' subsidiaries are counted separately; R&D is attributed to headquarters or registered offices. Statistics for enterprises are compiled at national level and for local units at regional statistics level (NUTS 2 level)

As BERD
At detailed industry level per country but also provides higher-level aggregates (e.g. total economy, total market, services and total goods production)

Data collection frameworks
International Accounting Standard (IAS) 38 and national accounting standards  (a) A flow value, defined as the total value of a producer's acquisitions, less disposals of fixed assets.

12
To provide an example of the impact that the use of different data sources has on the final decomposition result, we report in Figure 2 the work by Hernandez et al. (2013), who investigated the EU-US R&D gap by analysing ANBERD 10 statistics (national statistics of intramural business expenditure on R&D) and EU R&D Scoreboard data. The figure quantifies the discrepancies that derive from using these different data sources for decomposition of the R&D intensity gap between the EU and the US. The mentioned authors suggest that the main reason for such discrepant results is the nature of the data sources (see next sub-section for more insights).
More specifically, the figure reports for EU vs US the results of the decomposition (the total R&D intensity gap and its decomposition in intrinsic end structural effects). It shows the decomposition results by R&D intensity manufacturing and services groups and by the overall total EU vs US gap for each of the two decomposition calculations using ANBERD and EU R&D Scoreboard data (respectively, in the top and bottom graphical panels).
This figure represents one of the key showcases for the main output of this paper as it demonstrates empirically that applying the same business R&D decomposition equation for the same statistical year (2012) and for the same world region/country, two different decomposition results come out when using two different data sources. That is, the EU vis-à-vis the USA holds an overall total negative benchmarking result (gap) for business R&D intensity of 0.37 % (top graphical panel, using ANBERD data) vs 1.78% (bottom graphical panel, using EU R&D Scoreboard data). Furthermore, when decomposing it, the main determining factor of the gap result to be opposite for each of the two calculations: intrinsic using ANBERD (top graphical panel) and structural using EU R&D Scoreboard data (bottom graphical panel). Nonetheless, both calculations based on ANBERD and EU R&D Scoreboard data indicate that, together with the service sector, the high-tech (high R&D intensity) manufacturing sectors' group is the main responsible for the transatlantic business R&D intensity gap. Though, for this manufacturing sectors group, the main determining factor of the gap is intrinsic when using the ANBERD and structural wen using the EU R&D Scoreboard data.
Therefore, according to the national statistics (ANBERD data), industrial activities of firms located in the EU are much less R&D intensive than those firms located in the US, especially in key high-tech sectors (e.g. ICT), i.e. the EU vs US gap is mostly due to an intrinsic effect; reversely, according to the EU R&D Scoreboard data the activities of firms located in the EU are more R&D intensive than those firms located in the US, but the EU sample holds a much smaller size of the high-tech sector compared to the USA, meaning that the EU vs US gap is mostly structural.

Figure 2. Private R&D intensity gap between US and EU using ANBERD vs. EU R&D Scoreboard data sources, by groups of R&D intensity sectors (year 2012)
Source: Authors' calculations based on Hernandez et al. (2013)

International flows of R&D and business output
According to Lindmark et al. (2010), one factor that could explain the contradictory decomposition results is international flows of R&D and VA: companies tend to allocate a larger proportion of their VA and a smaller proportion of their R&D outside their home markets. In sub-sectors with a large number of large US companies, these flows are unbalanced, and (BERD) R&D intensities are thus higher in the US than in Europe. Similar results were obtained by Hernandez et al. (2013), who argue that the industrial activities (production and R&D) of foreign-controlled companies play a pivotal role in the discrepant results obtained using these two datasets. Therefore, we support the argument that the discrepancy in the nature of the EU-US R&D intensity gap that is found using national statistics (national intramural data on production BERD) and using data on net sales and corporate R&D investment from the EU R&D Scoreboard is mainly due to the accounting practices for inward (or intramural) and outward (or extramural) activities of foreign-controlled firms.
To give a sense of this phenomenon and the impact it could have on overall R&D decomposition results, we report the calculations made by Hernandez et al. (2013) in Table 2.  (2012) This table reports the percentage of production and R&D performed by foreign companies located in the US (intramural), and by US companies located abroad (extramural), in few specific sectors in the year 2006, with respect to the total US domestic production and domestic R&D. For example, the 173% value in the table means that US companies located abroad (extramural) have an "office, accounting & computing machinery" production of 173% compared to the production of companies within the US territory, i.e. they produce almost the double abroad compared to the domestic US production. Similarly, the foreign companies located in the US perform R&D within the US territory (intramural) which corresponds to 14.5 % of the total business R&D performed in the US territory by the pharmaceutical sector.
Two main messages arise from Table 2: (a) companies delocalise production and research facilities in different and considerable proportions, which may lead to substantial changes in the R&D intensity of both source and destination countries; (b) offshoring of activities varies significantly from sector to sector. These figures also explain why the net sales of the US Scoreboard companies in high-tech manufacturing sectors, especially in ICT manufacturing sectors, are much larger than the whole US production in these sectors.
Unfortunately, figures equivalent to those used in this table for the whole EU are not fully available to enable an EU-US comparison. However, according to Hernandez et al. (2013), data from some EU countries confirm that companies' inward and outward activities in the pharmaceutical and ICT sectors are likely to affect the comparison of R&D intensity between the EU and the US.
In order to offer an appreciation of magnitude of the impact that international flows of R&D could have on the measurement of business R&D intensity in a given country, we analysed the relationship between BERD intensity and the share of foreign-affiliated R&D activities (inward BERD) in the total The result of our analysis by country is reported in Figure 3. It should be noted that the share of R&D by foreign affiliates in BERD is higher than 50% in some EU countries, such as Hungary, Ireland, Belgium, the Czech Republic, Great Britain and Austria. On the other hand, Japan and the US, which have a low proportion (< 20%) of R&D foreign affiliates, show higher BERD intensity.

Figure 3. Proportion of BERD by foreign affiliates and BERD intensity in selected countries (2015)
Source: Authors' calculations based on OECD statistics (2015) 11 .

Accounting (or not) for countries' industrial structures
Even with a single data source in the same study (which, however, does not decompose R&D intensity into structural and intrinsic effects), there are other cases of discrepancies in the calculation of business R&D intensities depending on the approach adopted. For example, following one of the first examples by the French Ministry for Education and Research (Le Ru, 2012), it is only in recent editions of the Science, Technology and Industry Scoreboard (OECD, 2015) that the OECD has recognised the role of structural differences between countries in the calculation and comparison of their R&D intensities, and overcome it by adjusting the R&D intensity using the OECD industrial structure -the sectoral share of OECD VA for the given year (2013) -as adjusted, common weights across all countries. Instead, the unadjusted measure of BERD intensity is an average based on each country's actual sector shares. The different results using the two measurements of R&D intensity are shown in Figure 4.

Different R&D intensity ratios
The definition of R&D intensity as an indicator of country or company performance is another important aspect to mention. First, the numerators and denominators could be different in nature. For example, the numerator is either firms' R&D investment or BERD; the former data are captured from firms' financial accounts and the latter from surveys.
BERD data are more accurate for territorial analysis of private R&D activities, although they are revealing only if the data components of inward and outward flows of R&D investment and production (VA) are available and taken into consideration. Overall, as indicated above, the focus on intramural R&D of BERD data is a key difference from the EU R&D Scoreboard data (which includes both intramural and extramural R&D). This difference complicates the comparison between the two databases. These aspects are enormously important when drawing correct policy implications 12 .
Furthermore, in statistical macro-or meso-analysis for policymakers, the denominator is either GDP or VA, whereas firms' sales or VA are used by corporate and financial analysts to benchmark their competitiveness with peers at corporate or product/service levels. The differences are substantial. Firms' sales are used by corporate and financial analysts to evaluate their level of financial effort (R&D investment) in relation to their market size (sales) and to compare this with the financial effort of their main competitors. Firms also use VA to measure the economic health of a company as a whole or of a given product/service and to identify differences (if any) from competitors. In contrast, GDP or VA is used in macro-or meso-analysis by policymakers and policy analysts as the denominator of the R&D intensity ratio to monitor territorial competitiveness.
12 For example, the reasons for the relevance of companies' cross-border activities in the evaluation of the EU-US intensity gap using BERD or the EU R&D Scoreboard data, and the apparent discrepancies in the results, are provided by the European Commission (2014).

Unadjusted business R&D intensity Adjusted business R&D intensity
Electronic copy available at: https://ssrn.com/abstract=3683943 These differences could account for some mismatches in the results when comparing aggregate R&D intensities, depending on the definition of R&D intensity and the data used in the calculation.
An example is provided by Lindmark et al. (2010), who calculated the ICT R&D intensities of the EU and the US using both GDP and VA for ICT in 2005. The results of their calculations are included in Table 3 which shows that the US versus EU difference in the ratio of ICT R&D expenditure to VA (intensity 2) is proportionally smaller (10% versus 6%) than the difference in the ratio of ICT R&D expenditure to GDP contribution (intensity 1) (0.6% versus 0.3%).

Other micro-and macro-economic factors
A few more points about the use of the R&D intensity as a statistical indicator need to be made. There are issues of a micro-and macro-economic nature concerning the interpretation of R&D intensity indicators over time, as countries enter or leave economic cycles at different points and grow at different but fluctuating rates (Meister and Verspagen, 2006). As we have seen in section 2.1, the aggregate R&D intensity indicator is affected not only by the industrial structure but also by the characteristics (demographics, business cycle) of the pool of firms that make up that structure, with also firms' heterogeneous R&D efforts within the same sector, and by other structural factors and intrinsic factors (Leiponen and Drejer, 2007;Coad, 2019).
Moreover, something that the private-sector R&D intensity indicator does not consider is the complex -but important -issue of the efficiency and the effectiveness of R&D investment (Cincera et al., 2009;Ortega-Argilés et al., 2014). GDP accounts for economic output and the BERD to GDP ratio measures R&D efforts (the part of private-sector economic activities devoted to R&D), not R&D efficiency or effectiveness (Godin, 2007, Brown, 2017.
Nor does private-sector R&D intensity take into account different companies' strategies, as is relevant in the case of some sub-sectors, such as the pharmaceutical and biotechnology sectors, which require firms to invest heavily in R&D but in which sales may be very low for several years until new products can be successfully introduced.
It is not surprising that economic changes could hamper or facilitate the capacity of one country relative to others to continue to invest in R&D; it could also result in a higher or lower intensity ratio simply because the value of the denominator had fallen or risen.

Systematic analysis of business R&D decomposition studies
Although the decomposition of R&D intensity is very broadly used by academics and practitioners, the literature which displays and elaborates on the methodological aspects of its calculations is limited. Therefore, given the impact this has in real-world policy applications, and in addition to the analysis of the literature that identifies the main reasons for contrasting results, we systematically collected key information from the studies on the decomposition of private R&D intensity that were found in the literature and analysed it. The objective of the analysis is to spot any additional methodological commonalities or differences that could influence the decomposition results; that is, to identify if results are consistent across studies as well as to identify the reason for the variations when the results vary from one study to the next.

Criteria for selecting the studies
To our knowledge, there are only 17 studies (Table 4) in the literature that, at the same time, (i) focus on the decomposition of corporate R&D intensity, (ii) implement a comparative analysis of the determinants of R&D intensity, at least at country level and (iii) apply and display in full the computational methods of such business R&D intensity decomposition. 13 of these analyse the EU (seldom with the full number of Member States) and competing countries/regions comparatively as US, Japan, China, Canada, Australia, BRIC countries (Brazil, India and China together), Asian Tigers countries, South Korea, Switzerland, etc... . Indeed, the comparison of those non-EU countries is mostly vis-à-vis EU/EU countries and less frequently between some non-EU countries (e.g. Canada vs US); nonetheless, this reflects what the present literature offers.

Analytical approach
For the analysis of the information collected, we are conscious that the implementation of an empirical meta-regression analysis with results from, for example, 40 or more studies using the different data sources would have provided a robustness check. However, the result of such a test would not be sufficiently robust on account of the too small sample of studies that implement a corporate R&D intensity decomposition analysis; they are too few in the present literature, and only few of them analyse the EU and competing countries/regions comparatively. Furthermore, the country composition of the EU is heterogeneous across these studies. Therefore, in the following paragraphs we systematically analyse the key aspects that can be determined from the information collected.
The analytical results are the following.

1) Research questions
Because of the focus of this research and the selection criteria for these studies, the research objective common to the 17 papers is to analyse the effect of sector composition and intrinsic effects on private R&D expenditure in the EU or a given EU country, compared with a competing economy. Some of those studies investigate further what are the causes of the dominant intrinsic and/or structural factors that determine the R&D intensity gap (e.g. Erken, 2008); others focus their analysis on one group of sectors (e.g. ICT-related sectors in the case of Lindmark et al., 2010, or manufacturing sectors in the case of Foster-McGregor et al., 2013). Further research objectives include examining the EU R&D intensity gap from the perspective of the age of firms (Cincera and Veugelers, 2013).

2) Time span and geographical scope
Electronic copy available at: https://ssrn.com/abstract=3683943 The time period under investigation varies, from one year only to a range of years, all within the period of 1974-2013. The geographical scope of the comparative R&D intensity decomposition studies also varies; some compare one country with another country, while others compare the EU with one or several other countries. It should be noted that the sample composition of the EU countries varies from 7 to 28. Nonetheless, we cannot detect any commonality of values among the two so heterogenic variables that could be associated with the different decomposition results of the studies investigated.

3) R&D intensity ratios
Most studies that rely on BERD, ANBERD or EU-KLEMS use the R&D expenditure to VA ratio and some use the R&D expenditure to GDP ratio. The studies that rely on the EU R&D Scoreboard all use the R&D investment to net sales ratio. While for the first group of studies we cannot detect any commonality of values among the diverse R&D intensity ratios that could be associated with the different decomposition results of the studies investigated, in the case of the second group of studies both the intensity ratios and the results of the analyses are the same.

4) Data sources 13 and decomposition methodologies
From the analysis of the 17 studies reviewed, we found an additional key reason: although the differences in the basic calculation equations used are not substantial, when relying on national statistics (BERD) or OECD ANBERD data, the result changes if the countries' industrial structures are taken into account in the calculations. In fact, the inclusion of this variable substantially affects rankings (e.g. Sandven and Smith, 1998;see also OECD, 2015) or the overall result by indicating that the EU R&D intensity gap vis-à-vis major competing economies is mainly determined by structural factors -the opposite result from studies that have not accounted for this variable.
Moreover, the analysis of the empirical literature surveyed also shows a general association between the use of firm-level data from the EU R&D Scoreboard and structural effects as the main determinant of the EU R&D intensity gap. These findings seem to be robust over the time span of the 17 studies, as in that period the aggregate industrial structure of the EU countries did not change markedly (Foray and Lhuillery, 2010;Janger et al., 2011).
The reason why the correlation between the use of EU R&D Scoreboard data and structural effects is always dominant has never been investigated. One possible explanation could be that firms' R&D investment and sales are representative of the activities of the firms globally but not necessarily for the country of the companies' registered offices. Electronic copy available at: https://ssrn.com/abstract=3683943

Recommendations on data and methodology for analysts and policymakers
The arguments presented in the previous sections regarding the main reasons for discrepant results on the decomposition of business R&D intensity found in the literature can be grouped as follows: (i) differing accounting practices and natures of data sources used; (ii) R&D intensity decomposition methodology, including possible adjustments for countries' industrial structures and the definition of R&D intensity used; and (iii) heterogeneity of countries and business structures and the timing of the economic cycles analysed.
But what is the best approach to be pursued in the decomposition analysis of corporate R&D intensity?
Global corporate R&D investment can best be analysed using EU R&D Scoreboard data to interrogate the global R&D performance and economic competitiveness of European multinationals at the level of firms. The advantage of this micro-data source is that it covers most private R&D worldwide (around 90%) (European Commission, 2014: p. 15, n. 3); the limitation is that the denominator (usually firms' sales) does not represent the overall structure of the economy in the country, especially for services. Furthermore, another limitation is the sample selection (most R&D-investing firms), although the bias is homogeneous over the time and geographical areas. The best use of EU R&D Scoreboard data is when similar companies are compared, and when R&D data are used with patent data to overcome the technological strategy and localisation pitfalls of data merely on companies' global R&D investment.
BERD data are more accurate for territorial analysis of private R&D activities, although they do not account for the outflow activities of the foreign-affiliated companies in a given country. This weakness could be compensated for, for example, using Foreign AffiliaTes statistics (FATS), where these data are available for the countries analysed. Furthermore, in the R&D intensity ratio, the denominator used in statistical macro-analyses by policy-makers is the BERD/ANBERD to GDP ratio; however, as R&D intensity varies very much by sectors and sub-sectors, it would be opportune to analyse sector-specific R&D intensities using VA as the denominator. Additionally, when analysts have to compare heterogeneous economies, they are advised to consider that countries with a high GDP tend to be more R&D intensive (Krafft et al., 2014).
Corporate and financial analysts use firms' sales or VA as the denominator of R&D intensity to benchmark their competitiveness against peers at the corporate or product/service level, and should be advised to account for the behaviour of multinational companies, which tend to allocate a higher share of VA and a smaller share of R&D outside their home market, resulting in distorted R&D intensity (R&D expenditure to VA) results for their home countries (Lindmark et al., 2010).
Overall, the use of BERD and EU R&D Scoreboard data are complementary and can provide a better, more comprehensive view of business R&D intensity. Table 5 provides a guide to analysts and policymakers in their choice of methodological approaches that can be used to investigate business R&D intensity decomposition, showing their main limitations and offering some suggestions for interpretation, considering the impact on results of those limitations.

Control for age
Patents data from PASTAT to proxy R&D localisation Source: authors' analysis, based on 17 studies surveyed and the other literature already referred to in this study.

Conclusions
This study adds to the present literature in several ways. One of the key novel outcomes of this paper is i) the identification of the main reason why some of such studies come up with different results, although most of them rely on very similar data and apply similar methods. This analysis in fact reveals that, when using BERD or ANBERD data, sectoral composition (a structural effect) is the main determinant of the EU business R&D intensity gap if the industrial structure of the economies is taken into account; otherwise, they indicate intrinsic effects as the main determinant. Furthermore, it was found that, when using the EU R&D Scoreboard data, studies always show that structural effects are the main determinant.
This study also shows that ii) both the scientific and policy communities take quite too lightly for granted the robustness of these default decomposition methods, regardless of the data and variables used. iii) Moreover, and importantly, this study add to the existent literature in providing a guidance to analysts and policymakers by indicating which data and methods should be better approached in decomposing corporate R&D intensity, and how to correctly interpreting the results also in terms of policy implications.
The relevance of the methodological approach and the interpretation of results suggest that policymakers and analysts should also use data from complementary sources when available, taking into account their appropriateness to the analytical aim as well as the robustness and limitations of the data sets.
Overall, we believe that a more accurate approach that could be explored would be comparing the corporate R&D intensity performances of similar companies in different jurisdictions, as well as countries with comparable sectoral structures and overall economic performances, the accuracy of which would increase as more and better-quality data became available.
Additionally, there is an overarching issue regarding the interpretation of corporate R&D intensity data. The counter-cyclical or cyclical behaviour of companies and countries depending on their level of competitiveness and distance from the technological frontier could well influence overall benchmarking results. Moreover, corporate R&D intensity does not capture the efficiency and effectiveness of R&D investment, or the business/technological characteristics or strategies of firms.
The findings of this first literature survey on corporate R&D intensity decomposition also indicate that further research should consider addressing the shortage of good-quality data (i.e. more complete micro-data that also allow homogeneous company and country comparability), the shortage of investigations relying on longer time series and on longitudinal (balanced) datasets, and the lack of complete data on inflows and outflows of national business R&D expenditures.
Other analytical aspects to inspect in deep are the reasons why calculations relying on EU R&D Scoreboard data always lead a finding that structural effects are main determinant of the EU R&D intensity gap, and the impact that a country's or region's tax regime has on R&D intensity. Furthermore, there are only a few studies that include an encompassing set of control variables (which might explain more accurately the determinants of sector composition and of intrinsic effects) and investigate the development of more sophisticated statistical and econometric models.