International Trends in Technological Progress: Evidence from Patent Citations, 1980-2011

We analyze cross-country trends in several aspects of technological progress over the period of 1980-2011 by examining citations data from almost 4 million utility patents granted by the US Patent and Trademark Office (USPTO). Our estimation results on patent quality and citation lags relative to the US reveal the following observations. The emerging Asian economies of Korea, Taiwan and China have achieved substantial catch-up. In the case of Korea and Taiwan, progress has been made in terms of patent quality as well as citation lags. Chinese patents are of higher quality now than before but Chinese inventors have yet to reduce the citation lag relative to the frontier. In contrast, advanced economies of Europe and Japan have displayed steady decline in their patent quality. Finally, the US has strengthened its position in the international patent quality ladder.


Introduction
This paper documents comprehensive cross-country trends in several aspects of technological progress based on a newly updated US patent citations dataset for the period of 1980-2011. The purpose of our study is to provide a rigorous international comparison that reflects underlying innovation qualities and help better understand the substance behind rapidly rising volumes of patents granted by major patent offices in recent years. 1 Since the study of Trajtenberg (1990), citations data have offered an important source of patent quality measure: higher quality patents should generate more impact and hence more citations. The first part of our analysis studies the patent quality of different countries as measured by "average citations" received within two years. 2 Specifically, we develop a panel regression model with fixed effects to compare the citation rates of non-US inventors to that of US inventors in each of the three decades, 1980s, 1990s and 2000s. We overcome the issue of country-wide heterogeneity by focusing on USPTO-granted patents, and address the time-varying nature of USPTO practices by adopting the method of difference-indifferences. 3 Our regression results yield several noteworthy findings. First, patent quality of the emerging Asian economies of Korea and Taiwan rapidly caught up with the US during the 1990s, while similar catch-up occurred for China during the 2000s. Therefore, our results suggest that the recent surge of Korea, Taiwan and China in the volume of patent production could also be substantiated by underlying quality improvement. In contrast, 1 Table 1 below presents the total number of utility patents granted by USPTO to each country during 1980-2011. Despite the apparent global success in patent production, serious concerns have been raised against its actual substance. See, for instance, Lerner (2004), Federal Trade Commission (2003) and Merrill, Levin, Myers, et al. (2004). For quality decline in patents from the European Patent Office (EPO), see Eaton, Kortum, and Lerner (2004), Guellec and de La Potterie (2007), Jones (2009), Miller (2011) andOECD (2011).
2 Several studies have also established a statistical correlation between citation rates and market value of a patent (e.g. Bloom and Van Reenen (2002), Hall, Jaffe, and Trajtenberg (2005), Gambardella, Harhoff, and Verspagen (2008), Belenzon (2012)). 3 Another potential drawback of using patent data is that the grant date does not accurately reflect when the invention took place. Our use of 10-year windows partially addresses this issue.
the advanced nations of Europe and Japan have experienced a steady decline in their patent quality. While they too produce substantially more patents now than before, when measured vis-à-vis the US, patent quality of these countries has fallen across each of the three sample decades. The US has, on the other hand, strengthened its position in the international patent quality ladder despite the inroads made by a number of emerging economies. 4 The patent citations data offer another avenue to obtain an output indicator of innovation activity by different countries. In the second part of our paper, we take US-inventors' patents as the frontier technology and implement the fixed-effects estimator of Griffith, Lee, and Van Reenen (2011) to measure the "citation lags," i.e. the speeds with which non-US inventors cite the frontier patents relative to US-inventors. We find that citation lags for the advanced countries of Europe and Japan vis-à-vis the US did not change significantly over the last three decades, while most significant gains in narrowing of the citation lag were achieved by the emerging economies of Korea, Taiwan and Israel. These observations strengthen the case for overall technological progress achieved by the latter group of countries. China, for whom our evidence on patent quality trends suggest significant upgrade, did not register a similar level of improvement on the citation lag.
Our paper contributes above all to the large and extensive literature on investigating patents as economic indicators. 5 Within this literature, an early related study on using patent citations as a quality indicator was by Trajtenberg (1990), who estimated the effect of various research inputs on the citation-weighted patent counts. 6 An index of patent quality using citations information was developed by Lanjouw and Schankerman (2004).
Another area in which patent citations have been used extensively is the literature on knowledge diffusion. Using patent citation information as a direct measure of the transfer 4 Other countries that also demonstrate patent quality improvement, albeit at less striking patterns than Korea,Taiwan and China,include Israel and India. 5 See Griliches (1990), Jaffe and Trajtenberg (2002) and Hall and Harhoff (2012), among others. 6 For another early application of citation data, see Lieberman (1987). of knowledge, this empirical literature explores the role of distance, e.g. national boundaries. See Jaffe, Trajtenberg, and Henderson (1993), Jaffe and Trajtenberg (1999), Hu and Jaffe (2003), Thompson and Fox-Kean (2005), Henderson, Jaffe, and Trajtenberg (2005), Thompson (2006) and Griffith, Lee, and Van Reenen (2011), among many others. In contrast, this paper uses citations from an updated dataset to track the recent cross-country trends in technological progress.
The remainder of the paper proceeds as follows. In Section 2, we describe our data.
In Section 3, we study cross-country trends in patent quality by employing differences-indifferences estimation against the US benchmark in different time periods. In Section 4, we measure distance to the knowledge frontier, using the fixed-effects estimator of Griffith, Lee, and Van Reenen (2011), and document the trends across countries. Section 5 offers some concluding remarks. Online Appendices provide further details on our dataset as well as additional analyses that are left out of the main text for expositional reasons.

Data
We use USPTO patent citation data from January 1980 to December 2011. The data up to 1999 are obtained from the National Bureau of Economic Research (NBER) U.S. Patent Citations Data Files -specifically, PAT63 99 and CITE75 99. 7 Bronwyn Hall provides the corresponding data for additional three years, over the period 2000-2002, in her website. 8 For this paper, we have extended the dataset up to December 2011 by extracting related information (including the inventor location) from the bulk data provided by the USPTO. Online Appendix A presents the details of how this data extension was conducted.
Combining the two previous sets of data with our data extracts, we constructed a dataset 7 These datasets have been widely used in the economic analysis of knowledge spillovers. See   consisting of all utility patents granted up to December 2011 and detailed information on the patents that cited other patents included in the dataset. The number of patents that we have added for the period of 2003-2011 amounts to over 1.5 million, equivalent to almost 40% of the sample size.
Our analysis considers top 15 countries in terms of the accumulated number of utility patents granted by the USPTO between 1980 and 2011 (see Table 1). 9 These 15 countries, ordered from the most granted to the least, are as follows: United States (US), Japan Former Soviet Union states (FSU). 12 Our estimation analyses below also consider patents produced by the Rest of the World (RW).

Patent Quality
3.1. Regression Results. The patent data offer a valuable source for exploring the innovation capacity of a country. In this section, we analyze cross-country trends in patent quality measured by citations. Specifically, we develop a panel regression model with fixed effects to compare the patent quality of non-US inventors to that of US inventors over the period of 1980-2011. Our model is given by Roughly 99% of all patents in the sample period originated from the 15 countries. Before 1980, there were only very few patents from the emerging economies of China, India, Korea, Taiwan and others. 10 EU consists of the following 12 countries: Austria, Belgium, Denmark, Finland, Greece, Ireland, Italy, Luxembourg, Netherlands, Portugal, Spain and Sweden, as in Griffith, Lee, and Van Reenen (2011).

11
Hong Kong is excluded from China. In Online Appendix A, we explain how the location of the first inventor of a Chinese patent was checked in this regard.

12
For consistency throughout the periods, we regard all patents whose inventors were located in the 15 former Soviet Union member countries as "FSU (Former Soviet Union)" patents. where 1{·} is the usual indicator function. 16 These regressors are vectors of indicator variables across (non-US) countries for each of the three sample decades. For instance, the first element of X cst,80 has value one only for Japan in the 1980s, while its j-th element contains value one for the country reported in the (j + 1)-th row of Table 1. X cst,90 and X cst,80 are defined similarly.
While there are potentially many alternative methods of measuring patent quality from citations data, one natural measure is to simply count the number of citations that a patent receives. We consider the average number of citations that a patent granted to country c, sector s and year t accumulates within the first two years from the grant date. Our main dependent variable is the logarithm of this measure of "average citations." 17 The citations measure of patent quality may however admit the effects of size and home biases. In particular, US patents may receive more citations because of the large overall US share of patents combined with the fact that our data come from USPTO. These biases may also be significant for countries that witnessed rapid rise in patent numbers during sample years.
To deal with this issue, we run a parallel regression on the logarithm of "adjusted average citations" constructed as follows. We first compute the (two-year) average citations excluding citations from the same country and sub-category combination and then renormalize it by one minus that country/sub-category's grant share. Specifically, for each (c, s, t), the adjusted average citations measure is defined as Average citations, excluding citations made by patents from (c, s) 1 − (# of patents in the (c, s, t) cell)/(Total # of patents in grant year t) . 16 We have also conducted DiD estimation across different time intervals (periods of 8, 9, 11 and 12 years) but the results are qualitatively unaffected. 17 For cells with no average citations, we replace the zeros with a small positive number and run weighted regressions. Our results are not sensitive, both qualitatively and quantitatively, on the choice of this parameter. Using left-censored Tobit regressions instead of OLS also does not affect the results.
Excluding citations from patents from the same country and sub-category combination completely eliminates home bias, but it requires some normalization since one also needs to adjust for the size of the set of "potentially citing" patents for each country/subcategory/grant year cell. This latter issue is complicated by the fact that the patent share changes over time, and the citations come from all future years. Our normalization divides the average citations measure by one minus the country/sub-category's patent share in the given year.
The parameters of interest are the coefficients η 1 , η 2 and η 3 , which capture the difference compared to the US in the 1980s, 1990s and 2000s, respectively, for each country. 18 To account for the different sample sizes of cohorts, we adopt weighted regression with the weights given by the number of patents in each cell. Standard errors are clustered by country and sub-category. Table 2 reports the estimation results on the sector-adjusted and grant-year-adjusted cross-country trends in patent quality represented by two different citation measures. The coefficient values are also illustrated via clustered bar charts in Figure 1.
We summarize our findings as follows.
the adjusted average citations were around 59% and 20% lower for Korea and Taiwan, respectively.
In the 1990s, however, this deficit in log average citations was reduced almost by a factor of 5 for Korea to -0.282, implying that the average citation of Korean patents had increased by about 50% relative to the US, while Taiwan achieved parity with the US. This rapid catch-up was also apparent after adjusting for the home bias. The negative coefficients in adjusted log average citations for the two countries became positive during the 1990s.
Although these countries' relative growth in patent quality faded away in the 2000s, notice that the corresponding coefficient values are still similar to or better than those of Japan  Griffith and Miller (2011), who showed that the proportion of EPO patent applications with at least one Chinese inventor that are near science-more fundamental research and hence, presumably higher quality-was higher than that of all EPO patent applications for the period of 1995-2005. 20 Finding 2. Patent quality of the advanced economies of Europe and Japan declined relative to the US.
The second notable finding is the relative decline of the advanced nations of Europe and Japan. In the initial sample decade of the 1980s, EU and the major European countries (Germany, France, Great Britain and Switzerland) had relatively small patent quality deficits against the US in terms of log average citations. For instance, the coefficient value was -0.150 for Germany and -0.151 for Great Britain, implying that the average citations of these two countries were about 14% lower than that of the US. Japanese patent quality in this measure was actually better, by about 18%, than the US. However, a significant decline occurred during the 1990s for all these countries, and this downward trend continued in the 2000s. For Germany, the coefficient dropped to -0.363 (≈ -30%) in the 1990s and -0.529 (≈ -41%) in the 2000s, and for Great Britain, it became -0.304 (≈ -26%) and -0.344 (≈ -29%), respectively, with all these estimates being significant at 0.1% level. For Japan, the initial advantage was erased statistically by the 1990s and moved into deficit in the subsequent decade.
The same downward trend is evident also in terms of adjusted log average citations.
Here, this group of advanced countries all began with superior patent quality than the US, but steady decline led to only one of them (Great Britain) with a positive coefficient during the latest decade.
The remaining countries did not reveal patterns that were not as striking or conclusive as the aforementioned countries. Canada and Australia showed some marginal gains relative to the US but the changes were either small or not showing consistent trends across the two measures of patent quality. Israel, India and Former Soviet Union states as well as the Rest of the World did record consistent and steady improvements in their patent quality.
While this observation reinforces our first finding and paints a broad upward pattern for the emerging economies as a whole, the gains achieved by these nations fall somewhat short of the kind of rapid quality surges experienced by the patents from Korea, Taiwan and China. Finally, what can we learn from the results about US patents? If one considers average citations from all patents (home or otherwise), the US patent quality was higher than every country except Japan in the 1980s and this dominance was maintained throughout.
In particular, it is worthwhile to notice that, despite the gains made by certain economies, there is no single country whose corresponding coefficient is better than -0.218 in the 2000s (there were four such countries in the 1980s). In the final decade, the coefficient estimates range between -0.890 and -0.218 while in the 1980s, the corresponding range is substantially more disperse and spans between -2.088 and 0.203. 3.2. Hirsch Index. While average citations capture some aspect of a country's patent quality, such a measure may not explain overall technological innovation because they fail to take into account the total size or productivity of the innovation sector, i.e. the number of patents granted. One measure to capture citation-adjusted total research output is the Hirsch index, or simply H-index, which is widely used to measure a scholar's research performance. 21 In Online Appendix B, we report two sets of estimation results (Poisson and negative binomial models) with H-indices as dependent variables. These results broadly support our previous findings on average patent quality. With the quantity of patents also taken into account, we see that Korea and Taiwan actually continued to strengthen their worldwide positions during the 2000s; moreover, substantial technological gaps still separate the emerging economies from the advanced economies of Japan and Europe. In order to highlight these overall trends, in Table 3, we present the international rankings based on H-index regression results for each of the three previous decades. A higher ranking means an H-index closer to that of the US in the corresponding decade. We observe that Japan, Germany and EU maintained the top 3 positions throughout the three decades. The rankings of Korea and Taiwan rose most substantially, while France and Switzerland showed notable declines.

Citation Lags
The patent citations data enable us to explore another channel of measuring the trends in technological progress. Taking US as the knowledge frontier, we next consider the speed with which US patents are cited by non-US patents.
4.1. Econometric Model. We adopt the fixed-effects estimator of Griffith, Lee, and Van Reenen (2011). There are a set of inventions i = 1, ...., I and a set of inventors j = 1, ...J. The inventors will learn of invention i after a time period T ij and therefore, T ij can be thought of as the "citation (or diffusion) lag" between invention i and inventor j. There are several factors which determine the citation lag including characteristics of the invention i, Z i , characteristics of the inventor j, Z j and the joint characteristics of the invention-inventor match, Z ij . There will be a set of non-geographical variables as well as geographical variables that will influence the speed at which technology spillover occurs.
proposed by Hirsch (2005) to analyze individuals' research output in physics. In economics, Ellison (2013) recently used Hirsch-like indices to examine how well different indices explain labor market outcomes for young, tenured economists at 50 US departments. Table 3. Hirsch Index rankings Hirsch-index rankings Country 1980-19891990-19992000-2009 Japan ( Notes: These ranks are based on our regression model with the Hirsch index of each location as the dependent variable. We rank each country in descending order according to its coefficient estimate for each decade.
The hazard function of the citation lag is affected by a vector of explanatory variables X ij , incorporating the empirically observable counterparts to Z ij and Z j and an unobservable fixed effect, U i , which absorbs all the factors specific to the cited patent, Z i . Unobserved heterogeneity U i includes patent quality among other things.
We take the set of US patents as the frontier and regress US (cited) patents on characteristics of citing patents from non-US inventors. Since higher quality patents are likely to be cited more quickly, it is crucial to control for unobserved patent quality, as emphasized by Griffith, Lee, and Van Reenen (2011). 22 As a consequence, our estimates would be robust even if there had been changing trends in the quality of US patents over time.
Specifically, Griffith, Lee, and Van Reenen (2011) consider a multiple-spell version of the mixed proportional hazards model. Their regression model can be written as where β is a vector of unknown parameters, λ i (·) is a cited-patent specific baseline hazard Then an estimate of β can be obtained using a conditional likelihood approach, while accounting for right censoring.
4.2. Regression Results. In our empirical analysis, we fix the "potentially cited" country to be the US and consider only US patents. As discussed in the previous subsection, this amounts to interpreting the US as the knowledge frontier. This simplifying assumption is a reasonable first-order approximation in view of our findings in Section 3. We split the sample period into 8 sub-periods of each lasting four years and estimate for each sub-period the citation lag model with the first two citations (J = 2), as described in the previous subsection.
Included covariates for citing patents are the self citation indicator (whether a citation is from the identical assignee), the same sub-category dummy (whether a citation is from the same sub-category), the base cohort size (which is the number of patents in the citing country and technology sub-category for the citing year), the corporation dummy (whether the citing first assignee is a corporation or not), category dummies (six industry level dummies), and citing country dummies. Among these, the citing country dummies are covariates of interest. Their estimated coefficients represent how fast the non-US inventors cite US patents compared to US inventors and may also indicate how close the non-US inventors are to the technology frontier. Table 4 shows the estimation results. 23 For example, the first row of Table 4 displays the coefficient estimates for Japan through the 8 sub-periods. The estimate for the first sub-period in this row is -0.21 which means that inventors in Japan cited US patents about 21% slower than US inventors.
To assist exposition, Figure 2 displays the coefficients of each country dummy together with the corresponding coefficients for the Japanese dummy. In terms of citation lags, Japan is one of the countries that has maintained its proximity to the US throughout the sample period.
Let us summarize our main finding for this section below. Korea, Taiwan and Israel all began the sample period with citation lag substantially below that of Japan; for instance, during 1980-83, Taiwanese inventors cited US patents about 120% slower than US inventors, which amounted to 100% lag relative to Japan. By 1996-1999, however, all three countries caught up with Japan, and in the case of Korea and Israel, the deficits reversed during the 2000s. Although improved communication technology may have led to general decline in citation lag (i.e. "death of distance" as identified in Griffith, Lee, and Van Reenen (2011)), the asymmetric progress made by this league of countries points to real closing of the citation lag against the knowledge frontier.
Note that China does not feature in this league. Although China closed the citation lag somewhat during the 1990s, the distance grew again in the last decade.
It is also interesting to observe that, in terms of citation speed, the advanced economies of Japan and Europe (as well as Canada, Australia and FSU) actually maintained their relatively close position to the US frontier throughout the sample periods. Japan, in particular, have been more or less on level terms with the US for the last three decades, ahead 23 We include China and India as part of the Rest of the World for the first two periods to avoid imprecisely estimated estimators due to the small sample sizes of such cohorts. 2. Our regression model controls for base cohort size (the number of patents in the citing country and technology sub-category for the citing year), self citations (citations between patents with identical assignees) and within-subcategory citations (citations between patents within the same category). Corporation dummies (whether or not the first assignee is a corporation) and category dummies are also included. 3. We include CN and IN in RW for the first two periods to avoid diverging estimators due to the small sample sizes of such cohorts.  Table 4 Figure 2. Graphical representation of estimation results in Table 4 (continued) Notes: Each graph plots the coefficient values for the country dummy. Japan is included in every graph as a benchmark.
of all the other countries. Canadian inventors are slower than Japanese inventors despite its geographic proximity to the US.
Our findings are related to the literature on "absorptive capacity". Griffith, Redding, andVan Reenen (2003, 2004) found evidence that R&D is statistically and economically impor-

Conclusion
In this paper, we have analyzed cross-country trends in some aspects of technological progress over the period of 1980-2011 by considering an updated USPTO citations dataset.
Our estimation results reveal several noteworthy stylized facts. As widely expected, the 24 In Online Appendix C, we also report estimation results of the citation lag model based on the second and third citations (Table C.9) as well as on the third and fourth citations ( Last but not least, we also wait to see if other countries newly spring to join the growth path of the aforementioned Asian economies in innovation. In this study, we focused on patent citations data from the USPTO. While this may have caused some bias in favor of US patents, the trends of all other countries were measured relative to those of the US. We have also considered adjusted average citation rates to account for the home bias. Nonetheless, it would be worthwhile also to study the corresponding trends using data from other major patent offices, especially, from the EU. These Online Appendices to Kwon, Lee, and Lee (2015)    Notes: The sample period is divided into 8 equal-length sub-periods and we consider each country of each sub-period as an "author" to calculate the H-index for each "author." An author with a total of n patents has H-index h if he has more than h patents that have been cited at least h times and the other n − h patents have been cited at most h times.
We run Poisson regression and negative binomial regression with the H-indices as the dependent variable to examine the trends observed in Table B.1. We use the following model specification where the variables on the right-hand side are defined as in (3.1). 4 Here, the dependent variable, Y cst , is the H-index for a given country c, sector s and year t. For example, if y JP,11,1980 = h, this means that considering the patents granted in 1980 for subsection category 11 by Japanese inventors as patents of one big inventor, the H-index is h. Since this dependent variable is count data, we apply Poisson regression and negative binomial regression to estimate the coefficients. The standard errors are again clustered by country and sub-category.
The results are shown in Table B 1980-1989 1990-1999 2000-2009 1980-1989 1990-1999 2000-2009 Japan ( 1980-1989 1990-1999 2000-2009 1980-1989 1990-1999 2000-2009 Japan ( Country 1980-1989 1990-1999 2000-2009 1980-1989 1990-1999 2000-2009 Japan (JP Country 1980-1989 1990-1999 2000-2006 1980-1989 1990-1999 2000-2006 Japan (         Notes: The reported numbers are the estimated coefficient values for the country dummy estimated from the regression results above. 1. In the parentheses are standard errors. 2. Our regression model controls for base cohort size (the number of patents in the citing country and technology sub-category for the citing year), self citations (citations between patents with identical assignees) and within-subcategory citations (citations between patents within the same category). Corporation dummies (whether or not the first assignee is a corporation) and category dummies are also included. 3. We include CN and IN in RW for the first two periods to avoid diverging estimators due to the small sample sizes of such cohorts.  Table C.9 (continued) Notes: Each graph plots the coefficient values for the country dummy. Japan is included in every graph as a benchmark. Notes: The reported numbers are the estimated coefficient values for the country dummy estimated from the regression results above. 1. In the parentheses are standard errors.
2. Our regression model controls for base cohort size (the number of patents in the citing country and technology sub-category for the citing year), self citations (citations between patents with identical assignees) and within-subcategory citations (citations between patents within the same category). Corporation dummies (whether or not the first assignee is a corporation) and category dummies are also included. 3. We include CN and IN in RW for the first two periods to avoid diverging estimators due to the small sample sizes of such cohorts.  Table C.10 (continued) Notes: Each graph plots the coefficient values for the country dummy. Japan is included in every graph as a benchmark.