MY PRECIOUS! THE LOCATION AND DIFFUSION OF SCIENTIFIC RESEARCH: EVIDENCE FROM THE SYNCHROTRON DIAMOND LIGHT SOURCE

We analyze the impact of the establishment of a GBP 380 million basic scientific research facility in the UK on the geographical distribution of related research. We investigate whether the siting of the Diamond Light Source, a 3rd generation synchrotron light source, in Oxfordshire induced a clustering of related research in its geographic proximity. To account for the potentially endogenous location choice of the synchrotron, we exploit the availability of a `runner-up' site near Manchester. We use both academic publications and patent data to trace the geographical distribution of related knowledge and innovation. Our results suggest that the siting of the synchrotron in Oxfordshire created a highly localized cluster of related scientific research.

The derivative of (A.1) with respect to infrastructure S at is: As of December 2010, there were 347 published scientific articles available on Diamond's website. These articles appeared in 121 scientific journals in various fields. While nearly all journals contain an abstract of the article, only 54 report keywords. These publications list a total of 1,760 authors. Author names had to be standardised as the way in which names are listed differs across journals. These authors are affiliated with 441 institutions all over the world. We also standardised the names of affiliations as the way in which they were reported differed considerably across journals. We also completed addresses of affiliations whenever necessary by retrieving postal addresses from the relevant institutions' official websites.

B.2. Related Academic Publications
Thomson Reuter's Web of Science offers a tool that, for a given article, searches the entire Web of Science database for other articles that contain the same references as the original article. We used this tool to retrieve all articles that share at least one reference with our 347 'Diamond articles'. We then computed a similarity score as the average of the number of shared references divided by the number of references in the 'Diamond article' and the number of shared references divided by the number of references in the article retrieved through Web of Science. We then retained for each of the 347 'Diamond articles' the five most similar articles, where similarity is measured by the similarity score based on the relative number of shared references.
As an alternative, we also collected related articles imposing the additional restriction that articles have to be published in either a field journal that pertains to the same field as the original article (e.g. Crystal Growth & Design and Acta Crystallographica which are both crystallography field journals) or a general interest journal (e.g. Science). In a next step, we recovered all author names and their affiliations from these similar articles. We standardised author names and affiliations and dropped all authors that report no affiliation with an entity in the UK. We then retrieved postcodes for all UK affiliations and matched them with Code-Point data to obtain the corresponding grid coordinates which allow us to compute distances to Diamond and Daresbury. For robustness checks, we also compiled all articles published in the top 3 'Diamond journals.' That is we ranked the journals in which Diamond articles have appeared by the number of Diamond articles that have been published in a given journal. We selected the top 3 journals -(1) Journal of Molecular Biology (29 Diamond articles), (2) Journal of Biological Chemistry (21 Diamond articles), (3) Acta Crystallographica Section F (20 Diamond articles)and filtered all articles authored by at least one UK-based scientist. Across the three journals combined, we obtained a total of 6,189 articles for the 2000-10 period. We then proceeded as with the other data and allocated all publications to the corresponding institutions and geo-coded them.

B.3. Research Assessment Exercise (RAE) Data
The Research Assessment Exercise (RAE) is a peer-review assessment of all UK universities that is carried out every five years by the Higher Education Funding Council for England (HEFCE). We use data for the 2001 RAE. While the RAE was carried out in 2001, the assessment period refers to 1 January 1996 to 31 December 2000. This means the RAE variables refer to research published in the period 1996-2000.
We construct the following variables from the 2001 RAE at the TTWA and LA level: the number of 5/5* (internationally excellent) researchers in biological sciences, the maximum grade achieved in biological sciences by any university within a TTWA or LA, the number of 5/5* researchers in chemistry, the maximum grade achieved in chemistry, the number of 5/5* researchers in physics and the maximum grade achieved in physics by any university within a TTWA or LA.

B.4. Funding by Wellcome Trust
The Wellcome Trust publishes information on all grants awarded by year in which the grants were awarded. We drop grants unrelated to the type of scientific research that is relevant for our analysis, which includes grants awarded by the Biomedical Ethics Panel, the Funder Committee, grants for work on the history of medicine, arts awards, grants for population and public health studies, and grants to promote public engagement and broadcasting. These grants account for only a minor share of the total amount awarded by Wellcome (7%). For all remaining grants, we extract information on the number of grants and the total amount awarded by institution and aggregate the variables at the TTWA and LA level.

B.5. Funding by Science & Technology Facilities Council
The Science & Technology Facilities Council publishes aggregate statistics on all grants awarded by year and institution. We use the available information on the total value of grant payments by institution and aggregate the variables at the TTWA and LA level.

B.6. UK Crystallographers
To collect the data on crystallographers in the UK, we proceeded as follows: we used a list of all universities in the UK to check each university's website for academic departments that employ potential crystallographers (the structure of departments varies across universities, some for example bundle various fields within a life sciences department whereas others have separate departments, some even dedicated crystallography departments). We also included in our search research institutes, such as the Institute of Cancer Research or the MRC National Institute for Medical Research. Crystallographers were identified through research interests listed on the departments'/researchers' websites and their publication record. We then used the list of crystallographers to extract their entire publication record from Thomson Reuter's Web of Science. In a final step, we identified in our list of crystallographers those scientists that are also in our set of 'Diamond authors'. Note that we manually double checked all 'Diamond authors' in our dataset through extensive web searches to ensure that they were correctly identified as crystallographers. To address concerns that our data only contains crystallographers that have (successfully) remained in academia until 2014 when we collected the data, we also collect data on crystallographers that are no longer in academia or in the UK by retrieving lists of former members of departments and labs which a large number of webpages make available.

Appendix C. Relocation from Daresbury (SRS) to Oxford (Diamond)
We consider the most straightforward explanation for a positive proximity effect of Diamond: that research at Diamond simply replaced the research that was previously conducted at the SRS. Note, however, that in our main analysis we exclude all researchers employed by Diamond or the SRS, which means a simple shift of researchers employed by the SRS to Diamond cannot explain our results.
Considering this question is difficult for at least four reasons. First, we have less than two years of data for Diamond output that occurs after the SRS closed in August 2008. Second, the SRS coexisted with Diamond for nearly two years in 2007 and 2008. Third, the SRS was part of the Daresbury Laboratory, which continues to be home to a large number of scientists (including specialists in accelerator science based at the Accelerator Science and Technology Centre (ASTeC) and the Cockcroft Institute), even after the closure of the SRS. Fourth, according to the official evaluation that followed SRS's shutdown (STFC, 2010), there is no record of researchers that used the SRS. In order to evaluate a potential replacement effect, we collect information on scientific publications from Thomson Reuter's Web of Science by authors that list the SRS as their affiliation. Unfortunately, although there are a substantial number of authors that only list the Daresbury Laboratory as their affiliation, identifying unambiguously whether a given scientific article has been produced using the SRS is not straightforward. With this caveat in mind, Figure D3 shows the time trends of the number of articles. 1 The figure shows that the number of publications increased up until 2008possibly due to scientists finishing up projects before the SRS closed. Consistent with this, when we only count articles in general interest journals (e.g. Science) and top field journals (e.g. Progress in Crystal Growth and Characterisation of Materials in the field of Crystallography) the replacement effect is weaker (as might happen if the impending shutdown of SRS has some effect on article quality). The closure is also clearly visible in the data with the number of publications dropping notably in 2009 and 2010. 2 At the same time, as expected, the number of Diamond articles by scientists employed by Diamond rises steeply to surpass the number of SRS publications slightly by 2010. This offers evidence for a replacement effect. In other words, research output that relies on synchrotron infrastructure has indeed shifted from Daresbury to Diamond. 3        Notes. Dependent variable in column (1) is unweighted publication count by TTWA and year, in column (2) the publication count weighted by the number of authors, in column (3) weighted by authors' byline position, in column (4) by the number of distinct affiliations, and in column (5) the article count is restricted to articles where the first author is included in our sample and in column (6) to authors with a single affiliation. Robust standard errors clustered at TTWA level. All regressions include a constant. Covariates include %NVQ4 and above, Labour force, no. Wellcome grants, £'000 awarded by Wellcome, and £'000 awarded by STFC. * Significant at 10%, ** at 5%, *** at 1%.   Notes. Logit estimates; marginal effects reported. Columns (1)-(4): dependent variable is equal to 1 if a Diamond article is in the same field as its corresponding related article (e.g. the Diamond article is published in Acta Crystallographica Section B -Structural Science which belongs to the field of Crystallography and the corresponding related article is published in the Journal of Crystal Growth which also belongs to the field of Crystallography). In columns (3) and (4), articles published in general interest journals (e.g. Science, Nature, PLOS ONE, etc.) are always considered in the same field regardless of the other journal (e.g. the Diamond article was published in Science and its corresponding related article in Acta Crystallographica Section B -Structural Science, both journals are considered in the same field). Field definitions provided by Thomson Reuters Web of Science. Regressions include 487 related articles. Note that a given article can appear multiple times if its authors are located in different locations as measured by their postcode. Columns (5)-(10): Dependent variable is equal to 1 if a given article is a related article. All other articles are articles randomly drawn from the same journal in the same publication year (and have at least one UK-based author) as a given related article. For example, if a given related article was published in 2009 in Acta Crystallographica Section B -Structural Science, we randomly selected three articles published in 2009 with at least one UK-based author from Acta Crystallographica Section B -Structural Science. The sample in columns (5) and (6) consists of all related articles; the sample in columns (7) and (8) consists only of 'Diamond journals' (journals in which at least one Diamond article has been published); the sample in columns (9) and (10) consists of all other journals. Robust standard errors clustered at the journal level. All regressions include a constant. * Significant at 10%, ** at 5%, *** at 1%.  Notes. Dependent variable is publication count by TTWA and year. In columns (1)-(3), sample contains alternative set of related articles constructed by restricting set of related articles to articles published in either a field journal that is in the same field as the corresponding Diamond article or a general interest journal. In columns (4)-(6), The sample consists of all articles with at least one UK-based author over the period 2000-10 that appeared in the top 3 Diamond journals: Journal of Molecular Biology, Journal of Biological Chemistry, and Acta Crystallographica Section F, a total of 6,189 articles. Robust standard errors clustered at TTWA level. All regressions include a constant. Covariates include %NVQ4 and above, Labour force, no. Wellcome grants, £'000 awarded by Wellcome, and £'000 awarded by STFC. * Significant at 10%, ** at 5%, *** at 1%.