Program on Education Policy and Governance Working Papers Series Does Additional Spending Help Urban Schools? an Evaluation Using Boundary Discontinuities

Improving the educational attainment of disadvantaged students in urban schools is a priority for policy worldwide, but existing research is equivocal about the effectiveness of additional funding for achieving this objective. This study exploits anomalies in the spatial dimension of school funding policy in England to provide new evidence on this question. An "area cost adjustment" and other aspects of the formula that allocates central grants to Local Authorities (school districts) means that neighbouring schools with similar intakes, operating in the same labour market and facing the same prices for inputs can receive very different incomes. We find that these funding disparities give rise to sizeable differences in pupil attainment in national tests at the end of primary school. This shows that school resources have an important role to play in improving educational attainment. The results have direct implications for the current "Pupil Premium" policy in England.


Introduction
Improvement of the educational attainment of poor children is a top priority in many countries. This is a particular problem in countries, like the UK and US, where there are long and sizable tails in the bottom end of the adult distribution of basic literacy and numeracy skills (OECD, 1995). This bottom tail is heavily populated with people who have been disadvantaged since childhood, and many of these children live in inner-city urban areas. 1 An analysis for recent cohorts of school children in the UK finds that there is already a substantial attainment gap at school entry between pupils who are poor enough to be eligible for free school meals and the rest, and this gap widens over time (National Equality Panel, 2010). 2 Recent academic work on addressing this gap (and raising achievements more generally) has turned attention towards institutional structures and incentives, such as greater school autonomy and competition. However, educational policy still operates as if resources matter: In England, disadvantaged areas receive higher levels of funding, and one of the UK Coalition government"s flagship policies is a "Pupil Premium" to compensate schools that enrol high proportions of poor children. Therefore, in this paper, we turn back to this central question of whether simply giving money more schools results in higher achievements. Our empirical analysis uses a design that focusses on the effects of explicit differences in the grants paid to neighbouring city schools in adjacent school districts in England. This design combines discontinuities across district boundaries with instrumental variables derived from the national funding formulae.
The research design is rooted in a policy anomaly that means that neighbouring schools with similar pupil intakes can receive markedly different levels of core funding if they are in different education authorities in England. This happens partly because of rules in how funding is allocated to Local Authorities by central government, and there have been various local campaigns from authorities that have felt unfairly treated. 3 In brief, "area cost adjustments" are made that are intended to compensate for differences in labour costs between areas whereas in reality teachers are drawn from the same labour market and are paid according 1 It has long been established that family background and early childhood experiences are the most important determinants of educational outcomes (Coleman, 1966). The relationship between family background and educational attainment is stronger in England than in any of the 54 countries included in the TIMSS study (Schuetz et al. 2005;Blanden, 2009). 2 Specifically, the proportion of poor children reaching the "expected level" at school entry (the "Foundation Stage") is 22 percentage points lower than others. This widens over time. For example, on leaving school only 13 per cent of pupils eligible to receive free school meals go on to higher education compared to 32 per cent of all others (NEP report. p.341). 3 Local Authorities are the local government districts through which most schooling in England is organised. We are primarily interested in differences between local authorities in the funding received from central government. Another source of variation (where we do not have good information) is different rules in how Local Authorities allocate funding within areas.
-2 -to national pay scales. Consequently, schools which are close together but in different Local Authorities can get very different levels of funding, despite being otherwise very similar in their geographical location, catchment areas, student intakes and the prices they face for their of inputs. Primary schools in urban areas are particularly exposed to this funding anomaly, since they tend to be close together and attract pupils from the local area in which they are located. These schools also have a relatively high percentage of poor children.
This feature of the English funding system therefore allows us to evaluate the effect of expenditure on such schools using boundary discontinuity techniques. In the education literature, these techniques have often been used to look at the impact of school test scores on house prices as well as being used in other areas of economics (e.g. the taxation literature). 4 We show that schools on either side of Local Authority boundaries receive different levels of funding and that this is associated with a sizeable differential in pupil achievement at the end of primary school.
As discussed above, this investigation is important for two main reasons. Firstly, improving the attainment of children in disadvantaged urban areas is a top priority because of concerns about economic inequality (of which education is one aspect) and the heavy bottom tail of the educational distribution. It is important to identify the effects of expenditure on this population. The UK government"s "pupil premium" is directing additional resources (£488 per pupil in 2011-12) at schools with pupils from deprived backgrounds and our investigation gives us a good idea of the likely impact of such a policy. Secondly, there is an age-old debate in the academic literature on the causal effect of raising school expenditure on pupil attainment. The relationship is hard to identify because expenditure is often allocated to schools in ways that are correlated with pre-existing pupil advantages and disadvantages. In some contexts (including England) resources are centrally allocated partly on the basis of educational needs, which are negatively related to pupil attainment.
In other contexts (e.g. the US) expenditure derives from a local tax base which increases with parental wealth, which is in turn positively related to pupil attainment. Studies that identify the effect in a convincing way are relatively few and there are very different views on the overall interpretation of the literature from economists working in this field. Hanushek (2008) argues that accumulated research says that there is 4 With respect to the literature on the effect of test scores on house prices, papers that use regression discontinuity methods include Black (1999), Kane et al. (2005), Fack and Grenet (2008) and Gibbons et al. (2009). The method is used in many other areas of economicsfor example Cushing (1984) uses it to look at the effect of taxation on house prices. Duranton et al (2011) look at the effect of taxation on firms using a combined discontinuity and instrumental variables methodology that is similar to ours.
-3 -currently no clear, systematic relationship between resources and student outcomes. However high quality studies that show some effect from resource-related factors include Angrist and Lavy"s (1999) study on the effect of class size in Israel; studies on the experimental Tennessee STAR class size reduction (Krueger, 1999;Krueger and Whitmore, 2001); studies that have made use of student finance reforms (Guryan, 2001;Roy, 2004); and some of Hanushek"s own work (Rivkin, Hanushek and Kain, 2005). There have also been a couple of recent papers in England that have found modest effects of increased school resources (Machin et al. 2010;Holmlund et al. 2010;Jenkins et al 2006). Our research design is the first of which we aware that applies the boundary discontinuity approach in order to provide credible causal estimates of the effects of school expenditure differentials.
To preview our results, we show that schools close to Local Authority boundaries that are well matched in terms of pupil characteristics do receive different levels of funding from central government and these differences in resources are associated with differentials in pupil performance. Specifically, we group schools close to Local Authority boundaries into neighbourhood clusters on the basis of proximity and the extent of disadvantage (as measured by the proportion of children eligible to receive free school meals). We instrument school expenditure using variables that capture cross-boundary variation in the funding formula.
Our results imply that an additional £400 per student per year (a 12.3% increase relative to the mean) could raise achievement by around 10 per cent of a standard deviation. These effects, are however higher in schools that have higher proportions of disadvantaged students. The effects reported here are larger than those typically found in the literature and suggest that increasing school expenditure has an important part to place in raising educational attainment in disadvantaged urban areas. Although we cannot provide decisive evidence on the channels through which increased spending is effective, we provide some additional evidence on how school spending responded to the cross-boundary income differentials in these urban schools. We find that the additional income was spent disproportionately on learning resources, supplies and bought-in professional services, rather than teaching staff. These changes in the budget shares are, however, quite small.
The remainder of our paper is structured as follow: we discuss the institutional structure of schools in England and how funding is allocated (Section 2); data (Section 3); empirical strategy (Section 4); regression results (Section 5); and discussion and conclusions (Section 6).

Education in England: the Institutional Structure
In England, there is a National Curriculum and years of compulsory education are organised into four "Key Stages" (ending at the age of 7, 11, 14 and 16). At the end of primary school (end of Key Stage 2), all students in England undertake national tests in English, Maths and Science. These are national tests that are externally set and marked. They are important in the accountability system since they form the basis of School Performance Tables (or "league tables") at the end of primary school. Our outcome variable will be test scores at this stage of education (when children are aged 11). There is no grade repetition in the English system.
There are about 15,000 primary schools in England. Schooling is organised at the local level by Local Authorities, which are usually the same bodies as the local councils that control other aspects of local government. The majority of pupils attend "Community Schools" (i.e. 67% of pupils). In this case, the Local Authority employs the school"s staff, owns the school"s land and buildings and has primary responsibility for deciding the arrangements for admitting pupils. In the case of oversubscription, the most commonly used criteria for admissions are a siblings rule and proximity to the school. 5 Most other primary schools are faith schools. In some cases, these schools have greater autonomy from the Local Authority and an obligation to raise part of the capital funding ("Voluntary Aided schools"). Also, oversubscription criteria include affiliation to the religious denomination of the school. We restrict our attention to children attending Community schools as they are more homogenous in their funding, governance and admissions structure and thus easier to match across Local Authority boundaries.
Most funding to schools goes through Local Authorities (of which there are 150). Over the period relevant to this study, most funding gets allocated to Local Authorities using a national formula and then Local Authorities each use their own formula to allocate this funding to schools. 6 When the funding gets to schools, it is for the school to decide how to use it, although the bulk of expenditure is on teacher pay which follows national pay scales. The broad allocation of spending is as follows: 60% on teachers; 20% on support 5 The Schools Admission Codes sets out rules for admissions criteria. Notably, student ability or family income cannot be used as a criterion. 6 There has been a recent move to give many more schools autonomy. In this case funding will come directly from central government rather than through the Local Authority. However, this initiative is very recent and does not affect most schools in our sample.
-5 -staff or other staff; 6% on building and maintenance; 5% on learning resources/IT and 8% on a residual category. This has changed little over time (Holmlund et al. 2010).
Key features of national funding is that there is a basic allocation per pupil, with an allowance made for area ("area cost adjustment"), sparsity, additional educational needs and "high cost" pupils. 7 There have been some changes to the formulae over time (as documented in West, 2008). For example, in 2006/07 the funding formula changed to the "Dedicated School Grant". However, this was based on similar principles to the earlier formula (including adjustments for area and educational need) but introduced greater complexity, with additional funding strands to support national educational priorities.
We are mainly concerned with the aspect of the national formula relating to the "area cost adjustment".
This reflects two kinds of difference between areas in costs: differences in labour costs (i.e. the main factor) and differences in business rates paid on local authority premises. The "labour cost adjustment" is based on the differences in wage costs between areas. The underlying rationale is that local authorities have to compete for staff with other employers and therefore need to pay the local "going rate. 8 This is worked out by applying regression analysis to the Annual Survey of Hours and Earnings, a national survey of employers which collects information on individual workers" hours and pay. An index of "labour cost adjustment" factors are then produced and used in the formula to allocate education resources from central government to local authorities. However, this extra funding does not necessarily get passed on to teachers as they get paid according to national pay scaleswith very limited regional variation. 9 Unsurprising this has provoked considerable controversy over time. For example, a recent newspaper article reports a review of the situation of a Local Authority in London (Haringey): "under the current system, the borough is treated as outer London even though the challenges its schools faces and its teachers pay are in line with the inner-city areas like Camden, Hackney and Islington. It means each pupil in Haringey received £1,300 less in funding per 7 Some of the indicators used to measure additional education needs and "high cost" pupils have changed over time.
An example of what counts as "additional educational needs" is the proportion of children who do not speak English as a first language; measures of deprivation. Indicators used for "high cost" are the proportion of children with a low birth weight and the proportion of adults on income support in the Local Authority. 8 See http://www.local.odpm.gov.uk/finance/0708/acameth.pdf 9 There are four scales according to geography: Inner London; Outer London; "The Fringe" (i.e. a small number of areas that are within largely rural Local Authorities); and the rest of England and Wales. These differentials in teacher pay do not correspond to the "area cost adjustment". The former is more refined (i.e. there are many areas) and much larger than differences in teacher pay across these regions. Nonetheless, differences in teacher pay across areas can be a cause of resentment (e.g. if they work in Inner London rather than Outer London) since teachers do not necessarily live in the Local Authority where they teach.
-6 -pupil…" 10 We make use of this funding anomaly to identify the effect of school expenditure on similar schools either side of an administrative boundary. Further detail on the mechanics of the "area cost adjustment" is described in the section below.

Data
Our study is based on the National Pupil Database (NPD, a census of all students in state schools) between academic years 2003/4 and 2008/9 The data set contains information on the national test scores of all 11 year olds in England (i.e. at the end of Key Stage 2) in English, Maths and Science, as recorded in the Key Stage tests that are taken in May. As there is no grade repetition in the English system, all pupils are in the same year group when they take these tests. We use the average score across these subjects as our outcome variable. We also investigate the impact of school expenditure on each subject separately. The student census data in the NPD are available from 2001/2002, but we do not have full information on funding before 2002/3 and wish to include time lagged funding data so we restrict attention to 2003/4 onwards.
The National Pupil Database also has information on the prior attainment of each person -age 7 tests (i.e. at the end of Key Stage 1) in reading, maths and science. Demographic information included in the data set relates to gender, ethnicity, whether English is his/her first language, whether the pupil is known to be eligible for Free School Meals (an important indicator of socio-economic disadvantage). This information is recorded in January of each year. Geographic information on the pupil"s home residence is also available at Census "Output Area" (i.e. small geographic clusters of households and the proportion of pupils who do not speak English as a first language. The "Area Cost Adjustment" (ACAs) is fundamental to our empirical strategy. The ACAs are produced by the department of Communities and Local Government, and the methodology is discussed in CLG (2007).
As discussed above, ACAs reflect two kinds of difference between areas in costs: differences in labour costs (the Labour Cost Adjustment, LCA) and differences in business rates paid on local authority premises (the Rates Cost Adjustment, RCA). The Labour Cost Adjustment component is estimated from wage regressions estimated on a large national sample of employeesthe Annual Survey of Hours and Earnings. Essentially, log wages are regressed on a set of individual characteristics (including occupational controls, age, gender, industry) and geographical area fixed effects. The LCAs are then estimated as wage indices from the area fixed effects. For determining the education ACA, the RCA and LCA are weighted by according to the estimated contribution of labour (80%) and rates (between around 1% and 2% ) to education costs, so the LCA is by far the most important factor and the rates adjustment is inconsequential. For example, the RCA for Inner London for the 2008/9 index was 1.63 and the LCA was 1.32, but the Inner London overall ACA is 1.271 (see CLG 2007CLG , 2005). Oddly, a lower limit is applied such that the ACAs are lower-truncated at the mean. Areas with an average or lower than average wage index are given an ACA of 1. Areas with a higher than average wage index are assigned the actual estimated value (e.g. 1.1. if the index is 10% above the mean). The logic of for this truncation is not completely clear, but arguments appear to be political and have to do with not wanting to be seen to "penalise" low wage areas with lower central government funding allocations.
We have this education ACA data for every year. The ACAs have the following consequences for real per-pupil funding differences between neighbouring schools in adjacent Local Authorities (LAs). Firstly the ACAs are derived from national wage data on the private and public sectors (the New Earnings -8 -Survey/Annual Survey of Hours and Earnings series), but teacher pay is highly regulated by union bargaining at the national level and so does not vary between labour markets in the same way as wages in general.
Secondly, the ACAs are defined for sub-regional geographical units that are aggregates of LAs, so neighbouring LAs can receive different levels of per-pupil funding simply because they have been allocated to different ACA regions. All these factors together can lead similar neighbouring schools in adjacent LAs to receive very different levels of per pupil funding, and it is these cross-boundary differences in LA funding and ACAs that we exploit in our empirical analysis.
To set up these data for our empirical analysis, we carry out a number of data manipulations using a Geographical Information System, computing distances between each school and its nearest neighbours based on the school postcode coordinates, distances to Local Authority boundaries. We also derive a subset of LA boundaries that do not coincide with geographical features (major roads, motorways, railways) using feature data from the Ordnance Survey (these geographical data were obtained from the UKBORDERS and Digimap services at www.edina.ac.uk).

General principles
The central aim of the empirical research is to answer the question as to whether (and to what extent) additional school resources raise student achievement, with a particular focus on low-income, low achieving children in urban schools. All research that aims to answer this question has to address concerns that any estimated statistical association between resources and achievement is not causal. These concerns arise because the resources a school receives are dependent on the characteristics of the school, neighbourhood and its student intake, which are in turn correlated with student achievement.
To solve this identification problem, we employ a research design that combines elements of matching, regression discontinuity and instrumental variables. This design makes use of funding differentials that occur for similar schools located on opposite sides of Local Authority (school district) boundaries. These funding differentials arise because central government funding formulae pay out different per-pupil grants to Local Authorities (LAs), on the basis of average LA demographics and the wages in the labour market in which the LA is assumed to operate. In turn, LAs distribute these grants to schools, but not in ways that compensate for -9 -the specific circumstances of each school in their jurisdiction. Schools in adjacent LAs but close to the boundary will tend to be more similar to each other in terms of neighbourhood, intake and labour market than they are to the LA as a whole. On account of being located on either side of the boundary, they will receive differential funding from their respective LAs even though they operate in very similar contexts. As discussed above, this funding anomaly is particularly pertinent with respect to the Area Cost Adjustments (ACAs) that are used in central government formulae to compensate for wage differentials across labour markets, since neighbouring schools can receive very different per-pupil resources to compensate for inter-labour market wage differentials, even though close neighbouring schools are, self-evidently, in the same labour market and face the same prices for labour and other inputs 12 . Our method therefore uses these discontinuities in LA funding, and discontinuities in the ACA indices, as instruments for differences in school expenditure across LA boundaries.

A more formal exposition
Our empirical estimates centre on estimating the parameter β in regression models of the form where Y is student i's key stage 2 test score (an average across three subjects, Maths, Science and English) at the end of primary school (age 11), E sjt is a measure of per-student, current expenditure in school s, located in neighbourhood j, in the years leading up to year t 13 . Optional control variables (e.g. for pupil background and prior achievements) can be included, but we suppress these in the notation for simplicity. Pupil achievement is, in part, determined by unobserved school effects (µ st ), neighbourhood effects (θ g ) and a standard random error term (ε isjt ). School expenditure is endogenous to pupil performance (Y), because it is correlated with these school and neighbourhood effects through central government and LA funding decisions, and because of schools' own fund raising and expenditure decisions. 14 So, the fundamental identification problem in estimating the coefficient β, interpreted as the causal linear effect of resources on achievement, is that school resources E sjt are correlated with µ st + θ g .
12 In some areas, the wages schools have to pay their teachers are higher in high-ACA areas due to the London weighting on pay scales as discussed in footnote 9, but in general it is up to school management to decide whether they use additional resources on teacher pay or other expenditure items.. 13 We use means in the 4 preceding years, spanning the key stage 2 phase in primary education 14 Note, our empirical analysis can allow that these neighbourhood effects vary by year, but we suppress this for notational simplicity.
-10 -To assist with understanding our empirical strategy, it is useful to write out a representation of the process determining school expenditure in terms of its essential components: E sjt = α 1 f st +α 2 g jt +α 3 h lt (2) where α 1 f st represents school fund raising, (and school-level decisions about borrowing and saving), α 2 g jt represents income allocated to the school by the LA in relation to its neighbourhood location and expected intake, and α 3 h lt represents LA average per-pupil income from central government grant. Our estimation strategy for (1) is a differencing-based, discontinuity design, combined with an instrumental variables (IV) approach using instruments explicit in equation (3). 15 This strategy uses cross-sectional differences in the funding formula over closely spaced schools, and changes in the funding formula over time, which (we argue) are uncorrelated with changes in factors affecting these schools. We firstly eliminate neighbourhood factors common to neighbouring schools using a within-groups fixed effect estimator to difference out θ g , in which the groups j are defined by clusters of neighbouring schools (which we discuss in Section 4.3). 16 This yields differenced versions of (1) and (3): where the D represents the within -j transformation. This is not an effective strategy for the full set of schools, because the differences in central government grants to LAs are zero by construction within LAs 15 One fixed-effect method would be to difference equation (1) over time in a standard panel data estimator. This is the approach used by Holmlund et al(2010) using similar data to ours. However, there is very little variation over time in the ACAs used in this study, so time differencing is inappropriate in our context. 16 The more traditional boundary discontinuity method would involve specifying dummy variables indicating the nearest district boundary to each school, and including these dummy variables in the regression estimation of equation (1). This traditional method assumes that unobserved factors affecting school performance are constant along the boundary, or that the average on one side is the same as the average on the other, where the average is taken along the whole boundary length. This need not be the case when, as in our setting, the boundary is long and schools are not uniformly distributed along the length of the boundary on both sides. Differencing within neighbouring, matched school groups is more general in allowing the unobserved spatial effects on school performance to vary along the boundary (e.g. see Fack and Grenet 2010, Gibbons et al 2009, Duranton et al 2011. -11 -(Dh lt in 4b). In addition, neighbouring schools, with similar characteristics, within the same LA probably receive very similar levels of funding delegated from the LA. Therefore, a large proportion of the residual variation in funding differences between schools within the same LA in equation (4a) would be due to schoollevel decisions, or components of LA-delegated funding that relate to school attributes which are not controlled by spatial differencing. Both of these components are potentially correlated with the school-by- year effects (f st ).
However, building on the boundary regression discontinuity design literature (Black 1999 etc.), we can exploit the discontinuity in school funding between neighbouring schools across LA boundaries, arising from Dh lt , for the subset of schools that share the same geographical neighbourhoods but are on opposite sides of the LA boundary. The idea is then to use these core differences in funding between LAs (Dh lt ) as a source of exogenous variation with which to identify β in equation (4a).
We will present a number of estimates based on this research in our empirical results. Firstly we present estimates of equation (4a). On its own, this is still ineffective, because there remain differences in school expenditure decisions which are correlated with the school-by-year fixed effects (Df st ) due to unobserved differences between schools that are not fully controlled by the discontinuity design. One solution is to replace school-level expenditure differences (DE sjt ) with LA level average expenditure per-pupil differences, thus eliminating school-specific expenditure components. However, our estimate of β then yields an estimate of the response of pupil achievement to LA-average expenditure, rather than school-specific expenditure. Our preferred strategy is to use the instruments explicit in the funding mechanism.
A second solution, therefore, would be to use LA-level income differences from central government (Dh lt ) as an instrument for school-level expenditure differences (DE sjt ). However, due to changes in the central government funding system, we do not directly observe a central government grant to primary schools after 2005/6. 17 However, we can use the mean income delegated by LAs to schools within their jurisdiction as a potential instrument, since this is free of school-specific components (Df st ) and determined, for the most part, directly by the grant from central government.
17 After this year, central government did not provide a ring-fenced grant to LAs for primary school spending but switched to a block grant to cover all types and phases of school (the Dedicated Schools Grant) -12 -There is still some danger in using cross-boundary differences in LA funds delegated to schools as an instrument in this context, because this could in part indicate differences between LAs in terms of demographics, administrative effectiveness and strategic direction which are not effectively controlled for by the boundary discontinuity design. We will partly address this issue by matching schools according to a measure of disadvantage (i.e. the proportion of children eligible to receive Free School Meals in the school), as well as by geographical proximity when forming our neighbourhood clusters j. We can also control for the index of Additional Educational Needs (AEN) used in the formula that determines funding to LAs. However, an alternative solution is to use the differences in Area Cost Adjustments (Daca lt ) between LAs as instruments for the differences in school-level expenditure. The identifying assumption isthat the differences in the ACAs between neighbouring schools, across LA boundaries, are correlated with differences in school expenditure, but uncorrelated with differences in the characteristics of schools and their students. This assumption seems plausible given that the ACAs are intended to compensate LAs for differences in labour costs, and yet closely neighbouring schools are self-evidently in the same labour market.

Defining matched k-school clusters
We now explain how we define school 'neighbourhood' clusters (j) and implement the fixed effects estimator in (4a/b). To create a matched school cluster of maximum size k, we take an 'origin' Community school and match it to its nearest k-1 neighbouring Community schools in an adjacent LA by year where these neighbours are within 2km straight line distance. This cluster is then restricted to the schools that fall within 5 percentiles of the origin school in the distribution of proportion of Free School Meal (FSM) students. The intention here is primarily to match schools in terms of neighbourhood j and basic school type, allowing us to eliminate unobserved neighbourhood and school-type fixed effects (including labour market effects).
However, additional matching by FSM also eliminates potential differences in FSM proportions, which may reflect LA-average FSM differentials and hence enter into the between-LA funding differences, or may result in differential funds being allocated to schools within LAs (e.g. if some LAs provide compensating resources to disadvantaged schools).
We do this matching for all Community schools, but exclude any cases in which there are zero FSMmatched schools within 2km. The maximal value of k we will use is 8 (implying we match each school to up -13 -to 7 nearest schools, although the mean number in the cluster will be less than this due to the second stage restriction on schools that are similar in terms of FSM). The minimum value of k we use is 2, implying we match each origin school to its nearest school across the LA boundary. These k schools are 'stacked' in a panel format, and students assigned to their corresponding schools to create a student level data set. So each student in an 'origin' school s in a k-school cluster becomes grouped with other students in the nearest, up tok-1, FSMmatched schools in adjacent LAs. This student may appear again in the dataset, because the 'origin' school s may appear as a matched school for another origin school s' in an adjacent LA. The origin school identifiers s, s', s'' etc. serve as identifiers for the school clusters j in the within-groups regression (4a/b). In addition, the same schools (but with different students and different expenditures) appear in our data in different years.
Clearly, this setup generates a complex data and error structure, with implications for the estimated standard errors on the regression coefficients. For this reason, we make our standard errors robust to arbitrary correlation in the unobservables within LA boundary groups, by standard 'clustered' standard error methods.
These LA boundary error clusters are groups of schools for which the same pair of LAs appears for either the 'origin' or 'matched' school. Clustering the standard errors in this way allows for error autocorrelation induced by the repeated observations in the data setup, caused by spatial autocorrelation along LA boundaries, or serial correlation within schools, over time.
One important point to note is that this research design creates a selected sub-sample of schools and students: those Community schools that are located close to LA boundaries and have k-1 matchable Community schools within 2km. The schools in these boundary sub-samples are likely to be primarily urban (given the greater density of schools and LA boundaries within urban areas), with all this implies in terms of student demographics and school context. To the extent that the effects of expenditure are heterogenous across school and pupil types, the results we present are specific to schools and students of the type in our boundary sample, rather than the general population, which motivates our specific research focus on disadvantaged students in urban schools. This is an inevitable consequence of any research design that isolates specific non-random subgroups in the population in order to construct counterfactuals (including most regression discontinuity designs). Our additional results on heterogeneity by student and school type (see section 4.5 below) shed further light on the generalisability of the findings.

Evaluating the strategy and instruments
The identifying assumption in our preferred IV strategy is that the differences in the ACAs between neighbouring schools, across LA boundaries (and within boundaries over time 18 ) are correlated with differences in school expenditure, but uncorrelated with differences (and changes) in the characteristics of schools and their students. Our alternative IV strategy assumes that difference between the average grant paid by LAs to its schools and the average grant paid by an adjacent LA to its schools is uncorrelated with the differences in characteristics between neighbouring schools in these adjacent LAs. We present a number of tests of these assumptions. Firstly, we look at how sensitive our estimates of β are to the inclusion of control variables for student demographics and prior achievements (namely test scores at age 7, key stage 1), and other components of the central government school funding formula (z lt in equation 3a/b). Secondly we present 'balancing' tests to show that instruments are uncorrelated with differences in student characteristics across LA boundaries. These balancing tests involve testing for a correlation between a set of student and neighbourhood characteristics and our instruments. We do this in two ways. Firstly, we simply re-estimate our main school cluster fixed effects regressions, where we instrument school expenditure with LA-income or the ACA, but replace student test scores with student characteristics as the dependent variable (and dropping all control variables). Secondly, we aggregate student characteristics to school-by-year level and regress these characteristics and some other time varying school characteristics on our instruments, in a reduced-form regression with school cluster fixed effects. In the first cases, we test for a zero coefficient on the (instrumented) expenditure variable. In the second case we test for zero coefficients on the LA-income and ACA instruments.
A further potential threat to our identification strategy, often raised as a criticism of studies that use administrative boundaries as a source of discontinuity, is that the administrative boundaries coincide with physical features such as roads and railways that bisect geographical areas into distinct communities, so that the neighbouring schools in adjacent LAs are not in practice in the same neighbourhoods, and the neighbourhoods may differ on unobservable dimensions. To assess this hypothesis, we re-estimate our main instrumental variable specifications using the sub sample of schools that are separated by boundaries that do 18 This constitutes only a small part of the variation in our data.
-15 -not coincide with railways, major roads, or motorways (our boundary sample already excludes schools separated by major coastal water features such as estuaries). 19

Extensions to the main methods
In addition to the baseline estimates of β in our LA boundary sub-sample we offer a number of extensions which potentially lead to additional insights into which students benefit and in what ways they benefit from additional funding. In particular, we are interested in whether additional funding is more effective for some students than others, and more effective in some school contexts than in others, and whether it has more impact on some subjects than others. To this end, we estimate regressions separately for students in different demographic categories (FSM, non-FSM, boys, girls, white, non-white, high and low prior achievement -ks1 scores). In all these cases, our estimates can only partially answer our questions because we do not have expenditure split by subject area, nor do we know on which students the money is being spent. Hence, the estimates depend on both the response of outcomes in a given category (subject, or student type) to expenditure in that category, and on the way that schools, on average, allocate their expenditure between these categories (i.e. how much of additional expenditure goes into maths teaching relative to English, or into lower achieving children relative to high achieving children). More concretely, we can answer questions about how achievements in schools in different contexts respond to increased expenditure by splitting our sample into different school types, estimating regressions separately for schools with above/below median proportions FSM, above/below median indices of student's residential neighbourhood deprivation (IDACI indices -see the data section), and above/below median average ks1 scores (i.e. test scores at age 7). Finally, we re-estimate our IV estimates of equation (4a/b) separately for ks2 Maths, Science and English tests.

Evidence on expenditure patterns
Using our methods, it is not possible to estimate what types of expenditure are most or least effective in raising achievements, because we do not have sufficient instruments to identify separate causal effects for different expenditure categories. We do, however, provide some insights by looking at how the overall -16 -funding differences affect spending in various categories using the detailed breakdown available in our school expenditure data. This is achieved by estimating a set of expenditure share equations similar to (4a/b) at school-by-year level, but replacing test scores with expenditure shares as the dependent variable, and using LA-income differences as instruments for school total expenditure. This approach is similar to that commonly used for estimating household consumption "Engel curves" in the consumption literature, where the equations would typically include additional controls for goods" prices. In our case we use the school-cluster fixed effects to control for prices: that is we are comparing expenditures in closely spaced schools, which we assume face identical prices for their inputs. Table 1 shows descriptive statistics for the full national sample and the boundary sample based on 4school clusters, which will form the basis for most of our analysis (though we will report results for alternative sized clusters). Figure 1 maps the schools in this sub-sample. The full sample is not used in the empirical analysis and is shown only for comparison purposes. As we have discussed, our research design brings the focus on urban schools, the boundary sample being predominantly urban because of the greater density of boundaries and schools in urban areas. In fact, the sample is quite heavily weighted towards London schools, with schools close to boundaries in or on the periphery of London accounting for 60% of the sample (as compared to 14% in the population overall). This urban sample, has higher levels of per-student spending (£3689 compared to £3256 on average at 2009 prices), higher levels of income from the central government grant (£2889 compared to £2589), and a higher Area Cost Adjustment index. Children in the boundary schools are more likely to be on Free School Meals, less likely to speak English as a first language and less likely to be White British, reflecting their urban locations. The table also summarises the distances between our matched schools in the 4-school cluster boundary sub-sample. The schools are on average close to each other (less than 1.4km apart) and less than 500 metres from the LA boundary.

Description of the sample
The lower two panels of Table 1 shows how the expenditure and income data looks when we difference across LA boundaries, for schools in the 4-school cluster boundary sample. In the middle panel, the data is  Table 1. For reference purposes, the first pair (column 1 and 2) presents simple OLS estimates on the boundary subsamples, but without school cluster fixed effects (i.e. the data is not differenced across boundaries as implied by equation 4a). Due to the needs-based resource allocation to schools (both from central government to LAs and from LAs to schools) the coefficients in the uncontrolled regressions (column 1) are negative and significant and cannot be interpreted as causal estimates. Column 2 adds in the control variables set, which drives the coefficient towards zero (and insignificance), as we would expect since these variables at least partially control for the factors that jointly determine resource allocation and student achievement.

Regression results
The second pair (columns 3 and 4) relates to a k-school-cluster fixed-effect regression of ks2 scores on school-level expenditures, where expenditure is an average over the preceding 4 years before the tests (i.e. equation 4a, with no instruments). The estimates in these specifications are negative, and, with no control variables, become significant as we move down the table to differences based on larger clusters. Controlling for student characteristics in column 2 renders all the estimates statistically insignificant. As discussed in -19 -section 4.2, these regression discontinuity design-based estimates use between school variation in expenditure that is still potentially correlated with unobserved school characteristics, when these characteristics are not effectively controlled for by the discontinuity-design. One reason for this failure in the discontinuity design is that schools will differ in their ability to attract additional funding from non-central sources (charities, events, special LA grants) for reasons that do not necessarily relate to geographical location, such as head teacher and staff motivation and effectiveness in fund raising, or random variation in student intakes that attract additional funding (e.g. children with diagnosed additional needs). A second reason is that the matched schools are not perfectly co-located and so potentially not perfectly matched on unobserved characteristics of their student intake. These estimates therefore cannot be interpreted as causal.
Columns 5 and 6 report the results when we use average LA primary school expenditure per student in place of school-level expenditure, thus mitigating the biases induced by school-specific unobservables. The coefficients become large, positive and statistically significant (except where we compare nearest school pairs in row 1). They are generally quite insensitive to the size of the school-cluster used, and whether or not control variables are included, although the coefficients are much more precise in the 4 to 8-school clusters, when we compare a school with more schools than just its nearest neighbour. The effect sizes imply that an increase of £1000 in average per-student spending in the LA as whole is associated with between 0.10 and 0.18 of a standard deviation increase in student achievement at ks2. However, these estimates make it hard to judge the effect of additional spending at the school level, in that they do not adjust for the relationship between spending in the boundary schools and spending in the LA on average. Table 2b, provides the instrumental variables estimates of equations (4a/b), along with the first stage Fstatistics for the IV results. All of the F-statistics are acceptable in terms of usual criteria for the strength of the instruments. Column 1, 2 and 5, 6 use school level mean income delegated to the school from the LA as the instrument (in our CFR school income and expenditure data this is category IO1, "funds delegated by the LA"). Columns 3, 4 and 7, 8 use the ACA index as an instrument. Columns 1, 2 and 3, 4 use the boundary subsample, but without cluster fixed effects, so do not exploit cross-boundary differences. We report these results in order to demonstrate that we need both differencing across boundaries and instrumental variables as a fully effective strategy. To see this, note that the IV estimates without fixed effects and without any control variables are negative and significant, and not so different from the OLS estimates. This negative association -20 -occurs for similar reasons to the OLS estimates in Table 2a, because the LA funding-based instruments are correlated with the characteristics of schools that also determine school performance, due to needs based funding rules. Once we include control variables to partially adjust the estimates for these school characteristics, the estimates become positive and significant. However, it remains difficult to judge to what extent simply controlling for school characteristics in this way is an effective strategy.
These IV estimates are all considerably larger in magnitude than the IV estimates that do not exploit only cross-boundary differences (columns 1-4). They are also, in most, cases, higher than the estimates in Table 2a column 5 and 6 that used cross-boundary differences, but no instruments. A crucial thing to note, both from Table 2a columns 5 and 6, and from Table 2b columns 5-8, is that the strategy of comparing funding differentials arising from LA-sources in closely spaced matched schools across LA boundaries seems to be effective in eliminating the biases induced by needs-based resourcing, because the estimates are much less sensitive to the inclusion of our set of control variables. The implication is that the LA-based funding instruments are uncorrelated with other factors determining pupil achievement (and we provide more evidence on this in the balancing tests below). In fact, for the LA-income instrumental variables estimates in columns 6 and 7 (where the instrument is the average grant paid from the LA to the schools within its control), the point estimates are almost identical with, and without any control variables. The ACA-index based IV estimates are more sensitive, and the conditional estimates in columns 8 are around 50% higher than the unconditional estimates in column 7, although this difference is less than 2 standard errors.
The estimates from the 4 to 8-school clusters are again much more precise than in the 2-school clusters, and range from around 0.16 to 0.32. All are statistically significant at the 5 percent level or better, except for the specification in the top row of column 7. Although the IV estimates based on the ACA indices are potentially preferable on theoretical grounds, given they isolate a specific source of variation in funding, the LA-income based IV estimates yield more stable and statistically significant estimates. These LA-income IV estimates are not highly sensitive to the choice of school cluster size, nor to the control variable set. They have higher first stage F-statistics, which is to be expected given the greater variation shown in Figure 2.
Overall, the IV results indicate that an additional £1000 per student paid to schools in these urban LA boundary settings, raised student test scores at ks2 by around 0.25 standard deviations.
-21 -In these main specifications, identification of the expenditure effects comes from variation in expenditure across boundaries, and over time within the school-cluster. The point estimates are higher still if we control for school-cluster-by-year fixed effects such that we estimate using only the cross-sectional variation in expenditure, although the difference is less than one standard errorsee Appendix Table A1. Table 3a and 3b report the balancing tests described in section 4.4. These results assess whether students and schools, that are in LAs with high income levels ( For the most part, these balancing tests show that schools along LA boundaries that are exposed to different LA-incomes and ACAs do not have markedly different characteristics. There is no association between these instruments and early school achievements (ks1 at age 7), age, gender, English as first language, ethnicity 20 , or residential deprivation in the student level or school-level regressions. There is no association with school size (student numbers) or the average of students' residential neighbourhood house prices in the school level regression.

Evaluating the identification strategy: balancing tests
20 There is an association with ethnicity when we do not control for LA Additional Educational Needs (AEN) in the school-by-year level regressions.
-22 - The one obvious dimension on which the schools exposed to different LA-incomes and ACAs do not appear to be well balanced is FSM entitlement. In the both the student and school level regressions, the coefficient is small, but significant, in the regressions without the LA AEN control. The reason for this association is most likely that school funding formulae allocating funds to LAs (and potentially to schools within LAs) depend explicitly on the proportions of families on income support, which also determines FSM entitlement, and it is hard to break this link in the empirical analysis. Indeed, controlling for the LA Additional Educational Needs indexthe index of families on income support that is used in the funding formula to LAsin the second row in each panel reduces the size of the coefficient and renders it insignificant in the case of the LA-income instrument, and less significant in the case of the ACA instrument.
The question is, whether this failure of balancing in the uncontrolled estimates is of any consequence for the interpretation of Table 2. The positive sign of the coefficient in the FSM specification in Table 3 immediately suggests differential FSM status cannot explain the performance advantages in high-ACA schools, since FSM entitlement is also associated with lower ks2 achievement. More specifically, consider that the coefficient from a simple regression of standardised ks2 scores in FSM entitlement at student level (with no other control variables) is around -0.5. From Table 3, column (4) it can be inferred that a 5-6 percentage point increase in the probability a student being FSM-entitled is associated with a £1000 increase in total school expenditure per pupil. However, this relates to a 0.05*0.5 = 0.025 standard deviation fall in ks2 scores. This is not a big effect relative to the 0.25 standard deviation increase in ks2 scores attributed to £1000 in total expenditure per pupil in Table 2b and is of little substantive importance for the main findings on the effects of expenditure on ks2 scores. 21 Another reason for potential imperfect balancing across the LA boundary is if LA funding, and consequent school funding differentials, encourage sorting of households of different types across the boundary. In practice, parents in England will find it difficult to observe school expenditure differences without considerable research effort, and will find it even harder to make a judgement about the potential benefits or otherwise of these funding differences, given that resources are targeted to compensate other school disadvantages. However, we present a number of tests of this hypothesis.
21 Note that repeating this exercise for either 2,6 or 8-school cluster sizes, tends to improve the balancing in terms of FSM entitlement, but we report the 'worst case' so that the reader can judge for themselves the scientific credibility of the results.
-23 -One way in which household may sort across boundaries is through residential choice. The existence of house price differentials across school catchment area boundaries has been demonstrated by other studies and used a source of identification for the effects of school quality on house prices (Black 1999, Bayer et al 2007, Gibbons, Machin and Silva 2009, and such differentials potentially lead to this kind of sorting. Sorting of wealthier families into the neighbourhoods and schools on the side of the boundary with higher ACA-based funding could lead to amplification of the direct impacts of these resources. Given the scale of the effects in our results in Table 3a and 3b, we doubt that house-price related sorting is a major factor. To see this, consider that a one standard deviation increase in the ACA index is related to an £111 increase in per-student funding per year, which implies an 111/1000 x 0.25 = 0.028 standard deviation increase in student performance (where the s.d. is in the student distribution). Given the standard deviation in performance across schools is around 30% of the standard deviation in the student distribution, this £111 funding differential corresponds to a 0.08 standard deviation differential in school performance. A typical estimate from the schools and house prices literature puts the house price response to a 1 standard deviation increase in performance at around 3% (e.g. see Gibbons andMachin 2008, Black andMachin 2011). Therefore, a 1 standard deviation increase in the ACA index would raise house prices by only 0.08*3 = 0.24%, a price differential, or about £480 on a £200000 property typical at this time. This magnitude of price differential seems unlikely to lead to substantial educationally-relevant residential sorting. It should also be noted that the balancing tests indicate no statistically significant association between the instruments and housing prices, nor any association with achievements at age 7, which we would expect also to be affected if the results were driven by residential sorting that affected educational achievement.
Evidence of sorting across LA boundaries may also appear in the flows of students from residential locations on one side of the boundary to schools on the other, rather than residential sorting, and we test for these flows in column (10) of Tables 3a and 3b. The first result in column 10 of Table 3a suggests there is some correlation between the proportion of students attending a school from outside the school"s own LA, and expenditure in the school. A standard deviation increase in income from the LA is associated with a 3.2 percentage point decrease in the proportion of students attending from an adjacent LA. For comparison, the mean flows across the boundary in this boundary sample are 13.5%, so this is quite a large impact. The direction of the flow is, however, opposite from that which might be expected, with higher expenditure -24 -repelling students, a finding that is hard to square with the idea that higher expenditures attract higher ability or more motivated students leading to better performance. This result may arise because higher expenditures signal high-FSM schools that parents try and avoid, and because (as shown in column (4)) there is a residual positive correlation between expenditures and FSM intakes. This interpretation is borne out by the fact that when we control for the LA level additional educational needs index from the funding formula in the lower panel of Table 3a, or switch to using the ACA index instrument in Table 3b, the correlation between cross-LA inflows and school expenditure is eliminated. Ultimately, there is no strong evidence for student sorting across the boundaries into high-expenditure schools.
Residential sorting across LA boundaries could be especially sharp if the boundary coincides with geographical features, as discussed in section 4.4, so we re-estimate our main regressions using boundaries that do not coincide with major roads and railways. These results are shown in Table 4 (for the LA income instrument) and are not substantively any different from those in Table 2b, indicating that the coincidence of physical features and LA boundaries is of little relevance.
As a further test for a correlation between LA-level funding differences and unobserved school characteristics, we look at whether the funding a school receives from sources other than the LA grant are correlated with the funding they receive from their main LA grant. Schools in our sample receive around 79% of their resources from the main LA grant, around 8% from charitable and voluntary contributions, and the rest from various other grants from the LA and/or central government (e.g. grants for ethnic minority achievement and special educational needs). In particular, we are concerned that low funding from the LA might induce schools to raise more funds from alternative sources. While not necessarily compromising our IV strategy, there might be concerns that there is some general behavioural response by school leasdership and staff to these funding challenges that has direct effects on achievement as well as increasing school resources. To test for these possibilities, Table 5 reports the coefficients and standard errors from a school-by- year level regression of alternative income sources on the LA average income per pupil paid to schools (i.e. the LA-income instrument used in the main analysis). As in the balancing tests above, we use the boundary sub-sample of schools, with 4-school cluster fixed effects. The results in Table 5 indicate no large or significant association between LA income and alternative funding streams in total, nor between LA income -25 -and voluntary/charitable contributions specifically, again supporting the identifying assumption that the crossboundary funding differentials are uncorrelated with cross-boundary differences in school characteristics. Table 6 reports on heterogeneity by school characteristics. The split by school characteristics is based on whether or not a school has above or below-mean proportions of various student demographic groups. In these results, we use the LA-income instrument, because the ACA-index becomes too weak an instrument to give informative results for some of these subgroups, although the point estimates are similar (see Appendix A2). The overall story in Table 6 is that the effects of expenditure are considerably higher and more significant in schools with more "disadvantaged" students. Expenditure appears not to have had an impact in schools with higher proportions of whites than average, schools where pupils come less disadvantaged neighbourhoods, nor schools where achievement at ks1 is above average. Evidently, expenditure has higher returns in schools where there are greater gains to be made at school level. Interestingly, these effects seems to be based on the type of school, not the type of student. Appendix A2 presents the breakdown by student type, rather than the school characteristics, and there appears to be relatively little difference. In other words, all types of students in the most disadvantaged schools appear to benefit from additional funding, not just the disadvantaged students, although it is hard to know what to conclude from this finding given we have no information on how additional resources were split within schools between different student types.

Heterogeneity by school characteristics and subject
To answer questions about the linearity of the relationship between school resource differences and achievement, we further split the sample into schools with below-median across boundary funding differences (between zero and £110 per pupil per year) and those with above median boundary funding differences (between £110 and £1060). These results (not tabulated) show, perhaps unsurprisingly, that all the effects estimated in Table 2b originate in the upper part of the funding differential distribution, and marginal changes in funding per pupil below £110 per pupil per year have little influence on student achievement.
Lastly, Table 7 splits the ks2 score into subject areasmaths, science and English. It turns out that the effects are fairly general across subjects, although the strongest effects on achievement arise through scores in maths and science, with English showing a more moderate, but still significant response. It is not clear why -26 -expenditure effects should vary across subject areas. However, Machin et al. (2010) find asimilar result for a resource-based programme targeted at secondary schools in disadvantaged urban areas.

How was the additional money spent?
Although we can say nothing about the causal effects of different spending categories on achievement, we show how the expenditure patterns relate to additional income in Table 8, using the method set out in Section 4.6. Our CFR schools expenditure data has a fairly detailed breakdown of the expenditure shares in various categories. For presentational simplicity, we aggregate some of these categories into 9 groups, teaching expenditure (including temporary agency and "supply" staff), support staff (largely teaching assistants and specialist staff to assist with children with special needs), other staff (including administrative, catering and premises staff), personal development and training, premises (building and grounds maintenance, energy, cleaning, water and sewage, rates), learning and ICT resources, "bought in professional services" (which includes various types of consultancy, self employed music teachers, legal advice etc.), supplies (including catering and administrative supplies) and other costs (which include insurance costs, financial items such as loan interest and transfers to the capital budget).
The bottom row of Table 7 shows the mean expenditure shares in these categories (in the boundary subsample) over [2004][2005][2006][2007][2008][2009]. More than half the budget goes on teaching staff, and just under 80% on direct staff costs in total. Non-staff items are each a relatively small share of the total. The coefficients in row 1 of the table are the effect of an additional £1000 in total school expenditure per student on the share of expenditure in each category. Clearly, all these effects are quite small, with an additional £1000 per student reducing the share spent on teachers by 3.7 percentage points (from 56.4% to 52.7%). This is compensated for by an increase in the share spent on learning and ICT resources, professional services and supplies (from 11.5% to 15.2%). These results indicate that additional income tends to get spent disproportionately on items other than teaching costs, although the changes in the shares are small, so the overall impression is that additional income is spread across all categories. The indivisibility of teaching expenditures may also contribute to these empirical results, given that small expenditure differentials cannot easily be used to employ additional teachers (and are difficult to use to attract better teachers given lack of flexibility in teachers pay) and so would have to be spent in other ways.

Discussion and conclusions
Our findings indicate quite a strong role for general funding increases in raising achievement in urban state schools. Perhaps this should not be surprising. However convincing evidence of an impact from putting more money into state schools has remained elusive, so our analysis is a useful addition to the international academic literature on the economics of schooling. Although we can say little about the channels by which money raises achievement, or provide any guide to how the money should be spent when it reaches schools, the results are crucially important for higher-level policy making. However, FSM students are only 25% of the intake in the urban schools in our study, or 17% nationally.
Therefore, since the Pupil Premium is simply additional funding for schools, and is not necessarily used for resources targeted specifically at FSM children, it amounts to additional income of at best about £100 per student initially, rising to perhaps £400 by 2014-15 (again at 2009 prices). 22 According to our estimates, an additional £400 per student per year could be expected to raise ks2 achievement, on average, by about 10% of a standard deviation (based on the status quo in terms of all other institutional arrangements). A few more back of the envelope calculations (based on estimates in Table 4) indicate that, if used specifically for FSM students so that FSM students received an additional £2000 in resources, the Pupil Premium at its proposed        Table 2   Table 5: Association between LA income sources and non-LA income sources. Each coefficient is from a separate regression using 4-school clusters.
(1) -41 - Notes as Table 2 Low age-7 score is below Level 2b in Reading, Writing and Maths; High score is Level 2a or above in Reading, Writing and Maths Table 7: Effects of school spending on student ks2 test scores at age 11. Each coefficient is from a separate regression using 4-school clusters.
(1) Notes: refer to Table 2 -43 -  Notes as Table 2 Low age-7 score is below Level 2b in Reading, Writing and Maths; High score is Level 2a or above in Reading, Writing and Maths