Everybody needs good neighbours ? : evidence from students ' outcomes in England

There are large disparities between the achievements, behaviour and aspirations of children in different neighbourhoods - but does this mean that the place where you grow up determines your later life outcomes? Steve Gibbons, Olmo Silva and Felix Weinhardt outline the findings of a series of CEP studies of 'neighbourhood effects'.

There are substantial disparities between the achievements and behaviour of children living in different neighbourhoods (Lupton et al., 2009). These disparities have long been a centre of attention for researchers and policy makers concerned with socioeconomic inequality and its consequences. The underlying foundation for these concerns is the belief that children's outcomes are causally linked to the characteristics and behaviour of people who live around them. Area-based policies that are intended to address such inequalities are predicated on the existence of such causal links; see discussions in Currie (2006) for the US and Cheshire et al. (2008) for the UK. Interventions of this type include inclusionary zoning and desegregation policies, as well as regeneration and mixed-housing projects, such as 'Hope VI' in the US and the 'Mixed Communities Initiative' in England. The theories of 'social interactions', 'neighbourhood effects' and 'peer effects' (Jencks and Mayer, 1990;Manski, 2000;Durlauf, 1996) that underpin these policies have been drawn into economics from sociology and psychology, and economists have put substantial emphasis on role models (Akerlof, 1997;Glaeser and Scheinkman, 2001), social networks (Granovetter, 1995;Bayer et al., 2008) and conformism (Bernheim, 1994;Fehr and Falk, 2002).
Although these theories are compelling, convincing empirical evidence remains elusive and opinion on their policy relevance is divided. 1 The ambiguity in the evidence comes about for four reasons: (1) sortingcausality is difficult to establish because children's characteristics are linked to those of their parents, and in turn to those of their neighbours through common factors in residential choice; (2) correlated effectswhile theories of 'social interactions ' and 'neighbourhood effects' are about the effect of neighbours, the distinction between neighbours and neighbourhood factors ('correlated effects'; Manski, 1993)such as good schoolsis often blurred in the empirical investigations; (3) defining neighbourhood groupsthe correct geographical scale that should be used to delimit 'a neighbourhood' is a priori unknown; (4) heterogeneityissues of equity and efficiency in neighbourhood-related policies hinge on questions about heterogeneity and non-linearity in neighbourhood effects but the existing literature does not fully investigate these issues.
Bearing these four issues in mind, the main contributions of this study relative to previous work in the field are: (1) to use a research design which includes individual and neighbourhood-fixed effects and in which we directly observe the impact of residential movers on children who do not move to identify the causal effects of neighbourhood compositionrather than spurious sorting. This design is applied to a large administrative census of multiple cohorts of school children, spanning several years of childhood; (2) to use this design to estimate the effect of changes in the characteristics of neighbourhood peersnet of amenities and other 'correlated' effects (Manski, 1993); (3) to exploit the geographical detail of our data to provide alternative neighbourhood definitions and assess the correct spatial reference scale in the most flexible way. The level of detail in our data even allows us to distinguish between neighbours who attend the same or a different school, shedding further light on the actual stretch of neighbourhoods and (4) to use the size of our census data set to explore fully potential non-linearities and heterogeneities in neighbourhood effects, which is of paramount importance for residential mixing policies.
In essence, our empirical set-up involves regressing changes in test scores of students between age 11 and 14 on changes in neighbourhood quality between age 11 and 14. The measures of neighbourhood quality that we use are based on characteristics of the students in our English administrative data that are predetermined (at age 11) and we restrict our estimation sample to individuals who do not change neighbourhoods. This ensures that our identifying variation arises from changes in neighbourhood composition for residential stayers induced by movements of other residents in and out of the neighbourhood, and implies that we can control for individual and neighbourhood unobservables without requiring that treated individuals change residential neighbourhoods. These population movements that we exploit are sizeable, with over 425,000 students or around 25% of the neighbourhood group changing during the 3-year period over which we measure the development of academic achievements in our main specifications. This set-up, while unique in the neighbourhood literature, is related to Angrist and Lang (2004) who estimate peer effects from changes in peer composition due to students' mobility induced by desegregation programmes, to Gibbons and Telhaj (2011) and Hanusheck et al. (2004) who study the effect of students' between-school mobility on students who do not change school, to Gould et al. (2011a) who investigate the effects of large inflows of immigrants into Israeli elementary school on the long-term outcomes of native students and to Moretti (2004) who studies social returns to education in cities by looking at compositional changes experienced by non-movers. However, our method differs from research on school peer effects that exploits cohortto-cohort variation in group composition to control for time-fixed school unobservables (Hoxby, 2000;Hanushek et al., 2003;Gibbons and Telhaj, 2008;Lavy et al., 2012a). These studies either do not control for individual-fixed effects and compare different students in different cohorts to control for school-fixed effects, or else require student mobility between schools (or grade repetition) to generate within-student variation in peer group and control for individual-fixed effects.
We address concerns about sample selection in our group of stayers by carrying out an intention-to-treat analysis that includes movers in the estimation sample but assigns them to the neighbourhoods in which they originate (thus, fixing their neighbourhood assignment and avoiding problems induced by endogenous neighbourhood choices). Furthermore, we can account for unobservables that induce changes over time in movers' characteristics and stayers' outcomes within neighbourhoods by tracking several cohorts of students as they progress from primary through secondary education. This allows us to control for unobserved trends in neighbourhood quality (e.g. gentrification or deterioration in housing quality) and include school-by-grade-bycohort effects to control for the effect of changes in school quality and composition as students move between one grade and the next. This is feasibleand necessary in our contextbecause students change school between grades and because there is not a one-to-one mapping between residential neighbourhood and school attended. This implies that different students in the same residential neighbourhood attend two to three different secondary schools and that secondary schools enrol students from around sixty different residential areas.
To preview our results, we find little evidence of a causal link between young peoples' test scores and neighbours' characteristics once we control for individual and neighbourhood-fixed effects by looking at changes in the neighbourhood peer composition over time. Our estimated regression coefficients are near-zero and precisely estimated. Furthermore, any remaining association is eliminated once we control for school-by-cohort effects and/or neighbourhood-specific time trends. Differentiating between the effects of neighbours in the same school and neighbours in different schools still yields no evidence that neighbourhood composition matters. Going beyond the simple linear-in-means specification of neighbour-peer effects on test scores, we uncover no evidence of important non-linearities, complementarities or threshold effects. In contrast, we find evidence that neighbourhood composition exerts a small effect on students' non-cognitive behavioural outcomessuch as attitudes towards schooling and anti-social behavioureven using those stringent specifications which yielded zero effects of neighbourhood composition on cognitive outcomes. Interestingly, we find that the effect of neighbour-peers on non-cognitive outcomes is heterogeneous along the gender dimension. This is in line with a growing body of evidence showing that girls are more affected than boys by education inputs and intervention (Anderson, 2008;Angrist and Lavy, 2009;Lavy and Schlosser, 2011).
The rest of the article is structured as follows. The next Section fleshes out in more detail the four empirical challenges in neighbourhood effects research and the ways in which our design mitigates them. Section 2 describes our empirical strategy formally, and Section 3 discusses data that we use and the English institutional context. Next, Sections 4 and 5 discuss our findings on cognitive outcomes and robustness checks, while Sections 6 and 7 present our evidence on heterogeneity and complementarities, and on behavioural outcomes respectively. Finally, Section 8 provides some concluding remarks.

Empirical Issues in Neighbourhood Effects Estimation
The standard approach to estimating neighbourhood effects is based on the statistical association between children's outcomes and the socio-economic composition of their neighbourhood ('contextual effects'; Manski, 1993). As outlined in the Introduction, there are four main reasons to doubt the interpretation of these estimates as causal parameters, namely: sorting; 'correlated effects'; appropriate reference group; and heterogeneity. In this Section, we flesh out these problems and present our line of attack.
The first empirical challenge is posed by the fact that children's characteristics are linked to those of their parents and, in turn, to those of their neighbours through residential sorting. This implies that the causal influence on the effect of neighbours' characteristics is confounded by the simultaneous effects of children's and parents' own attributes. Studies have used a variety of approaches to address these biases, including instrumental variables (Cutler and Glaeser, 1997;Goux and Maurin, 2007); institutional arguments related to social renters with limited residential choice and mobility (Gibbons, 2002;Oreopolous, 2003;Jacob, 2004;Goux and Maurin, 2007;Weinhardt, 2013); quasi-experimental placement policies for immigrants (Edin et al., 2003(Edin et al., , 2011Gould et al., 2011b); and fixed effects to partial out individual, family and aggregate unobservables (Aaronson, 1998;Bayer et al., 2008). Finally, there have been a number of experimental studies looking at randomised interventions, namely the Gautreaux and Moving to Opportunity (MTO) programmes (Rosenbaum, 1995;Kling et al. 2005Kling et al. , 2007Sanbonmatsu et al., 2006).
Even if problems of sorting are solved, studies still need to disentangle correlation caused by neighbours' characteristics from common coincidental neighbourhood amenities ('correlated effects'; Manski, 1993). Indeed, neighbourhoods that differ in terms of their socio-economic composition probably differ along other dimensionssuch as school quality and other local amenitieswhich are often unobserved in the data. This distinction between the effects of better neighbours and those of better neighbourhoods is often blurred in empirical work, and the importance of neighbourhood composition as opposed to local resources and amenities takes a back seat. Randomisation of children to neighbourhoods does not solve this problem because the neighbourhoods to which individuals are assigned potentially differ along many other dimensions. In this respect, most of the MTO-based studies (Kling et al., 2005(Kling et al., , 2007Sanbonmatsu et al., 2006) treat neighbourhoods as a 'black box' in terms of the specific causal channels, although recent work has started to unpick the contributory factors (Harding et al., 2010). 2 To isolate the causal influence of neighbours from the effects of neighbourhoods, Moffitt (2001) suggested 'reverse-engineering' the evaluation of programmes like the MTO or Gautreaux to study changes in the outcomes of the original residents of the areas receiving relocated households. For these people, neighbourhoods remain unchanged except in so far as their composition is affected by the influx of new families.
Following this intuition, our study mitigates problems of sorting and confounding neighbourhood attributes by exploiting changes in neighbourhood composition induced by migration of residential movers in a population of school-age families. We estimate the effect of these mover-induced compositional changes on cognitive and non-cognitive outcomes of stayers in England from age 11 (grade 6) up to age 16 (grade 11). This approach allows us to control for time-persistent neighbourhood amenities, such as local school quality and other localised infrastructures/amenities (Manski's (1993) 'correlated effects') and to identify separately the effects arising from changes in neighbourhood composition, which we label 'neighbourhood peer effects'. 3 Although we cannot pin down the theoretical channels through which neighbours might matterfor example, conformism, social networks and role modelsthis limitation is common to the literature on peer effects in schools. Nevertheless, we claim that these reducedform estimates are policy relevant as they shed light on the likely effect of desegregation policies and mixed-communities initiatives, which advocate changes to neighbourhood composition as a way to improve youths' outcomes.
The third challenge lies with defining the operational reference group for a child's neighbour-peer influences. Like most previous research, we have no information on friendship networkswhich are in any case prone to problems of self-selection. However, we are not specifically interested in interactions within friendship groups. Instead, we want to investigate the influence of neighbourhood peer composition more broadly, including any effect which might arise from outside a child's friendship group. Of necessity, we must approximate the level at which these influences take place. However, unlike other research which is limited to large pre-defined groups (e.g. census tracts), we have precise geographical detail on residential location coupled with information on children's school attendance and age. This richness in our data allows us to define neighbourhoods at a very small scale (on average five students of the same age) but also experiment with larger groupings of contiguous areas (similar to Bolster et al., 2007). We can further modify these groups to focus on students of different ages, capturing interactions within the same birth cohort and across adjacent birth-cohorts, and split the reference groups into neighbours who attend the same school and neighbours who attend different schools, allowing us to separate peer effects in neighbourhoods from peer effects and other shared influences in schools.
2 Most other studies do not control for the quality of local schools and other neighbourhood features in their analysis, or try to distinguish between school and neighbourhood-level variables (Goux and Maurin, 2007), although there are exceptions (Gould et al., 2004;Card and Rothstein, 2007).
3 Note that we are not trying to estimate Manski's (1993) 'endogenous' neighbourhood effects, that is, the effect of neighbours' behaviour. We therefore sidestep reflection problems that arise when the effects of neighbours' behaviour are not separately identified from the effects of neighbours' characteristics that give rise to those behaviours. Fourth and finally, the existing literature does little to investigate heterogeneity and non-linearities in the effect of changes to neighbourhood composition, despite this being crucial to understanding the consequences of social mixing. 4 Even if policies that promote integrated neighbourhoods succeed in reducing inequality, they will be inefficient if the losses to those who lose out from mixed neighbourhoods outweigh the gains to those who benefit. 5 The literature on peer effects at school investigates these issues extensively (Hoxby and Weingarth, 2005;Gibbons and Telhaj, 2008;Lavy et al., 2012a, b) but there is much less evidence on heterogeneity in relation to residential neighbourhood effects. Although long ago these concerns were prominent in the neighbourhoods literatureboth in theory (Jencks and Mayer, 1990) and empirically (Corcoran et al., 1989;Crane, 1991) recent empirical work has paid less attention, as the search for credible identification of causal effects has led to a focus on linear effects for homogenous and narrowly defined groups. These groups include Blacks living in ghettos (Cutler and Glaeser, 1997); individuals in socially rented accommodation (Gibbons, 2002;Oreopolous, 2003;Jacob, 2004;Goux and Maurin, 2007;Weinhardt, 2013); immigrants (Edin et al., 2003(Edin et al., , 2011Gould et al., 2011b); and families living in deprived neighbourhoods and relocated to better areas (the 'Gautreaux' and 'Moving to Opportunity' programmes cited above). This narrow focus aids identification but precludes investigation of heterogeneity and complementarities for two reasons. First, sample sizes are often small, limiting the scope for further slicing the data into subgroups. Second, by focusing on the most disadvantaged individuals, these studies cannot investigate whether the effects of neighbourhood composition are homogenous or heterogeneous along the lines of students' background and ability. To examine these issues in detail poses huge data requirements for empirical research. Our data set provides us with a unique opportunity to investigate heterogeneity and non-linearities in these responses at a very detailed level. The next Section sets out our approach in greater detail.

General Identification Strategy: A Changes-in-changes Specification
Our empirical work estimates the effect of neighbourhood composition on students' educational and behavioural outcomes during secondary schooling. As outlined above, any attempt to estimate the causal influence of neighbourhood peers must eliminate biases that arise because of sorting. To address this issue, we use a changes-in-changes research design. The rest of this Section sets out our empirical model formally.
Assume that students' outcomes depend linearly on the characteristics of peers in the neighbourhood, other neighbourhood infrastructures and individual characteristics to give a reduced-form relationship: where y insct denotes the outcome of student i living in neighbourhood n, attending school s, belonging to birth cohort c and measured at grade or age t. Note that school grade is equivalent to age, as there is no grade repetition in England. In the empirical analysis, we look at academic outcomes, including test outcomes from grade 6 to grade 11 (ages 11-16) and some behavioural outcomes (e.g. attitudes to school, drugs use) in grades 9 and 11, as discussed in Section 3. We observe students' test scores at grades 6, 9 and 11 (ages 11, 14 and 16), and attended school and place of residence for these grades as well as all those in between. In this specification, z nct is a variable measuring neighbour-peer composition, for example, mean prior achievements of peers in the neighbourhood or the proportion from low-income families. The definition of these neighbour-peers is set out in subsections 2.3 and 3.3 below. The vector x i contains time-fixed predetermined observable student characteristics, which we allow to have a time-trending effect captured by dt. Furthermore, we assume that the error term has the following components: where a i represents an unobserved individual-level fixed effect that captures all constant personal and family background characteristics; φ n represents unobserved time-fixed neighbourhood characteristicssuch as access to a good public library and other infrastructuresand ξ n t represents neighbourhood unobserved trending factors such as gentrification dynamics. Finally, ϑ sct is a school-by-cohort-by-grade shock. Among other things, this term is intended to capture variation in school resources, composition and quality of teaching that is common to students attending the same school s in a given gradee.g. grade 6 (age 11)and belonging to the same cohort c. Finally, the term e insct is assumed to be uncorrelated with all the right-hand side variables. Endogeneity issues arise because the components a i , φ n , ξ n t and ϑ sct in (2) are potentially correlated with z nct and x i in (1). To eliminate some of the unobserved components that could jointly determine neighbour-peer composition and students' outcomes, we exploit the fact that we observe students as they progress from primary through secondary education, and know their outcomes and the composition of the neighbourhood where they live at different school grades (ages). We can therefore take within-student differences between two grades and estimate the following equation: where the subscripts t = 0 and t = 1 indicate the initial and subsequent grade (e.g. grades 6 and 9), and the exact grade interval varies according to the outcome under consideration. Note that we restrict our estimation sample to students who do not move neighbourhood. This implies that neighbour-peer changes (z nc1 À z nc0 ) depend on inflows and outflows of movers who are not in the estimation sample. The withinindividual, between-grade differencing for stayers reduces the error term to: where m insct is assumed random, and differencing eliminates both the individual (a i ) and the neighbourhood (φ n ) unobserved components that are fixed over time, including unobserved ability, family background and other forces driving sorting of families across different neighbourhoods. To allay concerns about the stayers being a selected sample, in one of our robustness checks, we include movers and stayers, and assign to movers the changes in the neighbour-peer quality they would have experienced had they not moved, providing 'intention-to-treat' estimates. Note also that it is straightforward to generalise (3) to allow for heterogeneity and non-linearities in the effects of neighbour-peer composition, for example, by interacting students' characteristics x it with neighbourhood composition changes (z nc1 À z nc0 ). Equation (4) shows that this grade-differenced specification does not control for changes in school quality between grades for a given student. The between-grade school quality change term ϑ sc1 À ϑ sc0 in (4) is likely to be non-zero because students change schools over the grade intervals that we study (some of these changes are compulsory during the primary-to-secondary transition), or because of new school leadership, changes in the teaching body or variation in school resources. This possibility poses a threat to our identification strategy because school quality changes for students in neighbourhood n might influence the inflow and outflow of students, as well as the characteristics of in/out-migrants into neighbourhood n, which would in turn affect changes in neighbourhood peer composition, z nc1 À z nc0 . We therefore further control for secondary-school-by-cohort effects or secondary-by-primary-school-by-cohort effects (effectively school-by-grade-by-cohort effects), effectively absorbing these sources of variation. We can in addition control for general unobserved neighbourhood-specific time trends ξ nsuch as gentrification or decline of some areas relative to othersby differencing from neighbourhood means across cohorts c.
Our identifying assumption in these models is that the remaining shocks to student outcomes (after eliminating student-fixed effects, neighbourhood-fixed effects, schoolby-cohort effects and/or neighbourhood trends) are idiosyncratic and uncorrelated with the changes in neighbourhood composition experienced by student i as he/she stays in the residential neighbourhood between grades t = 0 and t = 1. Our results include a set of balancing regressions that support the empirical validity of this assumption. These show that changes in neighbour-peer composition are not related to time-fixed neighbourhood characteristics or time-fixed average characteristics of the students living in the neighbourhood, even before we allow for neighbourhood unobserved trends or school-by-cohort effects.

Distinguishing Neighbourhood from School Peer Effects
In England, there is not a one-to-one link between neighbourhood and school attended but students in a given neighbourhood attend a mixed group of local schools, their choices being influenced by travel costs and school admissions policies that tend to prioritise local residents (see subsection 3.1). On average, students in the same age group and living in the same small neighbourhood (hosting five such students) attend two to three different secondary schools. Therefore, we can separately identify the effect of changes in neighbourhood peer composition for neighbours who attend the same secondary school and for those who do not. More formally, we can estimate the following model that partitions neighbourhood peers into two groups, those who go to the same secondary school (same) as student i and those who attend other secondary schools (other): Most variables in (5) were defined above. The variable ðz nc1 À z nc0 Þ same refers to changes in neighbour-peer composition driven by the mobility of peers who attend the same school as i at grade t = 1 (e.g. at grade 9 at secondary school). These students are therefore peers both in the neighbourhood and at secondary school. Note that schools are attended by students from a large number of residential areas: in our sample, on average, secondary schools attract students from 60 different neighbourhoods. This implies that same-neighbourhood-same-school peers are only a small fraction of the peers that students interact with at school. On the other hand, the variable ðz nc1 À z nc0 Þ other captures changes in the neighbour-peer composition that are driven by neighbourhood peers who do not attend the same school as i. Any difference between the coefficients b and c in (5) sheds light on the relative contribution of school and neighbourhood peers. Whereas peer effects (b) among neighbouring students who attend the same school might pick up interactions among students in schools, c represents a 'pure' neighbour-peer effect among students who go to different schools. As before, we can difference (5) within neighbourhoods, across cohorts to eliminate neighbourhood trends, and can control for school-by-cohort effects. 6

Defining Neighbourhood Geography
While all research on peer effects faces problems in defining group membership (Ammermueller and Pischke, 2009 for school peer effects), this choice is particularly challenging for neighbourhood peer effects where there are no natural boundaries (such as the grade or the class for school peer effects). Consequently, the neighbourhood group definitions adopted by previous studies vary greatly with respect to geographical size. Goux and Maurin (2007) argue that using large neighbourhood definitionsthat is, US Census tracts containing on average 4,000 peopleleads to an underestimate of interaction effects. However, overaggregation on its own will not necessarily attenuate regression estimates of neighbourhood effects as aggregation reduces measurement error. 7 Whether or not the level of aggregation matters in practice is an empirical question. The detail and coverage of our population-wide data permits experimentation with alternative geographical definitions, starting from a very small-scale unit -Output Areas (OA) from the 2001 British Censuswhich contains 125 households on average 6 School-by-cohort fixed effects can still be controlled for in (5) because students living in the same area attend a number of different schools and schools attract students from a large number of different neighbourhoods so that the terms ðz nc1 À z nc0 Þ same and ðz nc1 À z nc0 Þ other in (5) are not collinear with the term ð# sc1 À # sc0 Þ. 7 This is because the reduction in the covariance between mean neighbours' characteristics and individual outcomes will be offset by a reduction in the variance of average neighbours' characteristics in a regression of individual outcomes on neighbours' characteristics. and approximately five students in the same age-group (e.g. 6th-grade/age-11 students). Given that our identification approach relies on neighbourhood-fixed effects and trends to control for unobserved neighbourhood factors, a small-scale neighbourhood definition is preferable because it is less likely that there are unobserved neighbourhood changes over time within streets than within regions. Nevertheless, we experiment with larger geographical areas based on this underlying OA geography.
Another advantage of our data is that they cover the population of English state-school children and we can measure neighbour-peer composition in a variety of school grades. As we are interested in peer effects in the neighbourhood, we begin by considering students of similar age and construct neighbour-peer variables using data from students who are either of the same school grade (i.e. grade 6/age 11 at the beginning of our observation window) or 1 year younger/older (grade 5/age 10 and grade 7/age 12). However, we perform a number of checks using different grade bands. Note that these variables are constructed from information on students' characteristics that pre-date the first period of our analysis, using a balanced panel of students with non-missing data in every year of the census. This implies that changes over time in neighbour-peer composition occur only when students within our sample move across neighbourhoods and not when students drop out/come into our sample, or when their characteristics change.
The complex data that we use to pursue this analysis and the exact definition of our neighbour-peer variables are described in the next Section alongside the English institutional background.

The English School System
Compulsory education in England is organised into five stages referred to as Key Stages (KS). In the primary phase, students enter school at grade 1 (age 4-5) in the Foundation Stage, then move on to KS1, spanning grades 1-2 (ages 5-7). At grade 3 (age 7-8), students move to KS2, sometimesbut not usuallywith a change of school. At the end of KS2, in grade 6 (age 10-11), children leave the primary phase and go on to secondary school, where they progress through KS3, from grade 7 to 9 and KS4, from grade 10 to 11 (age 15-16), which marks the end of compulsory schooling. The vast majority of students change schools on transition from primary to secondary education between grades 6 and 7.
Students are assessed in standard national tests at the end of each Key Stage, generally in May, and progress through the phases is measured in terms of Key Stage Levels. 8 KS1 assessments test knowledge in English (reading and writing) and mathematics only and performance is recorded using a point system. On the other hand, at KS2 and KS3, students are tested in three core subjects, namely mathematics, science and English and attainments are recorded in terms of the raw test scores. Finally, at the end of KS4, students are tested again in English, mathematics and science (and in other varying number of subjects of their choice) and overall performance is measured using a point system (similar to a GPA), which ranges between 0 and 8. 9 Admission to both primary and secondary schools is guided by the principle of parental choice and students can apply to a number of different schools. Various criteria are used by oversubscribed schools to prioritise applicants but preference is usually given first to children with special educational needs, to children with siblings in the school and to children who live closest. For Faith schools, regular attendance at local designated churches or other expressions of religious commitment is foremost. Because of these criteriaalongside the constraints of travel costsresidential-choice and school-choice decisions are linked; see some related evidence in ; Gibbons et al. (2013) and Allen et al. (2010). Even so, most households have a choice of more than one school from where they live. On average, students in the same age bracket (e.g. age-14 students) living in the same OAthat is, our smallest proxy for neighbourhoods sampling on average five such studentsattend two to three different secondary schools every year and each secondary school on average samples students from around 60 different OAs (of more than 160,000 in England). As already mentioned, this unique feature allows us to measure changes in neighbourhood peer composition for students who attend the same or a different school.

Main Data Source and Grade 6 (KS2) to Grade 9 (KS3) Tests
To estimate the empirical models specified in Section 2, we draw our data from the English National Student Database (NPD). This data set is a population-wide census of students maintained by the Department for Education and holding records on KS1, KS2, KS3 and KS4 test scores and schools attended for every state-school student from 1996 to the present day. Since 2002, the database has been integrated with a Pupil Level Annual School Census (PLASC, carried out in January), which holds records on students' background characteristics such as age, gender, ethnicity, special education needs and eligibility for free school meals. PLASC also records the home postcode of each student on an annual basis. A postcode typically corresponds to 17 contiguous housing units on one side of a street, and allows us to assign students to common residential neighbourhoods and to link them to other sources of geographical data. In particular, we use data from PLASC to map every student's postcode into the corresponding Census Output Area (OA, described above).
The main focus of our analysis will be the period spanning grade 6 (age 11, end of KS2) to grade 9 (age 14, end of KS3) but we report results for other time periods and outcomes (described later). The main advantage of concentrating on this interval is that the data provide comparable measures of performance in English, mathematics and science at grade 6 (KS2) and grade 9 (KS3). We exploit this feature to construct measures of students' test-score value-added which allow us to estimate the changesin-changes specification spelled out in subsection 2.1. Operationally, we average each student's performance at KS2 and KS3 across the three subjects, then convert these means into percentiles of the cohort-specific national distribution, and finally create KS2-to-KS3 value-added by subtracting age-11 from age-14 percentiles. Note that we restrict our attention to students in schools that do not select students by academic ability (i.e. comprehensive schools).
Given the time span of the NPD-PLASC integrated data set and our data requirements, we can track several birth cohorts of students as they progress through education. For our main analysis, we retain students in the four 'central' cohorts, namely students in grade 6 (taking KS2 tests) in academic years 2001/2, 2002/3, 2003/4 and 2004/5, who move on to grade 9 (KS3 tests) in the years 2004/5, 2005/6, 2006/7 and 2007/8. We use other cohorts to construct the neighbour-peer variables as described in subsection 3.3 below. Finally, we concentrate on students who live in the same OA over the period covering grade 6 (age 11) to grade 9 (age 14), which we label as stayers (we will address issues of selectivity caused by focusing on the stayers in our robustness checks). After applying these restrictions, we obtain a panel of approximately 1.3 million students spread over four cohorts.

Data on Neighbour-peer Composition
Using NPD/PLASC, we construct measures of neighbour-peer composition based on neighbourhood aggregates of student characteristics. These neighbour-peer characteristics are: (i) average grade 3 (KS1) score in English (reading and writing) and mathematics; (ii) share of students eligible for free school meals (FSM); (iii) share of students with special education needs (SEN); (iv) fraction of males.
We use KS1 scores to proxy students' early academic ability, FSM eligibility as an indicator of low family income and SEN as a proxy for learning difficulties and disabilities. FSM is a fairly good proxy for low income, as all families who are on unemployment and low-income state benefits are entitled to free school meals (Hobbs and Vignoles, 2010). SEN is based on students deemed by the school to have special educational needs, which includes those with official SEN statements from their local education authority. FSM and SEN status are based on students' information in the first year they appear in the data, so they do not change over time by construction. Finally, we consider the share of males as this has been highlighted as important in previous research on peer effects (Hoxby, 2000;Lavy and Schlosser, 2011). To construct these neighbour-peer aggregates, we use individual-level data from all students who live in the same OA and are either in the same grade (i.e. grade 6/age 11 at the beginning of our observation window) or in the school grade above or below (grade 5 and grade 7). 10 We keep OA neighbourhoods in our estimation sample only if there are at least five students in the OA in these grade/age categories. Moreover, we keep a panel of students with non-missing information in all years, so that neighbourhood quality changes are driven by the same students moving in and out of the area, and not by students joining in and dropping out of our sample. Given the quality of our data, this restriction amounts to excluding approximately 2% of the initial sample. Figure 1 provides a graphical representation of the time window in the data and the construction of the neighbour-peer groups. For example, Cohort 1 is the cohort of children in grade 6 and taking KS2 in 2002, who go on to secondary school in 2003 and take their KS3 in grade 9 in 2005. Neighbour-peer composition in 2002 for Cohort 1 is calculated from students in the OA who are in Cohort 1, plus those in grades 5 and 7. Neighbourhood composition is calculated in 2005 from Cohort 1 and grades 8 and 9.
To check the validity of our basic neighbourhood definition, we construct two alternatives based on (i) students in the same OA and the same grade only; and (ii) students in the same and adjacent grades, but living in a set of contiguous OAs.
Specifically, for (ii), we create neighbourhoods that include students' own OA plus all contiguous OAs. These extended neighbourhoods include on average six to seven OAs, and approximately 80 students.

Data on Behaviour from the Longitudinal Study of Young People in England
The administrative data in PLASC/NPD provides outcome variables related to academic test scores. However, previous research in the field (Kling et al., 2005(Kling et al., , 2007 suggests that behavioural outcomese.g. crime, educational aspirations, health, life-satisfaction and wellbeingare more likely to be affected (sometimes perversely) by neighbours, even in contexts where test scores are not influenced (Sanbonmatsu et al., 2006). To investigate this issue, we use the Longitudinal Study of Young People in England (LSYPE), which sampled approximately 14,000 students in grade 9 (aged 14) in 2004 (one cohort only) in 600 schools, and followed them as they progressed through their secondary education up to grade 11 (age 16) and beyond. The survey covers students' experiences at school, at home and in their neighbourhood and contains a number of questions related to behavioural outcomes. These questions were asked in a confidential environment to encourage students to answer truthfully. Most of the questions involved a binary answer of the type 'Yes/No'. We follow Kling et al. (2007) and recombine some of the original variables to obtain four behavioural outcomes. Specifically, we construct the following four proxies: (iv) 'Anti-social behaviour' which is obtained as 'Did you put graffiti on walls last year (Yes = 1; No = 0)' plus 'Did you vandalise public property last year (Yes = 1; No = 0)' plus 'Did you shoplift last year (Yes = 1; No = 0)' plus 'Did you take part in fighting or a public disturbance last year (Yes = 1; No = 0)'.
The survey also contains precise information about students' place of residence, which means we can merge into this data the neighbour-peer characteristics that we have constructed using the population of students in the PLASC/NPD. Given the age of the students covered by the LSYPE, we consider the effect of neighbourhood changes on outcomes between grades 9 and 11. Moreover, as many older students drop out of education and thus out of our data set after grade 11 (the end of compulsory education), we construct neighbour-peer variables using students in the same OA and grade only. 11 Finally, grade 3/KS1 test scores for this cohort are not available, so we use mean KS2 test scores of neighbour-peers as a measure of their prior academic abilities.

Summary Statistics
Descriptive statistics for the main variables for the grade 6/KS2 to grade 9/KS3 data set are provided in Table 1. Panel (a) presents summary statistics for the characteristics of the stayers. The KS2 and KS3 scores are percentiles in the population in our database. The KS2 and KS3 percentiles average around 50, with a standard deviation of about 25 points, and mean value-added on 1.1. 12 We use figures from this Table to standardise all the results in the regression analysis that follows. About 15% of the students are eligible for FSM, 21% have SEN and 50% are male. Average secondary school size is around 1,080 students and the rates of annual inward and outward neighbourhood mobility are similar (they are based on mobility within a balanced panel) and close to 8%. Note that our estimation samplewhich excludes movers and students in the smallest neighbourhoodsis representative of the population as a whole; see Table B1 in Appendix B.
Panel (b) of Table 1 presents the means and standard deviations of the neighbourpeer characteristics and their changes between grades 6 and 9 (age-11/KS2 to age-14/ KS3). KS1 test scores at grade 2 are measured in points (not percentiles), and a score of 15 is in line with the national average. By construction (from our balanced panel), the levels of the shares of FSM, SEN and male students are very similar to those of the underlying population of students (see Panel (a)) and none of the neighbour-peer characteristic means changes much between grades (any change is due to the fact that the statistics report neighbour-group means and individuals are changing group membership). Our neighbourhoods have on average around five students in the same 11 Note that we cannot construct measures of the neighbourhood 'quality' by aggregating the characteristics of the LSYPE students as we have too few LSYPE students in each OA neighbourhood.
12 Mean value-added is not centred on zero, and the standard deviations of KS2 and KS3 percentiles are slightly smaller than theoretically expected, because the percentiles are constructed before: (i) dropping students with some missing observations (approximately 2% of the initial sample); (ii) disregarding students in small neighbourhoods (less than five students in the OA in the same grade and two adjacent cohorts); (iii) considering only students who do not change neighbourhood between grades 6 and 9 (the stayers). grade, and 14 students in the same or adjacent grades. This means that relative to most of the previous research in the field, we focus on small groups of close neighbourpeers.
An important point from Table 1 is the amount of variation we have in our neighbourpeer variables once we take differences to eliminate individual and neighbourhoodfixed effects. The standard deviation of KS1 scores is 1.76, while the standard deviation of the change in this variable between grades 6 and 9 is just over 0.86. Therefore, 24% of the variance in the average KS1 scores is within OA over time. The corresponding percentages for the shares of FSM, SEN and male students in the neighbourhood are 16%, 31% and 41% respectively. Figures 2(a) and (b) illustrate this point further by plotting the distributions of the neighbourhood mean variables in: Notes. Descriptive statistics refer to: (i) students who do not change OA of residence in any period between grades 6 and 9; (ii) students in Output Areas with at least five students belonging to the 'central cohort' +1/À1 in every period between grades 6 and 9; (iii) students in the non-selective part of the education system. These restrictions were operated after computing OA aggregate information (see Panel B). Number of 'stayers': approximately 1,310,000 (evenly distributed over four cohorts). Number of Output Areas: approximately 134,000. Average inward mobility and outward mobility in neighbourhood refer to (cohortspecific) Output Area mobility rates averaged over the period grades 6-9. KS1 refers to the average test score in reading, writing and mathematics at the Key Stage 1 examinations (at age 7); FSM: free school meal eligibility; SEN: special education needs (with and without statements   (i) levels (top left panels); (ii) between-grade differences (top right panels); (iii) between-grade differences, after controlling for primary-by-secondaryby-cohort school effects (bottom left panels); (iv) between-grade, between-cohort differences netting out OA trends (bottom right panels).
All these figures suggest that there is considerable variation over time in neighbourpeer characteristics, from which we can estimate our coefficients of interest, and that controlling for school-by-cohort or OA trends does not lead to a drastic reduction in this variation. It is worth reiterating that, on average, more than 8% of the neighbours move out and are replaced by new neighbours each year. Over 3 years, this means that more than one in four pupils in a student's neighbour-peer group is replaced, with a large part of this change occurring between grades 6 and 7, when mobility is highest. This is a substantial change, which we might expect to have real consequences.

Neighbours' Characteristics and Students' Test Score: Linear-in-means Estimates
Table 2 presents our regression results on the association between neighbour-peer characteristics and students' test scores for residential stayers. The Table reports standardised regression coefficients, with standard errors in parentheses (clustered at the OA level). As discussed in subsection 3.3, neighbour-peers are defined as students in the same OA and in the same or adjacent school grades, and we report the effect of: average grade 3 (KS1) point scores (Panel (a)); share of FSM students (Panel (b)); share of students with SEN status (Panel (c)); and share of male students (Panel (d)). Each coefficient is obtained from a separate regression. Some of these neighbour-peer characteristics are highly correlated with one another, but our aim is to look for the effects from any one of theminterpreted as an index of neighbour-peer qualityrather than the effect of each characteristic conditional on the others. Columns (1)-(4) present results from regressions that do not include control variables other than cohort dummies and/or other fixed effects as specified at the bottom of the Table. Columns (5)-(8) add control variables for students' own characteristics as described later in this Section. The note to the Table provides more details.
Column (1) shows the cross-sectional association between neighbour-peer characteristics and students' own KS3 scores. All four characteristics are strongly and significantly associated with students' KS3 scores. A one standard deviation increase in KS1 is associated with a 0.3 standard deviation increase in KS3, while a one standard deviation increase in FSM or SEN students is linked to a 0.2-0.3 standard deviation reduction in KS3. The fraction of males has a small positive relation with KS3 scores. These cross-sectional estimates are potentially biased by residential sorting and unobserved individual, school and neighbourhood factors. The results from the withinstudent, between-grade differenced specifications in (3)-(4) are shown in Column (2). Now, the associations between changes in neighbour-peer characteristics and KS2to-KS3 value-added are driven down almost to zero and only significant in two out of the four panels. The coefficients are up to 100 times smaller than in Column (1). A one standard deviation change in neighbours KS1 and in the FSM proportion over the 3year interval is linked to a mere 0.3%-0.5% of a standard deviation change in students' test-score progression. Neighbours' SEN and male proportions are not significantly associated with students' KS2-to-KS3 value-added, with estimated effects close to zero. To control further for school-specific factors, Column (3) adds primary-by-secondary-by-cohort effects. Results from these specifications show that none of the neighbour-peer characteristics are now significantly related to students' KS2-to-KS3 value-added. The loss in significance is not due to a dramatic increase in the standard errors, but to the magnitude of the coefficients shrinking towards zero. This backs the intuition gathered from Figures 2a and b that in principle there is sufficient variation to identify significant associations between neighbourhood composition and students' achievements. To control for neighbourhood-specific time trends, Column (4) adds OA-fixed effects in the value-added specification. The results are nearly identical to those in Column (3). 13 As shown in Appendix B Table B2, accounting for OA trends only, without school-by-cohort effects, yields virtually identical results.
Columns (5)-(8) repeat the analysis of columns (1)-(4) but add some control variables. These include students' own KS1 scores, FSM and SEN status and gender, plus school size, school type dummies and average rates of inward and outward mobility in the neighbourhood. Comparing Columns (1) and (4) suggests that the cross-sectional associations in Column (1) are severely biased by sorting and unobserved student characteristics: adding in the control variables reduces the coefficients substantially (by a factor of three). In contrast, once we eliminate student and neighbourhood-fixed effects as in Columns (2) and (6), adding in the control set does not significantly affect our results. The only case where there is a notable change is in the effect of neighbour-peer SEN, which becomes statistically significant (at the 5% level), even though the point estimate is unchanged. The similarity of the results in Columns (2)-(4) with those in Columns (6)-(8) is reassuring as it implies that changes in neighbour-peer composition are not strongly linked to students' background characteristics. This finding lends support to our identification strategy, which relies on changes in the treatment variables to be 'as good as random' once we partial out student and neighbourhood-fixed effects. The next Section presents more formal evidence on this point.
One concern might be that the attenuation in the estimates that we observe once we difference the data within student and between grades is caused by measurement error in our neighbour-peer variables. Although our proxies are constructed from administrative data on the population of state-school children, they may still be noisy measures of the underlying neighbours' attributes that matter for students' achievements (which we cannot observe). This noise could be exacerbated by differencing the data, in particular as there is a high degree of serial correlation in the neighbour-peer characteristics within neighbourhoods. The standard errors in Table 2 suggest this is not the case. However, to assess this issue more systematically, we perform two robustness checks. First, we use teachers' assessment of students' performance during KS1 to construct instruments for neighbour-peer KS1 test scores on the grounds that the only common components of KS1 test scores and teacher assessments should be related to underlying neighbours' abilities. 14 Instrumental variable regressions confirm that the effect of changes in KS1 test scores of neighbour-peers is not a strong and significant predictor of students' KS2-to-KS3 value-added. In our second robustness check, we estimate a linear predictor of students' KS2 achievement by regressing students' own KS2 achievements on own KS1 test scores, FSM eligibility, SEN status and gender. The predictions from these regressions are then aggregated across neighbourpeers to create new measures of predicted neighbour-peer KS2 at grades 6 and 9. This new composite indicator should be less affected by measurement error in relation to the underlying neighbourhood quality that matters for students' achievements as it is based on the best linear combination of the individual characteristics that predicts KS2 test scores. Using this measure as a proxy for neighbour-peer quality produces similar results to those in Table 2, with no evidence of any significant effect from neighbours on students' achievement. Finally, note that the reduction in the coefficients from Column (2) to (3) and from Column (6) to (7) is not due to the inclusion of a large number of fixed effects (around 190,000 primary-by-secondary-by-cohort groups). As shown in Appendix B Table B2, including only secondary school-fixed effects (around 3,200 groups) or secondary-by-cohort effects (approximately 12,000 groups) similarly drives our estimates to zero. 15 In summary, our baseline linear-in-means specifications indicate that the effects of neighbour-peers on student achievement are statistically insignificant and negligibly small. As controlling for unobserved neighbourhood trends does not affect our estimates once we have taken into account school-by-cohort effects, the analysis that follows considers only simple value-added specifications and specifications that further control for school cohort-specific effects.

Assessing Our Identification Strategy
The validity of our empirical method rests on the assumption that changes in neighbour-peer composition between grades are not related to the unobserved characteristics of students who stay in the neighbourhood, nor to other unobservable attributes of the neighbourhoods. We have already shown that the results of the between-grade within-individual value-added specifications are insensitive to the inclusion of additional control variables. In this Section, we tackle this issue more systematically by showing that our treatments are balanced with respect to student and neighbourhood characteristics. 14 For the students in our sample, KS1 achievement was assessed on the basis of externally moderated written tests, and using the teacher's own assessment based on their experience of the student. 15 As a further robustness check, we replaced school-fixed effects with school-level characteristics. For example, we replaced primary-by-secondary-by-cohort effects with actual cohort-specific changes in schoollevel characteristics on transition from primary to secondary school. These included student-to-teacher ratios, fraction of students of White ethnic origin, fractions of students eligible for FSM and with SEN status, number of full-time equivalent qualified teachers and numbers of support teachers for ethnic minorities and for SEN students. These specifications confirmed that neighbourhood composition is not strongly associated with students' value-added.
The neighbourhood characteristics that we consider come from the British 2001 Population Census at the OA level. Specifically, these are the proportions of: (i) households living in socially rented accommodation; (ii) owner-occupiers; (iii) adults in employment; (iv) adults with no qualifications; (v) lone parents.
Additional characteristics come from the NPD collapsed to the OA of residence at grade 6 (age 11), namely: KS1, FSM, SEN and gender, as well as the mean and the standard deviation of students' KS2 test scores. To check the balancing of our treatments, we carry out OA-level regressions of these neighbourhood characteristics on the OA-specific changes in the neighbour-peer characteristics used in Table 2 (i.e. grade 6-to-9 changes in neighbour-peer KS1 test scores and FSM, SEN and male proportions).
Standardised coefficients and standard errors from these regressions are reported in Table 3. Panel (a) shows the association between OA-mean student characteristics and the changes in neighbour-peer composition between grades 6 and 9. These regressions have no control variables other than the proportion of students in the neighbourhood from each cohort in our data and the proportions of students represented in different school types. 16 The only significant and meaningful associations are related to the changes in neighbour-peer FSM. These estimates show that neighbourhoods with low KS1, high FSM and high SEN experience increases in fraction of neighbours who are FSM registered, which would imply upward biases in the estimates in Table 2, Columns (2)-(4). These associations are, however, very small in magnitude. Moreover, it should be noted that we have only imperfect controls for cohort and school effects in these balancing tests and that these factors are more effectively controlled for in the specifications in Table 2, which include school-bycohort effects and neighbourhood trends. In Table 3 Panel (b), we regress OA-level KS2 statistics and census variables on the neighbour-peer change variables. These regressions include OA-level averages of the controls added in the specifications of Columns (5)-(8) of Table 2. The intuition for this approach is based on the idea of using census characteristics and OA KS2 statistics as proxies for additional unobservable factors in the regressions of Columns (5)-(8) of Table 2, and testing for their correlation with the changes in neighbour-peer characteristics. The results present a reassuring picture: nearly all the estimated coefficients are very small and insignificant. Assuming that the correlation of neighbour-peer changes with observable characteristics provides a guide to the degree of correlation with the unobservables (as argued in Altonji et al., 2005), the balancing test in Table 3 provide broad evidence that the near-zero neighbour-peer effect estimates in Table 2 are not biased by student or neighbourhood unobservables. 16 School types include: Community, Voluntary Aided, Voluntary Controlled, Foundation, City Technology College and Academy. The cohort and school-type proportions stand in for the cohort-by-school effects in our main student-level regressions, which we are unable to include in the aggregated OA-level regressions. . Regressions in the bottom panel include cohort effects, OA-averaged student KS1 test scores; OA-averaged student eligibility for FMS OA-averaged student SEN status; OA-averaged student male gender; OA-averaged school size (refers to school attended in grade 7); school-type effects (refers to school attended in grade 7); OA-averaged rates of outward and inward mobility in neighbourhood. Standard errors clustered at the OA level in parenthesis. ** 1% significant or better. * at least 5% significant.
Nevertheless, a sceptical reader could still argue that there might be unobserved shocks, conditional on school-by-cohort effects and neighbourhood trends, which simultaneously affect children's outcomes and the distribution of the characteristics of in-migrants and out-migrants. If families are moving in response to neighbourhood changes which affect student achievements, then our estimates are likely to be upward biased because neighbourhoods most likely experience a net outflow of rich students in response to shocks that have an adverse impact on student achievement (assuming that the neighbourhood factors affecting student achievement are normal goods in housing consumption). In other words, our near-zero estimates should be regarded as an upper bound of the effects of neighbourhood composition. Additional evidence from the British Household Panel Survey (BHPS), however, provides little support for the idea that residential migration occurs as a result of neighbourhood shocks. The BHPS is a longitudinal survey that follows a representative sample of families in Britain since the early 1990s. The survey tracks residential movers and asks respondents open-ended questions about their reasons for moving. These responses are then coded up into the most common categories. Taking a subsample of 637 movers that corresponds to households with children for the years matching the PLASC/NPD data that we use in our analysis, we find that the main specific reasons for residential moves are; (a) size or other physical attributes of the home (22.6% are moves to larger accommodation, while 9.5% relate to other aspects of the home); (b) formation and dissolution of partnerships (16%); (c) changes of tenure status (7.6% relates to buying a home, while 5.4% is linked to eviction or home repossession); (d) job-related reasons (9.6%).
Neighbourhood-specific reasons (i.e. disliking the area, isolation, safety, unfriendliness and noise) are specified by just over 5% of those moving, although there is an ambiguous 16.2% coded as citing 'other' reasons or no reason for moving and a further 4% citing 'family reasons'. The figures are tabulated in Appendix B Table B3. In summary, between 75% and 95% of the moves occur for reasons not related to neighbourhoods and none of the responses cite neighbourhood changes or education issues. In conclusion, there is little reason to believe that our results are biased by neighbourhood shocks that directly affect students' educational achievements and cause changes in neighbour-peer composition.

Peers at School or Peers in the Neighbourhood?
The analysis so far has not distinguished between neighbour-peers who attend the same secondary school and those who do not. This distinction could be important for at least two reasons. First, children who are at school for a large part of their day may not interact with neighbours, unless they know each other from school already, so neighbour-peers who attend a different school may exert little or no influence on students' outcomes. Second, distinguishing between school and neighbourhood peers is useful for uncovering an uncontaminated neighbourhood-level peer effect, net of school peer effects and other school factors that have not otherwise been effectively controlled for in our regressions. Table 4 presents evidence on this issue by tabulating results obtained from estimating (5), and including different levels of fixed effects as we move from Columns (1)-(3). Results in Panel A show that neighbour-peer KS1 has an impact on a student's achievement only if these neighbours also attend that student's secondary school. However, this association vanishes as soon as we include secondary-by-cohort or primary-by-secondary-by-cohort effects. Next, results in Panel B, show that FSM status of neighbour-peers matters irrespective of school attended, with a standardised coefficient of negative 0.003 (SE 0.001). However, as soon as we include school-by-cohort effects to control for school-related residential sorting during the transition between primary and secondary school, the estimated effects shrink and become insignificant. Finally, we find no evidence of neighbour-peer Areas is driven by the restriction that Output Areas must have both a subset of students going to the same school and a subset of students going to different schools. Controls include student's own KS1 test scores; student is FMS; student is SEN; student is male; school size (refers to school attended in grade 7); average annual rate of outward mobility in neighbourhood; average annual rate inward mobility in neighbourhood. Secondary-by-cohort effects: approximately 12,000 groups. Secondary-by-primary-by-cohort school effects: approximately 191,000 groups. Standard errors clustered at the OA level in parenthesis. ** 1% significant or better; * at least 5% significant. effects when looking at neighbours' SEN status and gender, irrespective of the school attended. All in all, this evidence indicates that residential neighbourhood peer effects are effectively zero, irrespective of whether neighbours attend the same school or not.

Robustness Checks I: Intention-to-treat Estimates and Other Definitions of Peers and Neighbourhoods
An important issue that we flagged in Section 2 is that focusing on a sample of children who do not move between grades 6 and 9 might induce sample selection biases. To circumvent this problem, we provide intention-to-treat estimates, using movers and stayers but assigning to movers the grade-9 characteristics of the neighbourhood in which they lived at grade 6 (as described in subsection 2.1). Table 5 presents our results for specifications without (Column (1)) and with (Column (2)) primary-bysecondary-by-cohort effects (both columns include control variables). The new results are almost identical to those reported in Table 2 for stayers only, allaying sampleselection concerns.
As discussed in subsection 2.3, there are ambiguities about the correct neighbourpeer group definition. In Table 5, we experiment with different group definitions as discussed in subsection in 3.3. Columns (3) and (4) consider neighbour-peers in the same OA and grade only, whereas Columns (5) and (6) change the neighbourhood definition to include, on average, six to seven adjacent OAs (on average 80 students). In general, these redefinitions make no substantive difference to the results. In some cases, previously insignificant coefficients become more precise, although all the effects remain very small in magnitude, and most are insignificant once we include school-by-cohort effects. Using aggregates computed over larger residential areas in Column (5) increases the precision and the size of our estimates. However, including school-by-cohort effects as in Column (6) brings our estimates close to zero and insignificant (with the exception of the changes in the share of males). This pattern might be explained by the fact that changes in larger neighbourhood aggregates are more likely to be contaminated by omitted time-varying neighbourhood factorssuch as changes to neighbourhood infrastructure or household mobility dictated by school quality and accessthan for smaller geographical units. This lends support to our earlier claim that small-scale geographical fixed effects minimise the risk from endogenous changes in neighbourhood quality. Finally, we experimented with alternative neighbour-peer variables based on the characteristics of the adult population in the neighbourhood (rather than students of similar ages). This type of information is not readily available from the education data sets used so far, but can be gathered using time-varying information from the Department for Work and Pension (DWP). From these data, we matched the students in our main data set to neighbourhood information on: (i) the number of working-age people claiming the 'Job Seeker Allowance' (JSA, i.e. unemployment benefits); (ii) the number of people aged 16-25 claiming JSA; and (iii) the number of lone parents on income support (a proxy for very low income usually among young, unmarried mothers).
Evidence from regressions analogous to those in Table 2 but using these adult-based indicatorsgave coefficients close to zero and insignificant, implying no neighbourpeer effects related to the adult composition of the neighbourhood.

Robustness Checks II: Timing Issues and Alternative Time Windows
Up to this point, we have only investigated whether KS2-to-KS3 value-added is related to neighbourhood changes over the same period. However, students' educational progress could respond more to changes at different points over the grade 6-9 period. We therefore investigated whether there are heterogeneous effects from the three different grade-on-grade changes in neighbourhood composition, that is, grades 6-7, grades 7-8 and grades 8-9. The results (not tabulated) are in line with the other results so far, although there is a small negative effect of neighbour-peers' average KS1 changes between grades 6 and 7, which is borderline significant with a p -value of 0.054. 17 To address timing issues further, we consider students' attainments at grade 11 (KS4) and analyse whether students' value-added between grade 6 (KS2) and grade 11 (KS4), and between grade 9 (KS3) and grade 11 (KS4) is affected by the corresponding changes in neighbour-peer characteristics. The data used to estimate these models are discussed in Appendix A and a selection of our results is presented in Appendix B Table B4. Results based on neighbourhood changes over up to 5 years confirm our previous findings: irrespective of the neighbour-peer proxy considered, there is no evidence that variation in neighbourhood composition affects the gains in achievement of students.
We also allowed for time lags in the process by studying whether grades 9-11 (KS3-to-KS4) value-added is affected by grades 6-9 or grades 8-10 changes in the neighbourhood composition. Furthermore, we looked at students' value-added in primary schools, replicating the analysis in Table 2 for the grades 2-6 (KS1-to-KS2) phase (results not tabulated). Once again, we found no evidence of neighbour-peer effects on students' test score progression. These results are available upon request.

Heterogeneity, Non-linearities and Complementarities
The results from the linear-in-means specifications presented so far show that, on average, changes in neighbour-peer composition do not influence students' test score gains. However, this headline result might mask heterogeneity and non-linearity along a number of dimensions. As discussed in the Introduction, these issues are relevant because 'mixed neighbourhoods' policies that aim to improve overall students' outcomes are predicated on strong assumptions about the second-order partial derivatives of the functions describing these neighbourhood effects. In this Section, we exploit the size and coverage of our census data to investigate heterogeneity, complementarities and non-linearities in neighbour-peer effects. Table 6, which runs across two pages, presents our first set of results, with Columns (1a)-(1b) to (4a)-(4b) exploring heterogeneity in pupils' response to neighbourhood changes according to whether the student: (i) has KS1 test scores above/below the sample median; (ii) is eligible for FSM; (iii) has SEN status; (iv) is male or female.
Next, Columns (5a)-(5b) to (8a)-(8b) present heterogeneity by neighbourhood type. Specifically, we separately consider areas with: (i) above/below median student numbers; (ii) above/below median population density; (iii) above/below median housing overcrowding 18 ; (iv) percentage of social housing tenants above/below 75%. Table 6 Heterogeneity and Complementarities in Neighbourhood Effects, by Student and Neighbourhood Characteristics Dependent variable/timing is: KS3-KS2 value-added/grades 6-9 (1a) Notes. The Table reports standardised coefficients and standard errors obtained from regressions pooling all students and interacting individual or neighbourhood characteristic specified in the heading with one of the treatments (change in the neighbourhood characteristic). All regressions include controls as in Table 2 column (5) and following columns, plus secondary-by-primary-by-cohort-fixed effects. Number of observations approximately 1,310,000 in approximately 134,000 Output Areas. Secondary-by-primary-by-cohort effects: approximately 191,000 groups. Number of students above/below median KS1: approximately 582,000/ 726,000 respectively. Number of FSM/Non-FSM students: approximately 203,000/1,106,000 respectively. Number of SEN/Non-SEN students: approximately 279,000/1,031,000 respectively. Number of male/female students: approximately 665,500/643,700 respectively. Small and large neighbourhoods are defined using number of students in the 'central cohort +1/À1' residing in the OA on average over the 4 years of the analysis. Number of students in large/small neighbourhoods: approximately 674,000/635,000 respectively. Population density, housing overcrowding and share of households socially renting derived from British Census 2001 at the OA level. Number of students in high/low-density neighbourhoods (above/below median): approximately 656,000 in both cases. Number of students in neighbourhoods with high/low residential overcrowding (above/below median): approximately 656,000 in both cases. Neighbourhoods with a high share of social renters are defined as those with at least 75% households in socially rented accommodations. Number of students in neighbourhoods with high/low share of social housing: approximately 43,600/1,267,000 respectively. Standard errors clustered at the OA level in parenthesis. **1% significant or better; *at least 5% significant. Of the 64 estimates presented in the Table, only six are significant at conventional levels. These show that: (i) a larger fraction of SEN students negatively affects students with high KS1 achievements; (ii) a larger fraction of FSM students lowers non-SEN and female students' testscores; (iii) a larger fraction of boys improves other boys' achievements; and (iv) a larger fraction of neighbours with FSM and SEN status has a significantly adverse effect on the value-added of students living in high-density neighbourhoods.
Note that most of these estimates are only significant at the 5% level and that the effect sizes are very small. Importantly, the first two findings coupled with the remaining evidence emerging from the Table suggest that neighbourhood mixing might decrease overall achievements: while high-KS1 students and non-SEN students marginally lose out from interacting with more SEN and FSM neighbour-peers, students who are eligible for free meals or have an SEN status are not significantly and positively affected by neighbourpeers with higher average KS1 grades or lower shares of SEN and FSM students in the neighbourhood. Similarly, female students marginally lose out from being surrounded by a larger share of FSM-eligible neighbours, but FSM pupils do not benefit from having a smaller share of male neighbour-peers. Our results also suggest that neighbour-peer effects are more pronounced for students in urban areas (captured by high population density; Column 6b), although we find no evidence of this in relation to urban disadvantage as measured by overcrowding (Column 7b) or concentrated social housing (Column 8b). 19 A number of checks in relation to non-linearities and threshold effects similarly failed to yield significant effects or notable patterns. (The findings are available upon request.) Specifically, we added changes in the quadratic and cubic polynomials of the neighbourhood composition variables, or quadratic and cubic powers of the changes in our neighbour-peer variables into our regressions. We also allowed positive and negative neighbourhood composition changes to cause asymmetric effects but found little evidence of such heterogeneity with the exception of the effect of average KS1 grades of neighbour-peers: while positive changes do not have a significant effect, negative changes have a perverse, positive but quantitatively negligible (at 0.0001) impact on students' value-added, borderline significant at the 5% level. Finally, we allowed large-negative, negative, positive and large-positive changes to have heterogeneous effects on students' test-score value added but still failed to find evidence of any significant non-linearity.
As a last exercise, we investigate whether there are any distinctive effects from the very highest and the very lowest-ability neighbours. In the context of English secondary schools, Lavy et al. (2012b)  The plots present standardised regression coefficients and 95% confidence intervals from regressions simultaneously including the three proxies for neighbourhood composition. More details are provided in Section 6. the ability distribution and heterogeneous effects from very 'good' peers at the top of the ability distribution. To replicate this design, we investigate whether changes in the shares of top/bottom 10% neighbour-peers (in the national KS1 student distribution) affect students' KS2-to-KS3 value-added. Even in this case, we find nothing to suggest that changes in the neighbourhood composition affect students' educational attainment. This is so irrespective of whether we pool all students, or separately study the effect of very bright and very weak neighbours on boy/girls, FSM/non-FSM student and SEN/non-SEN pupils, and on students with different levels of KS1 attainments. Figure 3 presents some related results, where we look at the effect of the interaction between changes in the KS1 achievements of neighbour-peers and students' own KS1 test scores. This graphical analysis is in the spirit of Hoxby and Weingarth's (2005) analysis for school peer effects. Specifically, the plots show the estimated standardised effects (and associated 95% confidence interval) of changes in neighbour-peers' KS1 attainmenteither average KS1 (Panel (a)), or the percentages of neighbours with KS1 scores in the top decile (Panel (b)) and bottom decile (Panel (c))against students' own KS1 deciles. The graphs are obtained from one single regression of pupils' KS2-to-KS3 value-added on all three indicators of neighbour-peer KS1 interacted with dummies for students' own KS1 deciles. The empirical specification is comparable to the one in Column (7) of Table 2, although we control for students' own KS1 decile instead of his/her own average KS1 test score.
With some imagination, we can detect a weak upward trend in the response of KS2to-KS3 value-added to an increase in the proportion of top-10% KS1 neighbour-peers, and a weak downward trend in the response to an increase in the proportion of bottom-10% KS1 neighbour-peers. These results imply some vague positive complementarities between high-achieving neighbour-peers and high-achieving students, and some weak negative interactions between low-achieving neighbour-peers and lowachieving students. Mean neighbour-peer KS1 scores (conditional on the percentages in the top and bottom deciles) show no strong patterns across the distribution of students' own KS1 test scores, although the effects of neighbours' average KS1 achievements are positive for students in the central deciles. However, note that we cannot reject the null of joint equality of the coefficients in any of these settings, or the null that the coefficients are jointly equal to zero.
In conclusion, this more detailed analysis does not reveal any patterns that were not evident in the linear-in-means estimates. Overall, there is little evidence of any significant neighbour-peer effects or of complementarities, which would justify mixing neighbourhoods as a policy to improve overall student achievements. Similarly, we find no sign of 'bad apple' neighbour-peer effects from the lowest achievers.

Neighbourhood Characteristics and Behavioural Outcomes: Evidence from the Longitudinal Study of Young People in England
To consider potentially more interesting effects of neighbour-peer composition on behaviour, we next use information collected in the LSYPE linked to the NPD-based neighbour-peer variables used so far. Given the time window covered by the LSYPE, we consider the effect of neighbourhood changes on outcomes between grades 9  Table 2 Column (5) and secondary school-fixed effects. Sample includes one cohort of students interviewed in the LSYPE, aged 14 in 2004. Number of observations: approximately 3,700 for both male and female students, in about 500 schools and living in approximately 4,000 Output Areas. Peers are defined as student living in the same OA and of the same age. Regression further consider only: (i) students who do not change OA of residence between grades 9 and 11; (ii) students in Output Areas with at least three students belonging to the same age group in grades 9 and 11; and (iii) students in the non-selective part of the education system. 'Attitudes towards schooling' is a composite variable obtained from three separate questions as follows: 'School is a worth going (Yes =   Table B5. Standard errors clustered at the OA level in parenthesis. **1% significant or better; *at least 5% significant. and 11. As KS1 test scores are not available for the LSYPE cohort, we use KS2 scores of neighbouring students as a proxy for neighbour-peer achievements. Table 7 reports the results. Note that previous evidence in the literature has shown marked heterogeneity by gender, so we report estimates from separate regressions for boys and girls. 20 All models include the standard set of controls and secondary schoolfixed effects. The construction of the behavioural outcome variables was discussed in subsection 3.4. Descriptive statistics for the LSYPE sample are provided in Appendix B Table B5, both for the behavioural variables and for the student and neighbour-peer characteristics. These figures suggest that despite the fact that the LSYPE sample is much smaller than one previously considered, it is still representative of the student population and displays enough variation in the variables of interest. Columns (1) and (2) of Table 7 display the relation between neighbourhood changes and the composite variable 'Positive school attitude' for boys and girls, respectively. Starting from the top, we see that an improvement in KS2 achievements of neighbour-peers positively affects students' attitudes towards education and that this effect is significant and sizeable for boys: a one standard deviation change in the treatment corresponds to a 3.6% of a standard deviation change in the dependent variable. Symmetrically, we find that a larger share in the fraction of neighbours with learning difficulties and poor achievements (as captured by SEN status; see Panel C) negatively affects views about schooling, but this effect is more sizeable and significant for girls. In this case, a one standard deviation increase in the treatment would negatively affect female students' attitudes towards education by 6.4% of a standard deviation. On the other hand, neither the fraction of students in the neighbourhood who are eligible for FSM nor the share of males affects other students' views of education.
The four central columns of the Table investigate the relation between neighbourpeer composition and students' absences from school ('Playing Truant'; see Columns (3) and (4)) and students' use of substances (this proxy includes smoking, drinking and using cannabis; see Columns (5) and (6)). None of the associations presented in the Table is significant at conventional levels, and often the signs are the opposite of what one would expect.
Finally, Columns (7) and (8) concentrate on the variable 'anti-social behaviour', which captures whether students got involved in graffiti, vandalism, shoplifting, fighting or a public disturbance. Our results show that, while neighbourhood composition in terms of KS2 achievements, share of males and proportion of SEN students does not significantly affect these behavioural outcomes, an interesting pattern emerges when looking at the proportion of neighbours from poor family backgrounds (FSM; see Panel (b)). A one standard deviation change in this treatment significantly increases male students' involvement in anti-social behaviour by 5% of a standard deviation, but this change would not affect young girls' behaviour. 20 Given the much smaller sample covered by the LSYPE, we are unable to split our results for FSM/non-FSM and SEN/non-SEN students convincingly. However, some exploratory analysis showed little heterogeneity in the effect of neighbourhood composition on behavioural outcomes along these dimensions.
To further explore these issues, we study whether the effects of neighbours' characteristics on boys' and girls' behavioural outcomes differ according to peers' gender. Our results (not tabulated) show that male peers' FSM eligibility has a larger effect than female peers' FSM status on male students' involvement in anti-social behaviour, although this difference is not statistically significant. These heterogeneous effects for boys and girls are not surprising. Kling et al. (2005Kling et al. ( , 2007 document similarly different effects for male and female youths re-assigned to better neighbourhoods by the MTO experiment. More broadly, a growing body of research shows that boys and girls respond differently to education-related interventions. Among others, Anderson (2008) finds that three well-known early-childhood interventions (namely, Abecedarian, Perry and the Early Training Project) had substantial short and long-term effects on girls but no effect on boys, while Lavy and Schlosser (2011) and Lavy et al. (2012b) find that peer quality in secondary schools affects boys and girls differently. Finally, Angrist and Lavy (2009) and  show a consistent pattern of stronger female response to financial incentives in education in a variety of settings.
In conclusion, and considering both the small number of students sampled by the LSYPE and the fact that we can only look at outcomes between grades 9 and 11, the results in Table 7 provide some support for the notion that the neighbour-peers can affect teenagers' behaviour. It is worth noting that in comparable specifications (i.e. Table 2, Column (7)), we found no effects on cognitive outcomes. This suggests that our evidence of significant effects on behavioural outcomes is not due to a less robust empirical specification of our models when using the LSYPE data. Nevertheless, all in all our evidence suggests that neighbour-peer effects are not a strong and pervasive determinant of students' outcomes on either the cognitive or the non-cognitive dimension.

Concluding Remarks
Our study has looked at the effect of the characteristics and prior achievements of neighbourhood peers on the educational achievements and behavioural outcomes of secondary school students in England. In our main administrative data set, we track four cohorts of over 1.3 million students through the first 3 years of their secondary schooling. The unique features of our population data seti.e. coverage and densityhave allowed us to make a number of important empirical contributions, besides presenting novel evidence on the effect of peers in the neighbourhood. First, we have drilled down to the effect of neighbourhood changes that are caused by movements of families in an out of small neighbourhoods. We have tracked these changes through information on the detailed residential addresses of our census of students. This is a new strategy to address the sorting problem in neighbourhood research. Second, exploiting the fact that we observe several cohorts of students experiencing changes in the composition of their neighbourhoods at the same time as they move through the education system, we have been able to partial out student and family-background unobservables, neighbourhood-fixed effects and time trends as well as school-by-cohort unobserved shocks. These methods get us close to pinning down an unbiased neighbourhood effect estimate stemming from changes in the mix of people in the residential neighbourhood (i.e. a 'contextual effect'; Manski, 1993) as originally advocated by Moffitt (2001). Third, by exploiting the detail and density of our data, we have been able to change our definitions of neighbourhoods and peers in the place of residence and thus address the inherent problem in the literature of pinning down the correct definition of what constitutes a neighbourhood. The English institutional setting, where secondary school attendance is not tightly linked to place of residence, further allowed us to distinguish between neighbours who attend the same or a different school and to test for potential interactions between school and neighbourhood peer effects. Finally, the recent literature has focused on estimating linear effects for homogenous and narrowly defined groups to aid identification. In contrast, our strategy and data set have provided us with a unique opportunity to investigate heterogeneity and non-linearities in these responses at an unprecedented level of detail.
In summary, our findings show that although there is a substantial cross-sectional correlation between students' test scores and the characteristics of their neighbours, there is no evidence that this association is causal. The effect of changes in peers in the neighbourhood on students' test-score gains between grades 6 (ages 11) and 9 (age 14) is nil. Exploiting the density of our data, we have extended our empirical models to go beyond simple linear-in-means specifications, and studied non-linearities, complementarities and threshold effects. Even then, we failed to find evidence of significant neighbour-peer effects on students' achievements. From a policy perspective, the implication is thaton the educational dimension at leastprogrammes to promote socio-economic mixing in communities through residential relocation are unlikely to be effective. Student achievements and qualifications are evidently unaffected by changes in their neighbourhood composition induced by residential turnover, even when we look at changes occurring over a long 5-year interval that spans the whole of compulsory secondary schooling. In contrast, we uncover some evidence that noncognitive and behavioural outcomessuch as attitudes towards school and anti-social behaviourare affected by changes in neighbourhood composition, and that these effects are heterogeneous along the gender dimension. This suggests that future research on the effects of social interactions in neighbourhoods should focus on outcomes other than teenage educational attainments. 0.509 0.176 Share malechange grades 6-9 0.000 0.128 Number of students in Output Area, 'central cohort' +1/À1, grade 6 13.212 6.562 Number of students in Output Area, 'central cohort' +1/À1, grade 9 12.884 6.628 Notes. Descriptive statistics refer to students in the non-selective part of the education system. The data include students who change OA of residence between grades 6 and 9; students in Output Areas with less than five students belonging to the 'central cohort' +1/À1 in every period between grades 6 and 9.      Peers are defined as students living in the same OA and of the same age. Regression further consider only: (i) students who do not change OA of residence between grades 6 and 11; (ii) students in Output Areas with at least three students belonging to the same age group in grades 6 and 11 (Columns (1)-(3)) and grades 9 and 11 (Columns (4)-(6)); and (iii) students in the non-selective part of the education system. Number of observations approximately 500,000 in approximately 102,000 Output Areas. All regressions include controls as in Table 3, Column (2) and following columns. Secondary school-fixed effects: approximately 3,100 groups (refer to school at grade 7 when student enters secondary education). Standard errors clustered at the OA level in parenthesis. **1% significant or better; *at least 5% significant.  Notes. Descriptive statistics refer to the sample that includes one cohort of students interviewed in the LSYPE, aged 14 in 2004. Peers are defined as students living in the same OA and of the same age. The sample only includes: (i) students who do not change OA of residence between grades 9 and 11; (ii) students in Output Areas with at least three students belonging to the same age group in grades 9 and 11; and (iii) students in the non-selective part of the education system. 'Attitudes towards schooling' is a composite variable obtained from three separate questions as follows: 'School is a worth going (Yes