The Introduction of Academy Schools to England's Education

We study the origins of what has become one of the most radical and encompassing programmes of school reform seen in the recent past amongst advanced countries -- the introduction of academy schools to English secondary education. Academies are state schools that are allowed to run in an autonomous manner which is free from local authority control. Almost all academies are conversions from already existent state schools and so are school takeovers that enable more autonomy. Our analysis shows that this first round of academy conversions that took place in the 2000s generated significant improvements in the quality of pupil intake and in pupil performance. There is evidence of heterogeneity as improvements only occur for schools experiencing the largest increase in their school autonomy relative to their predecessor state. Analysis of mechanisms points to changes in headteachers and management structure as key factors underpinning these improvements in pupil outcomes.


THE INTRODUCTION OF ACADEMY SCHOOLS TO ENGLAND'S EDUCATION
School reforms that have taken place in many countries in the recent past -notably free schools in Sweden, and charter schools in the US -have proven to be an important dimension of the changing education landscape. Change has occurred in the context of some reforming nations being innovative in their attempts to get closer to what they perceive to be the optimal 1 They are different from most US charter schools which are typically, though not always, set up from scratch. A closer comparison to the typical charter school in England is the free schools which are a recent addition to the education landscape and are new schools (often set up by parent or community groups). The closer US comparison to academies is 'in-district' charters where an already existent public school is converted to a charter as a school takeover -these are less commonplace than US charters as a whole, but there are places where conversions of public schools to charters have taken place (like Boston and New Orleans -see Abdulkadiroglu et al., 2014). 2 In England, secondary schooling takes place from ages 11-16 and primary schooling from ages 5-11. 3 Prior to the Act only secondary schools could become academies and to convert they were required to sign up a sponsor. Afterwards, primary schools were permitted to become academies, free schools were introduced and a sponsor was no longer required for conversion to take place. See Eyles, Hupkau and Machin (2015) for more details.
2 school type. At the same time in other countries education policies have been pursued with little deviation from the orthodox model of the traditional local or community school. 4 The genesis of the English academies programme is what we study in this paper. The academy school model was initiated under the 1997-2010 Labour government when strong concerns were being expressed that schools in particular local authorities (usually serving urban inner-city disadvantaged neighbourhoods) were not delivering a good enough education to the children attending them. A widespread recognition emerged that something needed to be done, both to try to improve educational standards, and to confront significant behavioural problems, in these schools where it had been said that 'teachers had lost control of the corridors'. The proposed solution was to replace an existing school with a new type of state school to be run outside of local authority control and which was managed by a private team of independent cosponsors. The sponsors of the new academy school delegate the management of the school to a largely self-appointed board of governors who have responsibility for employing all academy staff, agreeing levels of pay and conditions of service and deciding on the policies for staffing structure, career development, discipline and performance management.
We study the causal impact of academy school conversion on pupil intake and pupil performance. This line of enquiry is aimed at working out how the Labour academy programme functioned and impacted on pupils affected by the policy. To do so we consider data on pupils in schools over the school years 2000/01 to 2008/09 since this facilitates a before/after analysis of the impact of academy conversion. 5 Of course, as the discussion has already made clear, it was pupils in disadvantaged schools that participated in academy conversion and so we need to 3 define a credible control group of pupils attending schools that did not become academies in the sample period. We do so by comparing outcomes of interest for children enrolled in academy schools to pupils enrolled in a specific group of comparison schools, namely those state schools that go on to become academies after our sample period ends. We discuss the rationale for the empirical credibility of this and our methods in using this research design (together with threats to convincingly achieving identification) in more depth below. It turns out that this approach produces a well-balanced treatment and control group that differences out key observable and unobservable factors linked to conversion to academy status.
Because pupil composition may change before and after conversion to an academy, robust study of the causal impact of academy conversion on pupil performance needs to utilise an empirical strategy that is not contaminated by such change. The approach taken in this paper is to study performance effects for pupils who were already enrolled in the school prior to conversion and are then affected by academy conversion in subsequent years of their secondary schooling. Since the initial enrolment decision was made for the pre-conversion school, academy conversion should be exogenous to these students, and can be set up as in terms of an intention to treat empirical exercise, from which we can obtain a causal estimate of a local average treatment effect. In this setting, the intention to treat group is all pupils enrolled in the predecessor school who, irrespective of whether they actually do, are pre-conversion in line to take their year 11 KS4 exams in the school. The approach has similarities to that taken in Abdulkadiroglu et al. (2014), who study school takeovers in New Orleans, referring to pupils who stay in a converting school as 'grand-fathered' pupils.
Whilst we study a school transformation programme that is different in a number of dimensions to those that have been implemented elsewhere in the world, our work fits well with 4 two strands of economics of education research. The first is a growing literature that presents empirical estimates of the impact of school types on pupil achievement. For example, US work on charter schools tends to find achievement gains associated with charter status, and with the 'injection' of charter school features to public schools. 6 In the UK, a small body of work has identified the impact of specific school types on educational and labour market outcomes. 7 The second is a bigger and by now fairly long established literature on school types in the US and elsewhere. These include many studies on Catholic schools, on voucher-subsidised private schools and many analyses of the impact of school types using international test score data. 8 In the next section, we discuss the structure of the secondary schooling system in England and document the rise of academies in the period we study. We also present a brief summary of related studies. Section 3 describes the data and the research designs we implement.
Section 4 presents the main results on the effects of academy conversion on pupil intake and performance. We also report a number of robustness tests of our key findings. Section 5 hones in on mechanisms through studying the use of academy freedoms that underpin the reported results. We then offer conclusions to the paper in section 6. 6 This literature is not without its own controversy. Recent, typically small scale, experimental evaluations of charters in or near particular US cities (Boston and New York) find positive impacts on educational achievement (see Abdulkadiroglu et al. 2011Abdulkadiroglu et al. , 2014Angrist et al. 2013;Dobbie and Fryer 2011;Hoxby and Murarka 2009). Wider coverage non-experimental evaluations produce more mixed results (Center for Research on Education Outcomes, 2009). On the injection of charter school features to public schools in Houston, and their beneficial effects, see Fryer (2014). 7 See, for example, the Clark (2009) paper on schools becoming devolved from local authority control in the late 1980s and early 1990s or the work on private schools by Green et al. (2012). 8 See, for example, Altonji, Elder and Taber (2005), Neal (1997) or Evans and Schwab (1995) for analysis of US Catholic schools or Hsieh and Urquiola (2006) for an analysis of the private school voucher programme in Chile.
For evidence on school effects using international test score data see OECD (2011) and Woessmann, 2011, 2015).

2. Academy Schools
Academies were first introduced to English education in the early 2000s. In hindsight, their introduction can be viewed as a key development in the history of education in England. 9 This is firstly because changes in school type like those that have taken place for academies, and the scale of the academies programme, are rarely seen in education systems across the world.
Secondly, the academies programme has been promoted and pursued with almost evangelical fervour by advocates, and run down with an equal lack of enthusiasm and stark criticism by detractors. Lord Adonis' (2012) book eloquently describes this. Adonis was the key player in government in setting up the Labour academies programme, and the more sceptical lines from those who oppose academies 10 make the controversial nature of the debate clear.
The first clutch of academies opened in the school year beginning in September 2002.
Academies are independent, non-selective, state-funded schools that fall outside the control of local authorities. In most cases, they are conversions of already existing predecessor schools.
Academies are managed by a private team of independent co-sponsors. The sponsors of the academy school delegate the management of the school to a largely self-appointed board of governors with responsibility for employing all academy staff, agreeing levels of pay and conditions of service and deciding on the policies for staffing structure, career development, discipline and performance management.

Secondary School Types in England and Academy Introductions
There are seven different school types that make up the English secondary education In the time period we study prior to the Academies Act of 2010 (which altered the definition of academy status in some important ways), the main impetus of the programme was to replace failing schools with academies with aim of generating school improvement by moving away from the conventional school type that had populated the English secondary sector in the past. 11 The path to establish an academy school in a local authority involved a number of steps. The key feature was the need to sign up a sponsor, who worked with the local authority (LA) where the school operates, and to complete a formal expression of interest (this made the case that an academy in the proposed area was both needed and feasible). The phase is completed when the LA and sponsor send the expression of interest to the Secretary of State for Education for his or her ministerial approval. After approval the process moves on to the feasibility stage and beyond that to actual conversion of the already existing school to an academy. In Table 3, we look in more detail at which types of school converted to academy status.
The upper panel of the  and is targeted at low income and minority students.
The smaller number of studies of conversions of already existing schools to charters (as in the study of school takeovers in Boston and New Orleans by Abdulkadiroglu et al., 2014), or the introduction of practices used in charters to US public schools (as in Houston schools studied by Fryer, 2014) are of more direct relevance to the English case of academies. Some of 9 these report substantial improvements in test scores due to the use of methods of 'best practice'.
In our analysis, we look at mechanisms in the case of academies and, as we will show, some of these overlap with the successful features of innovation in these in-district charters and school conversions/takeovers.
Many of the US experimental and quasi-experimental studies are relatively small scale in that their treatment group is often a small sample of schools (or even a single school in the case of Angrist et al., 2010). Interestingly, they do find positive effects for lotteried in pupils. Abdulkadiroglu et al. (2011) find that the lotteried in pupils experience significant improvements in their English language arts (ELA) scores and math scores at both middle and high schools, with effects being larger for the latter. Hoxby and Murarka (2009) also find that lotteried in pupils experience significant improvements in both their maths scores and reading scores between the third and eighth grade compared to the lotteried out pupils who remain in traditional public schools. Angrist et al. (2010) find that lotteried in students who attend KIPP Academy Lynn, a school that serves students in grades five through to eight, experience significant improvements in their maths scores and ELA scores. In a separate study, Dobbie and Fryer (2011) look at schools in Harlem in New York, with results being broadly similar results to those of Angrist et al. (2010).
An issue with the experimental studies is that lotteries only occur in the schools that are oversubscribed. Given that successful schools are more likely to be oversubscribed, estimates that exploit the lottery process are likely to be upper bounds. As an alternative, some studies adopt non-experimental methods to appraise the charter school model. They tend to produce more mixed results. For example, the CREDO (2009) study uses propensity score matching 10 methods finding charter school performance to be no better (or worse) than neighbouring traditional public schools.
One problem with non-experimental methods is how well they deal with selection bias compared to the lottery based estimates. An informative study that addresses this issue is by Hoxby and Murarka (2007). They estimate treatment effects for charter schools using both non- reporting positive urban charter school effects in both cases. However, Dobbie and Fryer (2013) report that observational estimates from New York schools give lower effect sizes than lottery estimates from the same sample of schools suggesting that the use of matching and regression alone may lead to downward bias.
On academies themselves, there remains very little rigorous research work. There are early studies of small numbers of converters by Machin and Wilson (2008)  In our analysis reported on below we separately study intake and performance for this very reason, because we observe changes in the ability composition of pupils in terms of their prior academic achievement entering schools after they become academies. Thus we implement a research design studying performance effects only for children who were enrolled in the converting schools before they became academies (in the terminology of Abdulkadiroglu et al. (2014), who study school takeovers in New Orleans, these are 'grand-fathered' pupils). Since the initial enrolment decision was made for the pre-conversion school, academy conversion should be exogenous to these students, and therefore the study of pupil performance effects can be set up as in terms of an intention to treat empirical exercise, from which we can obtain a causal estimate of a local average treatment effect. January collection because this collection is the most available and consistent over time. 16 In England, compulsory education is organised around four key stages for years of schooling from ages 5 to 16. These are key stage 1 (in years 1 and 2) and key stage 2 (years 3 to 6) in primary school; and key stage 3 (years 7 to 9) and key stage 4 (years 10 and 11) in secondary school. In studying academy conversion impacts, our two outcomes of interest are pupil intake and pupil performance. To study intake for pupils enrolling in secondary school in year 7, the first year of secondary school, we look at the key stage test scores (KS2) that pupils take at the end of primary school (aged 10/11 at the end of year 6) before they make the transition to secondary school. To study performance in year 11, the final year of compulsory secondary schooling, we look at the key stage 4 (KS4) examinations that pupils take at the end 13 of compulsory schooling (aged 15/16 at the end of year 11). These school leaving exams are known as GCSEs (standing for the General Certificate of Secondary Education).
The impact of academy conversion needs to be analysed at the pupil-level. This is because the underlying composition of students attending schools may change over time (as we show, pupil intake does change post-conversion). To study intake, we match each pupil entering year 7 of a secondary school over the 2001/02 to 2008/09 academic years to their KS2 results over the 2000/01 to 2007/08 academic years. It is important to note that we allow for this intake change when identifying the causal effect of academy attendance on KS4 performance by focusing on pupils already enrolled in an academy pre-conversion thus avoiding endogeneity of the post-conversion enrolment decision.
One further practical issue concerns the definition of schools that convert to academies.
There are a small number of examples where more than one predecessor school combines to create one academy school. Where this occurs, we create one hypothetical pre-academy school (see the discussion in the Data Appendix). This adopts hypothetical characteristics that are a weighted-average of the characteristics of the merged schools.

Modelling Approach
A conversion event c is defined as occurring in the school year t that the academy school starts operating (i.e. event year E(t = c) is when it 'opens for business' and admits new pupils as an academy). We then use the academic year that the academy status is awarded (and the years after) as the base that we need to calculate the quasi-experimental before/after conversion effect on the pupil-level outcomes of interest.
We have two outcomes of interest. The first is to study the impact of academy school conversion on the quality of pupil intake, which we measure in terms of ability composition by the end of primary school standardised KS2 average points score 17 of pupils who enrol into year 7, the first year of secondary school. The second outcome, and the main outcome of interest in the paper, is the KS4 performance of pupils, measured as the standardised best 8 exams points score of individual year 11 students. We also consider robustness of the findings to different measures (the precise measures used for KS2 and KS4 are described in detail in the Data Appendix, together with additional performance results for a range of different KS4 measures).

Research Design -Quality of Pupil Intake
We begin by comparing what happens to pupil intake (measured by KS2 test scores of year 7 enrollers) before and after conversion for pupils attending schools that do and do not convert in the sample period. In the following equation for pupil i enrolled in year 7 in school s in year t, the key parameter of interest is the differences-in-differences coefficient δ: In (1) A is a dummy variable equal to 1 if the secondary school attended in the entry year of secondary school is in the treatment group (i.e. will become or is an academy in the sample period) and equals 0 if the school is in the comparison group (schools that do not convert to an academy in the sample period, but convert after the sample period ends). Defining E as an event year, the dummy variable indicator I(E ≥ t = c) takes a value 1 if the pupil enrols in conversion year c or after and X denotes a set of control variables. Finally, α s denotes school fixed effects, α t denotes year effects and u 1 is an error term.
The specification in (1) imposes an average post-conversion effect across all postconversion years. A more flexible specification estimates separate treatment effects for pre-and post-conversion years, in an event study setting, as: We report event study estimates of four pre-conversion δ's (from E = c-4 to c-1) and four conversion year and post-conversion δ's (from E = c to c+3).
We also allow for heterogeneous effects by recognising that academies with different forms of predecessor school gain different amounts of autonomy when they convert. We consider differences by 'autonomy distance' by allowing effects to vary with the type of predecessor school. To do so, we estimate separate versions of (2) for academy conversions from community schools and conversions from non-community schools. 18 The presumption underpinning this is that the autonomy distance is largest for conversions that take place from predecessor community schools (see the earlier discussion around Table 3).

Research Design -Pupil Performance
To study pupil performance effects we look at the Key Stage 4 (KS4) performance of year 11 students. There are important identification issues that need to be considered here that did not apply to the KS2 intake part of our study. Specifically, there are three important dimensions of our empirical strategy that enable us to identify a causal effect of academy conversion on pupil performance: i) We consider children whose parents made their decision to enrol their children in the academy before it converted. This ensures that academy conversion was exogenous to enrolment in secondary school.  Table A1 of the Data Appendix shows the structure of this treatment in more detail.
iii) Since the initial enrolment decision was made for the pre-conversion school, academy conversion should be exogenous to these students, and can be set up as in terms of an intention to treat (ITT) empirical exercise, from which we can obtain a causal estimate of a local average treatment effect (LATE). The ITT group is all pupils enrolled in the predecessor school who pre-conversion are lined up to take their year 11 KS4 exams in the school (i.e. year 7 students enrolled 4 years prior to conversion, year 8 students enrolled 3 years prior etc). The approach is similar to that taken in Abdulkadiroglu et al. (2014), who study school takeovers in New Orleans, referring to pupils who stay in a converting school as 'grand-fathered' pupils.
As we are interested in the causal impact of academy conversion on KS4 results we can first operationalise our empirical analysis by means of the following value added equation: In (3) estimates of the θ 1 coefficient is analogous to the KS2 difference-in-difference set up above, but because we now restrict to pupils enrolled in the pre-conversion school there is a 17 subtle difference. This is that not all pupils who end up taking their KS4 exam at a school that becomes an academy (A ist = 1) were enrolled in the school pre-conversion. Conversely, not all students initially enrolled in a school that converted to an academy (ITT ist = 1) remain in the school to take their KS4 exams. Thus, ordinary least squares estimates of θ 1 from (3) will not reflect a causal estimate.
Defining the variable indicating treatment by an academy conversion as we account for selection into and out of treatment by using intention to treat status (ITT ist ) as an instrument for Z ist , to estimate a LATE as follows: In the first stage in (4) the estimates of θ 2 show the proportion of the ITT group that stay in the academy and take KS4 exams there. These are the 'grandfathered' pupils that remain in the school. Equation (5) is the reduced form regression of KS4 results on the instrument. The instrumental variable (IV) estimate is the ratio of the reduced form coefficient to the first stage coefficient, θ 3 /θ 2 .
Extending this IV setting to the event study framework we are able to estimate separate estimates for the four years from conversion onwards (E = c to c+3) using four instruments for whether a pupil is ITT for event year c, event year c+1 and so on. 19 Further extending to estimate separately for community and non-community predecessor schools enables us to consider the impacts of autonomy distance associated with the conversion.

Comparison Schools
In Table 4, we compare average pre-treatment characteristics of academy schools and other types of maintained English secondary schools. It confirms that academies have significantly different characteristics from the other school types. This is true of pupil characteristics (like the proportion eligible for free school meals, the proportion white and the proportion with special educational needs) and of pupil performance (like the headline school leaving age measure of the proportion getting 5 or more A*-C GCSEs and equivalents and the Key Stage 2 primary school points score).
This is not at all surprising. The whole point of Labour's academy programme was to improve poorly performing schools. Thus, a naive comparison between academy schools and all other state-maintained schools is likely to suffer from significant selection bias. A related problem is that schools that go on to become academies may have common unobservable characteristics (e.g. they have a type of school ethos that is more in line with the academy model). Finally there is scope for mean reversion, as academies were badly performing schools in their predecessor state.
Looking in more detail within the group of academies it does, however, turn out that the schools that convert to academy status between 2002/03 and 2008/09 have very similar pretreatment characteristics to the schools that later become academies. A set of balancing tests is given in the final row of the Table. One cannot reject the null hypothesis that the 106 academies that convert in the sample period and the 114 future academies have the same sets of characteristics. This partially legitimises our use of pupils attending future converters as a control group in the D-i-D setting. It is further legitimised in the empirical findings we describe below where there are no differential pre-conversion trends in the same school years, thus allaying any concerns of mean reversion.
Thus the data structure we use is a balanced panel of schools for the school years 2000/01 to 2008/09 with repeated cross-sections of enrolled year 7 (for intake) and year 11 (for performance) pupils. Time variation in the academy conversion programme means that we can set these up in the event study framework detailed above. Table A2 of the Data Appendix shows the sample sizes for the different cohorts of academy schools in the KS2 and KS4 analyses that we undertake.

Academies and Pupil Intake
In Table 5, we report results showing the effects of academy school conversion on the quality of pupil intake. The Table reports estimates from five different empirical specifications. We begin with the raw differences-in-differences estimate in column (1). We add time-varying controls in column (2). In column (3), we estimate heterogeneous effects in the event study setting, and in columns (4) and (5) we look at event study estimates for pupils in community and noncommunity predecessor schools respectively.
The estimated coefficients in the Table show that academies, post-conversion, attract pupils with significantly higher KS2 test scores than those schools that convert after our sample ends. Column (1) shows that, on average, pupils enrolling in an academy at year 7 have a KS2 mean points score that is 0.074 of a standard deviation (σ) higher than those attending schools 20 yet to attain academy status. The average intake effect falls to 0.058σ with the addition of the controls in column (2).
The event study estimates in column (3) show there to be no pre-conversion differences in trends between pupils in the treatment and control schools. They show a conversion year impact (E = c) of 0.010σ. This gradually rises year on year post conversion, becoming strongly significant in statistical terms, before reaching 0.082σ by event year c+3. These results suggest that (on average) there was a change in the pupil intake of schools when they converted to academy status. On conversion, academies began admitting higher ability pupils.
As shown in Figure 1, this positive impact grows over time, suggesting important compositional changes in the academies student body over time. Interestingly, the positive intake effects are only present for academies that convert from community predecessor schools (as shown in column (4)) where the (bigger) conversion year impact of 0.056σ is significant and rises to 0.200 σ by E = c+3. Figure 2 plots the event study estimates by predecessor type and the clear difference is evident.  (4) to (6) show estimates from value added specifications that net out end of primary school KS2 pupil performance and include controls while columns (7) to (9) extend the (4) to (6) specifications to the event study setting.

Academies and Pupil Performance
The first point to note is that the estimates are broadly similar regardless of estimation method. The columns (1) to (3) specifications show that being in an academy school increases pupil's KS4 standardised test scores by a statistically significant 0.082σ to 0.095σ. Adding the prior achievement measure (KS2) and control variables in columns (4) to (6) reduces this a little to 0.073σ to 0.080σ, which remains significant. Thus pupil achievement is significantly higher on average, and so is value added for pupils attending schools that converted to an academy.
The interpretation of the ITT estimate in column (5) of a significant 0.073σ improvement is that KS4 went up by 0.073σ more for children enrolled in a pre-conversion school as compared to children enrolled in control schools in the same school years. The IV estimate in column (6) corrects for the fact that not all ITT children sat their KS4 examinations in the school (in fact 93.2 percent did as the highly significant first stage at the bottom of the Table   shows) and this rises to 0.079σ. This is the preferred baseline average impact estimate of academy conversion.
Columns (7)-(9) of Table 6 show the event study D-i-D estimates. These show no discernible pre-treatment trends, but a significant positive, and rising over time, impact after conversion. In the IV estimates of column (9), conversion year test scores are 0.037σ higher (though statistically insignificant), and this rises to (a statistically significant) 0.184σ four years post-conversion. Figure 3 very clearly shows the significant upturn after treatment and the lack of pre-conversion differences. It also makes it clear that academy conversion raised pupil performance, according to the causal IV estimates.
In Table 7 and Figure 4 we show separate IV estimates for pupils attending academies that converted from community and non-community schools respectively. Significant -and sizable -effects are seen for the former, whilst there is no improvement for the latter. These results reveal an important finding in terms of the overall interpretation of our results. They suggest that pupils attending schools experiencing the largest increase in autonomy via 22 conversion -those from predecessor community schools -were the only ones to experience performance improvements. The estimated effects are large, for example with a year of conversion effect in the IV estimates being 0.097σ that reaches 0.388σ by c+3.
Conversions from community schools enabled a gain of responsibility for the majority of the curriculum of the school (except the core subjects: English, Maths, Science and IT); the structure and length of the school day; the school budget and all staffing decisions. In the next section of the paper we look at which of these underlying mechanisms may have been behind the observed performance improvements. Prior to that, however, we consider some empirical extensions and study the robustness of the key findings.

Extensions and Robustness
Recall that the treatment effect we are estimating is time-varying because academy conversions occur in different school years 2002/03 through 2008/09 . Thus one extension we have considered is to estimate the most detailed KS4 models separately by cohort. Figure 5 plots IV estimates from the models separately by cohort. 20 It is very clear that a null hypothesis of the same average effects across cohorts is not rejected by the data. The gradually rising positive performance effects are seen across the four cohorts of conversions shown in the Figure. The event study estimates uncover a significant improvement in performance that grows with more years post-conversion. This is not quite the same, though is strongly connected, to the years of exposure to academy treatment that children receive. The reason why is that a small number of pupils do not sit their KS4 exams in an academy school and are not intention to treat, but are nevertheless exposed to treatment (i.e. they may enter post-conversion and leave prior to examinations). We have therefore reformulated the estimated models in terms of years of exposure to being taught in an academy. This involves defining the ITT variable and the treatment variable as years of exposure. Table 8 shows the results both for continuous and for dummy variable ITT and treatment years of exposure variables. It is evident that more years of exposure produces a bigger impact on pupil performance, and one that is of sizable magnitude for four years of exposure at 0.323σ in the academy conversions from predecessor community schools.
Next we consider a falsification test. This is a test of whether the estimated θ coefficients reflect pre-existing differences in the outcomes of interest for our treatment group compared to our control group. To do the falsification exercise, we altered the year in which each cohort of academy school became an academy to that of an earlier time period. We then re-estimated our models calculating the θ coefficients based on a 'fake' year (four years before) where we pretended schools converted to academies. If the θ coefficients in this falsification exercise give similar results to that of our original specification, then we would worry that the results of our original specifications reflect pre-existing differences in the outcomes of interest. To avoid any contamination when pupils attend schools that actually have converted, as oppose to attending during the 'fake' conversion, it is necessary for there to be no overlap, at the school level, between fake post-academy years and actual post-academy years. This means that we have to shorten the post-treatment fake periods for the first three academy cohorts. Thus the sample size drops. We also lose 4 schools who do not have GCSE sittings for some of the earlier 1997/98-2000/01 period. 21 The falsification exercise was conducted over the seven year period between the 1997/98 and 2004/05 academic years. Column (1) of Table 8 shows the results for all conversions, and column (2) just for conversions from community schools. In both cases the estimated θ coefficients for the academy conversion are always close to zero and statistically insignificant.
This fake policy experiment does seem to rule out that our results are driven by pre-existing unobservables. However, as already noted, it was carried out on a slightly different sample and so in columns (3) and (4) of Tables 7, we report the original specifications for the same sample of schools. They are very similar to the main KS4 results of the paper. 22 The same is true when the value added specification adding in KS2 (which we are unable to do for the fake policy) is considered in columns (5) and (6).
We have also looked at other measures of KS4 performance. These are shown in Appendix Tables A2-A4. All models are comparable with those in Tables 6 and 7. If, rather than using the total points score, we consider the proportion getting 5 A*-C GCSEs (and their equivalents) and or the proportion getting 5 A*-C GCSEs (and their equivalents) but including Finally, we considered a different measure of whether academisation under the Labour programme resulted in improved school performance by looking at Ofsted inspections of schools before and after conversion, again relative to control schools. 23 Table 9 shows transition matrices for treatment and control schools in the 2000s. These transitions constitutes movements in inspection rankings (of outstanding, good, satisfactory or inadequate) before and after academy conversion for academies in the early and late 2000s and the same for comparison schools. Not all schools were inspected twice in this period so we are forced to analyse a sub-set of schools.
The descriptive statistics in Table 9 show that academies were, on average, more likely to move up the rankings before and after conversion as compared to comparison schools.
Ordered probit estimates reported in Table 10 confirm this and show a statistically significant improvement in inspection rankings of academies. We take this as complementary and corroborative evidence in line with the KS4 performance gains we have already reported.

Mechanisms
The above results uncovered evidenced of significant performance improvements for pupils treated by academy conversion. They also showed these improvements to be more pronounced for those attending schools that gained the greatest autonomy. We now address the questionwhat use of academy freedoms can account for these findings?
To begin the discussion of mechanisms, we first draw on the Department for Education's (2014) survey of academy schools 'Do Academies Make Use of Their Autonomy?'. This survey collected information on a wide array of changes that may have occurred following conversion. 24 These are summarised in Table 11 for 23 of the Labour academies we analyse in this paper, and for 148 academies (including the 23) overall. When asked what the most important change was, two answers dominate -'changed school leadership' (at 56 percent) and 'changed the curriculum you offer' (at 26 percent).
Furthermore, both of these were reported to be linked to improved outcomes (in 73 and 77 percent of cases respectively). Other changes that were notably linked to improved outcomes were 'Increased the length of the school day' (63 percent) and 'Collaborated with other schools in more formalised partnerships' (45 percent).
Looking at differences between treatment and control schools in the D-i-D event study offers further evidence. We can look at three of the important factors identified in Table 11: whether a new headteacher is taken on upon conversion; whether more pupils are enrolled; and whether more teachers are taken on. This is facilitated by the availability of school level data over time on each of these. year of conversion c as compared to the control schools. This seems to be a one off change that occurs as the subsequent year treatment effects from c+1 to c+3 are all insignificantly different from zero. The rate of headteacher turnover is a little higher 62 percent in conversions from predecessor community schools, but is also high at 51 percent in predecessor non-community schools, showing that changing headteacher is a general and widespread feature of academy conversions.
Thus a strong feature of academy conversions is to replace the headteacher. There is a more modest turnaround of the rank and file teaching staff, and much of this is due to a need to take on more teachers as more pupils enrol in academies post conversion. This is shown from the results reported in Table 13. The Table shows event study D-i-D estimates of the effect of academy conversion on the number of teachers, number of pupils and the teacher-pupil ratio.
Looking at columns (1)- (3) shows that the number of teachers rose gradually for event study years c+1 through c+3, although there was no significant effect in the year of conversion. This is because, as shown in columns (4)-(6), more pupils were enrolled as the academies were up and running, again with an insignificant change in the year of conversion, but with increases in pupil numbers by c+3. Finally, columns (7)- (9) show that the number of teachers increasing was largely due to increased pupil enrolments (except in the conversion year where the teacher-pupil ratio did rise, especially in conversions from predecessor community schools because of a blip down in pupil enrolments that year). Overall, however, the Table shows less clear evidence of   28 teacher turnover as compared to the very significant evidence of headteacher turnover shown in Table 12.

Conclusions
This paper focusses on what has become a high profile case of education policy -the introduction of academy schools into the English secondary school sector. We consider the impact of academy school conversion on pupil intake and performance. Academy conversion is seen to generate a significant improvement in the quality of pupil intake and significant improvements in pupil performance for those who attended schools treated by academy conversion. There is evidence of heterogeneity in the estimated performance effects as they occur only for schools experiencing the largest increase in their school autonomy relative to their predecessor state.
For children attending academies that converted from a community school we find that transformation to an academy raised their educational outcomes by 0.14σ on average, and by more for children receiving more years of treatment (rising to around 0.39 of a standard deviation three years post-conversion). These findings complement existing work from different settings like that on US charter schools (both newly set up and more closely to takeovers of public schools) on whether different school types can affect pupil performance. It is noteworthy that a key feature distinguishing these new coalition academies is that, on average, they are not characterised by poor performance and disadvantage in their predecessor state like the sponsored academies introduced and approved under the previous Labour government which we analyse in this paper. 25 The way some of them are run is also different with, for example, some of the post May 2010 academies being run as chains of schools by major sponsors. It will be an important future research challenge to determine whether or not these new convertor and chain run academies are able to deliver the kinds of performance improvements for students enrolling in them that the Labour programme we study here seemed to do.

Figure 1: Event Study Estimates of Pupil Intake and Academy Conversion, Key Stage 2, Pupils Enrolled in Year 7
-.   Pupil Performance and Academy Conversion, IV Estimates Notes: From column (9) specification of Table 6.

Figure 4: Event Study Instrumental Variable Estimates of Pupil Performance and Academy Conversion, Key Stage 4, Year 11 Pupils By Predecessor Type
-.3 -.

Pupil Performance and Academy Conversion, IV Estimates
Notes: From cohort specific estimates of column (9) specification of Table 6. Notes: a -Registered independent schools are independent of the local authority (LA), and are fee-charging. b -Academy schools (prior to 2010/11): all ability independent specialist schools, which do not charge fees, and are not maintained by the local authority; established by sponsors from business, faith, HE institutions or voluntary groups, working in partnership with central government. Sponsors and the DfE provide the capital costs for the Academy. Running costs are met by the DfE in accordance with the number of pupils, at a similar level to that provided by local authorities for maintained schools serving similar catchment areas. c -City Technology Colleges: all ability independent schools, which do not charge fees, and are not maintained by the local education authority. Their curriculum has a particular focus on science and technology education (see West and Bailey, 2013). They were established by sponsors from business, faith or voluntary groups. Sponsors and the DfE provided the capital costs for the CTC. Running costs are met by the DfE in accordance with the number of pupils, at a similar level to that provided by local authorities for maintained schools serving similar catchment areas. d -Voluntary-aided schools are maintained by the local authority. The foundation (generally religious) appoints most of the governing body. The governing body is responsible for admissions and employing the school staff. Land at voluntary-aided schools is usually owned by trustees, although the local authority often owns any playing field land (DfE, 2012). e -Foundation (formerly grant-maintained) schools are maintained by the local authority. The governing body is responsible for admissions, employing the school staff, and either the foundation or the governing body owns the school's land and buildings (DfE, 2013). f -Voluntary-controlled schools are maintained by the local authority. These are mostly religious schools where the local authority continues to be the admission authority. Land at voluntary-controlled schools is usually owned by trustees, although the local authority often owns any playing field land (DfE, 2013). g -Community schools are maintained by the local authority. The local authority is responsible for admissions, employing the school staff, and it also owns the school's land and buildings.      Notes: E denotes event year and c is the year of conversion. Robust standard errors (clustered at the school level) are reported in parentheses. Control variables are dummies for whether the pupil is male, the pupil's ethnicity group, whether they are eligible for free school meals and whether they have special educational needs. Notes: E denotes event year and c is the year of conversion. Robust standard errors (clustered at the school level) are reported in parentheses. Control variables included are the same as from the Table 5 regressions, although in specifications including KS2 test scores we now additionally include a separate intercept for pupils for whom KS2 data is unavailable. For children who move out of treatment or control schools to take their KS4, school fixed effects (1714) for the school they move to are also included.   Tables 6 and 7 are because a small number of pupils do not sit their KS4 exams in an academy school and are not intention to treat but are nevertheless exposed to treatment i.e. they may enter post-conversion and leave prior to examinations. Restricting the sample to pupils appearing in both the exposure and event study samples gives us 362412 observations. Running the specifications in Tables 6 and 7 on this sample makes no difference to the reported results.    (1) and (2) are limited for the Fake Policy time period and comprise solely of gender.  Notes: The dependent variable is coded as 0 for a reduction in Ofsted rating, 1 for no change and 2 for an improvement. Robust standard errors in parentheses. The control variables included in specification (2) are proportion male, proportion white, proportion of pupils eligible for free school meals and the proportion of pupils with special educational needs al measured in the year of first inspection. Year of inspection dummies are also included.    Notes: E denotes event year and c is the year of conversion. Robust standard errors (clustered at the school level) are reported in parentheses. Control variable are percentage of year 7 intake male, white-origin, free school meal status and special educational needs status. A pooled Academy x Post-Conversion (E = c to c+3) estimate and associated standard error (in parentheses) comparable to (1) for all schools is 0.345 (0.042).   Notes: E denotes event year and c is the year of conversion. Robust standard errors (clustered at the school level) are reported in parentheses. Control variable are percentage of year 7 intake male, white-origin, free school meal status and special educational needs status. A pooled Academy x Post-Conversion (E = c to c+3) estimate and associated standard error (in parentheses) comparable to (1) for all schools is 0.062 (0.028), for (4) for all schools is 0.039 (0.024) and for (7) for all schools is 0.023 (0.015).

Data on Academy Schools
We first identified all schools that became academies over the school years 2002/03 to 2010/11. Our sources for this are Department for Education extracts that give information on all academies that have opened or are in the process of doing so. The extract gives the opening date of the academy, its URN (a unique identifier for the school allowing us to identify it in various governmental data sources such as the National Pupil Database and the Pupil Level Annual Census data), DFE number (a second unique identifier combining school specific and local authority specific numbers) and the URN number of the predecessor school.
Using performance tables data from the Department for Education (DfE) we match in predecessor school types. The data gives 244 schools that became academies between the first 3 academy openings in 2002/03 and those that gained academy status by September 2010 (the beginning of the academic school year). We omit those that were previously independent schools due to pupils in these schools not having exam information at KS4. Similarly, we omit new schools as they have no predecessor school.
In order to have a balanced panel we focus on academies that have some form of predecessor school open from at least 1996 onwards. Any later and the school will not have KS4 results for 2001. In order for our sample to be balanced for intake we exclude academies who do not enrol pupils in year 7. The final sample contains 106 treatment schools (those that opened as academies prior to, or in, September 2008) and 114 control schools with observations ranging over the years 2000/01-2008/09. None of our control schools become academies during these sample years.

Pupil Level Data
We use data from PLASC (pupil level annual schools census) and the NPD (national pupil database). The NPD contains information on all key stage 2 (KS2) and key stage 4 (KS4) exams sat at the end of primary and secondary school respectively. Each pupil is identified by a unique reference number and the data gives the unique URN of the school in which they sat the exam. While the NPD reports on pupils in examination years PLASC has a record for every pupil for each year that they are in the maintained school sector. PLASC data gives the pupil, year group and school as well as demographic variables such as ethnicity, gender, free school meal eligibility and special educational needs status. We can track pupils through secondary school using the unique pupil identifier. This identifier is common to the NPD enabling us to merge NPD and PLASC data. This gives a panel of pupils with their demographic information, their KS2 and KS4 test results and the school(s) that they attended from year 7 (first year of compulsory secondary education) through to year 11 (final year of compulsory education). We then extract those pupils who attended the 220 treatment and control schools at some point over the sample period. We can now see which schools pupils attended in every secondary compulsory year of schooling 26 , their demographic information and their exams results at KS4 and KS2. Our intake analysis focuses on those who enter as a year 7 student in 2000/01 -2008/09 while our results analysis focuses on those who sit exams, are ITT or receive exposure in one of our 106 treatment schools or sit exams in one of our 220 control schools over the same period.
The raw data contains a small number of duplicate observations for pupils. 27 Duplicates at the level of KS2 results are easy to deal with as we randomly delete one entry when pupils records are duplicated in all aspects apart from the primary school they attended (as primary school does not matter to us). When there are two entries with differing exam scores we keep the record with the most information (i.e. if one entry has the pupil missing most exams while the second has scores for these exams we keep the latter). 28 There are also small number of multiple records of KS4 attainment for pupils. Our analysis focuses on the year 11 record giving us a dataset of pupils who have completed their GCSEs (as oppose to those who have sat some exams early).
When multiple records exist in a single year we delete those whose scores are not included in national or school level calculations -often these are those who switch schools and so take exams in one school but are coded as attending another. 29 In a few cases pupils are flagged as not to be included in the school level calculations (so their attainment would not be used to calculate performance tables school level data) despite the fact that their information is not duplicated and nothing appears to be wrong with their attainment data. We include these pupils in the final dataset. However, all results are robust to omitting these pupils. 30 The sample sizes for year 7 and year 11 pupils are given in Table A1.
Finally it is worth noting that PLASC does not cover years prior to 2002. For our observations before then we do still have NPD data on KS2 and KS4 performance (we have these going back to 1997 for KS4 and 1996 for KS2). Therefore these observations are missing all demographic covariates with the exception of gender. Similarly in our fake policy results the only covariates, aside from year dummies and school fixed effects, are pupil gender. This is why, in Table 9, we reproduce our main specification without covariates so as to make the fake and actual policy results comparable.
For our the intake analysis we assume that those identified as being in year 8 in 2002 in PLASC were year 7 pupils in the same school the previous year -we therefore retain demographic variables for these pupils. 26 Strictly speaking this is not true. Some pupils enter the schooling system either from another country or from independent schools. We observe when the pupils enter but not precisely where they came from. These pupils are retained in our analysis. 27 That is, multiple records in a single year. 28 This may be the case when a pupil misses exams through illness and retakes at a later date. 29 Variables in the NPD identify whether the pupil's achievement were used in school/national level calculations. 30 Unless otherwise stated all further robustness checks mentioned in this Appendix are available upon request form the authors.

Notes on Treatment and Clustering
Treatment for the pupil intake KS2 analysis is simple. A pupil is defined in treatment group if they enrol in an academy school in their first year of secondary school -year 7 -after conversion to an academy has occurred.
Intention to treat for the KS4 performance analysis is defined as follows. For an individual in pre-enrolment year c-i (where c denotes conversion year) and academic year group j an individual is expected to sit their exams in c-i + (11-j). A person is then ITT if the preceding term is equal to c, c+1, c+2 or c+3. To see why an individual cannot be ITT in year c+4 note that the 'biggest' pre-enrolment year is c-1 and the 'smallest' academic year group is 7 thus the preceding term cannot exceed c+3.
The exposure variable for Table 8 is defined cumulatively therefore we simply sum the number of academic years an individual spends in an academy school post conversion. ITT is then defined as above. We limit this sample to those who spend at most 4 years in an academy post conversion so as to be consistent with Tables 6 and 7. A final note relates to how we define 'school'. For each of our treatment and control schools we assign a unique number. It is possible that two pupils from different schools are given the same number should the two differing schools later become the same academy. We identify when schools merge by looking at linked schools in edubase (this is a Department For Education database of all open and closed maintained schools in England). In one case a single school becomes two separate academies (North Westminster Community School splits into Paddington Academy and Westminster Academy in 2006). Pupils attending the predecessor school are randomly assigned one of the two numbers given to the two academies that open later. Students who leave the sample but are ITT or receive exposure are given a unique number equal to the school that they sit their KS4 exams in. In estimated specifications, standard errors are clustered on this unique number resulting in 1714 clusters in Tables 6 and 7 and 1720 in Table 8.

Attainment Measures
The main variable in our analysis of intake is an average score across three subjects specific tests: English, Maths and Science. Test scores are reported in two ways: firstly, a level from 2-5 is awarded in each subject and secondly, a raw test score. The raw test score is out of 80 for science and is the sum of two separate science papers each marked out of 40 while the English test score is marked out of 100 and is composed of the sum of two separate test scores, each marked out of 50, in reading and writing. Finally math is composed of two marks out of 50 with one of the tests being in mental arithmetic. The levels are based upon these underlying test scores but are not always consistent. For instance, after an initial level is assigned after grading the test there may be a review of the pupil's test score resulting in a higher or lower level being awarded even if the underlying raw test mark is not altered. Similarly the mark required for any one level varies both between subjects and within subjects across years. For these reasons we use standardised raw test scores as our main dependent variable in KS2 regressions.
When pupils are not awarded a test mark or are performing at a level below the level of the test we award pupils a mark of 0. Those who miss the tests are excluded from our sample for the purposes of the KS2 regressions but are included in our KS4 regressions where we include a dummy for those who do not have a KS2 record or who miss KS2 exams. Our KS4 results are robust to re-running our regressions omitting those without a KS2 record and those whose scores are below test levels.
The main KS4 qualification in the UK is the GCSE (General Certificate of Secondary Education). GCSEs are graded A*-G. The current points score calculations give an A* a score of 58 and a G a score of 16 with grades in between going up in increments of 6 as follows: Prior to this an A* was given a score of 8 and a G a score of 1 with scores going up in increments in 1.

Grade Points
Old scale As well as GCSEs there are a wide range of equivalent qualifications focusing on more vocational subjects. These include GNVQs and BTecs. Depending upon the type of equivalent these are often worth multiple GCSEs and are often graded as a combination of GCSE grades i.e. a distinction in an intermediate GNVQ is equivalent to gaining two GCSEs with one at grade A and the other at grade A*. 31 The points score given to the qualification reflects the underlying GCSE grades that it is based upon so that under the new scoring system the aforementioned qualification would be given a score of 110.
The points system we use is as follows: Grade Points Scale used in the paper The points system we use addresses some of the concerns expressed pertaining to the 16-58 and 1-8 scales used over the course of our sample. 32 The non-linearity reflects the fact that it appears hardest to jump from grades D to C and from A to A*.
We cap points scores at best 8 qualifications. To do this we normalize raw point scores by their GCSE equivalent i.e. a qualification worth 4 GSCEs and 208 points (under the 16-58 scale) is normalized to be worth 52 points. We then convert these points to our new measure and rank them highest to lowest. We then add up the grade weightings (in terms of GCSEs), taking fractions of qualifications if need be, until we reach 8. All those in the top 8 are then multiplied through by their weight and summed to give the points score.
Our decision to cap at 8 is motivated by two concerns. Total points scores have the problem that pupils can appear to do well by entering many exams and performing poorly in them. Similarly using, for instance, 5 best means that those who focus very narrowly on a small set of exams may appear better than those who perform well over a larger selection of subjects/qualifications. Our decision to cap at 8 balances these two concerns.
Finally, it is worth noting that our point measures create some notable discrepancies with the official method. For instance, an equivalent qualification worth two GCSEs graded CD is worth 74 points under the 16-58 scale meaning that it is worth more than a A* at GCSE. Using our system such a qualification is worth 10 points (the sum of the points scores for grades of C and D) -the equivalent of a GCSE at grade A*. A further example is a BTEC that is worth 76 points on the old scale and equivalent to 4 GCSEs. This is the same as achieving grades of 2 Fs and 2 31 Most equivalents are graded as pass, merit or distinction but the Department for Education equates these categories, combinations of, A*-G grades. 32 We are grateful to Tim Leunig and Mike Treadaway for very helpful correspondence on this.
Gs. In our system this is equivalent to a point score of 6. Thus our points mean the qualification is the same as getting a C at GCSE whereas the old measure means that the qualification is again worth more than an A*. In general our system reduces the relative points scores of equivalent qualifications compared to the official method. Despite this our results remain unchanged when using the (standardized) old (1-8) and new (16-58) points systems and when using total rather than capped scores.
The threshold measures (results for which are reported in Tables A2-A4) are relatively simple. In these, an equivalent qualification is seen as being at least a C if its normalized points score is greater than or equal to that score given to a grade C at GCSE. Thus a qualification worth N GCSEs whose normalized point score is at least 6 equates to N qualifications of at least grade C.
We present results for all our main performance specifications in Tables A2-A4 using different dependent variables.

Ofsted Reports 33
Ofsted is a government department that carries out inspections of maintained schools in England and Wales and reports to Parliament. Inspectors give schools minimal prior warning of inspection and proceed to inspect the school based upon a pre-set criteria before awarding the school and overall effectiveness rating. 34 Overall effectiveness is based upon many criteria such as the achievement of pupils, the effectiveness of management and the level of well-being and personal development of the pupils. Our main interest is whether schools converting to academies are more likely to improve their rating relative to the control schools.

58
To do this we use Ofsted ratings for the years 2000-2010. We limit the sample to the years 2000-2010 as post 2010 all the schools in our sample have converted to academies making any comparisons between converters and those yet to convert impossible.
For our estimates we use the first and last inspections for each school in our sample. For treatment schools the first inspection must be prior to conversion while the last must be post conversion. These restrictions results in our sample of treatment school falling to 46 with the first three cohorts not represented in our sample at all. For controls schools we omit those that only have a single inspection over the period thus reducing our sample of control schools to 105. For this sample we define a variable equal to 0 if the school's first inspection is worse than its last, 1 if the inspections are the same and 2 if the latter inspection is an improvement on the first. We use first and last inspections so that there is an equivalence in how we select relevant inspections for treatment and control schools. There are no cases when schools have multiple reports in the same year.
As a robustness check we replicate the results using the following two conversions for Ofsted scores: Conversion Our results prove robust to these changes.

Data on Mechanisms
As well as considering Ofsted reports we study mechanisms by looking at survey results from the Department for Education (2014), head teacher change and teacher turnover.
We collect data on head teachers using edubase and match a head teacher to each of our schools for each year (excluding 2001 for which data are not available) in our sample. For each year we define a binary variable equal to 1 if this year's head teacher is different from last years. When two schools merge we set this variable to 1 only if the head is not the head of either of the predecessors. When two separate schools are defined as being the same (with respect to the 59 clustering variable) we set this variable to 1 if either school change their head teacher in that year. Controls in this linear model are the same as those reported in Table 10.
For the teacher and pupil analysis we use data from the annual schools census. The data gives us the number of qualified and unqualified teachers at all maintained secondary schools for the years 2001-2009. We weight the total number of teachers, at the school level, by the number of pupils of compulsory secondary schooling age (11-15) relative to the total number of pupils in the school. This prevents a potentially spurious relationship between the number of teachers and academy conversion caused by many schools opening 6 th forms post-conversion. The weighted number of teachers, total pupils in compulsory secondary schooling along with the ratio of these two variables form the dependent variables in Table 14. Controls are the same as those reported in Table 11.