The U.K. as a Technological Follower: Higher Education Expansion and the College Wage Premium

The proportion tripled between However, the the trend in the college wage premium has been extraordinarily ﬂat. We show thatthesepatternscannotbeexplainedbycompositionchanges.Instead,wepresentamodelinwhichﬁrmschoosebetweencentralizedanddecentralizedorganizationalformsanddemonstratethatitcanexplainthemainpatterns.Wealsoshowthemodelhasimplicationsthatdifferentiateitfromboththeexogenousskill-biasedtechnologicalchangemodelandtheendogenousinventionmodel,andthatU.K.dataﬁtwiththoseimplications.TheresultisaconsistentpictureofthetransformationoftheU.K.labourmarketinthelasttwodecades.


INTRODUCTION
In the period extending from the early 1990s to the present, the U.K. economy experienced a dramatic transformation in educational attainment. Specifically, in 1993, 11% of the population held a university degree. This percentage doubled by 2006 and tripled by 2016. In this article, we examine the impact of that increase on the U.K. labour market, using our findings as a basis for contributing to the ongoing discussion about the interaction of educational attainment and technological change.
The editor in charge of this paper was Dirk Krueger.

REVIEW OF ECONOMIC STUDIES
with the same technology can dictate quite different wage outcomes for two countries depending on whether they are leaders or followers in the adoption of that technology.
The article proceeds in six sections including the introduction. In the second section, we establish the core patterns for the U.K., relying largely on Labour Force Survey (LFS) data between 1993 and 2016. We show that, despite a rapid increase in the proportion of university graduates, the college wage premium is flat across our time period. 1 We demonstrate the robustness of the two findings: they cannot be explained as, for example, declines in the actual wage differential that are masked by changes in composition. We consider compositional changes related to increases in the female participation rate, the shift toward more advanced university degrees over time, the difference between the public sector and the private sector, and the substantial increase in immigration. We also consider changes in unobserved abilities, using a bounding approach. None of these exercises alters the core result that the education wage differential was essentially unchanged during a period of rapid educational growth.
The combination of an increase in the supply of education and no change in the educational differential points to an offsetting relative demand shift favouring more educated workers. Such a shift has, of course, been the focus of considerable investigation, with a common conclusion that technological change associated with the IT revolution has been a key driving force. In the third section, we investigate competing models of technological change: the canonical model of exogenous skill biased technological change; models in which increases in education induce skill biased invention; and models of technological choice, in which firms choose among existing technologies. To test among the models, we employ wage and relative wage regressions derived from a general production function that nests all three possibilities. Based on estimates of those regressions, we argue that a model with exogenous technological change (either in its classic form or in a task based form) cannot explain the skill and wage patterns in the U.K. data. In particular, in the context of those models our estimates would imply that skilled and unskilled labour are nearly perfect substitutes and that there has been no exogenous skill biased demand shift, neither of which seems reasonable. This echoes previous papers that conclude that the canonical model also does not fit more recent U.S. data (Card and DiNardo, 2002;Beaudry and Green, 2005;Acemoglu and Autor, 2011). The implied substitution patterns are also relevant for empirical specifications derived from the endogenous invention model when holding technological change constant. For those specifications, as well, the findings of near perfect substitutability and no ongoing skill bias to demand shifts do not match the model. Implications from the endogenous invention model when not holding technological change constant also do not fit with the U.K. data patterns. We also present evidence that, viewed through the lens of an endogenous invention model, the U.S. is a strong candidate for being the technological leader where the new skill biased inventions were made. It had both a higher level of education and a higher amount of investment in IT before any other developed economies. The U.K., on the other hand was a laggard in educational attainment. We argue that once the U.K. did start to increase the educational level of its workforce, its firms could choose to pick up the technologies and organizational forms already developed in the U.S. In that sense, it is more natural to think of the U.K. in the context of the third type of model: a model of endogenous technological choice.
In the fourth section, we set out a model of endogenous technological choice which has the ability to capture the core data patterns and the results of our estimation. The model is a variant of models in Rosen (1978) and Borghans and ter Weel (2006) which focuses on the role of decentralization of decisions and information. It is also related to the model of endogenous technological choice in Beaudry and Green (2003). Firms use skilled and unskilled labour and choose between an older, centralized mode of operation and a newer, decentralized mode. The model endogenously generates an unchanging college wage premium. This was the point of using this type of model, and so that outcome provides no proof of the model's relevance. However, the model also generates testable added implications about the pattern of employment in manager positions for skilled and unskilled workers as the relative supply of skilled workers increases as well as strong implications about the form of the aggregate production function.
We examine these empirical implications of our model in section five. In that section, we also investigate further implications by examining the relationship between the educational composition of the workforce and the extent to which workers feel they control how they do their own work using matched worker-workplace survey data from the U.K. Workplace and Employer (WERS) data. We show that the areas where the increases in the BA proportion were largest had the greatest uptake of decentralized organizational forms. We establish that this is a causal relationship using an IV strategy using a combination of parental education and the population share of the birth cohorts most affected by the educational increase, measured in 1995 (i.e. before the entry of the most affected cohorts into the labour force). We view this as a credible strategy since the validity of the instrument just requires that differences in fertility rates across areas were not driven by changes in firm organizational forms 20 years later. Thus, the data fit with a model in which increased educational attainment induces more and more firms to choose a decentralized organizational form. One interesting implication of the model that is confirmed in these data are that increases in education levels in an area induce larger increases in individual decision making among less educated than among more educated workers. This arises because under the old, centralized technology, more educated workers were disproportionately managers and were already making their own decisions. It is for the less educated that decentralization is a particularly big revolution.
In Section 5, we also briefly provide evidence that several other developed and developing economies in this period also experienced a combination of a rapid increase in educational attainment with little change in the education wage differential. That is, in our terms, the U.K. was not the only technological follower. The sixth section of the article contains conclusions.
We are not the first researchers to note the substantial increase in degree-holding in the U.K. For example, Carpentier (2004) documented the trend in student numbers from 1920 to 2002, showing that it increased sharply around the early 90s. He also showed a reduction in university expenditure per student around the same time. Many other studies have also documented the substantial increase in the share of graduates in the 1990s or across cohorts (OLeary and Sloane, 2005;Walker and Zhu, 2008;Green and Zhu, 2010;Devereux and Fan, 2011).
Previous papers have also noted the lack of a reduction in the college wage premium over time or across recent U.K. cohorts (McIntosh, 2006;Machin and McNally, 2007;Walker and Zhu, 2008). However, those papers either appeal to offsetting relative demand shifts stemming from exogenous skill biased technical change or do not attempt to explain the lack of change in the relative wages at all. We add to the previous literature, in part, by providing an explanation that does not rely on exogenous skill biased demand shifts that just happen to be the right size to match the change in educational attainment across a range of years. Instead, we present a model in which this pattern arises endogenously, which has ramifications for how we think about the interactions of technological change, factor supplies, and factor demand. We also differ from earlier studies in our explicit emphasis on the firm organization part of the process-that is where our empirical work focuses. Combined, these give us new insights into how technological change affects economies. Overall, we view studying the U.K. as an opportunity to examine the impact of education policy on technological adoption and, through it, on wages in the situation that is likely relevant for most countries-being a technological follower. Our main empirical work is based on the demographic, education, employment, wage, and occupation variables in the U.K. Labour Force Survey (LFS). The LFS is a representative quarterly survey of approximately 100,000 adults that is the basis for U.K. labour force statistics. It is similar in nature to the U.S. Current Population Survey (CPS) which we use as a comparison. We make use of U.K. LFS data running from the first quarter of 1993 to the last quarter of 2016.
Consistent definitions of education levels over time are obviously important in our investigations. The LFS asks respondents about their highest level of educational qualification, with the potential categories changing over time. We take advantage of detail in the potential responses to construct six more aggregate categories that are consistent over time. For our main discussion, we then further aggregate those categories into three broader groups: a university degree level or above; secondary or some tertiary education below a university degree level; and below secondary qualifications. We draw the bottom line of secondary education as Grade C in the General Certificate of Secondary Education (GCSE), which are exams that students take at age 16 after 11 years of formal schooling. The GCSEs mark the first major point of exit from education in England: around one fifth of the working-age population have GCSEs Grade C or above or equivalents as their highest level of qualification in 2016. We consider a grade of at least C to be equivalent to High School graduation (HS) in the U.S. because the proportion of people strictly below the threshold in the U.K. is close to the proportion of HS drop-outs in the U.S. 2 Under UNESCO's International Standard Classification of Education (ISCED 2011), both U.S. High School Diploma and U.K.'s GCSE Grade C or above fall into ISCED level 3 "upper secondary education." We have investigated alternative definitions of education groups and they make little difference to our main results. 3 We restrict our samples to people between ages 20 and 59 because the education qualification question was not asked of people over age 60 before 2007 unless they were working at the time of the survey.
Wages are surveyed in the first and fifth quarters an individual is in the survey. We use the hourly wage derived from the weekly wage in the main job and actual weekly hours. Our sample contains 30,000-75,000 wage observations per year. As we are interested in the real cost of labour to firms, we deflate wages by the GDP deflator 4 https://stats.oecd.org/ OECDStat_Metadata/ShowMetadata.ashx? Dataset=MEI_ARCHIVE&Coords=%5bVAR%5d. %5b108%5d&ShowOnWeb=true&Lang=en(Link).
In places, we use the U.S. CPS to form a comparison. We again use individuals aged 20 to 59. The data are from the Outgoing Rotation Group samples. Following Lemieux (2006), we do not use observations with imputed wages when calculating wage statistics. Wages and employment status refer to the week prior to the survey week, and we only use wage and occupation data for individuals who are employed in the reference week. We aggregate the U.S. workers into three education groups: high school drop-outs; high school graduates (which includes workers with some or completed post-secondary education below a Bachelor's degree); and university degree holders (Bachelors and higher).
BLUNDELL ET AL. We begin with a figure showing the level of university attainment over time for the U.K., with the U.S. as a benchmark. We will use the shorthand of calling the group with university degrees BA's, even though it includes other types of Bachelors degrees and more advanced degrees. For both the U.S. and the U.K., we summarize the data by plotting year effects from an exercise in which we first calculate the BA proportion for the set of cells defined by year and 5-year wide age ranges then regress those proportions on a complete set of year and age range dummies. We control for age in this way because we are concerned that the movement of the baby boom through the age structure will affect our BA proportion measure. Figure 1 contains plots of the year effects for the BA proportion for both the U.K. and the U.S. The figure includes year effects from the General Household Survey (GHS) for the U.K. for the years before 1993 along with the same proportions from the LFS starting in 1984. 5 The sample sizes for the GHS are small, especially for the more educated, so we do not use it in our main analysis, but it does provide longer term context for the LFS data patterns. For the overlapping years, both of the U.K. datasets show a gradually increasing trend, although the level differs. As shown in Figure 1, the BA proportion in the U.K. showed a gradual increase in the 1970's and 1980's but it was still only about 0.13 in 1990, half of the value for the U.S. in that year. Beginning around 1993, however, the U.K. proportion underwent a rapid acceleration. By 2010, it had surpassed the U.S. 6 The big increase in the U.K. proportion in the BA group starting in the mid-1990s corresponds to a rapid increase in higher education enrolment from 1988 to 1994. This increase has been documented in many studies (OLeary and Sloane, 2005;Carpentier, 2006;Walker and Zhu, 2008;Green and Zhu, 2010;Devereux and Fan, 2011) and has been used as an arguably exogenous source of variation in studies of the causal impact of education Devereux and Fan (2011). The expansion of higher education over these decades reflects a sequence of specific policy choices made by the U.K. government. Further details are provided in the online appendix.

Changes in relative wages.
The second main pattern relates to wages. In Figure 2, we plot the ratio of BA to high school median hourly wage by year for the U.K. We will refer to this ratio as the college wage premium. As with the BA proportion, the plot corresponds to year effects from a regression in which age is held constant. 7 The striking point in this figure is its flatness. Over the span of years from 1993 to 2016, the wage ratio shows only minor fluctuations around a flat line. The absence of significant changes to the relative wages is consistent with previous studies which found the U.K. graduate wage premium to be stable in the 90s and early 2000s (Chevalier et al., 2004;McIntosh, 2006;Machin and Vignoles, 2006;Machin and McNally, 2007;Walker and Zhu, 2008). 8 The flatness of the ratio seems to us to be 5. The LFS underwent significant changes in 1984. Before 1984, it was a bi-annual survey. From 1984to 1991 6. The rate of increase and the catch-up to the U.S. is even clearer when the data is plotted by birth cohort (Appendix A).
7. In Supplementary Appendix, we present the college wage premium over the life-cycle by birth cohort. The differential is increasing over age in a concave pattern for each cohort. Because of this life-cycle pattern, one would expect the education wage ratio for the economy as a whole to increase as the population in our 20-59 sample ages, due to the baby boomers getting older. Holding age constant allows us to look past these composition related changes to the underlying wage changes.
8. Two earlier papers (OLeary and Sloane, 2005;Walker and Zhu, 2005), using data up to 2003, found the university premium to have fallen somewhat over the cohorts that experienced the higher education expansion. However, the authors later revised their cohort conclusions with more years of LFS data in Walker and Zhu (2008).

Figure 1
Notes: BA refers to individuals who have a bachelors or higher degree. We aggregate each dataset to the level of year and 5-year age band, and regress the BA proportion on year dummies and age-band dummies. The proportion BA numbers are year effects from these regressions plus the level in 1992 for the 30-34 age band. Source: Authors' calculation from the U.K. Labour Force Survey, the U.K. General Household Survey, and the U.S. Current Population Survey. striking in light of the near tripling of the proportion of the working age population with a BA over this same period. Our goal in this article is to provide an explanation for this pair of patterns.

The effects of composition shifts on the core patterns
One possible explanation for why such substantial increases in educational attainment were associated with little or no change in educational wage differentials is that compositional shifts are obscuring the true patterns. To see this, it is helpful to think of workers as bundles of efficiency units of tasks. More able workers supply a larger number of efficiency units per hour worked, and, in a standard neoclassical model, their observed wages will reflect this. As a result, observed average wages can increase either because of increases in the market price per efficiency unit or because the composition of workers shifts in the direction of a higher average number of efficiency units per worker. Since our result is that the observed college wage premium has not fallen as we might expect, the scenario of greatest potential interest is one in which the price differential for BA versus HS tasks declines while the differential in average efficiency units between BA and HS workers increases.

Observable characteristic composition.
Perhaps the most obvious compositional shift in terms of observable worker characteristics is related to the increase in female labour force participation. If the added female entrants with BA's are successively more able (compared to the added HS females) then their entry could hide a decline in the education differential in prices per efficiency unit. However, even the most cursory glance at the data indicates that gender composition shifts are not a source of problems since the wage patterns are the same for males and Ratio of BA median wage to that of high-school graduates 1993-2016, U.K.
Notes: Wage is hourly. The sample is 20-59 year olds in LFS 1993-2016. BA refers to individuals who have a bachelors or higher degree. We aggregate LFS to the level of year and 5-year age groups, and regress the log BA to HS median wage ratio on year dummies and age-band dummies. The figure plots the estimated year effects normalized to zero in 1993.  females. In Figure 3, we plot the Proportion of BA's and the college wage premium for males and females separately (again, obtaining year effects from regressions including age polynomials). For both genders, we see the dramatic increase in the BA proportion after 1993, with a faster increase for females. The wage differential remains flat over time for each gender, with each series showing nearly identical values for the differential in 1993 and 2016, and so a change in weighting between men and women would not alter the overall wage picture.

REVIEW OF ECONOMIC STUDIES
In Supplementary Appendix, we present several further exercises. First, we consider the increase in the proportion of university degree holders with post-graduate degrees. We show that replotting the wage line in Figure 2 including and not including workers with post-graduate degrees among the BA's does not change the main pattern: both lines show nearly identical values in 1993 and 2016. This is a reflection of the fact that, while the proportion of workers with a postgraduate degree increased rapidly, the proportion of university graduates with these degrees was still small at the end of our period. Second, we consider immigration as another potential source of compositional change since the proportion of U.K. workers born outside the U.K. doubled over the past two decades and immigrant returns to education are lower than those of the native born (Dustmann et al., 2013). However, the combination of strong increases in education with no accompanying changes in the college wage premium is present even if we look at the U.K. nationals alone, implying that composition changes related to immigration are not driving our main patterns. We also break the data down into public versus private sector employment and wages. Over the sample period, the public sector's employment share has remained around 25%. Both sectors saw very large increases in the BA proportion, with somewhat faster increases in the private sector. Both sectors again experienced relatively flat movements in the college wage premium, though the private sector trend is slightly more negative (amounting to about a 3% decline over the period from 1994 to 2016 as shown in Figure 8 in the Supplementary Appendix.) Overall, we conclude that shifts in composition with respect to observable worker characteristics cannot explain our main pattern of substantial education increases paired with an invariant education wage premium.

Unobservable characteristic composition.
It is still possible, of course, that changes in the composition of unobservable characteristics has shifted across education groups in a way that could explain the wage patterns. As higher education expands, it draws in pupils from a wider and wider range of prior attainment and perhaps innate ability. The expansion of university education in the U.K. after 1988 came with a fall in per student resources and was accomplished in part by transforming polytechnic institutions into universities. Both of those changes might also have had a negative impact on the quality of courses and hence of graduates. Thus, it seems possible that the average quality of BA workers has declined across cohorts. It is important to note, however, that this does not necessarily imply that the observed college wage premium is biased one way or the other relative to the composition constant differential. The quality of HS-educated workers is also likely to fall if the more able individuals among those who would have stopped at a HS education level in earlier cohorts now go to university and if some of those who would have been HS dropouts previously now obtain secondary qualifications. Thus, it is theoretically ambiguous whether the ability-composition constant college wage premium is greater or smaller than the observed one.
The idea that BAs have a lower and wider range of quality after the higher education expansion has been advocated in OLeary and Sloane (2005) and Walker and Zhu (2008). Both papers use quantile regressions to estimate the university wage premium across different periods or cohorts, and they report a greater decline in the premium at lower quantiles than at higher quantiles. While it's tempting to interpret such results as evidence of declining quality of BAs at the lower end of the BA wage distribution, examining the wage distributions for BA and HS workers separately suggests a different conclusion. Working with 5-year wide birth cohorts, in Supplementary Appendix we show that the decline in the wage differential at lower quantiles is driven by relative increases in lower end wages for the HS-educated. The 50-10 differential of the BA wage distribution is unchanged across cohorts entering the labour market in our period. Thus, it is difficult to conclude that the fall of the graduate premium at lower quantiles is due to a greater deterioration in the quality of BAs than HS workers at their respective lower ends.
In Supplementary Appendix, we also present a bounding exercise to examine the limits of the potential impact of shifts in the distribution of unobservable characteristics on the college wage premium. We work at the level of 5-year birth cohorts because any such shifts would be clearest in looking at different cohorts of potential university graduates. Our exercise follows Manski (1994), Blundell et al. (2007), and Lee (2009), and works directly from a bounding approach in a Roy Model context set out in Gottschalk et al. (2014).
Underlying our approach is an hierarchical model of ability. In this model, there is a single, unidimensional ability that is more productive the higher is an individual's education level. Under standard assumptions on costs, higher ability individuals sort to higher levels of education. In this situation, there is a set of individuals (or, more properly, ability levels) who choose to go to university even in the pre-expansion period when universities were more costly to access. With the expansion of the university system these "university stayers" continue to get a higher education but they are joined by a set of "university joiners" who have been induced to enter university by the declining costs. Thus, the pre-expansion wage distribution for BA's consists only of university stayer wages while the post-expansion BA distribution includes both stayers and joiners. We have no way of identifying who is a stayer and who is a joiner in the post-expansion distribution, but by making extreme assumptions on which workers are joiners, we can construct extreme bounds on the median wages for stayers. Comparing those bounded values to the median wage for BA's before the expansion (who, remember, consist only of university stayers), we get bounds on movements in the median university stayer wage. Since the stayers are a consistent group over time, these bounds reflect wage movements for a composition constant group.
We can make one of two extreme assumptions in order to form bounds. In the first, the "joiners" are the lowest wage earners in the post-expansion cohort wage distribution. Thus, the "stayers" wage distribution can be obtained by trimming from the lower tail of the observed wage distribution the proportion by which the set of university educated workers has expanded between the two cohorts (where the proportion is expressed as a proportion of the post-expansion set of BA workers). At the other extreme, the "joiners" would be better workers. But as Gottschalk et al. (2014) show, under a standard Roy model, the "joiners" can be at best as good as the "stayers." If they were better than they would already have entered the BA sector. Thus, the other bound is the actual observed post-expansion distribution. Performing an analogous exercise with HS workers, we can form bounds on movements in the high school wage and on the college premium. 9 We present detailed results from the bounding exercise in the Supplementary Appendix. The nature of the exercise is such that the bounds are defined as movements relative to a base cohortin our implementation, the 1965-69 cohort. That cohort entered university age just before the major policy generated university expansion that began in 1988 and had a university graduate proportion of 0.16. The following two 5-year birth cohorts (born 1970-4 and 1975-9) represent the main part of the increase in educational attainment. For the 1975-9 cohort, the proportion graduating university reached 0.34. The bounds on the change in the college wage premium between the 1965-9 and 1975-9 cohorts range between an upper bound of 0 and a lower bound of −0.05. That is, even under extreme assumptions, the movements in the relative wage distribution and the proportion of each cohort who graduated university fit with very small changes in the 9. In forming the ratio, we use the benchmark case where the upper bound scenarios for the BA and HS workers correspond to one another (i.e. the movements out of the top of the HS distribution become the movements into the bottom of the BA distribution). We can then obtain one bound on the movement in the university-high school wage differential by taking the difference between the upper bound on the movement in the university median and the upper bound on the movement in the high school median. The other bound is the actual change in the median wage ratios.

152
REVIEW OF ECONOMIC STUDIES college wage premium. In the following cohorts-ones over which the proportion with a university degree increased at a much slower rate-the bounds move to around −0.15 for the 1985-9 cohort. Re-examining Figure 2 in light of this finding, it is possible to see a small (though statistically insignificant) decline in the college premium after 2010. To the extent this is true, it would suggest a decline in the premium that occurs after the main increases in the educational supply. We will return to that possibility later in our discussion. But, our overall conclusion from the bounding exercise is that, under this model of ability, selection on unobservables cannot explain why we do not see a large decline in the education wage differential for the cohorts with the largest increase in their education level.

TECHNOLOGICAL LEADERSHIP AND MODELS OF TECHNOLOGICAL CHANGE
To this point, we have established that since the mid-1990s, the U.K. experienced a substantial upgrading in the education level of its workforce but virtually no change in the wage differential between university and high school educated workers. The obvious implication is that the increase in the relative supply of more educated workers was exactly offset by an increase in the relative demand for more educated workers. That type of skill biased demand shift is, of course, the focus of a very large literature in which much of the attention focuses on the role of technological change. We are convinced by papers such as Bresnahan et al. (2002), Caroli and Van Reenen (2001), and Bloom et al. (2014) which argue that the key technological change in recent decades is broader than just the use of computer hardware and software in specific tasks, taking in changes in organizational form that make use of newly invented IT features. For that reason, we will couch our investigations of the impact of technological change in that wider, organizational context. One can think of the interaction of increased human capital attainment with technological change in terms of three main models. The first is one in which the technological change is exogenous: a new technology is introduced for an unspecified reason and is so dominant in terms of cost savings over existing technologies that it is adopted on a wide scale. Wage differentials are then determined by the interaction of relative demand shifts arising from this technological change (hinging on the skill bias of the technological change) and shifts in supply. Early versions of this model that focused directly on the college wage premium have generally been shown not to fit the data well (Card and DiNardo, 2002;Beaudry and Green, 2005;Acemoglu and Autor, 2011) but the more recent literature on polarizing changes in technology also has this broad form (e.g. Autor and Dorn, 2013). In all of these models, wage differentials reflect the classic race between technological change and education, with wages in higher skilled groups (defined by education or occupation) rising less if educational policy generates increases in the supplied labour in that group ((?)).
The second model type is one in which the invention of new technologies is a function of movements in the relative factor endowments in an economy. Thus, an increase in the education level in an economy provides an incentive for inventors to create new technologies that are relatively intensive in the use of higher educated labour (Acemoglu, 1998;Kiley, 1999). In this case, the relative increase in demand for skills is actually induced by the increase in their supply. Acemoglu (2007) shows that in cases where innovation is created by government funded research or by monopolistic or oligopolistic firms, if the elasticity of substitution between skilled and unskilled labour is high enough then an increase in the relative supply of skilled workers can induce an increase in the relative wage of the skilled workers. In this sense, in the context of this model, attempts to combat inequality by increasing educational attainment could backfire.
The third type of model is one in which a set of technological options already exist and firms choose among them. These endogenous choice models have the structure of a 2 sector by n factor trade model, where the sectors correspond to different technologies, and inherit implications of that model. In particular, if n > 2 and all factors are inelastically supplied then these models can yield the same implications as the induced invention models, i.e., that increases in the relative supply of skill can generate increases in the skilled wage differential (Beaudry and Green, 2003;Beaudry et al., 2010). On the other hand if all but two of the factors are perfectly elastically supplied (as one might expect if new organizational capital, for example, requires a one-time investment but widely accessible information thereafter) then even large increases in the relative supply of educated labour will leave skill group wages unchanged if the economy remains within a region in which both the new and old technologies are in use (the cone of diversification) (Beaudry and Green, 2003). 10 We believe this class of models fits with the spirit of the literature on decentralization and organizational form which, starting with Milgrom and Roberts (1990)'s seminal contribution, often approaches organizational form as something firms optimally choose given existing options (e.g. Caroli and Van Reenen, 2001;Bresnahan et al., 2002). It also follows a line of reasoning dating back to Griliches (1958) which emphasize endogenous adoption of technologies as the cost of adoption changes (see e.g. Doms et al. (1997) and Borghans and ter Weel (2007). Deciding which of these models is relevant for an economy is important because, as we have just described, they can have quite different implications for the effect of education policy on inequality. But which model is relevant is potentially context contingent. There may be technologies that are so superior that the exogenous technical change model is clearly relevant (though we suspect those situations are extremely rare). On the other hand, in economies that are technological leaders in time periods when new technological possibilities are opening up, the induced invention model may be more appropriate. However, for other, following economies (and even in the technological leaders in periods after the initial invention is complete) the endogenous technological choice models, with firms choosing from an already invented set of options, may be the most relevant.
Much of the theorizing about these different models has been done with the U.S. economy and U.S. stylized facts in mind. But the U.S. context may be quite unique. In particular, we will argue that there are good reasons to believe that the U.S. has been a technological leader in the development of skill biased technologies and their associated organizational forms in recent decades. The U.K.-and, potentially, other developed economies -are, then, technological followers. In the remainder of the paper, we investigate the claim that the U.K. is a technological follower and that, as a result, endogenous technological choice models best describe the functioning of its economy.

Testing among models of technological change
In this section, we use an empirical specification derived from a relatively general production function with U.K. data to establish the claim that the exogenous technological change and endogenous innovation models do not match patterns in the U.K. data market in recent decades. Given that, in subsequent sections, we set out a model of endogenous technological choice and investigate its implications, including for the wage specifications derived and implemented in this section.

REVIEW OF ECONOMIC STUDIES
To investigate the various models, we derive an empirical specification that nests all three models. Here, we provide a brief description of the derivation, with details in Appendix A. We adopt a specification set out in Beaudry and Green (2005) in which there is an aggregate production function given by, F(θ st S t ,θ ut U t ,K t ), where S t is skilled labour used in production, U t is unskilled labour, K t is capital, and θ st and θ ut are skilled and unskilled labour enhancing technological change parameters, respectively. Given the focus of the existing literature and to keep the discussion simple, we assume that technological change is labour enhancing, implying that our specification does not nest factor neutral technical change. We discuss the implications of using a form of factor neutral technical change in Supplementary Appendix. We will also assume that F(.,.,.) is constant returns to scale. Apart from that, the production function is left purposefully general so that it can be seen as reflecting any of the three models of technological change. Because we are concerned that there could be age effects arising from the movement of different sized cohorts into the education system, we follow Card and Lemieux (2001) in assuming that both skilled and unskilled labour can be written as CES aggregates of labour supplied by is the amount of skilled labour from age group j that is employed in period t, U jt is defined analogously, j and j are age specific factor augmenting parameters, and σ a is the elasticity of substitution between age groups within a skill group. In our estimation, we use over time variation within geographic sub-regions in the U.K. but in our initial exposition we will focus on a single region, suppressing the regional subscript.
Assuming competitive labour markets and employing a log linear approximation, we obtain, and, where, lnS jt = (lnS jt −lnS t ) and lnŨ jt is defined analogously. Concavity of the production function implies β 1 −β 2 ≤ 0 and α 1 +α 2 ≥ 0. The difference between the two log wage expressions gives Equation (3) is a generalization of the specification in Card and Lemieux (2001). In that paper, as in most papers in the skill biased technical change literature, only a relative wage equation is estimated. But there is relevant information in the underlying wage equations as well, and we will focus on the skilled wage equation along with the wage ratio equation. With estimates of those two, the unskilled wage equation is redundant.
In order to take the skilled wage equation and the relative wage equation to the data we need to address the fact that the productivity parameter ratio ln θ st θ ut and the θ ut parameter that enter both equations are unobserved. We address these issues using the approach in Beaudry and Green (2005), capturing general productivity increases with measured TFP and allowing for exogenous skill-biased shifts using a quadratic function of time. This allows for a bit more flexibility than the common linear skill biased technical change assumption, which is obviously nested in this specification.
Based on this, we arrive at an estimable specification for the skilled wage equation similar to the one in Beaudry and Green (2005), given by: where 1gjt is an error that contains approximation error and is assumed to be independent of the right hand side variables. We also obtain a relative wage specification given by: where, again, 2gjt corresponds to approximation error. Note that both equations include a complete set of age band effects. In addition, we have introduced a subscript, g, corresponding to geographic region. We include a complete set of region effects (d 0g ) and, so, are using withinregion and age group, over-time variation. We construct the wage and employment variables at the region by age group by time level, but it is important to highlight that neither the TFP t variable components nor K t have g subscripts, i.e., the relevant values for both are assumed to be at the national level. For TFP t , this reflects an assumption that technologies are available equally in all regions of the country. The same assumption underlies the lack of a g subscript on the time trend coefficients. For K t , the corresponding assumption is that the capital market is national. With capital and technology defined at the national level, we use regional level data to see how regional variation in skill supplies alter sub-national differences in technological adoption and, so, wages. We view differences in regional outcomes within a common capital market as a good scenario in which to examine implications of the relationship between skill supplies and wages. The detailed derivation of these equations in the Appendix provides the direct mapping of the b and d coefficients onto the underlying structural (α, β, and σ a ) parameters. We also include there a discussion of the conditions under which our specification reduces to the Card and Lemieux (2001) version of the canonical specification, which does not include capital or TFP terms.
The exogenous technical change model and the induced innovation model have similar testable implications for the estimated coefficients in our model. In particular, in the canonical exogenous technical change model, the d 2 coefficient equals − 1 σ , where σ is the elasticity of substitution between skilled and unskilled labour, and must be negative (Card and Lemieux, 2001). Further, the coefficients on the time variables in the wage ratio equation should imply a positive and significant trend, representing the exogenous technological shift favouring skilled workers. Given our expanded specification, skill biased technical change could, alternatively, show up as a positive and significant coefficient on lnTFP t (s u t +s s t ) in the wage ratio equation, implying that observed technological change favours skilled workers. In the endogenous innovation model, holding technology constant (as we do using the combination of the time trend and TFP), d 2 is also equal to − 1 σ and, so, faces the same restrictions as with the exogenous technological change model (Acemoglu, 2007, equation (18)). Further, if we estimate a specification in which we do not control for technology then the coefficient on ln S gt U gt in the wage ratio equation is an amalgam 156 REVIEW OF ECONOMIC STUDIES of the substitution effect and a potentially offsetting innovation effect that would raise the relative wage of skilled workers. The theory implies a connection between the estimated coefficients with and without controls for technology: if the elasticity of substitution estimated when controlling for technology is large then the effect of a shift in relative skill supply on the relative wage should be large and positive. Thus, in order to test the implications of the innovation model, we implement our full specification as well as a specification in which we do not include either time or TFP variables. For comparison to previous estimates, we also estimate the Card and Lemieux (2001) variant of the canonical model for the wage ratio, i.e., a specification that includes all the variables in (5) except lnTFP t and ln K t U t . In the appendix, we present further specifications in which we drop ln K t U t and replace it with the log price of capital, lnr t . Our conclusions are robust to these variations.
We use U.K. LFS data from 1993 to 2016, restricting our sample to 20-59 year olds for whom we observe wages and education. We aggregate to the level of cells defined by 5-year wide age groups and geographic regions, which allows us to control for compositional changes associated with the growing importance of London and other urban centres in our time period. For sample size reasons, we pool the data in 3 year groups. 11 Within each age × region cell, we obtain the median real log wage for BA and for HS workers. We take the difference of those to form our wage gap dependent variable. We measure S gjt and U gjt as the total number of hours worked by BA and HS workers, respectively, who are in region g, age-band j and year t. We measure S gt and U gt as the simple sums of S gjt and U gjt across age groups within a region. 12 We scale S gjt ,U gjt ,S gt , and U gt so that the aggregate hours supplied each year g (S gt +U gt ) matches the national time series from the Office of National Statistics (ONS). 13 We obtain aggregate TFP series, capital and aggregate hours from the ONS. 14 It is worth emphasizing that our estimates are based on variation within region x age cells over time. In Figure 4, we show the variation we are using by plotting long differences (between 1993-5 and 2014-6) in ln w sgjt w ugjt against long differences in ln S gjt U gjt for all our regions for one of our age groups (30-34 year olds). Plots for other age groups show the same pattern. In particular, there is considerable variation in changes in ln S gjt U gjt across regions, ranging from just over 1.1 log point increase over the 20 years in Northern Ireland to a high of over 1.5 log points in London and with an even spread in between. Matching that is little change in the within region/age 11. The three year groups are 1993-5, 1996-8, etc. We use the LFS "Regions of Usual Residence" as our definition of geographic regions. There are 19 such regions including, for example, London, Rest of South East, Greater Manchester, and the Western Midlands. These regions are consistently defined over the whole of our sample period. Sample size issues related to the reporting of wages prevents us from using a more detailed geography such as the one used in the organizational forms exercise later in the paper. We treat our production function as being at the level of the region, implying that all of our variables now have a g, for geographic region, subscript. The only exceptions are the capital and TFP variables. We assume that both capital and technological ideas flow freely across the regions in the country, implying that the country-aggregate levels of those variables are relevant.
12. This deviates from the theory in which the aggregates are functions of σ a , j , j . We do this for simplicity and transparency so that we aren't forcing this element of our specification on the data. Since our estimates of σ a imply very high substitutability across age groups, the results change very little when using the CES aggregates with estimated parameters rather than simple sums.
13. The simple sum of hours in our sample every year would deviate from the true aggregate hours because education is missing to varying degrees over time and our sample selects 20-59 year olds only.
14. The TFP series is the annual series of multi-factor productivity from ONS' release "Multi-factor productivity estimates: Experimental estimates to Q2 2017." Our capital measure is the annual series called "Contribution of capital services to GVA growth (percentage points)" in the same ONS release. Aggregate hours is "labour hours" from the same ONS release. This ONS release can be found https://www.ons.gov.uk/economy/economicoutputandproductivity/ productivitymeasures/articles/multifactorproductivityestimates/experimentalestimatestoquarter2apriltojune2017here.  group wage ratio, with most of the long term changes in the ratio being under 10% in absolute value. The correlation between the two series is only 0.15 and is not statistically significantly different from zero. When we put ln S gt U gt instead of ln S gjt U gjt on the x-axis, we get a similar pattern of weak correlations comparing only between regions. Thus, our data have considerable over-time variation in changes in employment ratios across regions matched with small changes and little variation in the change in the wage ratio. This core moment in the data is what is driving our estimate of the coefficient on logS gt /U gt in Table 1.
We present the results from our specifications in Table 1. The first two columns contain estimates of the skilled wage equation and the wage ratio equation by OLS. The second two columns contain 2SLS estimates aimed at addressing the potential endogeneity of the employment levels of the inputs. We instrument for ln S gt U gt by using the education reform. In particular, we form a Bartik style instrument in which we interact the proportion of the population in a region in 1993 (the start of our data) who were born in 5-year-wide birth cohorts with the growth in the proportion of that cohort who obtained a BA at the national level. The idea behind this instrument is that regions with a higher proportion in the cohorts that were most directly affected by the education reforms (those born between 1970 and 1974 and between 1975 and 1979) would face a stronger increase in the relative supply of skilled labour for reasons that have to do with historical fertility patterns that are plausibly independent of later education trends. We also construct an instrument as the interaction of the proportion of the parental generation for the 1970-9 birth cohorts who themselves had a BA with the national growth rate in the proportion of workers with a BA. This  Card and Lemieux (2001). All specifications include complete sets of age-band and region dummies. *** P < 0.01, ** P < 0.05, * P < 0.1.
is intended to capture the idea that children in locations with more educated parents were more likely to take advantage of the education reforms. Both instruments are strong predictors of ln S gt U gt in the first stage and do not suffer from weak instrument issues by any standard test. Using similar logic to the second instrument, for each birth cohort in each region, we construct the proportion of the "parental" cohort (the one born 25 years earlier) with a BA. We interact that proportion with the growth rate in the proportion with a BA for the specific child's cohort at the national level. Here too, the idea is that the growth in the BA share for an age group in a region will be related to the education level of the parents for that age group combined with the general increase in education level for their cohort. We use this as an instrument for ln S gjt U gjt but have to restrict our attention to age 20 to 44 year olds because the first stage is weak when we include older individuals since there is little variation in the proportion of the parents' generations with a BA for the older age groups. Finally, we instrument for ln K t U t using the interest rate. The theory underlying our specifications implies several restrictions. The results reported in Table 1 have not imposed these restrictions; imposing them would make little difference to the key estimates and we will show them in Appendix A.

Assessing the exogenous and endogenous skill biased technological change models
The estimates from our wage specifications do not fit with either the canonical exogenous SBTC model or the induced skill biased innovation model. The first strike against these models is the lack of any substantial effects of the skill supplies on the wage ratio. The estimated coefficients on logS t /U t in column (1) (OLS) and column (3) Katz and Murphy (1992)). Thus, even in a generous interpretation, the coefficient would imply very high and possibly perfect substitutability between skilled and unskilled labour. This is very problematic for both the exogenous and induced innovation models since changes in relative demand created by either exogenous or endogenous technical change cannot move relative wages if the skill groups are perfect substitutes. As a side point, the age-specific skill supply coefficient is also close to zero (−0.038 in the OLS, and the wrong sign in the IV), implying a huge substitution elasticity between age groups (above 25). By comparison, Card and Lemieux (2001) estimated this elasticity to be in the [4,6] range.
The second strike against the exogenous SBTC model is found in the coefficients on the time and time squared variables in the wage ratio equations. Recall that these are intended to capture the path of the ongoing skill biased technological changes. The IV estimates imply a negative trend while the OLS estimates imply technological change effects that are small and move from positive in early years to negative in later years. These results are robust to different specifications. Such a pattern, in which technical change is small and either against skilled labour from the outset (IV) or turning against it in later years (OLS) does not fit with the exogenous technical change model.
These conclusions are reinforced in the last column of the table which contains estimates from the implementation of the classic Card and Lemieux (2001) specification that includes only the linear time trend, the overall skill supply ratio, the skill supply ratio at the age group level, and a complete set of age and region effects. From this specification, we can see that our estimates of the skill supply and time effects in our main specification are not being determined by the inclusion of the TFP and capital variables. The estimated coefficients on both the time trend and the relative supply variables are statistically insignificant and of the wrong sign. The coefficient on the age group specific relative supplies is also small and statistically insignificant. At best, using the extremes of the confidence intervals, these estimates imply that skilled and unskilled labour are close to perfect substitutes, different age groups are close to perfect substitutes, and there is little or no ongoing skill biased technical change. We view these data patterns as a repudiation of the exogenous skill biased technical change model for the U.K. in the period after 1992. 15 The estimated TFP effects provide further evidence against both the exogenous and endogenous skill biased technological change models. The TFP variable has a small and statistically insignificant effect on the wage ratio in column 3 of the table. Combined with the positive and statistically significant effect of TFP on the skilled wage in the estimates in column 4, the implication is that there is technological growth in this period but that skilled and unskilled workers benefit from it to an equal degree. Thus, the data does not fit with technology, as captured by TFP, being skill biased: the core feature of both of the first two technological change models. 16 All of these implications apply to both the exogenous and endogenous skill biased technological change models, but the endogenous technological change model has added implications. In particular, if we do not control for technological change then the impact of changes in the skill ratio on the wage ratio no longer has a determinate sign. The negative substitution effect that is estimated when controlling for technological change is combined with an effect on innovation that can generate offsetting, skill biased demand shifts. Under some circumstances, the latter effect dominates and the estimated coefficient in the relative wage regression without 15. These patterns are robust to imposing the theoretical restrictions on coefficients in equations (4) and (A.6) and to excluding London, out of concerns that it is big enough to be driving the results on its own. 16. A referee pointed out that problems with the exogenous and endogenous SBTC models can also be demonstrated in a calibration exercise. In Supplementary Appendix, we show that if one assumes typical values from the U.S. literature for σ and σ a then the combination of the implied path for ln θst θut of observed TFP implies a very strongly declining path for θ ut that is unrealistic.

160
REVIEW OF ECONOMIC STUDIES technology controls can be positive. In column 5, we present estimates of the wage ratio equation without the TFP and trend variables. We also drop ln K t U t in order to obtain a specification similar to what is implied in Acemoglu (2007). The estimated coefficient on the relative skill supply variable is close to zero and statistically insignificant. For this to be the case, σ , the elasticity of substitution between skilled and unskilled labour should be near 2 in the endogenous innovation models in Acemoglu (2007) and Acemoglu and Zilibotti (2001). Instead, our estimates in columns 1 and 3 have the opposite sign and even the lower bound of the estimates indicate much larger σ values. The lower bounds of estimates in columns 1 and 3 would imply the effect of the relative skill supply in column 5 should be much larger. Either way, the data patterns are not consistent with the implications of the endogenous innovation model.
One possible response to our concerns about the model of exogenous SBTC of the type embodied in Card and Lemieux (2001) is that it is an older version of these models which has been supplanted by models of technological change and polarization. This has happened, in part, because other papers have similarly concluded that the exogenous skill biased technical change model does not fit even the U.S. data well either (e.g. Card and DiNardo, 2002;Beaudry and Green, 2005;Acemoglu and Autor, 2011). To look further into the role of polarization in the U.K. wage and employment structure, in Table 2 for 30-34 year olds, we present average real wages (in the first column of the first panel) and proportions of employees (in the first column of the second panel) in each of 9 one digit occupations in 1993. The occupations are ranked by their average real wage. In the second columns in each panel we present the change in either wages or proportions between 1993 and 2016.
The second column for employment proportions shows an approximate U-shaped pattern, with growth in employment shares in the top three occupations, declines in the middle (largely routine) occupations and growth in personal services. The relationship is not perfect since the lowestpaid occupation ("elementary") shows a decline, but the pattern is broadly one of polarization. However, when we hold the education composition constant between the cohorts (in the last column), there are small declines in employment in the top three occupation groups and essentially no change in processing and skilled trades in the middle. There is some added evidence of relative growth at the bottom of the distribution. The main conclusion, however, is that the right branch of the U-shape in employment growth in the U.K. is entirely attributable to the education shifts. That is, occupation shifts appear to us to be of secondary importance relative to education shifts in determining the changes in the wage structure in the U.K. Given that, we do not believe that polarization/task based versions of the exogenous SBTC theory provide a useful lens through which to understand the specific wage and employment patterns we are examining.
Taken together, we view the patterns of changes in wage levels, wage ratios, skill ratios, TFP, and capital for the U.K. in the last two decades as firmly rejecting both the exogenous skill biased technological change model and the endogenous skill biased innovation model for the U.K. for this period. Our view is that the endogenous innovation model is better suited to explaining movements in economies that are technological leaders where the innovation is taking place and that this does not describe the U.K. in this period. We elaborate on this claim in the next section.

Induced technological change and technological leadership.
Induced technological innovation models focus on the expansion of the technological frontier. As such, they are about countries which are the technological leaders and would seem to provide a better explanation for movements in leader than follower economies. Working within the induced innovation model, the country that is most likely to be the leader in skill biased technological innovation will be the one with the highest share of skilled workers. A high share provides an incentive for innovator firms to invent machines or forms of organization that complement skills. In 1980, on the cusp of the computer revolution, the U.S. was the leading developed economy in terms of education level. In that year, 22% of the U.S. population aged 25-64 had a tertiary education, which was by far the highest in the OECD (Lee and Lee, 2016). 17 Thus, incentives for innovators to generate human capital intensive technologies would have been highest in the U.S. Moreover, the U.S. has had the highest ratio of investment in ICT (Information, Computers, and Technology) capital to total non-residential gross fixed capital throughout the 1985-2010 period (OECD, 2017). The idea that the U.S. is the innovation leader is also supported by evidence in Bloom et al. (2012) showing that U.S. multinationals use a more decentralized structure relative to both domestic firms and multinationals from other countries even when all are observed operating in the same economy (the U.K.). On the other side, there is also good reason to believe that the U.K. is a follower in the area of skill biased technologies and their associated organizational forms. 18 Certainly, the U.K. was well behind the U.S. in educational attainment at the beginning of the computer revolution. This can perhaps be most clearly seen in data organized by birth cohort. For the cohort born between 1955 and 1959 in the U.K. (and who would have turned 25 in the early 1980s, at the outset of the computer revolution), 12% held a university degree by age 30 compared to 24% for the same cohort in the U.S. 19 For the cohort born a decade later, the numbers were 16% for the U.K. and 27% for the U.S.-the U.K. was still a laggard. Thus, viewed through the lens of the theory of induced invention, we would not expect the U.K. to have been a leader in skill-biased innovation. However, because of the educational reforms described earlier, by the cohort born between 1975 and 1979 (who turned 25 in the early 2000s), the U.K. had surpassed the U.S. with 34% attaining a university degree in the U.K. compared to 32% in the U.S. That increase in the educational 17. The next highest were Canada at 18% and Australia and New Zealand at about 15%, with the remainder of the OECD decidedly lower.
18. Classifying the U.K. as a technological follower could imply that we can analyse its wage patterns as the equivalent of a Southern economy in the analysis in Acemoglu and Zilibotti (2001). In their discussion, Northern economies innovate in response to relative skill changes in their workforces as described earlier. Southern countries, in contrast, do not innovate and take the technological level invented in the North as given. However, with no innovation response channel in the South, increases in the relative supply of skill in their workforces will necessarily induce a decline in the skilled-unskilled wage ratio. As we have seen, this does not fit with the wage patterns in the U.K. in recent decades.
19. These figures are computed from the U.K. LFS for the years 1992-2015 and the Outgoing Rotation Group sample from the U.S. Current Population Survey for the same years.

162
REVIEW OF ECONOMIC STUDIES attainment of new labour market entrants in the U.K. could have provided the conditions for firms to adopt the technologies previously developed in the U.S. Interestingly, the proportion of investment that was in ICT capital shot up in this decade in the U.K., approximately doubling at the same time the proportion of new labour market entrants with a university education also doubled (OECD, 2017). 20 Further, the evidence in Bloom et al. (2012) about use of decentralized organizational forms also suggests that U.K. firms were following rather than leading. They argue that U.K. firms were laggards in adopting decentralized structures because of regulation based inflexibilities. We offer an alternative explanation: that at the time of the development of the new IT related structures, the lower education level in the U.K. implied it was less profitable for U.K. firms to adopt the new approach. Then, as the U.K. education level increased, the U.K. underwent a technological transformation. We think that these patterns fit most naturally with models of technological choice and we turn to a model of this form in the next section.

A MODEL OF EDUCATIONAL CHANGES, TECHNOLOGICAL CHANGE, AND DECENTRALIZATION
In this section, we set out a model of technological choice in a situation where newly invented technologies involve decentralized organizational forms made possible by IT innovations. We derive implications of the model at the macro level that we compare to our production function estimates and at the micro level that we investigate with workplace data in the following sections. The general framework we consider is one in which firms can choose to produce a single output either with a centralized (C) technology or a decentralized (D) technology. Having a single output is intended to emphasize the nature of these technologies as general purpose technologies that could be applied to the production of any product. Following Rosen (1978) and Borghans and ter Weel (2006), we will characterize production in engineering terms as having a Leontieff form in which a continuum of tasks, x, defined on the unit interval are required to produce an output. 21 The amount of each task required to produce one unit of output is given by the continuous function, α(x), x ∈ [0,1]. The tasks are performed by two types of workers: U (unskilled) and S (skilled). Total hours of work are inelastically supplied by each type of worker. Workers of each type are described by capacity functions, τ l (x), which are continuous functions defined on [0,1] determining the amount of time a worker of type l = U,S needs to produce the amount of task x required for one unit of output. Further, we assume that tasks are ordered from least to most complex and that S workers have comparative advantage in more complex tasks, i.e., τ S (x) τ U (x) is decreasing in x. Rosen (1978) shows that based on such a specification, one can derive a production function defined over n s and n u (the number of hours of S and U labour used, respectively) in which the firm allocates a given amount of S and U to each task in order to maximize output. In particular, firms will allocate skill groups according to their comparative advantage in the sense that there will be a task ρ such that all tasks, 0 ≤ x ≤ ρ are assigned to U workers and, conversely, all tasks ρ <x ≤ 1 are assigned to S workers. Further, ρ is declining in n s n u . Thus, if the relative number of S workers 20. The proportion of total non-residential fixed capital investment in ICT increased by 88% in the U.K. between 1990 and 2000. Only Finland and South Korea had faster growth in this proportion in this decade. In comparison, the proportion grew by 37% in the U.S. 21. This general form for production has become somewhat common in models of technological change, tasks, and polarization. For example, a variant of it is used in Acemoglu and Zilibotti (2001) and Acemoglu and Autor (2011) use this approach to provide a framework for interpreting existing research on tasks and technological change. Our model differs in the way we introduce decentralization and in our assumption that firms can choose between two such technologies. is small then they will only be assigned to the most complex tasks and as that relative number grows, they will be moved progressively further down the list of tasks ranked by complexity. The marginal rate of technical substitution between S and U equals τ S (ρ(n u ,n s )) τ U (ρ(n u ,n s )) , where we have written ρ as a function of n u and n s . Thus, profit maximizing firms will hire numbers of hours of U and S labour to equate the marginal rate of technical substitution to the wage ratio, w S w U (where, w S and w U are the skilled and unskilled hourly wages), allocating those hours optimally according to comparative advantage over the tasks required to produce. The result is a production function that reflects the efficiencies from taking account of the comparative advantage of the two types of workers and which is, itself, not necessarily Leontieff in form. In this sense, the ultimate production function reflects more than just the engineering "recipes" since it includes the optimal allocation of workers across the task combinations specified in the recipes.
As Rosen (1978) demonstrates, and as we draw in Figure 1, in the case with two types of workers (our case), the unit output isoquant intercepts both axes. The intercept on the N u (number of unskilled workers) axis equals 1 0 τ U (x)dx. As we move away from that intercept to the left, we begin to introduce S workers, replacing the U workers in the most complex tasks. Thus, the slope of the isoquant is given by τ S (ρ(n u ,n s )) τ U (ρ(n u ,n s )) and comparative advantage dictates the standard convex shape. The N s intercept is given by 1 0 τ S (x)dx. We will consider an economy with two possible "recipes" or technological forms. The first is centralized and takes the form as set out above, where we will now write the technological requirements function as α C (x) and the amount of time a worker needs to complete the number of tasks needed for a unit of output as, τ C l (x). In order to match patterns in the data, we delineate management tasks from other tasks. In the centralized technology, management tasks are necessary in order to co-ordinate the other tasks and the producers of the other tasks just focus on production of their part of the process, leaving communication and co-ordination to the managers. We will arbitrarily denote tasks on the interval [θ,1] as management tasks. To keep the exposition simple, we will assume that the α and τ functions are continuous from above and below at θ .
The alternative technological form is decentralized. Caroli and Van Reenen (2001) describe modern organizational forms as being "delayered" with "some decision-making being transferred downstream." Multi-tasking is also an important feature of this organizational form with the benefits that the firm becomes more flexible and managers have to spend less time monitoring and co-ordinating workers (Bloom et al., 2014). Thus, rather than having workers performing physical tasks without regard to others and having a manager who co-ordinates the outcome, in a decentralized form, workers both produce and co-ordinate with other task producers. As a result, less of the pure management task is needed. All of this is made possible by (i.e. is complementary with) IT technological change, which reduced the cost of diffuse information transfer.
We capture the differences in the decentralized form relative to the centralized form, first, by assuming that there is a lower requirement for the pure management tasks in the new form: where, the D superscript denotes the decentralized technology, and λ<1. For simplicity, we will assume that the requirements for the other tasks remain the same, i.e., α D (x) = α C (x),∀x <θ.
Following much of the literature on technical change and the labour market, we also assume that skilled workers are better at working with the new organizational form (Caroli and Van Reenen, 2001;Bresnahan et al., 2002 ). We represent this by assuming that skilled workers are perfect multi-taskers and can perform each of the non-management tasks in the same amount of time as before, performing the new, associated communications while 164 REVIEW OF ECONOMIC STUDIES they are doing them without extra effort (thanks to IT). For unskilled workers, performing each non-managerial task now requires more time since working with the new IT is more difficult for them. Further, skilled workers are able to take advantage of the new technology in management tasks while unskilled workers are not. Thus, and ,∀x ≥ θ and γ >1. We view this specification as capturing the notion of decentralization in papers such as Lindbeck and Snower (1996), Caroli and Van Reenen (2001), Bresnahan et al. (2002), and Bloom et al. (2012): that it is an organizational form in which decision making and communications are spread throughout the firm rather than being done by a small cadre of managers. We could allow for decentralization forms in which communication and decision making are differentially allocated across tasks but elect for the simpler form in which they are essentially allocated evenly across the non-manager tasks for expository clarity.
The literature emphasizes that decentralization has been enabled by the advent of IT. Much of the recent work on IT and the labour market also emphasizes impacts of the new technology in replacing routine tasks that tend to lie in the middle of the wage distribution. Following Borghans and ter Weel (2006) and Acemoglu and Autor (2011), we can model this effect by having the α values in middle tasks substantially reduced under the new (D) technology. Essentially, the idea is that IT capital performs those tasks and, thus, less labour is required in them. As described in Acemoglu and Autor (2011), the result will be a polarization in employment, with relatively more employment in low and high complexity jobs compared to those in the middle. However, this will not alter our main points about movements in educational wage differentials set out below. For that reason, we will not explicitly include the reductions in middle α's in our analysis for simplicity.
Given this setup, if there were only U workers in the economy then all firms would use the C technology since it would be cheaper at any given unskilled wage. Conversely, if there were only S workers in the economy, firms would use only the D technology. But we will start by assuming that the endowment of S and U workers in the economy is such that both technologies are in use (returning to the conditions under which that is true momentarily). We also assume that these are general purpose technologies that can be used for producing any good. Thus, to simplify, we assume both are used to produce a good which is the numeraire. Assuming free entry of firms and that output is the numeraire with a price of 1, that implies two zero profit conditions: where, w U is the unskilled wage, w S is the skilled wage, ρ C is the task dividing the U from the S tasks for technology C and ρ D is the threshold task for the D technology.
Several key points follow from these two equations. First, together they imply a factor price invariance result as in standard trade theory. Because ρ C and ρ D are determined by the equality of the wage ratio to the marginal rate of technical substitution (MRTS) in profit maximizing firms and the MRTS is given by τ S (ρ) τ U (ρ) (i.e. is technologically determined), everything on the right hand side of both equations can be written as functions of w U and w S . That, combined with the assumption that these are general purpose technologies and so are producing the same good with the same price, implies that we have two equations in two unknowns (w S and w U ). We show the solution diagrammatically in Diagram D1. The figure shows the unit output isoquants for the two technologies. The isoquant for the centralized technology intersects the number of unskilled workers (N u ) axis at n C u0 = 1 0 τ C U (x)dx, i.e., the total number of hours to produce one unit if only unskilled workers are being used. Similarly, its N s axis intercept is n C s0 = 1 0 τ C S (x)dx. The unit isoquant for the decentralized technology has a larger N u intercept because of our assumption that unskilled workers take longer to do non-managerial tasks because of the requirement to communicate as well as produce but get no advantage in terms of the time they require to perform management tasks. In contrast, under the decentralized technology, skilled workers require no extra time to do non-production tasks and can take advantage of IT to spend less time on managerial tasks. The result is an isoquant with a larger N u intercept, a smaller N s intercept and a lower slope at all values of x than the C isoquant. 22 Given the continuity assumptions and the comparative advantage assumption, the isoquants will cross once. That, in turn, implies that there will be a single unit cost line that is just tangent to the two isoquants, i.e., a single pair of w S and w U values at which both technologies are in operation.

Diagramme D1: Wage Setting with Two Technologies
Diagram D1 is, of course, a standard trade diagram with two technologies instead of two sectors, and the same conclusions follow here as in the simple trade case. Our assumptions about the two technologies implies that the C technology will be relatively U intensive and in an equilibrium in which both technologies are used, n C u * and n C s * hours of unskilled and skilled work, respectively, will be used to produce a unit of the output with this technology. Similarly, n D u * and n D s * hours of unskilled and skilled work will be used with the D technology. Rays defined by Those rays form the boundaries of the cone of diversification (the shaded area in Diagram D1). As long as the ratio of skilled to unskilled hours 22. To make the exposition simpler, we assume that λ·γ = 1. This implies that the isoquant is smooth at task θ . Without it, there would be a kink in the isoquant that would complicate the exposition but not the ultimate conclusions.

REVIEW OF ECONOMIC STUDIES
in the economy falls within that cone, both technologies will be in use. If, instead, N s then only the C technology will be used. This is simple to see in the figure since on rays with lower slope than n C s * n C u * , the cost line that is just tangent to the C isoquant will lie below the D isoquant, implying that it is less costly to produce just with C. Conversely, if N s N u > n D s * n D u * then only the D technology will be used.
What is of most interest to us is the implications for wage movements when there are increases in S relative to U. Given that equations (9) and (10) have a unique wage solution and are not functions of labour quantities, as long as both technologies are in use, changes in the amounts of S and U in the economy do not alter the individual wages or their ratio. This is the standard factor price invariance result from trade theory. Firms in the economy react to larger relative amounts of S labour not by increasing the amount they use with any one technology but by shifting toward the more S intensive technology (D). In fact, it is straightforward to show that a given increase in the ratio N s N u generates a more than proportionate increase in output from the D technology. 23 Following from this, the empirical implications from the model are as follows. First, if the two technologies are available and the skilled to unskilled labour ratio, N s N u , is in the cone of diversification then increases in the ratio of skilled to unskilled labour does not alter the wage ratio, w s w u , or the individual wages, w s and w u . Second, if N s N u rises enough then eventually all firms will adopt the D technology and then subsequent increases in N s N u will generate decreases in w s w u as in the standard one technology case. Third, assume that there are unskilled managers in the C technology before the increase in the skills in the economy (as is the case in our sample period), i.e., ρ C >θ. In that case, the ratio of the number of unskilled managers to skilled managers will decline as N s N u increases. This happens because U workers form a larger fraction of managers under the C technology (indeed, they may not be managers at all in the D technology given the comparative advantage set up) while S workers form a larger fraction of managers under the D technology. As the number of skilled workers rises, there will be a disproportionate shift toward the D technology that will imply more S than U managers overall even though the proportion of each type of manager will stay the same within each technology. Fourth, as N s N u increases, the proportion of S workers who are managers decreases. This is somewhat surprising given that the economy is shifting toward a more S intensive technology where more of the management positions are held by S workers. However, there is actually a smaller proportion of S workers who are managers with the D technology (since all S workers are managers in the C technology if ρ C >θ) and so as the economy shifts toward the D technology the proportion of S workers who are managers will fall. This is a reflection of the fact that in the decentralized technology, where S workers can both produce and communicate at the same time, S workers are used farther down into the task structure than in the C technology in equilibrium. Fifth, as N s N u increases and the economy shifts toward the D technology, we should see more workers in all parts of the production structure making decisions and communicating not just to their managers.
It is interesting to compare these implications to those from a more standard model with exogenous technical change. In Online Appendix, we analyse a model in which one technology is in use at a time. The production function is expressed as a function of managerial and production labour with skilled workers having a comparative advantage in managerial tasks. We characterize 23. To see this note that we can write the ratio of S to U hours employed in the economy as a weighted average of the ratios employed in the two technologies, i.e., Ns where, φ C is the fraction of output generated using the C technology. If the economy is in the cone of diversification, as Ns Nu increases, the two technology specific ratios do not change but φ C decreases. In fact, φ C must decrease more than proportionally to maintain the equality. skill biased technical change as a relative increase in the productivity of skilled workers as managers. This captures both that the technological change favours skilled worker and that it related to managerial tasks. The technological change arrives exogenously, i.e., it alters the production function firms face without their making a choice over whether to adopt it. In this scenario, we show that the ratio of skilled to unskilled wages will remain constant only if the relative supply of skilled workers in managerial tasks increases by enough to offset the increase in their productivity in those tasks. This is the opposite of the implication from our endogenous technological choice model in which the expansion in S is accompanied by a decreasing proportion of S workers who are managers.

Macro evidence
We begin our investigation of the relevance of our model of choice between a decentralized and a centralized organizational form by examining the model implications in relation to the wage and employment patterns documented in the earlier sections of the paper. The first implication of the model is that the substantial increase in the proportion of workers with a university degree should have no impact on either the college premium or skilled and unskilled wages individually. In Section 2, we showed that the college premium has not changed since 1992 even as educational attainment has soared and that this pattern cannot be explained as a result of compositional shifts in terms of observed or unobserved worker characteristics. This implication is borne out in our aggregate production function estimation where the coefficient on the relative skill supply variable in the wage ratio equations is small and never statistically significantly different from zero. As described earlier, the endogenous innovation model can also predict this zero effect but Acemoglu's description of the timing of the reaction of an economy to an increase in its relative skill supply involves an initial decline in the wage ratio followed by an increase as the effects of new inventions gradually take hold. In the figures in Section 2, instead, the relative wage stays constant throughout the period of greatest education expansion-a pattern that is predicted by the technical choice model. The technical choice model also has the stronger implication that underlying the lack of movement in the wage ratio should be a lack of response of the skilled and unskilled wages individually to the relative supply shift. As we discussed earlier, the estimated coefficients on S t U t in Table 1 imply that movements in the skill ratio have no effect on either skilled or unskilled wages. These zero effects fit with the picture of the isoquant in Diagram D1. In that figure the isoquant for the economy is formed as the envelope constructed using the Centralized isoquant to the right of n C u * , the straight line connecting the points, n C u * ,n C s * and n D u * ,n D s * , and the Decentralized isoquant to the left of n D u * . The flat portion of the isoquant matches a flat section of the aggregate production function corresponding to the range of factor employment values in which the economy is operating in the cone of diversification with both technologies in use. That section being flat corresponds to the effect of S t U t on the levels of both wages being zero, which we have just seen is true. It also implies that the determinant of the Hessian of the production function should equal zero. We can construct an estimate of that determinant as either, (b 2 ·d 4 −b 4 ·d 2 ) or (b 2 ·−d 3 −(1−b 3 )·d 2 ). These take values of 0.007 and −0.033 from the OLS estimates and −0.16 and −0.12 from the IV estimates, all of which are not statistically significantly different from zero at the 5% significance level and all but one of which are about the same size in absolute value as their associated standard errors. Thus, our production function based estimates fit with the model implication that the U.K. economy was operating in a region in which the production function had a flat spot in our time period. This is a very specific implication of our technological choice model. It is worth noting that one could allow ongoing technological changes in both the technologies in our model, which would be represented by inward shifts in the unit isoquants in Diagram 1. However, if the rates of technological change were different in the two technologies then the wages and their ratios would change over time and the estimated production function would not have a flat segment. An equal technological change in each would be captured in a TFP measure that did not affect the wage ratio, as is the case in our estimates.
Taken together, we see the evidence from the production function estimates as fitting with technological change affecting the labour market through two channels. The first is a general shift out in the production possibilities frontier that is captured in our TFP measure. The fact that TFP changes induce wage level changes but no change in the wage differential implies that this element of technological change is skill neutral. It may reflect forces affecting productivity other than the IT and skill related changes that we emphasize here. Controlling for movements in TFP, changes in the skill ratio have no impact on wage levels or the wage differential, fitting with our model of endogenous technological choice. Thus, our evidence suggests a non-biased general shift out in the frontier with skill related technological changes corresponding to changes in the choice of the point along a given frontier (i.e. holding TFP constant). The result that the economy is operating on a flat portion of the production function in this period is a key piece of evidence in favour of this view.
The other implications of the model at the aggregate level have to do with occupational composition. In particular, as the relative number of workers with BA's increases, management roles should be increasingly taken over by BA educated workers. Thus, the model predicts that the proportion of managers who have a BA should increase across cohorts. In the left graph of Figure 5, we plot the proportion of managers who have a BA over time. The plots are for 30 to 34 year olds in order to hold age composition constant. There is clear evidence of a large shift in the direction predicted by the model: approximately 25% of managers had a BA in the early 90s compared to over 50% after 2010. 24 At the same time, the proportion of the BA educated workforce employed as managers should decline according to the model. In the 2nd graph of Figure 5, we plot the proportion of BA workers employed in management jobs, again focusing on age 30-34. We see that 23% of BA workers were managers in the early 90s compared to 19% after 2010. We argued earlier that the pattern depicted in the two panels in Figure 5 fits with our model but does not match the predictions of a standard model with an exogenous technological change favouring educated workers.

Micro evidence
We turn, next, to using micro data to examine the main implication of the model: that firms in locations with larger increases in the relative number of educated workers make greater use of decentralized organizational forms. Our hypothesis is that in a more decentralized and de-layered organizational structure, workers will be given more autonomy and will report greater influence over their work. We are interested in whether an increase in the relative supply of education skills induces a shift toward a more decentralized organizational form as measured by this marker. We examine this question using the U.K. Workplace Employment Relations Survey (WERS). The WERS is a survey of workplaces that includes questionnaires both for the manager as well as for Notes: We define managers as the first major group under U.K. SOC2000. The occupation classification in the LFS changed from SOC90 in 2000 to SOC2000 in 2001, and then to SOC2010 in 2011. We map the other two classifications to SOC2000 in a probabilistic way, using a matrix from the ONS for the latter period, and a self-constructed matrix based on dual-coded data in 2000-1. The left graph shows the break points in the time series for when the classification changed. a subsample of employees. 25 We focus on employees' responses to three questions: "How much influence do you have about the following?" 1. "The range of tasks you do in your job," 2. "the pace at which you work" 3. "how you do your work." The responses for each question range from 1 "A lot" to 4 "None." These questions are included in the cross-sectional WERS surveys for 1998, 2004, and 2011. Rather than use these questions separately we implement a principal components analysis to compute an index of the ability of workers to influence their own work. We define the index as 4 minus the first principal component, so that the index is higher where more employees report having more influence. The index accounts for approximately 80% of the total covariance among the three questions. Finally, we normalize the influence index to have mean 0 and standard deviation 1 in the 3-wave-pooled sample. We view the answers of "A lot" to these questions as reflecting a decentralized workplace where decision making on what to do and how fast to do it has been devolved to workers. In this, we follow Bresnahan et al. (2002) and Bloom et al. (2012) who implement surveys of managers to capture organizational practices. Their decentralization measure is based, in part, on "individual decision authority" which reflects whether workers control their "pace of work" and "method of work." Table 3 lists the overall mean and standard deviation of the influence index by WERS wave and education of employees. Across all firms, there has been a nearly 0.6 standard deviation increase in the mean influence index value between 1998 and 2011. Thus, there is a clear general trend toward decentralization of decision making. We examine differences between more and less educated workers in the lower panels of the table, presenting weighted averages with the 25. The WERS surveys 25 employees per workplace. When there are fewer than 25 employees at the workplace, they are all given the questionnaire. The WERS is a representative survey and we incorporate its associated weight in all our calculations. Notes: for each education group or for "all employees," we first calculate 4 minus the first principle component of the three influence scores (ranged 1-4). We then normalize that variable to have mean 0 and standard deviation 1 in the 3 wave pooled sample for the education group or for "all employees." Workplaces are weighted by the establishment's employment weight times the proportion of employees of that workplace in that education group. If a workplace has no employees of the labelled education group responding to the influence questions in the employee survey, the workplace is not counted in the sub- proportion of workers at a firm in the particular education group as the weights. Doing that indicates that the increases in the index value were particularly large at lower educated firms. This makes sense since those are the firms that would most likely have used a centralized structure in the past and that, as a result, would have had the most leeway for adjustment.
To investigate the role of skill supply in choice of organizational form, we examine the relationship between the local supply of workers with BAs and the influence index at the workplace level. "Local area" here refers to Travel To Work Areas (TTWA), which were developed to capture local labour markets using data on commuting flows in 1991. 26 There were around 300 such areas in the U.K. in the 1998 through 2011 period. We derive from the LFS the proportion of workers in the TTWA who have a BA or above for the two calendar years up to and including the WERS survey year. 27 Table 4 reports the results from OLS regressions of the influence index on the local BA proportion across a range of specifications. In all the specifications, we pool together the data from the three waves, and we weight by the size of the workplace. Given that our main variable of interest varies at the TTWA level, we cluster the standard errors at that level. In the first column, we report the results from an OLS regression with the proportion of BA's in the area and year dummy variables as the only regressors. The estimated year effects indicate a secular trend toward organizational forms with greater worker control. This may reflect a response to the general increase in the education level of the workforce but more direct evidence on whether such a relationship exists is found in the estimated effect of the proportion of workers with a BA. We estimate that a 10 percentage point increase in the proportion of BAs in an area is associated with a 0.09 standard deviation increase in the influence index. This result fits with the idea that firms in areas with a higher proportion of educated workers use more decentralized organizational forms. In the next set of columns, we check the robustness of this result across a series of specifications. In the second column, we condition on the current HS proportion in the area, and the coefficient on the BA proportion changes very little. Thus, what matters for decentralization is the proportion of higher educated workers not more versus fewer high school drop-outs among the less educated. In the third column, we introduce controls for industry, workplace size, and size of the organization. 28 Notably, the size and significance of the BA proportion coefficient remains very similar to what was observed in column 1. This implies that the association between the level of education of the population and the organizational form happens within industries (as one would expect with a General Purpose Technology) rather than through shifts in the industrial structure. In the fourth column, we further include interactions between industry and wave and the key estimate remains essentially unchanged. Finally, we are concerned that our results are being driven primarily by London as a potential outlier which contains a large number of observations and has both high education and high use of more modern technologies. However, omitting London, in column 5, does not alter our results.
In Table 5, we report results with the dependent variable generated either only from the responses of the BA employees or only from the non-BA employees' responses. The specification includes industry, size, and year effects as in column (3) of Table 2, and we try both weights based just on establishment size and weights based on employment in the specific education group. The results indicate that the positive correlation between BA proportion and employees' influence at workplace observed in the earlier specifications is not a mechanical result from a combination of BAs having more influence than non-BAs and an increasing proportion of BAs. In fact, the influence over work decisions reported by non-BA employees in their workplace is even more positively correlated with the local supply of BAs than for BA employees. Again, this fits with 28. More specifically, industry is measured by the first digit of Standard Industrial Classification 1992; we have 5 categories of workplace size: <25, 25-49, 50-249, 250-999, 1000+. Whereas workplace size refers to the number of employees at the specific site, the organization may have multiple sites and hence many more employees. We have 5 categories of organization size: <50, 50-249, 250-999, 1000-9999, 10000+. the idea that under the older, centralized organizational form, BA employees would have had managerial or quasi-managerial roles and, thus, some control over decision making. It is the non-BA's who will experience the greatest change in the shift to a decentralized workplace. Whether the estimated association between the local BA proportion and the average influence index value in these regressions represents a causal effect of the level of education is unclear. More educated workers may migrate to areas where firms have more decentralized organizational structures, implying a reverse causality. Alternatively, there could be a third unobserved factor prevalent in some areas that both increases the attractiveness of using a decentralized form and is attractive to more educated workers. We find it difficult to determine what form such a factor would take given that we are already controlling for industrial structure and firm size. In addition, the fact that our results hold up when we drop London (which is a strong candidate as a place where more educated workers migrate to with the aim of working for the most up-to-date firms) is weak evidence against the first endogeneity channel. Nonetheless, we are concerned that there is remaining endogeneity.
To address any remaining endogeneity, we adopt two approaches. 29 The first is to include the value of the dependent variable (the mean value of the influence index) in the first year for which we have it (1998). 30 One can interpret this variable as a parameterization of location fixed effects that uses only the part of the fixed effect that is correlated with the historic mean level of worker control over their workplace. 31 Thus, we compare two regions with the same initial level of use of decentralized organizational forms as a means of holding constant a general proclivity to use such forms for time-invariant reasons and ask whether the region that had a greater increase in the proportion of workers with a BA saw a larger proportion of firms increase the extent of their decentralization. The results without industry and firm size controls are given in column (1) of Table 6 and the results including those controls are given in column (2). The estimated effect of the proportion BA is again highly statistically significant and takes a value of about two-thirds of the comparable estimates in the first and third columns of Table 4. Thus, the proportion BA 29. We implement these approaches using data aggregated to the TTWA. Estimation using data at the firm level with clustered standard errors yields very similar results.
30. Since we have to drop the first year of our data, we are left with firm observations across only two years of data. 31. Direct fixed effect estimators yield erratic and ill-defined coefficient estimates which we interpret as arising from the shortness of our panel.  (3)  variable is picking up longer term differences in the extent of use of decentralized forms to a limited degree and not enough to overturn our conclusion that increases in the proportion BA induces a movement toward those forms. Interestingly, the historical use of decentralized forms itself has only a weak relationship with future use of those forms in a region. Our second approach is to implement an instrumental variables (IV) estimator. In particular, we make use of variation across areas that relates to the expansion of education. As instruments we use the proportion of the population born in the years 1970-4 and the proportion born between 1975 and 1979, measured in 1995-6. 32 The underlying idea is that the proportion of the population with a university degree expanded substantially for the 1970s cohorts. As a result, areas with a high concentration of people of university age at the time of the expansion in the higher education system would be predicted to have a more educated population later to the extent that people have some tendency to stay where they grew up. In addition, we use the educational composition of people in the generation who would likely be the parents of these cohorts (people born between 1945 and 1954). In particular, we construct the proportion of the parental generations who have a BA and the proportion who have a GCSE/O level, again measured in 1995-6. We also include the interaction of these parental education variables with the size of the 1970s birth cohorts in the area. The idea behind the instruments is that areas that one would predict to have a large increase in the proportion of BAs in their workforce between the early 1990s and the early 2000s are ones where there is a local baby boom in those generations and where the parents own education indicates that they would be interested in their children's education. For this set of instruments to be valid, we require that parents in the previous generation and, in particular, more educated parents-did not have a tendency to have more children in areas which would later turn out to have more decentralized organizational structures. We also require that the parents did not locate in an area because it would undergo a shift toward a more decentralized organizational form several decades later, as part of a shift to a technology that did not even exist at the time at which most 32. The denominator for the proportions is the population born between 1940 and 1979. of them made their location choice. The fact that we control for industry and firm size effects in these regressions eliminates any concern that their location might have been related to persistent concentration in industries that would ultimately favour decentralization. We view the conditions under which this instrument set fails as very stringent. In particular, we find it hard to come up with situations in which differences in cohort sizes across areas are determined by the conditions that would affect the adoption of decentralized organizational forms decades later, especially after we control for industrial structure. The set of instruments are highly significant in the first stage, with p-values associated with the F-statistic for the test of their exclusion being effectively zero.
Column (3) in table 6 contains the results from our IV specification. The estimated coefficient on the proportion BA is 1.08, which is very similar to the value estimated with OLS in column (3) of table 4. This fits with our belief that endogeneity is not a substantial concern once we control for industry and firm size effects.
Our overall conclusion from our estimates is that an increase in the proportion of the working age population with advanced education in a region causes firms in that region to increase their use of decentralized technologies, with the effect being on the order of a 10 percentage point increase in the percentage of the working age population with a BA generating a 0.1 standard deviation increase in the extent to which workers feel they control their own work. This fits with results in Caroli and Van Reenen (2001) where they use U.K. and French data to show that a relative shortage of educated workers in a local labour market, as reflected in a higher education wage differential, implies that the firms in that market are less likely to implement organizational change. We view their results and ours as corroborating evidence for our model in which the large increase in the education level of new cohorts born after the late-1960s generated a shift in organizational structure toward a more decentralized structure in which workers had more control over their own tasks. As we have seen, in such a model, the technological shift can be accomplished without a change in the wage differential between more and less educated workers.

Other technological followers
To this point, we have presented evidence for a claim that the U.K.'s combination of a rapid educational upgrading with no accompanying change in the education-wage differential can best be understood in the context of a model of technological choice in which the U.K. is a technological follower choosing among technologies developed elsewhere (most likely the U.S.). But the U.K. is not the only economy to undergo a substantial increase in its education level after the U.S., and it is worth asking whether other economies experiencing such an increase also have patterns fitting with them being technological followers.
To address this question, we use data from the OECD on the education levels and educationwage differentials for advanced economies between 1997 and 2010 (OECD, 2012). The data is from the labour force surveys for the member economies and is restricted to 25-64 year olds. The period is chosen both because it is one in which we can obtain consistent data and because it roughly matches the period of substantial growth in the U.K.'s education level. That is, it is a period in which other economies also experiencing such growth would face the same set of existing technological choices. In this period, 11 other OECD economies both started the period with a proportion of their population with a tertiary education that was lower than that of the U.S. in 1997 and experienced an increase in that proportion of at least 40%. 33 The lowest increase country meeting these requirements was Belgium (rising from 25% of its population having a tertiary education in 1997 to 35% in 2010) and the highest was Poland (moving from 10% with a tertiary education in 1997 to 23% in 2010). The OECD data indicate a 65% increase for the U.K. from 1997 to 2010, which is very close to what we obtain using the U.K. LFS for the same period (71%).
We examine movements in the wage ratio between the mean annual earnings of all workers aged 25-64 with a tertiary education and the mean annual earnings of workers with an upper secondary education being their highest education level. We regress this ratio on a simple linear time trend to summarize the wage differential pattern that coincides with the rapid educational growth in these economies. Out of the 11 OECD economies meeting our education growth criteria 34 : the time trend coefficient is not statistically significantly different from zero at the 10% level or below in 7; two exhibit statistically significant positive trends; and two exhibit statistically significant negative trends. 35 According to the time coefficient regressions, for Poland-the country with the largest percentage increase in tertiary education-the wage ratio fell by 1.8 percentage points per decade (from a base of 170). At the other end, for the country with the smallest educational increase-Belgium-there was a 1.5 percentage points increase per decade in the wage ratio (on a base of 130). The other economies with statistically insignificant time trends for the wage ratio show negative and positive point estimates that are either smaller or somewhat bigger than these two examples. We present the full set of estimated coefficients in Supplementary Appendix. In the online appendix, we also show how our estimates fit with previous results in Crivallero (2016), who estimates very small effects of increases in university attainment on the college wage premium in a pooled sample of 12 European economies, and Chen (2013), who shows that Taiwan also underwent a large increase in educational level with no accompanying change in its college premium.
Taken together, we believe the results in the OECD data and in other papers are consistent with our model for many economies undergoing substantial increases in their education levels. We make no claim that our discussion provides a complete analysis of the determinants of wage movements in these economies. The number of observations for each country in the OECD data is small and we do not investigate factors such as the level of decision making of workers, as we do for the U.K. Nonetheless, we think the fact that there are so many economies which both start our period behind the U.S. in their education levels and experience substantial educational growth but do not have statistically significant changes in their wage ratios indicates that it is plausible that other countries could also be described in terms of our model with educational catch-up driving endogenous technological choices. A more complete investigation of this hypothesis for other economies is beyond the scope of this paper.

CONCLUSION
In this article, we highlight two empirical patterns: first, the U.K. underwent a dramatic increase in the proportion of the working age population with a BA since 1993; second, the BA-to-HS wage differential was essentially unchanged over this period. The combination of increased educational supply and a lack of movement in the educational wage differential necessarily implies a skill 34. This includes the U.K. We drop Australia because there are only three earnings observations in our period. For the other countries, the wage ratio data are for the years 2000-10, with some missing years in most economies.
35. The countries with flat wage ratio profiles are: Belgium, France, Ireland, New Zealand, Poland, Switzerland, and the U.K. The two countries with positive trends are: South Korea and Spain. And the two with negative trends are: Norway and Sweden. Regressions of the wage ratio on the proportion with a tertiary education generate the same pattern of insignificant, significantly positive, and significantly negative coefficients on the education variable.

REVIEW OF ECONOMIC STUDIES
biased demand shift over time. We consider three models of technological change that imply skill biased demand shifts: the canonical model in which the demand shift is exogenous; a model in which the increase in education induces new skill favouring inventions; and a model in which a variety of technologies already exist, with firms choosing which to implement. We argue that the core patterns in the U.K. data do not fit with exogenous technological change models, including those that incorporate tasks. Moreover, because the growth in educational attainment varies over time, the exogenous technological change models require that the rate of technological change has to speed up and slow down in just the right way to generate the pattern that we observe of an unchanging college premium throughout the post-1993 period. Of course, we cannot reject a claim that there just happened to be such a variation in the exogenous rate of technological change but we view it as improbable.
Of the remaining, endogenous technological change models, we believe that models of induced invention may be relevant for the U.S. in recent decades since it was in a position to be a technological leader in skill biased technologies by virtue of having a much more educated work force than other developed economies at the dawn of the computer era (Beaudry et al., 2006). In contrast, the U.K. underwent its educational expansion much later and, as a result, we believe it is plausible that it was a technological follower for this type of technology-following an induced technological adoption model rather than one of induced innovation.
More explicitly, we argue for a model for the U.K. in which firms in any sector can choose to produce using a centralized or a decentralized organizational structure as discussed in papers such as Caroli and Van Reenen (2001) and Bloom et al. (2012). In the decentralized structure, workers need to be able to take individual initiative and control their own work-characteristics that we view as fitting more with higher educated workers. The model has a similar construction to a classic trade model in that the economy responds to a shift in the relative supply of more educated workers by shifting toward greater use of the decentralized organizational structure. And, as in the trade model, there is no adjustment in terms of relative wages or wage levels. But the model also has further implications; most notably that the proportion of managers who have a BA should increase but the proportion of BA's who work as managers should decrease as the decentralized technology spreads. The latter is the opposite of the prediction from a standard skill biased demand model built around a nested CES production function. The model also implies strong restrictions on the shape of the aggregate production function that we show hold for the U.K. in this period. In addition, we show that areas in the U.K. which had more substantial increases in education levels are also areas where workers report having more control over their own work-something we see as a marker of a decentralized workplace. Importantly, this pattern occurs within industries, not because of shifts in the industrial structure, and is robust across a range of specifications. We develop an instrumental variable strategy in which we instrument for area specific educational changes using differences in fertility in previous decades and parental education. We believe these instruments are very likely to be valid, based as they are on an assumption that parental decisions on fertility in the 1960s and 1970s did not arise from predictions of decentralized technologies coming to their areas in the 1990s and after. Again, it is important that we control for industry in all our specifications, implying that parents would have to make their guess about future technology use independently of the local industrial structure for our instrument to be invalid. The IV results indicate that increases in the education level in a local economy have a causal impact on the adoption of decentralized organizational forms by firms in that economy.
The key point we see as arising from this exercise is that the effects of technological change are not one size fits all. There are good reasons to believe the U.S. has been a technological leader and there has been considerable study of the interactions of technological change and educational supply shifts in the U.S. The question then becomes, can the experience of the U.S. be generalized to other countries? The U.K. provides an interesting case study to examine this question. Its large expansion in education happened quickly and well after the main expansion for the U.S. Because of that, we believe that the U.K. provides evidence on what happens to technological followers as their conditions shift toward favouring the technologies that the leader has developed. We argue that during the transition period for a follower economy, one could observe no real impact on skilled wage differentials even though the economy was being substantially transformed. Our evidence lines up with this interpretation. We believe this calls into question approaches in which technological change effects are identified from commonalities in wage and employment movements across countries, with remaining differences assigned to differences in institutions and differences in supply shifts. This does not mean that there are no commonalities across economies and that we should devolve to studying each economy in isolation. Instead, we view our results as indicating the need for a broader view of the impact of technological change-one which emphasizes the role of differences in movements in relative factor supplies in determining the point in the lifecycle of a technology at which each economy adopts it.