-
PDF
- Split View
-
Views
-
Cite
Cite
Daron Acemoglu, The simple macroeconomics of AI, Economic Policy, Volume 40, Issue 121, January 2025, Pages 13–58, https://doi.org/10.1093/epolic/eiae042
- Share Icon Share
SUMMARY
This paper evaluates claims about the large macroeconomic implications of new advances in Artificial intelligence (AI). It starts from a task-based model of AI’s effects, working through automation and task complementarities. So long as AI’s microeconomic effects are driven by cost savings/productivity improvements at the task level, its macroeconomic consequences will be given by a version of Hulten’s theorem: Gross Domestic Product (GDP) and aggregate productivity gains can be estimated by what fraction of tasks are impacted and average task-level cost savings. Using existing estimates on exposure to AI and productivity improvements at the task level, these macroeconomic effects appear non-trivial but modest – no more than a 0.66% increase in total factor productivity (TFP) over 10 years. The paper then argues that even these estimates could be exaggerated, because early evidence is from easy-to-learn tasks, whereas some of the future effects will come from hard-to-learn tasks, where there are many context-dependent factors affecting decision-making and no objective outcome measures from which to learn successful performance. Consequently, predicted TFP gains over the next 10 years are even more modest and are predicted to be less than 0.53%. I also explore AI’s wage and inequality effects. I show theoretically that even when AI improves the productivity of low-skill workers in certain tasks (without creating new tasks for them), this may increase rather than reduce inequality. Empirically, I find that AI advances are unlikely to increase inequality as much as previous automation technologies because their impact is more equally distributed across demographic groups, but there is also no evidence that AI will reduce labour income inequality. Instead, AI is predicted to widen the gap between capital and labour income. Finally, some of the new tasks created by AI may have negative social value (such as the design of algorithms for online manipulation), and I discuss how to incorporate the macroeconomic effects of new tasks that may have negative social value.
1. INTRODUCTION
Artificial intelligence (AI) has captured imaginations. Promises of rapid, even unparalleled, productivity growth as well as new pathways for complementing humans have become commonplace. There is no doubt that recent developments in generative AI and large language models (LLMs), which produce text, information and images – and Shakespearean sonnets – in response to simple user prompts, are impressive and even spellbinding. ChatGPT, originally released on 30 November 2022, soon became the fastest-spreading tech platform in history, with an estimated 100 million monthly users only 2 months after its launch.
AI will have implications for the macroeconomy, productivity, wages and inequality, but all of them are very hard to predict. This has not stopped a series of forecasts over the last year, often centring on the productivity gains that AI will trigger. Some experts believe that truly transformative implications, including artificial general intelligence enabling AI to perform essentially all human tasks, could be around the corner.1 Other forecasters are more grounded, but still predict big effects on output. Goldman Sachs (2023) predicts a 7% increase in global Gross Domestic Product (GDP), equivalent to $7 trillion and a 1.5% per annum increase in US productivity growth over a 10-year period. Recent forecasts from the McKinsey Global Institute suggest that generative AI could offer a boost as large as $17.1 to $25.6 trillion to the global economy, on top of the earlier estimates of economic growth from increased work automation (Chui et al., 2023). They reckon that the overall impact of AI and other automation technologies could produce up to a 1.5–3.4 percentage point rise in average annual GDP growth in advanced economies over the coming decade.2
Are such large effects plausible? And if there are going to be productivity gains, who will be their beneficiary? With previous automation technologies, such as robotics, most gains accrued to firm owners and managers, while workers in impacted occupations experienced negative outcomes (e.g. Acemoglu and Restrepo, 2020a). Could it be different this time? Some experts and commentators are more optimistic. A few ‘proof-of-concept’ experimental studies document non-trivial productivity gains from generative AI, largely driven by improvements for less productive or lower-performing workers (e.g. Brynjolfsson et al., 2023; Noy and Zhang, 2023; Peng et al., 2023), and this has prompted some experts to be cautiously optimistic (Autor, 2024), while others are forecasting a ‘blue-collar bonanza’ (The Economist, 2023).
This paper uses the framework from Acemoglu and Restrepo (2018, 2019b, 2022) to provide some insights into these debates, especially relevant for the medium-term (about 10-year) macroeconomic effects of AI. I build a task-based model, where the production of a unique final good requires a series of tasks to be performed, and these tasks can be allocated to either capital or labour, which have different comparative advantages. Automation corresponds to the expansion of the set of tasks that are produced by capital (including digital tools and algorithms). In this framework, AI-based productivity gains – measured either as growth of average output per worker or as total factor productivity (TFP) growth – can come from a number of distinct channels (see Acemoglu and Restrepo, 2019a):
Automation (or more precisely extensive-margin automation) involves AI models taking over and reducing costs in certain tasks. In the case of generative AI, various mid-level clerical functions, text summary, data classification, advanced pattern recognition and computer vision tasks are among those that can be profitably automated.
Task complementarity can increase productivity in tasks that are not fully automated and may even raise the marginal product of labour. For example, workers performing certain tasks may have better information or access to other complementary inputs. Alternately, AI may automate some subtasks, while at the same time enabling workers to specialize and raise their productivity in other aspects of their job.
Deepening of automation can take place, increasing the productivity of capital in tasks that have already been automated. For example, an already-automated IT security task may be performed more successfully by generative AI.
New tasks may be created thanks to AI and these tasks may impact the productivity of the whole production process.3
In this paper, I focus on the first two channels, though I also discuss how new tasks enabled by AI can have positive or negative effects. I do not dwell on the deepening of automation, because the tasks impacted by (generative) AI are different from those automated by the previous wave of digital technologies, such as robotics, advanced manufacturing equipment and software systems.4 I also do not discuss how AI can have revolutionary effects by changing the process of science (a possibility illustrated by neural network-enabled advances in protein folding and new crystal structures discovered by the Google subsidiary DeepMind), because large-scale advances of this sort do not seem likely within the 10-year time frame, and many current discussions focus on automation and task complementarities.
I show that when AI’s microeconomic effects are driven by cost savings (equivalently, productivity improvements) at the task level – due to either automation or task complementarities – its macroeconomic consequences will be given by a version of Hulten’s theorem: GDP and aggregate productivity gains can be estimated by what fraction of tasks are impacted and average task-level cost savings. This equation disciplines any GDP and productivity effects from AI. Despite its simplicity, applying this equation is far from trivial, because there is huge uncertainty about which tasks will be automated or complemented, and what the cost savings will be.
Nevertheless, as an illustrative exercise, I use data from a number of recent studies, particularly Eloundou et al. (2024) and Svanberg et al. (2024), as well as the experimental studies mentioned above, to obtain some back-of-the-envelope numbers. Eloundou et al. (2024) provide the first systematic estimates of what tasks will be impacted by generative AI and computer vision technologies. Their methodology does not fully distinguish whether the impact will take the form of automation or task complementarities, and does not provide information on when we should expect these impacts to be realized and how large their cost savings will be.5 For computer vision technologies, Svanberg et al. (2024) provide estimates of what fraction of tasks that are potentially exposed to AI can be feasibly automated in different time frames.
I take Eloundou et al.’s estimates of tasks that are exposed to AI (without distinguishing automation versus task complementarities). I then aggregate this to the occupational level and weigh the importance of each occupation by its wage bill share in the US economy. This calculation implies that 20% of US labour tasks are exposed to AI. I then use Svanberg et al.’s estimate for computer vision tasks that, among all exposed tasks, 23% can be profitably performed by AI (for the rest, the authors estimate that the costs would exceed the benefits). I take the average labour cost savings to be 27% – the average of the estimates in Noy and Zhang (2023) and Brynjolfsson et al. (2023) – and turn this into overall cost savings using industry labour shares, which imply average overall cost savings of 14.4%.
This calculation implies that TFP effects within the next 10 years should be no more than 0.66% in total – or approximately a 0.064% increase in TFP growth annually. If we add bigger productivity gains from Peng et al. (2023), which are less likely to be broadly applicable, or incorporate further declines in graphics processing unit (GPU) costs, this number still remains around 0.9%.
To turn these numbers into GDP estimates, we need to know how much the capital stock will increase due to AI. I start with the benchmark of a rise in the capital stock proportional to the increase in TFP. This benchmark is consistent with the fact that generative AI does not seem to require huge investments by users (beyond those made by designers and trainers of the models). With these investment effects incorporated, GDP is also estimated to grow by 0.93–1.16% over the next 10 years. When I assume that the investment response will be similar to those for earlier automation technologies and use the full framework from Acemoglu and Restrepo (2022) to estimate the increase in the capital stock, the upper bound on GDP effects rises to around 1.4–1.56%. Nevertheless, my framework also clarifies that if the capital–output ratio increases in response to the TFP rise, this may increase GDP by more than TFP, but does not additionally contribute to welfare, because the extra investment comes out of consumption.6
I then argue that the numbers above may be overestimates of the aggregate productivity benefits from AI, because existing estimates of productivity gains and cost savings are in tasks that are ‘easy-to-learn’, which then makes them easy for AI. In contrast, some of the future effects will come from ‘hard-to-learn’ tasks, where there are many context-dependent factors affecting decision-making, and most learning is based on the behaviour of human agents performing similar tasks (rather than objective outcome measures). Productivity gains from AI in these hard tasks will be less – though, of course, it is challenging to determine exactly how much less. Using a range of (speculative) assumptions, I estimate an upper bound of 73% easy tasks among Eloundou et al.’s exposed tasks. I suppose that productivity gains in hard tasks will be approximately one-quarter of the easy ones. This leads to an updated, more modest increase in TFP and GDP in the next 10 years that can be upper bounded by 0.53% and 0.90%, respectively.
New tasks created with AI can more significantly boost productivity. But, some of the new AI-generated tasks are manipulative and may have negative social value, such as deepfakes, misleading digital advertisements, addictive social media or AI-powered malicious computer attacks. While it is difficult to put numbers on good and bad new tasks, based on recent research, I suggest that the negative effects from new bad tasks could be sizable. I make a very speculative attempt using numbers on the negative welfare effects of social media from a recent paper by Bursztyn et al. (2023). These authors find that consumers have a positive willingness to pay for using social media (in particular, Instagram and TikTok) when others are using it, but they would prefer that neither themselves nor others use it. Roughly speaking, their estimates imply that revenue can increase by about $53 per user-month, but this has a negative impact on total GDP/welfare equivalent to $19 per user-month. Combining these numbers with an estimate of the fraction of activities that may generate negative social value (in practice, revenues from social media and spending on attack-defence arms races in IT security), I suggest that with more intensive use of AI, it is possible to have non-trivial increases in GDP. For example, AI may appear to increase GDP by 2%, while in reality reducing welfare by −0.72% (in consumption equivalent units).
Finally, I explore AI’s wage and inequality effects. My framework implies that productivity gains from AI are unlikely to lead to sizable wage rises. Moreover, even if AI improves the productivity of low- and middle-performing workers (or workers with limited expertise in complex tasks), I argue that, theoretically, this may not translate into lower inequality. In fact, I show by means of a simple example how an increase in the productivity of low-skill workers in certain tasks can lead to higher rather than lower inequality. Adapting the general equilibrium estimates from Acemoglu and Restrepo (2022) to the setting of AI, I find that the more intensive use of AI is unlikely to lead to substantial wage declines for affected groups, because AI-exposed tasks are more evenly distributed across demographic groups than were the tasks exposed to earlier waves of automation. Nevertheless, I estimate that AI will not reduce inequality and is likely to have a negative effect on the real earnings of low-education women (especially white, native-born low-education women). My findings also suggest that AI will further expand the gap between capital and labour income as a whole.
In conclusion, I argue that as originally suggested in Acemoglu and Restrepo (2018), more favourable wage and inequality effects, as well as more sizable productivity benefits, will likely depend on the creation of new tasks for workers in general and for middle- and low-pay workers in particular. While this is feasible in theory and I have argued elsewhere how it could be achieved (Acemoglu, 2021; Acemoglu et al., 2023), I also discuss why this does not seem to be the focus of AI research at the moment.
In sum, it should be clear that forecasting AI’s effects on the macroeconomy is extremely difficult and will have to be based on a number of speculative assumptions. Nevertheless, the gist of this paper is that a simple framework can discipline our thinking and forecasts, and if we take this framework and existing estimates seriously, it is difficult to arrive at very large macroeconomic gains.
The rest of the paper is organized as follows. The next section outlines the conceptual framework I use throughout the paper and derives a number of theoretical insights on aggregate productivity gains, investment responses and wage and inequality effects. It also discusses the crucial distinction between easy-to-learn and hard-to-learn tasks and their productivity implications, and introduces the contrast between good and bad new tasks. Section 3 provides a preliminary quantitative analysis of new AI breakthroughs within this framework. It first presents a baseline (upper bound) estimate on the basis of the fraction of existing tasks that are likely to be impacted by AI within the next 10 years and existing estimates of cost savings (productivity improvements) from AI. It then refines this estimate by introducing the distinction between easy-to-learn and hard-to-learn tasks and undertakes a preliminary classification of AI-exposed tasks into the easy and hard categories. I also make an even more speculative attempt at incorporating the macroeconomic implications of bad new tasks into this framework. Finally, I report estimates on the wage and inequality implications of recent AI advances. Section 4 concludes with a general discussion, while the Supplementary Appendix includes additional information on how tasks are classified into exposed and non-exposed and easy-to-learn and hard-to-learn categories.
2. CONCEPTUAL FRAMEWORK
The model here builds on Acemoglu and Autor (2011) and Acemoglu and Restrepo (2018, 2019b, 2022), and I focus on the main elements of the framework, referring the reader to these papers for further details and refinements. The economy is static and involves the production of a unique final good, and all markets are competitive.7
Throughout, I assume that is increasing in z, so that labour has a comparative advantage in higher-indexed tasks. This implies that there exists a threshold I such that tasks are produced with capital and those above this threshold are produced with labour.
I normalize the total population to 1 and assume that different workers have different units of effective labour. To simplify the discussion, I assume that there are two types of labour, high-skill and low-skill, and there is no comparative advantage difference between these two types of labour (I generalize this later). The only difference is that high-skill workers, which make up a fraction of the population, have λH units of effective labour, while the remaining low-skill workers have only units of effective labour. This specification ensures that both high-skill and low-skill workers could be performing some of the same tasks. It also implies that wage inequality is pinned down by – a feature I relax later.
Finally, I assume that there exists a (non-satiated) representative household that consumes the final good (net of capital expenditures) and I denote the consumption of this household by C.
2.1. Equilibrium
I focus on a competitive equilibrium, which satisfies the following usual conditions:
- The allocation of tasks is cost-minimizing. That is, task is produced by labour if and only if
The amount of capital k(z) is chosen to maximize , where Y is given as in Equation (1).
The labour market clears. That is, Equation (2) holds.
Notice that the first condition imposes an innocuous tie-breaking rule that when indifferent, firms use capital for performing a task. Given this tie-breaking rule, all tasks z > I will be performed by labour (i.e. l(z) = 0 for all and k(z) = 0 for all z > I). Whether this is high- or low-skill labour is indeterminate in the baseline model, so I focus on the overall amount of effective labour units.
2.2. How AI could affect production
Before completing the characterization of equilibrium, I discuss how AI could affect production in this economy.
AI enables further (extensive-margin) automation, increasing I. Such automation could be triggered either because AI reduces the cost of capital for some marginal tasks (i.e. tasks slightly above I) or increases the effectiveness of machinery or algorithms performing some marginal tasks, thus raising for some z above I. Obvious examples of this type of automation include generative AI tools, such as LLMs, that are taking over simple writing, translation and classification tasks, as well as somewhat more complex tasks related to customer service and information provision, or computer vision technologies taking over image recognition and classification tasks.
AI can generate new task complementarities, raising the productivity of labour in tasks it is performing.9 For example, AI could provide better information to workers, directly increasing their productivity. This possibility could be modelled as AI reducing the cost of complementary capital in some tasks z > I in the more general formulation in footnote 8. Alternatively, AI could automate some subtasks (such as providing ready-made subroutines to computer programmers) and ously enable humans to specialize in other subtasks, where their performance improves. This channel would require the explicit modelling of the range of subtasks making up each task. In this case, new AI technologies would perform some of these subtasks and do so with sufficiently high productivity, so that the subtask-level displacement would be weaker than the productivity gains, expanding the demand for labour and the marginal productivity of labour in these tasks. The possibility that the productivity effect could be larger than the displacement effect and expand labour demand is the same as in the basic models of automation, as exposited in Acemoglu and Restrepo (2018, 2019b). Even more interestingly, AI may enable workers to specialize in the non-automated subtasks and raise their expertise in these activities (e.g. when humans spend less time in writing standard subroutines, they can become better at other parts of programming). I represent task complementarities by an increase in in some tasks , or when they happen in all tasks, by an increase in AL.
AI could induce deepening of automation – meaning improving performance, , or reducing costs, , in some previously capital-intensive tasks (tasks ). Examples include IT security, automated control of inventories and better automated quality control (see Acemoglu and Restrepo, 2019a).
AI can generate new labour-intensive products or tasks, which corresponds to an increase in N. As argued in Acemoglu and Restrepo (2020b), Acemoglu (2021) and Acemoglu et al. (2023), there are many pathways for such new tasks. Later I also discuss the case where some of these new products and tasks can be manipulative and have negative social value.
The effects of new AI tools will depend on the extent of each one of these effects, and I will try to provide more specificity on these possibilities later. In the rest of this section, I derive the consequences of different effects of AI.
2.3. Equilibrium wages and comparative statics
In general, this expression has ambiguous sign, so automation can increase or reduce wages. More specifically, there are two opposing effects (Acemoglu and Restrepo, 2018, 2019b): (1) automation always produces a positive effect on wages (and labour demand) because it increases productivity (or equivalently, reduces costs). This positive productivity effect is represented by the first term and (2), automation displaces workers from the tasks they used to perform. The negative displacement effect is represented by the second term. In the special case, where R(K) is constant, it can be verified that automation increases wages. This is not the case, in general, when R(K) is increasing, as shown in Acemoglu and Restrepo (2018), because the displacement effect can be larger than the productivity gains.10
Overall, the impact of (extensive-margin) automation on the equilibrium wage is closely tied to its productivity effect, to which I next turn.
Before doing so, I also note that the effects of task complementarities may be a little more complex than typically assumed. Even though an increase in raises the marginal physical product of labour, the equilibrium wage is determined by the value of the marginal product of labour, which depends on the adjustment of task prices. As tasks produced by labour become more abundant/easier to perform, these task prices are reduced, and in the empirically relevant case where , task prices decline more than the increase in physical productivity. The equilibrium wage may still increase because of productivity gains, but the benefits to labour may be limited overall. For example, holding I and N constant, an increase in AL will leave wages constant when where denotes the capital share in national income (Acemoglu and Restrepo, 2018). When , higher AL can actually reduce real wages. Since currently in the US economy, Humlum’s estimates of mentioned above, of about 0.5, imply that task complementarities or labour-augmenting technological improvements will not raise wages much. Even when they increase wages, these technological shifts reduce the labour share, just like automation does (Acemoglu and Restrepo, 2018, 2019b).
Finally, I note that deepening of automation (which is less likely to be relevant in the case of AI for the reasons discussed in Section 1) and new tasks always increase wages and the latter always raises the labour share of national income as well (and thus tends to narrow the gap between capital and labour income). I return to a discussion of new tasks later.
2.4. Hulten’s theorem
I discuss below how this share can be estimated from recent studies on which occupations and tasks will be impacted by AI.
2.5. Easy tasks and hard tasks
As noted above, the proof-of-concept studies by Peng et al. (2023), Noy and Zhang (2023) and Brynjolfsson et al. (2023) have focused on tasks where AI could have clear benefits and where there were already sophisticated applications (such as GitHub Copilot for the first study) or where some businesses were already making an effort to use generative AI (as in the customer service application of the third study). I now argue that estimates from tasks that are relatively easy for current AI technologies, even if reliable within their chosen context, cannot be directly extrapolated to the rest of the economy. More generally, it is useful to distinguish between ‘easy-to-learn tasks’ where outside learning (and thus learning by AI models) is easy from those where gaining expertise on the basis of outside observation is hard. Easy-to-learn tasks, which are relatively straightforward for (generative) AI to learn and implement, are defined by two characteristics:
there is a reliable, observable outcome metric and
there is a simple (low-dimensional) mapping between action and the outcome metric.
How to boil an egg (or providing instructions for boiling an egg), the verification of the identity of somebody locked out of a system and the composition of some well-known programming subroutines are easy tasks. The desired outcome – an egg that is boiled to the desired level, allowing only authorized people to access the system, or whether the subroutine works or not – is clear. In none of these cases do the successful outcomes depend on the complex interaction of many dimensions of actions. With reliable, objective measures of success (well-boiled egg, no security breach given the ground truth of authorized people or a subroutine that does not crash), AI models can learn to perform well in a relatively straightforward manner. Beyond this, AI models can also learn from human actions, because there are expert humans who perform well in these tasks, such as expert programmers, and because objective measures of success are available, experts can be identified.14
‘Hard tasks’ typically do not have a simple mapping between action and desired outcome. In hard problems, what leads to the desired outcome in a given problem is typically not known and strongly depends on contextual factors, or the number of relevant contexts may be vast, or new problem-solving may be required. Additionally, there is typically not enough information for the AI system to learn or it is unclear exactly what needs to be learned. Diagnosing the cause of a persistent cough and proposing a course of treatment is a hard problem. There are many complex interactions between past events that may be the cause of the lingering cough and many rare conditions that should be considered. Moreover, there is no large, well-curated dataset of successful diagnoses and cures. In hard tasks, AI models can still learn from human decision-makers, but because there is no clear metric of success, identifying and learning from workers with the highest level of expertise will not be straightforward either. As a result, there will be a tendency for the performance of AI models to be similar to the average performance of human decision-makers, limiting the potential for large productivity improvements and cost savings.
AI productivity gains observed so far are from easy tasks. It is reasonable to expect that productivity gains in hard tasks are more limited, at least at first. Productivity gains in easy tasks result from AI models performing these tasks more or less at the same level as expert workers, and/or at lower cost than humans. For example, in the Noy and Zhang (2023) study, expert workers are those that are able to summarize and write reasonably well. The desired outcome in this case is straightforward to determine and does not require new problem-solving efforts (this may have been different if the writing tasks were more complex or required ‘creativity’). Generative AI models, such as GPT-4, are trained on vast amounts of this type of writing, making this a clear example of an easy task. Productivity gains in this case come from the fact that lower-expertise or less-skilled workers can be brought up to the level of more expert workers at little cost, and the majority of the workers are helped to perform their assigned tasks more quickly. In easy problems, there may even be additional productivity gains from AI models discovering action combinations that were not typically tried or known by expert humans.15
There are several barriers to productivity gains in hard tasks. First, the lack of a simple mapping between action and desired outcome will make it much more difficult to train AI models, and early automation efforts and even human-complementary use of AI may be hampered by this slow learning, delaying any productivity gains. Second, human complementarity may be much harder to achieve. When there is no good information on what the desired outcome is (e.g. what was the right diagnosis for the persistent cough?), most of the training data of AI models will come from how humans act in similar circumstances. As a result, learning from humans will not lead to better than human performance and is unlikely to generate new complementarities and reveal different insights than what humans are already doing.16
I also attempt to provide estimates using this updated TFP equation below, which will lead to gains that are about 25% less than those that do not take the distinction between easy and hard tasks into account.
2.6. Investment responses
To go from TFP responses to total GDP responses, we need to see how much capital increases. This is also relevant for understanding what magnitude of an investment boom generative AI may trigger.
In what follows, I suppose that B and AK do not change. If AI significantly improves the practice of science and invention and/or creates new high-productivity tasks, such changes may occur in the future. But I assume that they are constant within the next 10 years.
Notice, however, that when , and in particular for the empirically plausible value of , the first term is negative – an increase in the productivity of AI-augmented capital in capital-intensive tasks reduces investment and the equilibrium capital stock. The last term is also non-positive, provided that total investment increases. Hence, a natural upper bound for the proportional increase in the capital stock of tasks already performed by capital is the proportional increase in output.
Additionally, if I increases, investment will jump from zero to a positive amount in the newly automated tasks.18 Because in the estimates I report below, only a modest fraction of tasks will be automated with AI, this increase may be small as well. Hence, I start with a first estimate that takes the proportional increase in the capital stock to be the same as the increase in aggregate productivity. I then provide another estimate incorporating the full structure of between-industry and between-task substitution, based on the framework and results of Acemoglu and Restrepo (2022). This will enable me to estimate implied changes in the capital share and then use the expressions in footnote 12 to derive updated investment response and GDP estimates.
I finally note that it is straightforward to show that consumer welfare (and consumption) in this economy is proportional to TFP divided by one minus the capital share (i.e. TFP). In particular, if the capital–output ratio rises and the increase in GDP is more pronounced than that implied by Equation (13), then the GDP response would overstate the increase in consumer welfare, as investment comes at the expense of consumption (Acemoglu and Restrepo, 2022).
2.7. New good tasks
This is always positive and could be large. Note also that the wage and productivity impact of new tasks can be potentially larger than cost savings in existing tasks, and this is particularly likely to be the case when new tasks improve the entire production process (as captured by the B term), or when they add new sources of cost improvements or complementary functions. Despite new tasks’ central role in wage and productivity growth and in reducing labour income inequality (see Acemoglu and Restrepo, 2018; Autor et al., 2024), I will not focus on new good tasks generated by AI for the reasons discussed in detail in Section 4.
2.8. New bad tasks
Although it is difficult to ascertain the magnitude of such negative welfare effects, I argue, based on some recent studies, that they may be non-trivial. Specifically, I use estimates on the relative magnitudes of the two terms in (18) and proxy for the magnitude of the first term by the revenues of tasks where AI can produce new bad tasks or socially harmful activities.
2.9. Wage and inequality implications of AI
A number of commentators and experts are cautiously optimistic that advances in generative AI could be beneficial for labour or at the very least not impact workers as adversely as previous waves of digital technologies, such as robotics and software systems, which were predominantly used for automation. There are three potential pathways through which such optimism may be realized.
AI can enable productivity increases in tasks currently produced by labour. This is the task-complementarities channel and can be captured either by an increase in AL or an increase in for a subset of the tasks that are automated or by increases in λU and (which, recall, are the productivities of unskilled and highly skilled workers). However, when , these types of productivity improvements will reduce the labour share, and thus inequality between capital and labour will increase.
If AI generates large productivity gains, it may increase wages even though it reduces the labour share (Acemoglu and Restrepo, 2018, 2019b). This channel thus critically hinges on the magnitude of the productivity effects discussed above, but in any case, always increases inequality between capital and labour.
As already discussed, some early studies show that within narrow occupations, lower-performing or lower-expertise workers are the ones benefiting from generative AI. This raises the possibility that AI could be more complementary to lower-skill workers and may reduce labour income inequality. In my framework, this would be captured by an increase in relative to λH. Even in this case, however, inequality between capital and labour will rise (provided that ).
If AI created new (good) tasks, these would reduce inequality between capital and labour, and if enough new tasks were targeting lower-skill workers, this could also reduce labour income inequality (Acemoglu and Restrepo, 2018).
I now argue theoretically that there are several reasons why 1 and 2 listed above are unlikely to be major sources of wage growth or significant limits on inequality. First, consider a 1% increase in AL (or an equivalent increase in for labour-intensive tasks). As explained above, this may not increase wages at all or may lead to only small wage increases. In fact, from Equation (11), abstracting from the productivity effect, the direct impact on the equilibrium wage will be a change. In the plausible case, where , this is negative. The overall impact may still be an increase in wages because of the productivity effect, but as already noted, when σ is approximately equal to the share of capital in national income, the overall impact will be essentially zero. In conclusion, without the creation of a sufficient number of new tasks, inequality between capital and labour will increase and wage rises may be limited.
What about a reduction in inequality because lower-skill workers benefit more? In the model here, the earnings of high-skill workers relative to low-skill workers are always pinned down at . So if new technologies reduce this ratio, they will reduce the gap between high-skill and low-skill workers. But even this conclusion needs to be qualified. Acemoglu and Restrepo (2022) show that in more general settings, with multiple skill groups, there will be ripple effects whereby impacted demographic groups can then compete for tasks previously performed by other groups. In such a situation, an overall increase in labour productivity of both high-skill and low-skill workers in some tasks can lead to their displacement from these tasks, and then the ripple effects can, in principle, affect low-skill workers even more adversely than high-skill workers.
While such adverse effects on low-skill workers are a general possibility in the framework of Acemoglu and Restrepo (2022), I am not aware of worked-out examples where an increase in the productivity of low-skill workers increases inequality. I now provide such an example.
2.10. How greater low-skill productivity can lead to higher inequality
Let me illustrate this possibility with a simple example, by relaxing the assumption that there are no comparative advantage differences between high-skill and low-skill workers. Suppose that the economy starts from an equilibrium in which tasks below some I are performed by capital, and tasks between I and N are performed by a combination of high-skill and low-skill workers. In particular, suppose that low-skill workers have a comparative advantage in tasks between I and , and denote their (constant) productivity in these tasks by λU, while the productivity of high-skill workers in these tasks is also constant and equal to λH.
Suppose also that the relative productivity of high-skill workers in tasks between and N is – indicating that high-skill workers have a comparative advantage in higher-indexed tasks. However, because there is no strict comparative advantage, the equilibrium may involve both types of workers performing some of the same tasks.19 Assume also that the elasticity of substitution between tasks is .
Let us start with an equilibrium in which both high-skill and low-skill workers perform tasks , while only high-skill workers perform tasks . In this initial equilibrium, the relative wage of skilled workers will be pinned down by the relative productivities of the two types of workers in the tasks they are both performing – that is, – and is given by .
Suppose that labour productivity in increases due to advances in AI, and this also is more helpful for lower-skilled workers, and so declines. I now show that these advances could boost inequality. Because , the prices of tasks will decline more than the increase in physical productivity of labour, and there will be less labour assigned to these tasks. If this effect is significant, all high-skill workers may be allocated away from these tasks, and the amount of labour demanded in these tasks may fall short of the supply of low-skill workers. In this case, the post-AI allocation may involve only low-skill workers performing tasks , while both low-skill and high-skill workers perform tasks . Then, regardless of how much declines, the relative wage of skilled workers will be determined by the tasks that both types of workers are performing, which are now those above , and thus will be equal to . Hence, inequality increases following the rise in the productivity of low-skill workers.
Therefore, I have just proven that, in this general scenario, even a reduced productivity gap between low-skill and high-skill workers in some tasks could lead to greater inequality. Hence, the overall inequality implications of AI cannot be directly deduced from its effects on the performance of workers of different skills in a given set of tasks and requires a fuller empirical exploration. In the next section, I investigate AI’s inequality effects by adapting the framework and estimates of Acemoglu and Restrepo (2022) to the current setting.
3. A PRELIMINARY QUANTITATIVE EVALUATION
In this section, I provide a preliminary quantitative evaluation of the possible effects of recent breakthroughs in AI over a horizon of 10 years. The centrepiece will be the use of Hulten’s theorem and recent estimates of which tasks can be automated using AI and computer vision technologies and the cost savings thereof. Once I obtain these estimates from Hulten’s theorem for TFP growth, I convert them to GDP growth estimates using another series of assumptions on how the capital stock of the economy will respond. I also discuss how AI’s effects through new bad tasks could be estimated. Finally, I combine these numbers with the more detailed framework in Acemoglu and Restrepo (2022) to obtain even more speculative estimates for wage and inequality implications. Before moving to the estimates, I first describe the sources I use, further discuss existing productivity estimates and motivate various parameter choices.
3.1. Data sources and parameter choices
The centrepiece of the estimates in this paper is Equation (14), and its refinement to Equation (15). To implement this equation, two pieces of information are needed:
GDP share of tasks that are impacted by AI (inclusive of computer vision) within the next 10 years.
Average cost savings in these tasks due to AI, .
It is impossible to have accurate estimates of either of these two quantities, and hence my repeated caution that the numbers here – and for that matter, other estimates in the literature and the public debate – should be interpreted with great caution as suggestive numbers. Nevertheless, there are studies that have already shed light on these quantities. I now discuss what these are and how I use them.
3.1.1. GDP share of tasks impacted by AI
The most careful estimates of which tasks are exposed to recent AI and computer vision advances come from Eloundou et al. (2024). These authors use two related methodologies for classifying which tasks are exposed to AI and computer vision. Both of those start from O*NET task and Detailed Work Activity (DWA) descriptions. The authors ask GPT-4 to classify all 19,265 tasks and 2,087 DWAs. They also develop a coarser index by manually classifying DWAs and then cross-validating their GPT-based measure with this ‘human’ coding. Here, I focus on the GPT-based measure, which allows greater granularity, as I explain next. In addition, Eloundou et al. (2024) distinguish between a direct exposure () measure, which is based on their assessment of what LLMs can achieve now. They then develop a second, more aggressive measure (their so-called β measure), allowing ‘indirect’ exposure to a hypothetical LLM+. This includes tasks that will be (possibly) exposed to new software and other advances building on current LLM and computer vision technologies.20
The index reported by Eloundou et al. in the paper is based on a binary coding, while they also construct an ‘automation’ index; in this case, their β measure includes granular information about what fraction of the activities might be impacted by LLM+ (ranging across 0%, 25%, 50%, 75% and 100%). In what follows, I take their automation index to be able to use this granular information. This index still contains both automation and task complementarities, even if it emphasizes automation a little more than their other exposure indices (because they code the granular information on the basis of information about what activities can be automated). In particular, as the authors themselves note, the impact of generative AI may often involve automation of some subtasks, allowing human workers to focus on other activities. The granular information contained in this index is especially useful for my purposes, because it provides an assessment of which tasks/occupations are less likely to be impacted by AI. I set all tasks that the authors classify as having 50% or less of their activities impacted by AI and computer vision to zero, and I refer to the rest as ‘AI exposed tasks’.
There are several problems that need to be tackled to turn these estimates into the quantities I need.
Although Eloundou et al.’s automation index emphasizes automation, it still includes elements of ‘augmentation’ or task complementarity. This motivates my interpretation that combining their measure with Equation (14) will capture cost savings from both automation and task complementarities.
Eloundou et al.’s data need to be converted into GDP shares. To do this, I combine tasks into occupations, and then I aggregate across occupations using their wage bills, computed from the US Bureau of Labour Statistics (BLS) National Occupational Employment and Wage Estimates pooled across the years 2019–22. This procedure yields a wage bill-weighted share of exposed occupations equal to 20%. I interpret this number to be the same as the GDP share of tasks exposed to AI.
Eloundou et al.’s approach is to determine tasks that can be ultimately performed by generative AI and computer vision technologies (such as the technology already incorporated in Dall-E). Two things are missing from the information they provide. The first is how much of the task impact is likely to be realized within the next 10 years. The second is whether, in all of these cases, it is profitable to use AI (e.g. whether automating using AI is cost-effective). Svanberg et al. (2024) make an attempt to provide answers to these questions in the case of computer vision technologies, which are a subset of the technologies Eloundou et al. consider. I take Svanberg et al.’s estimates and extrapolate them to all of the tasks Eloundou et al. consider.21 Namely, Svanberg et al.’s base estimates imply that among computer vision-exposed tasks, 23% can be feasibly (and profitably) automated within 10 years.22 Applying this number to Eloundou et al.’s estimates, I arrive at the GDP share of tasks impacted by AI within the next 10 years as of all tasks (or occupations).
3.1.2. Cost savings from AI
I base my estimate of on the experimental studies that have already provided ‘proof-of-concept’ estimates of productivity improvements or labour cost reductions due to AI. Three studies, which I have already mentioned, are particularly notable here, and I now describe each one of them.
Peng et al. (2023) design an experiment where freelance computer programmers are given access to and encouraged to use GitHub Copilot, which (at the time of the experiment) was powered by OpenAI Codex (GPT-3). Participants were given the task of implementing an HTTP server on JavaScript, a popular language for which resources and subroutines are readily available, and GPT-3 was already trained on these resources. They compare the experimental treatment group to a control group that is not given access to GitHub Copilot. They find that the treatment group performed the assigned tasks on average 55.8% faster than the control group. As an aside, like the other studies I mention here, Peng et al. find that these improvements come from otherwise less well-performing subjects.
Noy and Zhang (2023) design an online experiment where individuals from a range of white-collar occupations are recruited and presented with simple writing tasks (in particular, tasks like writing press releases, short reports and analyses that are designed to take 20 to 30 min and resemble real-life tasks in the participants’ occupations). The treatment group is given access to and encouraged to use ChatGPT-3.5, while the control group is not. They verify that there is very low usage of ChatGPT in the control group. They estimate that ChatGPT enables, on average, 40% faster completion of the task at hand. They also estimate an 18% improvement in quality scores, as judged by peers and ChatGPT-based scoring of the output. In their case, too, the gains come mostly from subjects that performed less well before the experiment.
Brynjolfsson et al. (2023) is the only study that I am aware of that looks at the use of generative AI tools in a real business setting with a careful experimental design. The business they focus on is a customer service provider, which uses a custom generative AI tool to help customer service associates. The rollout took the form of a treatment group getting access to this tool, while the control group did not. Brynjolfsson, Li and Raymond evaluate the impact of the rollout on cost savings, by focusing on how quickly tasks (open customer tickets) are resolved. They also look at self-reported customer satisfaction. They find a significant improvement in the speed with which tasks are completed by customer service associates – an effect of about 14% on average. They additionally estimate a slight and statistically insignificant decline in quality, however, as judged by the users themselves. Like the other studies, Brynjolfsson, Li and Raymond find that the results are predominantly among the lower-performing, less expert employees. In fact, they estimate that the top quintile of associates experience no improvement at all.
I interpret all three studies as providing labour cost savings from AI, broadly construed – in particular, meaning that all three studies include both automation and task complementarity elements. For instance, for the GitHub Copilot users in Peng et al., the authors’ interpretation is that some of the subtasks previously performed by programmers, such as the writing of common routines, are now done by the Copilot. Along the lines of the discussion in Section 2.2, suppose that programming in JavaScript involves N subtasks, which need to be completed for a successful program. These include initial planning (which approach to adopt, how to organize the program, etc.), composition of subroutines, putting the subroutines together, debugging the subroutines, debugging the master program and then assessing whether the program achieves the planned aims. When all of these subtasks are performed, then the task at hand – the writing of a specific computer program – is completed. We can think of generative AI as taking over a subset of the composition of subroutine tasks. Then in line with the framework in Acemoglu and Restrepo (2018, 2019b), the use of this new technology will create displacement and productivity effects. Overall performance will increase because of productivity effects, provided that this technology is better than/faster than humans at composing some subroutines (but humans are still needed for the other subtasks, including planning, debugging and checking). Demand for human labour in this task may increase or decrease depending on the magnitude of the displacement and productivity effects and the elasticity of substitution between this task and other tasks (as well as the demand elasticity for the product that is ultimately being produced for the market). In addition, when simple subroutines are taken over by the Copilot, this may enable human workers to specialize in other subtasks, potentially generating task complementarities and further productivity gains.23
What I have just described for programming also applies to the other two studies. ChatGPT is doing some of the drafting, which human subjects can then take and incorporate into their writing with some verification and modification. The same applies for the customer service associates, who are allowed to copy and paste text suggested by the generative AI tool in the setting studied by Brynjolfsson, Li and Raymond.
I also assess that the tasks in these three studies are broadly comparable to the exposed tasks considered in Eloundou et al. (2024), even though I argue later that they are more likely to be in the easy-to-learn category. Hence, these studies are most informative about cost savings (productivity improvements) for the exposed tasks in Eloundou et al.
As a baseline, I ignore the quality effects (which, as noted, are not uniform between the three studies) and focus on the average increase in speed and interpret this as average cost savings. I return to the inequality implications below. Finally, as a baseline, I use the average of the estimates from Noy and Zhang (2023) and Brynjolfsson et al. (2023), and turn to the average of the three studies as a robustness check. The reason for this is that Peng et al.’s setting is less likely to be relevant to other tasks and occupations, since the task in question is a very well-defined one for which Github Copilot was extensively trained, and this has no direct equivalent in the other tasks we are focusing on. Under these assumptions, the average labour cost savings are 27% ().
Recall that these numbers refer to declines in labour costs in occupations where many tasks involve the use of both capital and labour, and what is relevant in Equation (14) is overall cost savings. To convert labour cost savings into overall cost savings, I use the industry-level estimates from Eloundou et al. and combine them with industry labour shares from the Bureau of Economic Analysis (BEA), as described in the Supplementary Appendix.24 This gives an average (AI exposure-adjusted) labour share of 0.535, and thus the average (overall) cost savings from AI are about .
If we add the numbers from Peng et al. to the mix, the average labour cost savings become 0.36, and thus the average overall cost savings come to .
3.2. Aggregate productivity gains: a first pass
In other words, according to this basic estimation strategy, TFP gains over the next 10 years from AI are about 0.66% – meaning that relative to the baseline without the current suite of AI and computer vision advances, TFP will be higher by 0.66 percentage points in 10 years, or annual TFP growth will be higher by around 0.064%. This is a non-trivial but modest effect, and certainly much less than both the revolutionary changes some are predicting and the less hyperbolic but still substantial improvements forecast by Goldman Sachs and the McKinsey Global Institute, which I discussed in Section 1.
If we were to consider the higher productivity numbers from Peng et al., 0.144 would be replaced by 0.193, and the 10-year TFP gains would be 0.89%, instead of 0.66%.
The only modification that would make a sizable difference to these numbers is to increase the fraction of tasks that will be impacted over the next 10 years. One way of doing this is to inflate the numbers from Svanberg et al. This could be because either the fraction of tasks that can be feasibly automated will be different for generative AI than for computer vision, or because within 10 years this fraction will increase significantly. For example, in Svanberg et al.’s scenarios where costs for computer vision decline very rapidly, such as 10% a year, the fraction of tasks that are feasibly automated may be as high as 30%. This would raise the GDP share impacted by AI to approximately 6%, and correspondingly increase the TFP gains over the next 10 years to about 0.9%. Note that even this number is quite modest, and moreover, 10% per annum cost declines is quite aggressive (since, as already noted above, even if GPU costs were to decrease by 10% or even 20%, this would not lead to a 10% decline in costs of performing computer vision tasks, given the presence of other inputs, such as programming and data, as well as the inherent limitations of the current generative AI architecture).
Finally, I note three important considerations missing from these computations.
These adoption numbers ignore the fact that there is still very little investment in AI in the US corporate sector. Acemoglu et al. (2022) estimate that less than 1.5% of US businesses had any investment in AI in 2019, and this is particularly true beyond the very large companies in manufacturing, information services and business services. Since many of the tasks considered in Eloundou et al. (2024) are performed in small- and medium-sized enterprises, this is unlikely to change quickly. If generative AI tools become monopolized in the hands of a few companies, this might further slow down their adoption by small- and medium-sized firms. These considerations suggest that even the 0.046% number for the share of GDP impacted by AI may be a big overestimate, and the true numbers could be much smaller.
Any major technology creates adjustment costs when adopted at large scale, because other organizational aspects need to evolve as well and this is typically quite costly and slow. In the context of digital technologies, Greenwood and Yorukoglu (1997) and Brynjolfsson et al. (2021), among others, have argued that productivity gains will take a J-shaped pattern, and the former paper predicts that the flat part of the J-curve lasts no less than 20 years for digital technologies. If so, the 14.4% overall cost reductions may be a significant overestimate for the next 10 years.
As already discussed above, some of the tasks in the list of Eloundou et al. (2024) are hard-to-learn, where productivity gains may be significantly less than those based on the experimental studies that have focused on the easy-to-learn tasks.
In the next subsection, I make a preliminary attempt at incorporating the third possibility, but I will ignore the first two. Nevertheless, these considerations make me conclude that even the 0.66% increase in TFP within the next 10 years due to AI is likely to be an upper bound on this technology’s medium-run effects.
3.3. Aggregate productivity gains: incorporating hard tasks
I take the 27% labour cost savings, from Noy and Zhang (2023) and Brynjolfsson et al. (2023), to apply to easy-to-learn tasks. The discussion in the previous section suggests that most hard tasks have not been impacted or automated yet, and hence it is impossible to know what their cost savings will be. Here I take those productivity gains to be 7%. My reasoning is as follows. I consider the tasks involved in the Peng et al. (2023) study to be very easy-to-learn for generative AI for reasons explained above. Those studied in Noy and Zhang (2023) are also on the easy side, and led to cost savings of about 40%, which is about two-thirds of the cost savings in Peng et al., while the customer service tasks in Brynjolfsson et al. (2023) are already moving towards somewhat more complex tasks, and these had cost savings of only 14%. I imagine that many of the hard-to-learn tasks are more challenging for AI models than the simpler end of the customer service tasks to which Brynjolfsson et al.’s numbers refer. This motivates my choice of half of the cost savings of their study, 7% (which is also about a quarter of the baseline 27% cost reduction estimate I used in the previous subsection).
The cost-saving numbers for both easy and hard tasks are again multiplied by 0.535 to convert them into overall cost savings—0.144 and 0.037 for easy and hard tasks, respectively. To obtain the shares of easy and hard tasks, I start from Eloundou et al.’s data and methodology, and then develop a procedure, implemented using GPT-4 like they do, for sorting these into easy and hard tasks. The key characteristic for easy tasks is the presence of a well-observed outcome and a straightforward rule that links actions/recommendations to characteristics of the problem at hand.
To implement this procedure, I start from the 4,089 exposed tasks determined from the above procedure. Each one of these tasks has a statement on O*NET that includes at least one verb. Each task also belongs to the higher level of aggregation of DWAs and the (even coarser) 332 Intermediate Work Activities (IWAs). The procedure proceeds in four steps:
Classification of verbs: Each task statement includes at least one verb, which is located at the beginning and can be easily identified. These primary verbs describe much of what is ‘hard’ or ‘easy’ that needs to be learned about the task. We manually classified tasks into easy and hard categories.25 The full list of tasks is provided in the Supplementary Appendix. As an illustration, verbs for easy tasks include, among others: compute, resolve, count, draft, grade, transcribe, classify, standardize, write and record. Many of these verbs are associated with simple actions that follow a clear set of steps and also implicitly have a well-defined metric for success (such as accounting or grading). In contrast, verbs for hard tasks include: participate, advise, instruct, diagnose, educate, hire, represent, testify and care. The latter set of verbs describes more open-ended activities for which there is less likely to be a clear metric of success. Yet other verbs, such as analyse, maintain or inspect, do not fall into either of these categories and are coded as ‘uncertain’.
Classification of IWAs: IWAs provide additional context for verbs, especially for actions that verbs alone lack. We manually classify the 332 IWAs into easy and hard tasks. Some of the easy IWAs are: evaluate project feasibility; maintain sales or financial records; explain regulations, policies or procedures; issue documentation; and teach safety procedures or standards to others. Some of the hard ones include: monitor health conditions of humans or animals; evaluate scholarly work; evaluate the quality or accuracy of data; maintain safety or security; and train animals. These examples highlight the same principle mentioned in the discussion of verbs – easy-to-learn activities are those for which there is a clear metric of success and simple rules that can achieve this successful outcome, while these are absent in hard-to-learn activities.26
Latent Dirichlet Allocation (LDA) topic modelling: While tasks may share IWAs and verbs, each task is also worded uniquely to capture the subject and the more detailed description of activities. While ‘writing an audit report’ and ‘writing a letter of recommendation’ are both associated with the same verb and the same IWA, the content and skills required for these tasks are different. To better disambiguate the contexts of these tasks, we use LDA to allocate tasks into clusters. The LDA procedure is unsupervised and clusters tasks using the co-occurrence matrices of words extracted from each task statement. It assigns a probability to each task belonging to a topic cluster (Blei et al., 2003). We feed all 19,281 tasks into the algorithm and use LDA to identify 100 clusters and the probability that each task belongs to one of those 100 clusters.
Final assignment: The final step of our procedure is to derive a probability that each exposed task is easy or hard. We first manually classify a random sample of 500 exposed tasks as easy or hard. We then use the classification of verbs, classification of IWAs and the LDA-derived probabilities to train a gradient-boosted tree (Friedman, 2001) to match the manual classifications of the training sample of tasks. We finally obtain a probabilistic assignment of each one of the 4,089 exposed tasks into easy and hard task categories from this algorithm.27
Unsurprisingly, this is smaller than the estimate in the previous subsection. Since automating hard-to-learn tasks – and even more so, introducing task complementarities for them – will be more challenging, I view this estimate to be more reasonable. Either way, the TFP gains within the next 10 years appear quite modest.
3.4. From TFP to GDP
Using the capital share for the entire private business sector, 0.43, this implies that GDP gains will be equal to the TFP gains multiplied by 1.75 (). Hence, taking the baseline estimate of an increase in TFP of 0.66%, I obtain a first estimate for GDP growth due to AI of 1.16% over 10 years, or, taking the presence of hard tasks into account, a lower estimate of 0.93%.
I update the GDP effects of AI advances in Section 3.6 when I model the between-task and between-industry substitution patterns and the resulting investment response.
3.5. Consequences of new bad tasks
The calculations so far leave out the effects of new tasks introduced thanks to AI (and equivalently, the system-wide adjustments that AI may enable in some businesses), as emphasized by Bresnahan (2019) and Agrawal et al. (2024). It is even more challenging to put numbers on the effects of new tasks. If AI helps create new tasks that increase productivity and especially contributes to the reinstatement of workers of different skill levels into the production process, its consequences can be much more positive.
Here my purpose is simply to point out that, in the case of AI, because some of the new tasks may be of the ‘bad’ type, there may be an overstatement of the welfare gains from AI when we look at GDP numbers. As an extremely preliminary attempt to argue that this could be important, I will draw on two sources of data.
The first is the recent study by Bursztyn et al. (2023), which provides suggestive estimates for the extent of this problem in the case of social media. Bursztyn et al. (2023) run an experiment to evaluate the welfare effects of social media for college students, the demographic group most engaged with social media. They ask TikTok and Instagram users how much they would need to be paid to not use the platform for a month. Users of these two platforms are willing to pay up to $59 and $47 a month, respectively, to continue to use social media – or on average of $53 per user-month. But they also find that users are simultaneously willing to pay $28 and $10, respectively, every month to get everyone from their social network off TikTok and Instagram, – or an average $19 per user-month. If non-users are also included in the analysis, the willingness to pay to stop everybody using social media increases to $47 for TikTok and $13 for Instagram. This suggests that while companies can profitably market AI-based social media, the net effect on welfare may be negative. Quantitatively, I ignore the non-users (since the population in the study is college students, who are more likely to be impacted by social media activity even when they do not use it than the adult population in the United States). I also start with a benchmark in which AI-powered platforms can capture the full (average) $53 willingness to pay per user (e.g. because they are effectively price discriminating by varying the intensity with which they are collecting and monetizing users’ data). Then taking the average between the two platforms, I conclude that for every $53 of revenue, there is a net negative effect on users of $19. Put differently, the total effect from this class of new bad tasks, the equivalent of (18) in theory, is in proportionate terms. Note also that if I assumed instead that social media companies can capture less than the full $53 per user-month, then the denominator in this expression would be smaller, and thus the proportionate damage per dollar of revenue would be higher.
I combine the number −0.36 with estimates of (1) revenues from social media and digital ads and (2) spending on malicious IT attacks and IT security against these attacks to compute apparent and real gains from new bad tasks. Specifically, the total revenues in 2022 of Meta (Facebook plus Instagram), Alphabet (Google and Youtube), Snapchat, TikTok and Twitter (‘X’) comes to 460 billion, or about 1.64% of US GDP,28 while a lower bound on IT security is 78 billion or 0.28% of US GDP.29 Assuming that spending by malicious actors on IT attacks is at least one-third of this and combining these three estimates, I arrive at 2% of US GDP. This number could be a significant underestimate of other manipulative activities enabled by new AI technologies. On the other hand, it could also overstate the problem, since only a fraction of digital ad revenues will come from such manipulative activities (and this may be particularly true for digital ad revenues from Google search). These considerations suggest that the numbers here should be taken as merely suggestive.
Taking the 2% of US GDP as revenues from new bad tasks and using the numbers from Bursztyn et al. (2023) suggests that the total negative effects of these manipulative activities is of GDP (in consumption equivalent units). This number points to a sizable negative impact on welfare, and could be larger if manipulative uses of generative AI becomes more widespread. In contrast, if one were to simply count the revenue coming from these new tasks, one might conclude that they would increase GDP by 2%. This discussion thus suggests caution in interpreting all increases in GDP coming from the use of generative AI as a positive effect on consumer welfare.
3.6. Wage and inequality implications
Here, g refers to demographic group g, and following their paper, I will focus on 500 demographic groups, defined by education, age group, gender, ethnicity and native versus foreign-born status. In addition, is the change in GDP resulting from the technology change (capturing the productivity effect), σ is the elasticity of substitution between tasks and is the vector of induced industry shifts (e.g. because automation of tasks in one industry affects prices and causes a reallocation of spending across sectors). Most importantly, is the vector of demographic group-level displacement caused by the technology shock – namely, it is a column vector of 500 entries, and Θg is the gth-row vector of the propagation matrix, which summarizes how the displacement of other demographic groups affects demographic group g (this is the reason why it is pre-multiplying the vectors of industry shifts and displacements for all groups).
The propagation matrix represents the full ‘ripple effects’ – the impact of the displacement of one demographic group on others, as they leave the tasks they were previously performing and compete with other groups to be employed in other tasks. Such reallocations are the key channel through which direct productivity gains for a group may end up harming it (as my example in the previous section illustrated). They are also the mechanism through which the displacement of a demographic group may end up being more damaging to another demographic group.
The ripple effects estimated in Acemoglu and Restrepo (2022) may be context-specific – meaning that the magnitudes of these effects could be quite different for the tasks impacted by AI and automation technologies that my previous work focuses on. Nevertheless, since it is impossible to estimate these ripple effects for the future impact of generative AI technologies, I will use these existing estimates.
An additional issue is that I have so far interpreted AI exposure to include both automation and task complementarities. In this subsection, I ignore the task complementarities and presume that all of the AI exposure quantified so far will take the form of automation.30
The methodology in Acemoglu and Restrepo (2022) starts from the vector, which is at the demographics group level. To construct an equivalent of this measure, I take the set of exposed occupations, and then use the wage bill shares of different demographic groups in these occupations to map the AI-generated displacement to the demographic group level. For example, if for demographic group g 5% of the wage bill share in 2019–22 were in fully-exposed occupations, then would be 0.05. I also assign these occupations to industries using wage bill shares in order to compute industry-level impacts on costs and prices.31 I compute the induced sectoral reallocations in the same way as in Acemoglu and Restrepo (2022), using their parameterization of inter-sector elasticity of substitution and the estimate of as in Humlum (2021) and take the elasticity of substitution between sectors in consumption to be as in Buera et al. (2022), which was also imposed in Acemoglu and Restrepo (2022).
Figure 1 is the equivalent of Figure 5b in Acemoglu and Restrepo (2022) and presents the distribution of AI exposure across demographic groups sorted by their hourly wage in 2018–22. It shows that AI exposure is much more equally distributed across demographic groups than pre-AI automation (which was based on robotics, dedicated advanced machinery and software systems).

Distribution of AI exposure across the wage distribution.
Notes: This figure depicts the AI exposure measure (both easy and hard tasks, combined) across 500 demographic groups. The horizontal axis gives the average hourly wage of each demographic group between 2018 and 2022, computed from the 5-year American Community Survey sample. Marker sizes are proportional to the average 2018–22 employment level of each group and different shades indicate the education level of the group.
Table 1 presents the main results. The seven rows are for the five education groups (aggregating demographic groups according to their education level), for the average of the workforce and for GDP. The first column gives the AI exposure for each one of the demographic groups, using our baseline exposure measure that does not distinguish easy and hard tasks. The second column presents the direct impact of AI exposure on each one of these groups, while the third column contains the full wage impact, taking into account induced substitution between industries and the ripple effects. The next four columns are similar but now refer to the AI exposure measure that separates easy and hard tasks – column 4 gives exposure to easy tasks, and column 5 is for exposure to hard tasks. Finally, column 8 shows the exposure measure from Acemoglu and Restrepo (2022) for comparison (but recall that this measure refers to a 36-year period, rather than the 10-year timescale here). The comparison of columns 1, 4 and 5 to column 8 confirms that AI exposure is much more equally distributed across demographic groups than pre-AI automation exposure. Workers with less than high school have the lowest exposure, followed by postgraduates, while workers with college degrees and those with associate degrees or some college have the highest exposure.
Baseline exposure . | Exposure adjusted for easy and hard task . | |||||||
---|---|---|---|---|---|---|---|---|
(1) . | (2) . | (3) . | (4) . | (5) . | (6) . | (7) . | (8) . | |
Baseline AI exposure . | Direct effect of AI exposure . | Total wage effect of AI exposure . | Exposure to easy tasks . | Exposure to hard tasks . | Direct effect of exposure to easy and hard tasks . | Total wage effect of exposure to easy and hard tasks . | Direct task displacement 1980–2016 . | |
Workers with less than high school degree | 0.0318 | −0.0323 | 0.0157 | 0.0235 | 0.0083 | −0.0356 | 0.0132 | 0.2690 |
Workers with high school degree | 0.0464 | −0.0617 | 0.0106 | 0.0337 | 0.0128 | −0.0649 | 0.0079 | 0.2706 |
Workers with some college | 0.0494 | −0.0676 | 0.0139 | 0.0357 | 0.0138 | −0.0709 | 0.0112 | 0.1886 |
Workers with Bachelor’s degree | 0.0523 | −0.0733 | 0.0060 | 0.0382 | 0.0140 | −0.0765 | 0.0040 | 0.0684 |
Workers with postgraduate degree | 0.0405 | −0.0498 | 0.0098 | 0.0299 | 0.0106 | −0.0530 | 0.0081 | 0.0343 |
Average worker | 0.0462 | −0.0612 | 0.0101 | 0.0336 | 0.0126 | −0.0644 | 0.0077 | 0.2107 |
GDP | 0.0156 | 0.0140 |
Baseline exposure . | Exposure adjusted for easy and hard task . | |||||||
---|---|---|---|---|---|---|---|---|
(1) . | (2) . | (3) . | (4) . | (5) . | (6) . | (7) . | (8) . | |
Baseline AI exposure . | Direct effect of AI exposure . | Total wage effect of AI exposure . | Exposure to easy tasks . | Exposure to hard tasks . | Direct effect of exposure to easy and hard tasks . | Total wage effect of exposure to easy and hard tasks . | Direct task displacement 1980–2016 . | |
Workers with less than high school degree | 0.0318 | −0.0323 | 0.0157 | 0.0235 | 0.0083 | −0.0356 | 0.0132 | 0.2690 |
Workers with high school degree | 0.0464 | −0.0617 | 0.0106 | 0.0337 | 0.0128 | −0.0649 | 0.0079 | 0.2706 |
Workers with some college | 0.0494 | −0.0676 | 0.0139 | 0.0357 | 0.0138 | −0.0709 | 0.0112 | 0.1886 |
Workers with Bachelor’s degree | 0.0523 | −0.0733 | 0.0060 | 0.0382 | 0.0140 | −0.0765 | 0.0040 | 0.0684 |
Workers with postgraduate degree | 0.0405 | −0.0498 | 0.0098 | 0.0299 | 0.0106 | −0.0530 | 0.0081 | 0.0343 |
Average worker | 0.0462 | −0.0612 | 0.0101 | 0.0336 | 0.0126 | −0.0644 | 0.0077 | 0.2107 |
GDP | 0.0156 | 0.0140 |
Notes: This table summarizes the effects of AI exposure on the real wages of different education groups and all workers. The first three columns are for the AI exposure measure that does not distinguish easy and hard tasks, while columns 4–7 are for the measure that introduces this distinction. Columns 1, 4 and 5 present levels of AI exposure for these different cuts. Columns 2 and 6 provide the direct effects of AI exposure, while columns 3 and 7 include the full equilibrium wage impacts, incorporating the ripple effects. Column 8, for comparison, includes the direct task displacement measure for 1980–2016 from Acemoglu and Restrepo (2022). The rows are for different education groups as well as for the average worker and for the overall GDP impact. The propagation matrix estimates are for 1980–2016, as reported in column 2 of Table 8 in Acemoglu and Restrepo (2022).
Baseline exposure . | Exposure adjusted for easy and hard task . | |||||||
---|---|---|---|---|---|---|---|---|
(1) . | (2) . | (3) . | (4) . | (5) . | (6) . | (7) . | (8) . | |
Baseline AI exposure . | Direct effect of AI exposure . | Total wage effect of AI exposure . | Exposure to easy tasks . | Exposure to hard tasks . | Direct effect of exposure to easy and hard tasks . | Total wage effect of exposure to easy and hard tasks . | Direct task displacement 1980–2016 . | |
Workers with less than high school degree | 0.0318 | −0.0323 | 0.0157 | 0.0235 | 0.0083 | −0.0356 | 0.0132 | 0.2690 |
Workers with high school degree | 0.0464 | −0.0617 | 0.0106 | 0.0337 | 0.0128 | −0.0649 | 0.0079 | 0.2706 |
Workers with some college | 0.0494 | −0.0676 | 0.0139 | 0.0357 | 0.0138 | −0.0709 | 0.0112 | 0.1886 |
Workers with Bachelor’s degree | 0.0523 | −0.0733 | 0.0060 | 0.0382 | 0.0140 | −0.0765 | 0.0040 | 0.0684 |
Workers with postgraduate degree | 0.0405 | −0.0498 | 0.0098 | 0.0299 | 0.0106 | −0.0530 | 0.0081 | 0.0343 |
Average worker | 0.0462 | −0.0612 | 0.0101 | 0.0336 | 0.0126 | −0.0644 | 0.0077 | 0.2107 |
GDP | 0.0156 | 0.0140 |
Baseline exposure . | Exposure adjusted for easy and hard task . | |||||||
---|---|---|---|---|---|---|---|---|
(1) . | (2) . | (3) . | (4) . | (5) . | (6) . | (7) . | (8) . | |
Baseline AI exposure . | Direct effect of AI exposure . | Total wage effect of AI exposure . | Exposure to easy tasks . | Exposure to hard tasks . | Direct effect of exposure to easy and hard tasks . | Total wage effect of exposure to easy and hard tasks . | Direct task displacement 1980–2016 . | |
Workers with less than high school degree | 0.0318 | −0.0323 | 0.0157 | 0.0235 | 0.0083 | −0.0356 | 0.0132 | 0.2690 |
Workers with high school degree | 0.0464 | −0.0617 | 0.0106 | 0.0337 | 0.0128 | −0.0649 | 0.0079 | 0.2706 |
Workers with some college | 0.0494 | −0.0676 | 0.0139 | 0.0357 | 0.0138 | −0.0709 | 0.0112 | 0.1886 |
Workers with Bachelor’s degree | 0.0523 | −0.0733 | 0.0060 | 0.0382 | 0.0140 | −0.0765 | 0.0040 | 0.0684 |
Workers with postgraduate degree | 0.0405 | −0.0498 | 0.0098 | 0.0299 | 0.0106 | −0.0530 | 0.0081 | 0.0343 |
Average worker | 0.0462 | −0.0612 | 0.0101 | 0.0336 | 0.0126 | −0.0644 | 0.0077 | 0.2107 |
GDP | 0.0156 | 0.0140 |
Notes: This table summarizes the effects of AI exposure on the real wages of different education groups and all workers. The first three columns are for the AI exposure measure that does not distinguish easy and hard tasks, while columns 4–7 are for the measure that introduces this distinction. Columns 1, 4 and 5 present levels of AI exposure for these different cuts. Columns 2 and 6 provide the direct effects of AI exposure, while columns 3 and 7 include the full equilibrium wage impacts, incorporating the ripple effects. Column 8, for comparison, includes the direct task displacement measure for 1980–2016 from Acemoglu and Restrepo (2022). The rows are for different education groups as well as for the average worker and for the overall GDP impact. The propagation matrix estimates are for 1980–2016, as reported in column 2 of Table 8 in Acemoglu and Restrepo (2022).
Consequently, predicted wage impacts do not appear to have a big impact on inequality between education groups. Focusing on the estimates in column 7, which incorporate easy and hard tasks, there is a slightly higher wage growth for workers with less than a high school degree – about 1.3% within 10 years – but the gap between postgraduate versus high school and college graduate workers also widens somewhat. In fact, the between-group standard deviation of log wages (weighted by employment) increases slightly, from 0.35 to 0.36. The results are quite similar in column 3 for the baseline AI exposure measure.
Finally, row 7 presents the estimate of the impact on GDP, taking into account the equilibrium increase in the capital stock implied by the model. This is under the assumption that all of the capital stock adjustment will take place within 10 years (while in practice, it may take longer) and that the required rate of return on capital investments does not change (whereas with a sizable investment, we may expect an increase). This leads to an upper bound GDP impact of 1.4% in column 7, when I distinguish between easy and hard tasks (or 1.56% in column 3 when this distinction is not drawn). GDP therefore increases substantially more than average wages, and as a result, the capital share of national income increases by about 0.31 percentage points. This confirms that inequality between capital and labour is likely to rise as a result of the rollout of AI.32 Figure 2 performs the same exercise as Figure 7 in Acemoglu and Restrepo (2022), focusing on the exposure measure that distinguishes between easy and hard tasks. As in that paper, productivity effects are (by construction) uniform across groups, and there is also not much inequality generated by the cross-industry shifts shown in the second panel. The third panel confirms that AI’s direct effects are more equally distributed across demographic groups and throughout the wage distribution. In contrast to the findings in Acemoglu and Restrepo (2022), ripple effects do not change the inequality patterns by much, largely because the direct effects are already fairly equally distributed. Figure 2 also indicates that there is a lot of variability in the experience of low-education groups – and this is the reason why the between-group standard deviation of log wages increases, as noted above. Figure 3 explores this issue further by depicting the real wage changes of finer groupings, distinguished by gender, by education and by white and native-born status versus the rest. It reveals that low-education women, especially white, native-born low-education women, are likely to experience declines in real wages as a result of AI.

Decomposition of productivity effects, industry shifts, direct displacement effects and ripple effects.
Notes: This figure is based on the estimates of the propagation matrix from Acemoglu and Restrepo (2022) and combines this with the measure of exposure to easy and hard AI tasks in this paper. The first panel includes just the productivity effect. The second panel adds the industry shifts induced by AI exposure. The third panel incorporates the direct displacement effect, while the final panel adds the ripple effects. The horizontal axis gives the average hourly wage for the relevant demographic group between 2018 and 2022, computed from the 5-year American Community Survey sample. Marker sizes are proportional to the average 2018–22 employment level of each group and different shades indicate the education level of the group. See text for details.

Total wage effect of exposure to AI, by gender.
Notes: This figure is based on the estimates of the propagation matrix from Acemoglu and Restrepo (2022) for 1980–2016 and combines this with the measure of exposure to easy and hard AI tasks in this paper. Each panel includes wage effect estimates for five education groups. Reported estimates are weighted averages of the estimates for the more detailed subgroups (using average employment 2018–22 as weights). The upper left panel is for native-born white men, the lower left is for all other men, the upper right is for native-born white women and the lower right is for all other women.
Overall, this exercise suggests that the inequality consequences of AI will not be as adverse as pre-AI automation, because AI exposure is more equally distributed across demographic groups. Nevertheless, there is no evidence that AI will reduce inequality, as some are forecasting. Rather, my analysis suggests that it may have a small positive effect on overall (between-group) inequality and reduce the real earnings of low-education women. It will also further widen the gap between capital and labour income.
4. CONCLUDING REMARKS AND DISCUSSION
Following its release on 30 November 2022, ChatGPT became the fastest spreading tech platform in history, reaching approximately 100 million monthly users within just 2 months. Its impressive features, and the greater capabilities of the newer version of ChatGPT-4 released in March 2023, soon captured imaginations, both among the general public and economic commentators. Forecasts of large productivity gains have now become commonplace.
While there is no question that generative AI models, including ChatGPT, have impressive achievements and have great potential for beneficial economic effects, the extent of their macroeconomic consequences remains an open question.
There are four potential types of macro effects that AI technologies can have in the medium run:
They can quickly revolutionize every aspect of the economy, and lead to massive improvements in productivity, even taking us close to ‘singularity’. While this is a possibility that cannot be completely ruled out, there is so far no evidence of such revolutionary effects (Nordhaus, 2021), and these are not addressed in the current paper.
They can have more modest but still notable effects on the macroeconomy by improving productivity and reducing costs in a range of tasks. Some of the forecasts have focused on these types of improvements and still produced relatively large numbers, such as a percentage point per annum increase in economic growth within a 10-year horizon.
They can impact wages and inequality due to their automation effects or conversely, lead to large wage increases, especially for lower-pay workers, as forecast by The Economist (2023).
They can have macroeconomic effects by producing deepfakes, misinformation, manipulation and other ‘bads’.
In this paper, I use the task-based framework of Acemoglu and Restrepo (2018, 2019b, 2022) to evaluate the second and the third effects, and I also take some steps to formalize how the fourth set of effects might work out in a task-based macro framework.
I base my approach on existing experimental studies that estimate productivity gains and time savings from using generative AI tools in a number of settings. By building on these studies, I am explicitly taking on board the idea that generative AI will lead to productivity improvements. Nevertheless, combining these numbers with estimates of exposed tasks from Eloundou et al. (2024) and Svanberg et al. (2024) leads to much more modest productivity effects than most commentators and economists have claimed so far. These numbers become even smaller once we take into account that many of the tasks for which we have evidence of cost savings are relatively easy for AI, while in several other tasks the integration of AI will face more formidable difficulties – mostly because these are likely to involve more complex interactions between action and context and because they lack clear metrics for success that are observable, and hence necessitate AI models to learn from the (average) behaviour of humans previously performing the same tasks.
Taking these considerations into account, I estimate that TFP effects from AI advances within the next 10 years will be modest – an upper bound that does not take into account the distinction between hard and easy tasks would be about a 0.66% increase in total within 10 years, or about a 0.064% increase in annual TFP growth. When the presence of hard tasks among those that will be exposed to AI is recognized, this upper bound drops to about 0.53%. GDP effects will be somewhat larger than this because automation and task complementarities will also lead to greater investment. But my calculations suggest that the GDP boost within the next 10 years should also be modest, in the range of over 10 years in total, provided that the investment increase resulting from AI is modest, and in the range of in total, if there is a large investment boom.
If AI is used to create new tasks and products, these will also add to GDP and can boost productivity growth. Nevertheless, when we incorporate the possibility that new tasks generated by AI may be manipulative, the impact on welfare can be even smaller. Based on numbers from Bursztyn et al. (2023), which pertain to the negative effects of AI-powered social media, I provide an illustrative calculation for social media, digital ads and IT defence-attack spending. These could add to GDP by as much as 2%, but if we apply the numbers from Bursztyn et al. (2023), their impact on welfare may be –0.72%. This discussion suggests that it is important to consider the potential negative implications of AI-generated new tasks and products on welfare.
Finally, I borrow heavily from the estimates of Acemoglu and Restrepo (2022) on the economy-wide productivity, wage and inequality effects of pre-AI automation technologies to provide some guidance on what the impact of new AI advances will be. Because AI-exposed tasks are more equally distributed within the population than tasks exposed to pre-AI automation, I do not find substantial negative wage effects for any education group. Nevertheless, the estimates do not point to significant reductions in inequality either, and in fact, my findings suggest that low-education women may experience small wage declines, overall between-group inequality may increase slightly, and the gap between capital and labour income is likely to widen further.
These results should not be interpreted as arguing that there are no major benefits from AI. First, an increase of about in TFP within 10 years is modest but still far from trivial. Second and more importantly, there may be other ways in which AI can be used to generate much more notable benefits. I have suggested in previous work (Acemoglu and Restrepo, 2020b; Acemoglu, 2021) that if AI is used for generating new tasks for workers, it can have more beneficial productivity, wage and inequality consequences, and it can even increase wages. This may be doubly true for generative AI, which could be used for providing better information to workers and boosting their expertise, as argued in Acemoglu et al. (2023) and explained briefly here.
Many production workers today, including electricians, repair workers, plumbers, nurses, educators, clerical workers and increasingly many blue-collar workers in factories, are engaged in problem-solving tasks. These tasks require real-time, context-dependent and reliable information. For instance, an electrician dealing with the malfunctioning of advanced equipment or short-circuit on the electricity grid will be hampered from solving these problems because he or she does not have sufficient expertise and the appropriate information for troubleshooting. Reliable information that can be provided quickly by generative AI tools can lead to significant improvements in productivity. Similarly, generative AI in classrooms can lead to a major reorganization of how teaching takes place, with greater levels of personalization, as these tools help teachers identify specific aspects of the curriculum with which subgroups of students are having problems and propose new context-dependent teaching strategies. Reliability of AI models will be key for successfully creating such new tasks and delivering improvements in the quality of education.
Productivity improvements from new tasks and new products, which have been important for previous transformative technologies, such as electricity and the Internet, are not incorporated into my estimates. This is for three reasons. First and most parochially, this is much harder to measure and is not included in the types of exposure considered in Eloundou et al. (2024) and Svanberg et al. (2024). Second, and more importantly, I believe it is right not to include these in the likely macroeconomic effects, because these are not the areas receiving attention from the industry at the moment, as also argued in Acemoglu (2021), Acemoglu and Restrepo (2020b) and Acemoglu and Johnson (2023). Rather, areas of priority for the tech industry appear to be around automation and monetization of personal data, such as through search or social media digital ads. This makes it less likely that many new good tasks will be created quickly. Third, and relatedly, more beneficial outcomes may require new institutions, policies and regulations, as also suggested in Acemoglu and Johnson (2023) and Acemoglu et al. (2023).
My assessment is that there are indeed much bigger gains to be had from generative AI, which is a promising technology, but these gains will remain elusive unless there is a fundamental reorientation of the industry, including perhaps a major change in the architecture of the most common generative AI models, such as the LLMs, in order to focus on reliable information that can increase the marginal productivity of different kinds of workers, rather than prioritizing the development of general human-like conversational tools. The general-purpose nature of the current approach to generative AI could be ill-suited for providing such reliable information. To put it simply, it remains an open question whether we need foundation models (or the current kind of LLMs) that can engage in human-like conversations and write Shakespearean sonnets if what we want is reliable information useful for educators, healthcare professionals, electricians, plumbers and other skilled workers.
Footnotes
I am grateful to Can Yeşildere for phenomenal research assistance, and to Leonardo Bursztyn, Mert Demirer, Lauren Fahey, Roberto Galbiati, Isabelle Méjean, Shakked Noy, Sida Peng, Julia Regier, and Whitney Zhang as well as participants at the MIT Solow Memorial conference and the Economic Policy conference, for useful comments. I am especially grateful to my discussants, David Hémous and Benoît Coeuré, for their insightful comments and suggestions. I am also thankful to Pamela Mishkin and Daniel Rock for generously sharing their data on AI exposure. I am also heavily indebted to my collaborators on several projects related to these topics, David Autor, Simon Johnson and Pascual Restrepo, from whom I learned a great deal and who have also given me very useful comments on the current draft. Financial support from the Hewlett Foundation is gratefully acknowledged. All remaining errors are mine.
The Managing Editors in charge of this paper were Roberto Galbiati and Isabelle Mejean.
Korinek and Suh (2024) predict a ‘baseline’ GDP growth of 100% over the next 10 years, and also entertain the possibility of much higher ‘aggressive’ AGI growth rates, such as a 300% increase in GDP. Many others are seeing recent developments as a confirmation of the forecasts in Kurzweil (2005) about the impending arrival of ‘singularity’ and ‘explosive’ economic growth (Davidson, 2021).
Three caveats are in order. First, although most recent advances are in generative AI, the economic forces explored here apply to other types of AI, and estimates of exposed tasks I use come on the basis of anticipated improvements in a range of AI-related technologies, including computer vision and software building on large language models. Hence, I consider the numbers here to apply to all of AI and thus typically refer to ‘AI’, unless there is a reason to emphasize generative AI. Second, I focus on the US economy because much of the existing evidence on the microeconomic effects of AI and the prevalence of exposed tasks is from the United States. The impact on other industrialized nations should be similar, whereas the consequences for the developing world are harder to ascertain and require much more in-depth research. Third, some commentators use ‘productivity’ to refer to output per worker (or average labour productivity), while others mean TFP. Throughout, I distinguish between aggregate TFP and GDP (per capita/worker) effects, and I use productivity improvement at the micro/task level as synonymous with cost savings.
New tasks in this framework also capture the possibility of productivity-enhancing reorganizing production. The role of AI in enabling such reorganization is emphasized by, among others, Bresnahan (2019) and Agrawal et al. (2024).
Eloundou et al. (2024) report negative statistical associations between their measure of exposure to AI, which I use below, and measures of exposure to robots and manual routine tasks.
More specifically, I use the most granular information that Eloundou et al. (2024) present, which is their ‘automation index’, coded with help from GPT-4. This index provides information on how much of the activities involved in a task/occupation can be performed by AI. Although this index has a somewhat greater emphasis on automation, it does not systematically distinguish between automation and task complementarities. As I discuss further below and Eloundou et al. (2024) themselves note, their exposure measure often captures the possibility that generative AI and related digital technologies can perform some of the subtasks in an occupation, enabling workers to focus on and specialize in other activities, and thus contains both automation and task complementarity elements.
In addition, if AI models continue to increase their energy requirements, this would contribute to measured GDP, but would not be a beneficial change for welfare.
Acemoglu and Restrepo (2018) provide a dynamic version of this economy with capital accumulation and endogenous technological choices, while Acemoglu and Restrepo (2022) provide a generalization with multiple types of labour and multiple sectors, and Acemoglu and Restrepo (2024) consider a non-competitive version of this economy. Extending the framework in any of these directions does not materially affect the results I discuss here.
One important simplification is to assume that tasks assigned to labour do not require any capital or tools, which is clearly unrealistic. The online Appendix of Acemoglu and Restrepo (2018) shows that the results are very similar if the task production function is modified such that:
I refer to this channel as ‘tasks complementarities’, rather than ‘labor augmentation’, because it can alter the distribution of subtasks and complement worker productivity in some of these subtasks, and also because ‘augmentation’ is sometimes used to refer to the introduction of complementary activities for labour, such as new tasks, which have very distinct effects, as I explain.
In particular, for any differentiable constant returns to scale production function F(K, L), Euler’s theorem implies: . Suppose R is fixed. Then any change in technology that increases F(K, L) at the initial factor supplies must increase . This is not the case when K is fixed, because R can increase more than F(K, L). In Acemoglu and Restrepo (2018, 2019b), the capital stock is taken as given, so a negative impact is possible. In neoclassical growth models with exponential discounting and time-separable preferences, R is fixed in the long run. However, Acemoglu et al. (2024) show that with more general preferences, R tends to increase following automation and other technological changes, so a negative wage effect is possible.
Namely, change in real GDP can be expressed as the change in the value of the objective function of the social planner (e.g. the utility function of the representative household if one exists). Then the Lagrange multipliers on task-level resource constraints are equivalent to the prices and can be taken as given in evaluating the effects of small changes in parameters/technology on the maximized value of this objective function, by the envelope theorem. See Hulten (1978).
The contribution of capital to output growth can also be derived in the general case, just like Hulten’s theorem. In particular, for any constant returns to scale production function , where A is a technology shifter, we have . The first term is TFP. Competitive factor markets imply , and thus TFP. Moreover, , and hence TFP . When the capital–output ratio (or equivalently the capital share) remains constant in response to changes in technology, then TFP, which is the first approximation I use. Later, I estimate by how much the capital–output ratio is predicted to increase due to automation and update this estimate.
Peng et al. (2023) estimate even larger effects on how quickly certain programming tasks can be completed. But these refer to a very narrow set of tasks – subroutines that GitHub Copilot can write in common programming languages – and are thus less broadly applicable, and I consider them in robustness checks.
The hard versus easy distinction is different from the routine versus non-routine distinction introduced in Autor and Murnane (2003). Routine tasks are those that involve repeated performance of the same activities (such as knitting or switchboard operations) in stable, predictable environments. The use of digital technologies in these tasks involves step-by-step programming of relevant software and hardware. A significant number of routine tasks have already been automated, whereas AI’s promise is its ability to perform non-routine tasks, as also emphasized by Susskind (2020). Some non-routine tasks such as parts of computer programming or writing of simple text are easy-to-learn, while others that require more context-specific decisions and where metrics of successful performance are scarce would be much harder to learn by outside observation. Crucially, ‘easy tasks’ in this sense need not be easy for humans. In fact, cost savings from easy tasks will be most pronounced when they are expensive for humans to perform. Conversely, some hard tasks, such as those that are based on intuition, experience and judgement, may be relatively straightforward for humans because they do not learn to perform these tasks on the basis of outside observation alone. This is the reason why cost savings relative to human performance are likely to be limited in hard tasks.
For instance, suppose humans typically start boiling the water with the egg inside, and it reduces variability if the egg is placed inside the saucepan after the water has boiled, then this is something some AI models may discover. AlphaZero’s discovery of new effective chess moves can be viewed as an example of the same phenomenon, even if it is in the context of a much more complex and interesting problem. Looked at it this way, AlphaFold’s big success in protein folding can be considered to have both easy and hard aspects. On the hard side, protein folding is a highly multidimensional problem, with no simple mapping between action and desired outcome. On the easy side, however, the desired outcome is observed, so with large amount of computing power, it is possible for an AI model to do even better than humans, as AlphaFold managed to do.
Mathematically, the general learning problem can be formulated as follows. There is a mapping for outcomes specified as , where X is the space of actions and Z denotes the space of ‘contexts’, and Y is the set of possible outcomes. Observed outcomes are noisy versions of the true outcomes in Y, with some noise process which I do not need to specify here. The mapping f is unknown. In principle, there are two ways in which AI can be trained in such a problem. First, with sufficient observations on Y and combinations of inputs from X × Z, the AI model may learn the entire function f or some restricted version thereof. Second, if Y is not observable, then the AI model may be trained from human choices, which can be summarized by a correspondence , specifying the possible actions that humans take when confronted with context Z. Either problem can be challenging if X × Z is very high dimensional. However, the latter type of learning is less likely to lead to something different and better than what humans were doing already. It can sometimes ensure performance at the level of expert humans, if these can be identified, but lack of reliable outcome data may also make it difficult to identify expertise.
The expression for the case in which labour-intensive tasks also use capital, as in footnote 8, is similar and leads to the same conclusion.
Around I, this amount is given by . If labour-intensive tasks were previously using capital as well, then the jump would be smaller but still positive.
This structure makes the example similar to the one that has a finite number of tasks – in this instance, three tasks. But I use the setting with a continuum of tasks for continuity with the rest of the paper. The ‘paradoxical’ result I am highlighting here does not depend on this stark structure or on having just two types of labour.
They also report a third, even more aggressive measure, which they refer to as ζ. This measure is more speculative and focuses on what can be ultimately performed by LLM+. In line with the authors’ interpretation, this is unlikely to be the case in the near future, so I will ignore this third measure.
This is not an innocuous step. Svanberg et al. focus on automation using computer vision, while Eloundou et al.’s exposed technologies include task complementarities as noted above and go beyond computer vision tasks. One could imagine that the share of tasks that can be profitably used with AI within 10 years is different between these two categories. Unfortunately, I do not have another source for such an estimate for non-computer vision AI.
Svanberg et al. also make further extrapolations assuming cost declines in computer vision technologies. They base these on estimates from Besiroglu and Hobbhahn (2022), who project a 22% yearly decrease in computation costs attributed to the expansion of compute power from GPUs. However, a doubling of GPU capacity will not necessarily translate into a 50% decline in costs, in general, due to diminishing returns and bottlenecks created by other inputs, and because of the limitation of the current architecture. I therefore do not include these in the baseline but return to them in the robustness discussion.
In this experimental setting, however, we do not see the displacement effects, because each of the treatment subjects are given the task, and there is no possibility of reducing the number of workers performing this task. Hence, from these experiments, we can only learn about the productivity gains, inclusive of any task complementarities – and not about the displacement effects. This interpretation clarifies why I am comfortable bundling automation and task complementarities together for the purposes of estimating productivity gains from AI.
Briefly, I follow Eloundou et al. and map exposed tasks to occupations, and then map occupations to 14 aggregated NIPA (National Income and Product Accounts) industries using the fractions of workers in each occupation employed across industries. I then compute the labour share of exposed tasks using industry labour shares weighted by the value-added shares of these industries in gross national income. In these calculations, the self-employed are assumed to have the same hourly wage rate as employees in the same industry. I use the 14 NIPA industries to be able to have information on hours worked. These data are for the years 2018–22 from the US BEA NIPA tables, and I provide a list of these 14 industries in the Supplementary Appendix. For reference, the average (value-added-weighted) labour share in 2018–22, without the exposure adjustment, is 0.576. This suggests that exposed tasks are, on average, in industries with slightly lower labour shares than the national average. This labour share estimate is similar to the one reported in Elsby et al. (2013), 0.583, for private non-farm business sector for the years 2010–2012 using the same approach as here. Elsby et al. (2013) also report similar numbers under alternative methodologies for incorporating self-employment income.
This step and other manual coding steps in this subsection were carried out by Can Yeşildere.
Of course, there is considerable ambiguity in some cases. For example, ‘Maintain safety and security’ also includes IT security activities, such as making sure that a new password can be issued to an authorized person, and this would be an easy-to-learn task. The third step is aimed at dealing with such ambiguities.
As examples, the algorithm assigns a probability of 0.98 that ‘Maintaining equipment service records’ is an easy task, while only a probability of 0.098 that ‘Interviewing credit card applicants by telephone or in person’ is an easy task. Because writing and drafting are classified as verbs associated with easy tasks, activities that involve writing are classified generally as easy. Some of this may understate the difficulty of some writing-related tasks. For example, ‘Writing reports or academic papers to communicate findings of climate-related studies’ is classified to be an easy task with probability 0.67, which is likely to be an underestimate of how hard such a task will continue to be for AI in the foreseeable future.
The numbers for Meta ($130 billion), Alphabet ($307 billion) and Snapchat ($4.6 billion) are from these companies’ 2023 10K filings. Fortune reported on 12 December 2023 that X’s annual revenue was $2.5 billion in 2023, while Financial Times reported on 15 March 2024 that revenues from TikTok’s US business had reached $16 billion per annum (see Fortune and Reuters).
See Statista.
If there were task complementarities, this would increase the productivity of exposed demographic groups in the remaining subtasks, but would also reduce the automation-driven cost savings by a corresponding amount. If task complementarities and automation affected different demographic groups within occupations symmetrically, this would have minor effects on the conclusions: although the wage effects of productivity gains from task complementarities are a little different than the cost savings generated by automation (Acemoglu and Restrepo, 2022), the results would remain broadly similar. If, on the other hand, some groups benefited more from task complementarities (e.g. because they are overrepresented among middle-expertise workers that can benefit most from generative AI tools), the distributional consequences could be different. Since I do not have a way of distinguishing productivity effects from task complementarities and automation, I am unable to explore this issue further.
This exercise uses the 49 BEA industries as in Acemoglu and Restrepo (2022).
Recall, however, that it is the TFP gains that are the relevant numbers for consumer welfare, since the additional investment comes out of consumption and may involve additional heavy energy use, as remarked in footnote 6.
SUPPLEMENTARY DATA
Supplementary data are available at Economic Policy online.
CONFLICT OF INTEREST
The author is not aware of any conflict of interest.
Funding
Funding support for this article was provided by the Hewlett Foundation.