-
PDF
- Split View
-
Views
-
Cite
Cite
Raj Chetty, John N Friedman, Michael Stepner, Opportunity Insights Team , The Economic Impacts of COVID-19: Evidence from a New Public Database Built Using Private Sector Data, The Quarterly Journal of Economics, Volume 139, Issue 2, May 2024, Pages 829–889, https://doi.org/10.1093/qje/qjad048
- Share Icon Share
Abstract
We build a publicly available database that tracks economic activity in the United States at a granular level in real time using anonymized data from private companies. We report weekly statistics on consumer spending, business revenues, job postings, and employment rates disaggregated by county, sector, and income group. Using the publicly available data, we show how the COVID-19 pandemic affected the economy by analyzing heterogeneity in its effects across subgroups. High-income individuals reduced spending sharply in March 2020, particularly in sectors that require in-person interaction. This reduction in spending greatly reduced the revenues of small businesses in affluent, dense areas. Those businesses laid off many of their employees, leading to widespread job losses, especially among low-wage workers in such areas. High-wage workers experienced a V-shaped recession that lasted a few weeks, whereas low-wage workers experienced much larger, more persistent job losses. Even though consumer spending and job postings had recovered fully by December 2021, employment rates in low-wage jobs remained depressed in areas that were initially hard hit, indicating that the temporary fall in labor demand led to a persistent reduction in labor supply. Building on this diagnostic analysis, we evaluate the effects of fiscal stimulus policies designed to stem the downward spiral in economic activity. Cash stimulus payments led to sharp increases in spending early in the pandemic, but much smaller responses later in the pandemic, especially for high-income households. Real-time estimates of marginal propensities to consume provided better forecasts of the impacts of subsequent rounds of stimulus payments than historical estimates. Overall, our findings suggest that fiscal policies can stem secondary declines in consumer spending and job losses, but cannot restore full employment when the initial shock to consumer spending arises from health concerns. More broadly, our analysis demonstrates how public statistics constructed from private sector data can support many research and real-time policy analyses, providing a new tool for empirical macroeconomics.
I. Introduction
Since Kuznets (1941), macroeconomic policy decisions have been made on the basis of publicly available statistics constructed from recurring surveys of households and businesses conducted by the federal government. Although such statistics have great value for understanding total economic activity, they have two limitations. First, survey-based data typically cannot be used to assess variation across geographies or subgroups; due to relatively small sample sizes, most statistics are typically reported only at the national or state level and breakdowns for demographic subgroups or sectors are unavailable. Second, such statistics are often available only at low frequencies, often with a significant time lag.1
In this article, we address these challenges by (i) building a public database that measures spending, employment, and other outcomes at a high-frequency, granular level using anonymized transaction data collected by companies in the private sector and (ii) demonstrating how this new database can be used to obtain insights into the effects of the coronavirus pandemic (COVID-19) and policy responses in near real time—within three weeks of the shock or policy change of interest.
We organize the article in three parts. First, we construct statistics on consumer spending, business revenues, employment rates, job postings, and other key indicators—disaggregated by geographic area (county or ZIP code), industry, and income level—by combining data from credit card processors, payroll firms, and financial services firms. The main challenge in using transactional data collected by private companies (which we refer to as “private sector data” in what follows) to measure economic activity is a tension between research value and privacy protection. For research, it is beneficial to use raw, disaggregated data—ideally down to the individual consumer or business level—to maximize precision and flexibility of research designs. But from a privacy perspective, it is preferable to aggregate and mask data to reduce the risk of disclosure of private information. To balance these conflicting interests, one must construct statistics that are sufficiently aggregated and masked to mitigate privacy concerns yet sufficiently granular to support research. Our goal is to demonstrate how one can produce public statistics that deliver insights analogous to those obtained from the underlying confidential microdata, thereby improving the transparency, timeliness, and reproducibility of empirical macroeconomic research (Miguel et al. 2014).
We construct publicly available series suitable for research from raw transactional data in a series of steps. We first develop algorithms to clean the raw data by removing data artifacts and smoothing seasonal fluctuations. Raw transactional data can exhibit sharp fluctuations and noise driven by changes in clientele, platform design, or exogenous events such as holidays (Leamer 2011; McElroy, Monsell, and Hutchinson 2018). We systematically examine each series for such artifacts and develop methods to address them. Next, we take steps to limit privacy loss by reporting only changes since January 2020 (rather than raw levels), masking small cells, and pooling data from multiple companies to comply with regulations governing the disclosure of material nonpublic information. After establishing these protocols, we report the final statistics using an automated pipeline that ingests data from businesses and publishes processed data, typically within a week after the relevant transactions occur.
The new data series we construct is a complement to, not a replacement for, existing public statistics obtained from representative surveys. The benefits of our data are their granularity and frequency—providing daily or weekly data for sectors and subgroups down to the county level. The drawback is that there are no ex ante guarantees that the data provide a representative picture of economic activity because any one company’s clients are not necessarily a representative sample of U.S. households or firms. We discuss these trade-offs in greater detail in Section II.C. To address these challenges, we benchmark each series to publicly available statistics from representative surveys and create series that track the survey-based measures closely, making ongoing adjustments to series that diverge from national statistics (e.g., because of changes in a data provider’s clients). Ultimately, the statistics we construct from transactional data provide an additional set of (imperfect) signals on economic activity that can in principle yield better statistical inferences when combined with existing survey-based statistics.2 Whether these data yield valuable new insights in practice is an empirical question.
In the second part of the article, we evaluate the empirical value of the new data by using them to analyze the economic effects of COVID-19, focusing on the period from March 2020 to December 2021—covering both the decline in economic activity and the recovery to baseline spending levels. To evaluate how far one can get solely with public statistics rather than confidential data, we deliberately conduct our empirical analysis using the aggregate statistics we release publicly.3
National accounts reveal that GDP fell in the second quarter of 2020 after the COVID-19 shock primarily because of a reduction in consumer spending. We find that spending fell primarily because high-income households started spending much less, using the median household income in the ZIP code where the cardholder lives as a proxy for household income.4 As of April 2020, 41% of the reduction in total spending since January 2020 came from households in ZIP codes with median income in the top quartile, while 12% came from households in ZIP codes with median income in the bottom quartile. This is because the rich account for a larger share of spending to begin with and because they cut spending more in percentage terms. Spending reductions were concentrated in services that require in-person physical interaction, such as hotels and restaurants, consistent with contemporaneous work by Alexander and Karger (2023) and Cox et al. (2020). These findings suggest that high-income households reduced spending primarily because of health concerns rather than a reduction in income or wealth.
Next, we leverage geographic variation in the demand shocks businesses face to identify the effects of the consumer spending shock on businesses. In-person services are typically produced by small businesses (such as restaurants) that serve customers in their local area. The revenues of those small businesses in high-income, dense areas (high-rent ZIP codes) fell by 61% between January and mid-April 2020, compared with 31% in the lowest-rent ZIP codes.
As businesses lost revenue, they passed the shock on to their employees, particularly low-wage workers. Postings for jobs with low skill requirements fell sharply in April 2020, with a much larger reduction in high-rent areas than in low-rent areas. Postings for jobs with high skill requirements fell much less and exhibit no cross-sectional gradient with respect to rent. As a result of the labor demand shock, employment rates fell by 39% for workers with wage rates in the bottom quartile of the pre-COVID wage distribution as of April 15, 2020 (the trough of the COVID recession), consistent with results first established using other confidential payroll data sources by Cajner et al. (2020). For those in the top wage quartile, employment rates fell by 14%. Low-wage people working at small businesses in affluent areas were especially likely to lose their jobs. At small businesses located in the highest-rent ZIP codes, 30% of workers were laid off within two weeks after the COVID crisis began; in the lowest-rent ZIP codes, 15% lost their jobs.
Employment levels for workers in the top wage quartile rebounded quickly, returning to pre-COVID levels by the end of June 2020. In contrast, employment recovered much more slowly for low-wage workers. The total number of jobs in the bottom quartile of the prepandemic wage distribution remained 12% below baseline even as of December 2021 (adjusting for wage growth). Why did employment rates for low-wage workers remain persistently lower? Unlike at the start of the pandemic, the source of lower employment rates at the end of 2021 was not a lack of labor demand: total consumer spending and low-skilled job postings were well above pre-COVID baseline levels throughout 2021. Furthermore, job postings for low-skilled workers were just as high in high-rent areas as they were in low-rent areas by December 2021. However, employment rates for low-wage workers continue to exhibit a sharp gradient with respect to rent, with employment levels (adjusted for wage growth) returning to pre-COVID baseline rates in the lowest-rent areas but remaining 23 percentage points below pre-COVID levels in the highest-rent areas. Employment rates in December 2021 were much more strongly related to the size of the initial shock to economic activity—for example, the change in employment rates as of April 2020—than contemporaneous factors such as COVID case rates or unemployment benefit levels. In short, the initial labor demand shock induced by the reduction in aggregate demand in March 2020 led to a persistent reduction in labor supply among low-wage workers in the hardest-hit areas. As a result, business cycle dynamics during the COVID crisis were not symmetric: on the way down, spending and employment fell in lockstep, but on the way back, they did not rise together, echoing patterns documented in the Great Recession (Yagan 2019).
In the third part of the article, we examine the scope for stabilization policies to break the chain of events documented above. We focus on the effects of stimulus payments, whose goal was to mitigate reductions in economic activity by boosting aggregate spending. The federal government sent households stimulus checks at three points during the crisis: April 15, 2020; January 4, 2021; and March 17, 2021. Using an event study design, we find that the stimulus payments made in April 2020 increased spending uniformly across the household income distribution (again proxying for income based on ZIP code), with low- and high-income households increasing spending substantially in the days after they received checks, consistent with evidence from Baker et al. (forthcoming) and Cox et al. (2020) using individual-level administrative data.
In contrast, the January 2021 payments had highly heterogeneous effects across the income distribution: low-income households continued to spend a substantial fraction of their stimulus checks, but high-income households (those living in the top quartile of ZIP codes by median income) spent virtually none of the money they received. The impacts of the stimulus changed sharply over the course of the recession because of the heterogeneous spending dynamics documented above: high-income households cut spending sharply but did not lose much income, and as a result had built up considerable savings by January 2021, sharply reducing their marginal propensity to consume. Because our spending data are available with a short lag, we were able to establish this result three weeks after the second stimulus payment. These results were cited in policy debates regarding who should receive the March 2021 stimulus payments, which ultimately concluded with policy makers lowering the income threshold for eligibility relative to initial proposals.
Finally, as predicted based on effects of the January 2021 stimulus, we find that the March 2021 stimulus payments increased spending for low-income households, but had little impact on spending for high-income households who remained eligible. Hence, estimates of marginal propensities to consume in January 2021 provided much better forecasts of the effects of the March 2021 stimulus payments than historical estimates from prior recessions, which suggested there would be little heterogeneity in MPCs by income level (Sahm, Shapiro, and Slemrod 2012; Broda and Parker 2014), or even estimates from just months earlier in the same recession (April 2020). This example demonstrates how public statistics constructed from private sector data can support a “real time” approach to macroeconomic policy, where policies are adjusted based on current evidence on their effects rather than relying solely on historical predictions from other economic environments.
The data can analogously be used to analyze the effects of other policies (beyond stimulus payments) that were implemented during the COVID-19 crisis. We find that results from our publicly available data match those from studies that use confidential data sources closely. For instance, we find that state-ordered shutdowns and reopenings of economies had modest effects on economic activity (Goolsbee and Syverson 2021) and that loans to small businesses as part of the Paycheck Protection Program (PPP) had small effects on employment rates (Hubbard and Strain 2020; Autor et al. 2022a; Granja et al. 2022). In addition, several studies use the new data constructed here to evaluate the effects of many other policies, from the effects of unemployment benefit changes (Casado et al. 2020) to eviction moratoria (An, Gabriel, and Tzur-Ilan 2022).
We conclude by analyzing whether the combination of government policies was adequate to stem the decline in economic activity set off by the reduction in consumer spending. Consumer spending fell sharply in April 2020 in the dense, affluent areas where many low-wage workers lost their jobs, portending the start of a downward spiral of secondary effects stemming from the initial aggregate demand shock. However, the relationship between consumer spending and the rate of local job loss flattened sharply by July 2020. Spending recovered to baseline levels or above baseline, even in places where many low-wage workers lost their jobs—presumably because of the substantial infusion of income to such areas in the form of fiscal stimulus, unemployment benefits, and other programs that led to an increase in disposable income at the bottom of the distribution (Blanchet, Saez, and Zucman 2022).
Overall, our findings suggest that fiscal policies can be very valuable for limiting secondary declines in consumer spending arising from a loss of income as workers lose their jobs. However, fiscal policy itself does not have the capacity to restore full employment when the initial shock to consumer spending arises from health concerns (Guerrieri et al. 2022). Furthermore, even after health concerns have abated, changes in labor supply among those who lost their jobs may lead to persistent reductions in employment. It may therefore be useful to target reemployment policies to individuals who held low-wage jobs in places that suffered the largest job losses (Austin, Glaeser, and Summers 2018). Our data provide a way to monitor the areas and sectors in which job losses persist, information that can be used to target and evaluate such programs going forward.
Beyond showing how the COVID-19 pandemic affected economic activity, the broader contribution of this study is the construction of a new public database of granular macroeconomic statistics that opens new avenues for empirical macroeconomics, from finer analysis of heterogeneous effects across subgroups and areas to real-time policy fine-tuning. Importantly, such analyses can be conducted by many researchers and policy analysts, not just those who can secure access to confidential data and devote resources to cleaning and harmonizing it. In this sense, the data assembled here provide a prototype for a new system of real-time, granular national accounts that can be refined in future work, much as Kuznets (1941) and Summers and Heston (1984; 1991) developed prototypes for national accounts that were refined in subsequent work (e.g., Feenstra, Inklaar, and Timmer 2015). Going forward, our intention is to continue to maintain and refine this database in collaboration with researchers at government statistical agencies, with the ultimate goal of creating a complement to survey-based statistics that yield further detail on economic activity.
Our work builds on two literatures: a long-standing literature on macroeconomic measurement and a recent literature on the economics of pandemics. In the macroeconomic measurement literature, our work is most closely related to studies showing that private sector data sources can be used to forecast government statistics (see Abraham et al. 2019 for an overview of this work). In the COVID-19 pandemic literature, numerous papers have used confidential private sector data to analyze consumer spending; see Vavra (2021) and Brodeur et al. (2021) for surveys. The contribution of the present study is to present a comprehensive characterization of how COVID-19 and subsequent stabilization policies affected economic activity by disaggregating data across geographic areas and subgroups at a high frequency. We discuss specific connections with prior work in the context of presenting our results.
The rest of this article is organized as follows. The next section describes how we construct the data series we make public. In Section III, we analyze the effects of COVID-19 on spending, revenue, and employment. Section IV analyzes the effects of stimulus and other government policies enacted to mitigate COVID’s effects. Section V concludes. Technical details are available in the Online Appendix, and the data used to produce the results can be downloaded online.
II. Construction of the Public Database
We use anonymized data from several private companies to construct public indices of consumer spending, small-business revenue, job postings, and employment rates. All of the data series described below can be freely downloaded from the Economic Tracker website.
We release each data series at the highest available frequency using an automated pipeline that ingests data from data providers, constructs the relevant statistics, conducts quality control tests, and outputs the series publicly. Online Appendix A details the engineering of this pipeline.
II.A. Methods
We disaggregate each series by industrial sector, county, and income quartile wherever feasible. To systematize our approach and facilitate comparisons between series, we adopt the following four principles when constructing each series.
First, we remove artifacts in raw data that arise from changes in data provider coverage or systems. For instance, firms’ clients often change discretely, sometimes leading to discontinuous jumps in series, particularly in small cells. We systematically search for large jumps in series, study their root causes by consulting with the data provider, and address such discontinuities by imposing continuity using series-specific methods described below.
Second, we smooth low- and high-frequency fluctuations in the data. We address high-frequency fluctuations through aggregation, for example, by reporting seven-day moving averages to smooth fluctuations across days of the week. Certain series—most notably consumer spending and business revenue—exhibit strong lower-frequency seasonal fluctuations that are autocorrelated across years (e.g., a surge in spending around the holiday season). We deseasonalize such series by indexing each week’s value in 2020 relative to corresponding values for the same week in 2019.
Third, we take a series of steps to protect the confidentiality of businesses and their clients. Instead of reporting levels of each series, we report indexed values that show percentage changes relative to mean values in January 2020.5 We suppress small cells and exclude outliers to meet privacy and data protection requirements, with thresholds that vary across data sets as described below. For data obtained from publicly traded firms—whose ability to disclose data is restricted by Securities and Exchange Commission regulations governing the disclosure of material nonpublic information—we combine data from multiple firms so that the statistics we report do not reveal information about any single company’s activities.
Finally, we address the challenge that our data sources capture information about the customers each company serves rather than the general population. Instead of attempting to adjust for this nonrepresentative sampling, we characterize the portion of the economy that each series represents by comparing each sample we use to national benchmarks and label the sector and population subgroup that each series represents.
We follow these four broad principles to construct every public data series that we release, while adapting the data-processing methodology to the specific characteristics of each data source.
II.B. Data Series
This section provides an overview of how we produce each data series. We summarize the data sources and give an overview of our key processing steps in Online Appendix Table I, and provide summary statistics on sample sizes for each series in Online Appendix Table II.
1. Consumer Spending
We measure consumer spending using aggregated and anonymized data on credit and debit card spending collected by Affinity Solutions, a company that aggregates consumer credit and debit card spending information to support a variety of financial service products, such as loyalty programs for banks. Affinity Solutions captures nearly 10% of debit and credit card spending in the United States. We obtain raw data from Affinity Solutions disaggregated by county, quartile of ZIP code median income, industry, and day starting from January 1, 2019.
We process the raw Affinity data into an analytical series following the four steps above—removing artifacts and outliers, deseasonalizing, indexing, and benchmarking—and describe each step in detail in Online Appendix B. As an example of our data-processing methods, we detect and remove discontinuous breaks caused by entry or exit of card providers from the sample. Because these card providers have geographically concentrated customer bases, the number of active cards in a county exhibits a sharp upward or downward spike when the sample of local card providers changes (Online Appendix Figure I.A). We identify these sudden changes by analyzing the number of unique cards from each county with at least one transaction in each week, using a supremum Wald test for a structural break at an unknown break point. If we identify a structural break in week t, we impute spending changes in weeks {t − 1, t, t + 1} using the mean week-to-week percent change in spending excluding all counties with a structural break in the same state.
The Affinity series has broad coverage across industries but overrepresents categories in which credit and debit cards are used for purchases (see Online Appendix Figure II discussed in Online Appendix B). We therefore view the Affinity series as providing statistics that are representative of total card spending, but not total consumer spending.
2. Small-Business Revenue
We obtain data on small-business transactions and revenues from Womply, a company that aggregates data from several credit card processors to provide analytical insights to small businesses and other clients. Womply receives data from approximately 500,000 small businesses, which corresponds to more than 5% of small businesses with 1–499 employees in the United States in 2020 (U.S. SBA Office of Advocacy 2020). In contrast to the Affinity series on consumer spending, which is a cardholder-based panel covering total spending, Womply is a firm-based panel covering total revenues of small businesses disaggregated by county, sector, and week. Another key distinction is that Womply data measure the location of the business as opposed to where the cardholder lives.
We process this small-business data following each of the four broad steps as with the consumer spending data from Affinity, but we tailor the methodology to the structure of the Womply data, as detailed in Online Appendix C. To take one example, there are again discontinuous breaks in the number of observed small businesses due to churn in the observed sample of payment processors, analogous to the entry and exit of card providers in consumer spending data. However, unlike the repeated cross sections of consumer spending data, we can address such sample churn more directly using the panel data on small businesses. In each calendar year, we follow the sample of businesses operating during the first week of the year: no new businesses enter the panel midyear. We must still detect cases where a payment processor exits the sample, and we adopt a similar approach to detecting discontinuous breaks as we applied to consumer spending data. We look for sharp drops in businesses operating at the state and national levels (Online Appendix Figure I.B).6 After adjusting for these discontinuous exits, we proceed with the rest of the steps described in Online Appendix C to construct a seasonally adjusted series for total small-business revenue.
Womply revenues are broadly distributed across sectors. A larger share of the Womply revenue data come from industries that have a larger share of small businesses, such as food services, professional services, and other services, as one would expect given that the Womply data only cover small businesses (Online Appendix Figure II).
3. Job Postings
We obtain data on job postings from 2007 to the present from Lightcast (formerly known as Burning Glass Technologies). Lightcast aggregates nearly all jobs posted online from approximately 40,000 online job boards in the United States. Lightcast then removes duplicate postings across sites and assigns attributes including geographic locations, required job qualifications, and industry.
We receive raw data from Lightcast on job postings disaggregated by industry, week, job qualifications, and county.7 We report job postings at the weekly level, expressed as changes in percentage terms relative to the first four complete weeks of 2020.
Lightcast provides a sample that is representative of private sector job postings in the United States. Online Appendix Figure III shows that the distribution of industries in the Lightcast data is well aligned with the Bureau of Labor Statistics’ Job Openings and Labor Market Turnover Survey (JOLTS), consistent with Carnevale, Jayasundera, and Repnikov (2014).
4. Employment
We use three data sources to obtain information on employment rates: payroll data from Paychex and Intuit and worker-level data from Earnin. We describe these data sources and then discuss how we construct a weekly series that is broadly representative of private nonfarm employment rates in the United States (see Online Appendix Tables III and IV).
i. Paychex and Intuit. Paychex provides payroll services to approximately 670,000 small and medium-sized businesses across the United States and pays 8% of U.S. private-sector workers (Paychex 2020). To track how employment changes vary across the wage distribution, we separate employees into four groups based on their hourly wage rates. We split the sample into the four groups whose wages (if they work full-time for the full year) would be above/below 100%, 150%, and 250% of the federal poverty line (FPL). For convenience, we refer to these groups as “wage quartiles” because these thresholds group workers approximately into quartiles before the pandemic.8 This approach allows us to track the total number of jobs in different parts of the wage distribution, adjusting for inflation over time. We obtain aggregate weekly data on total employment for each hourly wage group by county, industry (two-digit NAICS), firm size bin, and pay frequency.
Intuit offers payroll services to businesses as part of its Quickbooks program, covering approximately 1 million businesses as of January 2020. Businesses that use Quickbooks tend to be very small (fewer than 20 employees). We obtain anonymized, aggregated data on month-on-month and year-on-year changes in total employment (the number of workers paid in the prior month) based on repeated cross sections. We construct a national series from population-weighted averages of state changes in each month.
To protect business privacy and maximize precision, we combine Paychex and Intuit data to construct our primary employment series. We clean this series for analysis following the general principles (see Online Appendix E).9 We do not seasonally adjust our employment series because we have incomplete data in 2019; fortunately, seasonal fluctuations in employment are an order of magnitude smaller than those in spending (Online Appendix Figure IV) and hence are unlikely to affect our results.
ii. Earnin. Earnin is a financial management application that provides its members with access to their income as they earn it, in advance of their paychecks. Because its users tend to have lower income levels, Earnin primarily provides information on employment for low-wage workers. We obtain anonymized data from Earnin from January 2020 onward, describing the date a paycheck is received, workplace ZIP code, firm size, industry, and amount of pay. Earnin complements the firm-based payroll data sets by providing a worker-level sample with more granular ZIP-level geographic identifiers. However, because workers self-select into the sample when they enter or exit the Earnin customer base, the labor market disruptions of the pandemic generate substantial sample selection over time. We use the Earnin sample only to study the first six months of the COVID pandemic, from March to September 2020, when the sample is relatively stable. We convert the Earnin data into an employment series using an approach similar to that used to construct the combined Paychex and Intuit employment series (detailed in Online Appendix E).
5. Public Data Sources: UI Records, COVID-19 Incidence, and Google Mobility Reports
In addition to the new private sector data sources, we collect and use three sets of data from public sources to supplement our analysis: data on unemployment benefit claims obtained from the Department of Labor and state government agencies; data on COVID-19 cases and deaths obtained from the New York Times, Johns Hopkins, the Centers for Disease Control and Prevention (CDC), and the U.S. Department of Health and Human Services; and data on the amount of time people spend at home versus other locations obtained from Google’s COVID-19 Community Mobility Reports. More details on these data sources are provided in Online Appendices F to H.
II.C. Limitations
The rest of this article demonstrates how the database assembled here is valuable for uncovering the economic effects of COVID-19. However, these new data also have three important limitations that users should weigh, especially in future applications.
First, each data series we construct necessarily reflects the clientele of the data provider, and thus does not provide guarantees of population representativeness. We take several steps to verify that each data series is nationally representative: we compare the cross-sectional composition of each series against nationally representative statistics in this section, and compare trends in each series during the COVID-19 pandemic to data from publicly available benchmarks in the next section. But it is impossible to verify the representativeness of each level of disaggregation (e.g., county-level consumer spending), precisely because no existing public data sets provide similarly granular and high-frequency data—hence the value of these novel data sources. Given this limitation, it is valuable to verify empirical results using multiple different data series and triangulate findings against whatever data are available from representative surveys at coarser levels of aggregation, as we do in our analysis below.
Second, the series we construct have sampling error from both idiosyncratic variation across firms and households as well as from changing client bases and business closures. The economic shocks associated with COVID-19 were especially large, making their effects easy to detect even in the presence of such errors. In Section III.C, we show that the data series are sufficiently reliable to detect moderate-sized changes in economic activity at local levels (e.g., employment rate changes of 4 percentage points at the commuting zone level for the 50 largest commuting zones). Smaller fluctuations—for example, monthly innovations in employment rates during periods of normal economic growth—will not be distinguishable from sampling noise in these data sets.
Third, while our data cover certain sectors well—such as spending on items covered by credit and debit cards—they entirely exclude other sectors, such as spending on housing and durable goods such as vehicles. In the context of the COVID-19 pandemic, the data series we construct overlap with the sectors that exhibit the largest changes in economic activity (see Section III), but in other applications that may not be the case.
In light of these limitations, the data constructed here should be used to complement representative survey-based statistics, not as a substitute. Furthermore, the present version of the database is a prototype that can be improved over time. For example, noise in estimates of employment rates from changes in payroll firms’ clientele can be mitigated by chaining together estimates of employment changes from rotating panels of firms instead of relying on repeated cross sections. Adding additional data partners can address gaps in coverage, such as spending on housing. Such refinements could mitigate the limitations, though statistics from representative surveys will remain essential as benchmarks.
III. Economic Impacts of COVID-19
According to the U.S. Bureau of Economic Analysis (2020), GDP fell by |${\$}$|1.61 trillion (an annualized rate of 29.9%) from the first quarter of 2020 to the second quarter of 2020, shown by the first bar in Online Appendix Figure V.A. GDP fell primarily because of a reduction in personal consumption expenditures (consumer spending), which fell by |${\$}$|1.20 trillion. Government purchases and net exports did not change significantly, while private investment fell by |${\$}$|0.53 trillion.10 We begin our analysis by studying the determinants of this sharp reduction in consumer spending. Then we turn to examine downstream effects of the reduction in consumer spending on business activity and the labor market.
III.A. Consumer Spending
We analyze consumer spending using data on aggregate credit and debit card spending. National accounts data show that spending that is well captured on credit and debit cards—essentially all spending excluding housing, health care, and motor vehicles—fell by approximately |${\$}$|0.90 trillion between the first quarter of 2020 and the second quarter of 2020, making up 75% of the total reduction in personal consumption expenditures.11
1. Benchmarking
Our card spending series is well aligned with the Advance Monthly Retail Trade Survey (MARTS), one of the main inputs used to construct the national accounts.12 Online Appendix Figure V.B plots the month-on-month changes in spending on retail services (excluding auto-related expenses) and food services: both series track each other before the pandemic, then food services spending drops rapidly in March and April 2020, while total retail spending fluctuates much less during the pandemic. The root mean square error (RMSE) of the Affinity series relative to the MARTS is 3 to 5 percentage points, which is small relative to the fluctuations induced by COVID, but calls for caution in evaluating smaller shocks. Online Appendix Figure VI.A expands this analysis to other categories by plotting the change in spending from January to April 2020 in the Affinity spending series against the decline in consumer spending as measured in the MARTS. Despite the fact that the MARTS category definitions are not perfectly aligned with those in the card spending data, the relative declines are generally well aligned across sectors, with a correlation of 0.92.13
2. Heterogeneity by Income
We begin by examining spending changes by household income. We do not directly observe cardholders’ incomes in our data; instead, we proxy for cardholders’ incomes using the median household income in the ZIP code in which they live (based on data from the 2014–18 American Community Survey). ZIP codes are strong predictors of income because of the degree of income segregation in most U.S. cities; however, they are not a perfect proxy for income and can be prone to bias in certain applications, particularly when studying tail outcomes (Chetty et al. 2020). To evaluate the accuracy of our ZIP code imputation procedure, we compare our estimates to those in contemporaneous work by Cox et al. (2020), who observe cardholder income directly based on checking account data for clients of JPMorgan Chase. Our estimates are closely aligned with those estimates, suggesting that the ZIP code proxy is reasonably accurate in this application.14
Figure I, Panel A plots a seven-day moving average of total daily card spending for households in the bottom versus top quartile of ZIP codes based on median household income. Spending fell sharply on March 15, 2020, when the national emergency was declared and the threat of COVID became widely discussed in the United States. Spending fell from |${\$}$|8.3 billion a day in February 2020 to |${\$}$|5.5 billion a day between March 25 and April 14, 2020 (a 34% reduction) for high-income households; the corresponding change for low-income households was |${\$}$|3.5 billion to |${\$}$|2.7 billion (a 23% reduction).

Changes in Consumer Spending during the COVID Pandemic
This figure disaggregates spending changes by income and sector using debit and credit card data from Affinity Solutions and national accounts (NIPA) data. Panel A plots daily spending levels for consumers in the highest and lowest quartiles of household income by combining total card spending in January 2020 (from NIPA Table 2.3.5) with our Affinity Solutions spending series. See the notes to Online Appendix Table V for details on this method. Panel B disaggregates the sectoral shares of seasonally adjusted spending changes (left bar) and pre-COVID spending levels (right bar). See Online Appendix B.3 for the definitions of the sectors plotted in Panel B. Panel C decomposes the change in personal consumption expenditures (PCE) in the Great Recession and the COVID-19 recession using NIPA Table 2.3.6. PCE is defined here as the sum of durable goods, nondurable goods, and services in seasonally adjusted, chained (2012) dollars. The peak to trough declines are calculated from December 2007 to June 2009 for the Great Recession and from January to April 2020 for the COVID-19 recession. Data sources: Affinity Solutions, NIPA.
Because high-income households cut spending more in percentage terms and accounted for a larger share of aggregate spending to begin with, they accounted for a much larger share of the decline in total spending in the United States than low-income households. In Online Appendix Table V, Panel A, column (2) we estimate that as of mid-April 2020, top-quartile households accounted for 41% of the aggregate spending decline after the COVID shock, while bottom-quartile households accounted for only 12% of the decline.
This gap in spending patterns by income grew even larger over time. By August 2020, spending had returned to 2019 levels among households in the bottom quartile, whereas spending among high-income households remained 8% below baseline levels. Spending then continued to rise gradually in subsequent months and began to exceed pre-COVID levels starting in 2021 for both low- and high-income groups. The degree of heterogeneity in spending changes by income is larger than that observed in previous recessions (Petev, Pistaferri, and Saporta-Eksten 2011, figure 6) and played a central role in the downstream impacts of COVID on businesses and the labor market, as we show below.
3. Heterogeneity across Sectors
Next we disaggregate the change in total card spending across categories to understand why households cut spending so rapidly. In particular, we seek to distinguish two channels: reductions in spending due to loss of income versus fears of contracting or spreading COVID.
The left bar in Figure I, Panel B plots the share of the total decline in spending from the pre-COVID period to mid-April 2020 accounted for by various categories. Fifty-seven percent of the reduction in spending came from reduced spending on goods or services that require in-person contact (and thereby carry a risk of COVID infection), such as hotels, transportation, and food services. This is particularly striking given that these goods accounted for less than one-third of total spending in January, as shown by the right bar in Figure I, Panel B. These gaps grew larger as the pandemic progressed, as consumer spending increased above prepandemic levels for goods and remote services by mid-August 2020, but remained sharply depressed for in-person services (Online Appendix Table V, Panel B). The fact that the spending reductions vary so sharply across goods in line with their health risks indicates that health concerns (either one’s own health or altruistic concerns about others’ health) rather than a lack of purchasing power drove spending reductions.
These patterns of spending reductions differ markedly from those observed in prior recessions. Figure I, Panel C compares the change in spending across categories in national accounts data in the COVID recession and the Great Recession in 2009–10. In the Great Recession, nearly all of the reduction in consumer spending came from a reduction in spending on goods; spending on services was almost unchanged. In the COVID recession, 71% of the reduction in total spending came from a reduction in spending on services.
4. Heterogeneity by COVID Incidence
To further evaluate the role of health concerns, we examine the association between COVID case rates across areas and changes in spending. Figure II shows that spending fell more in counties with higher rates of COVID infection, in both low- and high-income areas, during the trough in consumer spending from March 25 to April 14, 2020. However, there was a substantial reduction in spending even in areas without high rates of realized COVID infection, consistent with widespread concern about the disease even in areas where outbreaks were less prevalent. To examine the mechanism driving these spending reductions, Online Appendix Figure VII uses anonymized cell phone data from Google to present a binned scatter plot of the amount of time spent outside home versus COVID case rates, again separately for low- and high-income counties. As in Figure II, there is a strong negative relationship between time spent outside and COVID case rates, with a steeper slope in low-income counties. The reduction in spending on services that require physical, in-person interaction (e.g., restaurants) follows directly from this reduction in time spent outside.

Association between COVID-19 Incidence and Changes in Consumer Spending
This figure presents a county-level binned scatter plot. To construct it, we divide the data into 20 equal-sized bins, ranking by the x-axis variable and weighting by the county’s population, and plot the (population-weighted) means of the y-axis and x-axis variables in each bin. The y-axis presents the change in seasonally adjusted consumer spending from the base period (January 6–February 2, 2020) to the three-week period of March 25 to April 14, 2020 (see Section II.B and Online Appendix B for details on the construction of our consumer spending series). The x-axis variable is the log of the county’s cumulative COVID case rate per capita as of April 14, 2020; axis labels show the levels on a log scale. We plot values separately for counties in the top and bottom quartiles of median household income (measured using population-weighted 2014–2018 ACS data). Data sources: Affinity Solutions, New York Times.
In sum, disaggregated data on consumer spending reveal that spending in the initial stages of the pandemic fell primarily because of health concerns rather than a loss of current or expected income—consistent with the mechanisms emphasized by Eichenbaum, Rebelo, and Trabandt (2021). Disposable income ultimately fell relatively little because few high-income individuals lost their jobs (as we show in Section III.C) and because the income losses of lower-income households who lost their jobs were more than offset by supplemental unemployment benefits, stimulus payments, and other transfers (Ganong, Noel, and Vavra 2020; Blanchet, Saez, and Zucman 2022). Next we turn to the effects of the spending reductions induced by these health concerns on businesses and the labor market.
III.B. Business Revenues
Services that are consumed in person (e.g., restaurants) are typically produced by small businesses who serve customers in their local area.15 The reduced in-person spending by high-income households documented above thus has heterogeneous effects across areas, with businesses located in more affluent areas facing larger spending shocks. We exploit this geographic heterogeneity to identify the effects of the reduction in consumer spending on businesses and their employees, starting by examining effects on small-business revenues.16
1. Benchmarking
We measure small-business revenues using data from Womply, which records revenues from credit card transactions for small businesses (as defined by the Small Business Administration) at the location where the sale occurs. Because there is no publicly available series on small-business revenues, we compare trends in the Womply data to the Affinity consumer spending data. These series are generally well aligned, especially in sectors with a large share of small businesses, such as food and accommodation services, where the RMSE of the Womply series relative to Affinity is 2.68 percentage points (Online Appendix Figure VIII). For retail, where large businesses have a larger market share, the RMSE is 6.72 percentage points.
2. National Trends
In the aggregate time series (plotted in Online Appendix Figure IX.A), small-business revenues fell by 48% when the pandemic began and then recovered to 11% below pre-COVID levels by July 2020. Small-business revenues then remained at that level until late 2020, reaching pre-COVID levels only in September 2021. The larger fall and slower recovery of small-business revenues relative to total consumer spending is consistent with evidence that consumer spending shifted toward large online retailers during the pandemic (Alexander and Karger 2023). Unfortunately, we lack data on revenues at large businesses, so we cannot examine these impacts directly.
3. Heterogeneity across Areas
To illustrate the data underlying our geographic analysis, we map the change in small-business revenues from January 2020 to the period immediately after the COVID shock (March 23–April 12, 2020) by ZIP code in New York City, Chicago, and San Francisco (Online Appendix Figure X).17 In all three cities, revenue losses were largest in the most affluent neighborhoods (Manhattan in New York and Lincoln Park in Chicago) and in the central business districts in each city. Even in predominantly residential areas, businesses located in more affluent neighborhoods suffered much larger revenue losses.
Figure III generalizes these examples by presenting a binned scatter plot of percent changes in small-business revenue versus median rents (for a two-bedroom apartment) by ZIP code.18 In the top ventile of ZIP codes by rent, small-business revenues fell by 61%, compared with 31% in the bottom ventile of ZIP codes by rent, consistent with the differences observed in the Affinity consumer spending data across areas.19

Changes in Small-Business Revenues versus Median Two-Bedroom Rent, by ZIP Code
This figure presents a binned scatter plot showing the relationship between changes in seasonally adjusted small-business revenue in Womply data versus rent at the ZIP code level. The binned scatter plot is constructed as described in Figure II. We measure changes in small-business revenue as the average value of our index at the ZIP code level between March 23 and April 12, 2020 (see Section II.B and Online Appendix C for details on the construction of our small-business revenue series). The x-axis variable is the ZIP code median rent for a two-bedroom apartment in the 2014–2018 ACS. Data sources: Womply, ACS.
The business revenue loss versus rent gradient is similar when we compare ZIP codes within the same county by regressing revenue changes on rent with county fixed effects (Table I, Panel A, column 2), or when comparing businesses within the same industry across ZIP codes using sector fixed effects (Online Appendix Figure XII.A). It also remains similar when controlling for the (pre-COVID) density of high-wage workers in a ZIP code to account for differences that may arise from shifts to remote work in business districts (Table I Panel A, column (3)).20
. | Dep. Var.: . | Change in Small-Business Revenue (percentage points) from January to April 2020 . | ||
---|---|---|---|---|
. | . | (1) . | (2) . | (3) . |
Panel A: Business Revenue | ||||
Median two-bedroom rent (per thousand dollars) | −16.71 (0.90) | −15.95 (2.20) | −14.33 (2.21) | |
Log density of high-wage workers | −2.38 (0.34) | |||
County FEs | X | X | ||
Observations | 9,917 | 9,917 | 9,917 | |
Geographic unit | ZIP code | ZIP code | ZIP code |
. | Dep. Var.: . | Change in Small-Business Revenue (percentage points) from January to April 2020 . | ||
---|---|---|---|---|
. | . | (1) . | (2) . | (3) . |
Panel A: Business Revenue | ||||
Median two-bedroom rent (per thousand dollars) | −16.71 (0.90) | −15.95 (2.20) | −14.33 (2.21) | |
Log density of high-wage workers | −2.38 (0.34) | |||
County FEs | X | X | ||
Observations | 9,917 | 9,917 | 9,917 | |
Geographic unit | ZIP code | ZIP code | ZIP code |
. | Dep. Var.: . | Change in Low-Wage Employment (percentage points) from January to July 2020 . | ||
---|---|---|---|---|
. | . | (1) . | (2) . | (3) . |
Panel B: Low-Wage Employment | ||||
Median two-bedroom rent (per thousand dollars) | −6.82 (1.17) | −4.33 (1.54) | −5.63 (0.89) | |
Log density of high-wage workers | −0.86 (0.36) | −0.74 (0.28) | ||
County FEs | X | |||
Observations | 949 | 949 | 11,223 | |
Geographic unit | County | County | ZIP code | |
Data source | Paychex & Intuit | Paychex & Intuit | Earnin |
. | Dep. Var.: . | Change in Low-Wage Employment (percentage points) from January to July 2020 . | ||
---|---|---|---|---|
. | . | (1) . | (2) . | (3) . |
Panel B: Low-Wage Employment | ||||
Median two-bedroom rent (per thousand dollars) | −6.82 (1.17) | −4.33 (1.54) | −5.63 (0.89) | |
Log density of high-wage workers | −0.86 (0.36) | −0.74 (0.28) | ||
County FEs | X | |||
Observations | 949 | 949 | 11,223 | |
Geographic unit | County | County | ZIP code | |
Data source | Paychex & Intuit | Paychex & Intuit | Earnin |
Notes. This table presents estimates from population-weighted OLS regressions at the county and ZIP code level. We regress percentage changes in small-business revenue (using Womply data) and low-wage employment (using Paychex-Intuit and Earnin data) on median two-bedroom rent (as measured in the 2014–2018 ACS). Standard errors are reported in parentheses; county-level regressions use robust standard errors and ZIP-level regressions use standard errors clustered by county. The dependent variable is in percentage point units. The dependent variable in Panel A is the average change in small-business revenue measured from March 23 to April 12, 2020, relative to January 4 to 31, 2020. All regressions in Panel A are at the ZIP code level. The dependent variable in Panel B is the change in low-wage employment measured from June 27 to July 31, 2020, relative to January 4 to 31, 2020. In Panel B, columns (1) and (2) are at the county level using combined Paychex and Intuit data, while column (3) is at the ZIP code level using Earnin data. In both panels, column (1) shows the baseline regression without any controls: this specification corresponds to the estimated slope coefficient and standard error reported in Figure III (small-business revenue) and Figure IV, Panel C (low-wage employment). In Panel A, column (2) adds county fixed effects and column (3) further adds the log of the density of high-wage workers as a control (which is observed using the Census LODES for 92% of ZIP codes representing 99% of the U.S. population). In Panel B, column (2) adds the log of the density of high-wage workers as a control to the baseline county level regression, and column (3) switches to ZIP code–level data for a specification analogous to the one in column (3) of Panel A. Data sources: Womply, Paychex, Intuit, Earnin, ACS, Census LODES.
. | Dep. Var.: . | Change in Small-Business Revenue (percentage points) from January to April 2020 . | ||
---|---|---|---|---|
. | . | (1) . | (2) . | (3) . |
Panel A: Business Revenue | ||||
Median two-bedroom rent (per thousand dollars) | −16.71 (0.90) | −15.95 (2.20) | −14.33 (2.21) | |
Log density of high-wage workers | −2.38 (0.34) | |||
County FEs | X | X | ||
Observations | 9,917 | 9,917 | 9,917 | |
Geographic unit | ZIP code | ZIP code | ZIP code |
. | Dep. Var.: . | Change in Small-Business Revenue (percentage points) from January to April 2020 . | ||
---|---|---|---|---|
. | . | (1) . | (2) . | (3) . |
Panel A: Business Revenue | ||||
Median two-bedroom rent (per thousand dollars) | −16.71 (0.90) | −15.95 (2.20) | −14.33 (2.21) | |
Log density of high-wage workers | −2.38 (0.34) | |||
County FEs | X | X | ||
Observations | 9,917 | 9,917 | 9,917 | |
Geographic unit | ZIP code | ZIP code | ZIP code |
. | Dep. Var.: . | Change in Low-Wage Employment (percentage points) from January to July 2020 . | ||
---|---|---|---|---|
. | . | (1) . | (2) . | (3) . |
Panel B: Low-Wage Employment | ||||
Median two-bedroom rent (per thousand dollars) | −6.82 (1.17) | −4.33 (1.54) | −5.63 (0.89) | |
Log density of high-wage workers | −0.86 (0.36) | −0.74 (0.28) | ||
County FEs | X | |||
Observations | 949 | 949 | 11,223 | |
Geographic unit | County | County | ZIP code | |
Data source | Paychex & Intuit | Paychex & Intuit | Earnin |
. | Dep. Var.: . | Change in Low-Wage Employment (percentage points) from January to July 2020 . | ||
---|---|---|---|---|
. | . | (1) . | (2) . | (3) . |
Panel B: Low-Wage Employment | ||||
Median two-bedroom rent (per thousand dollars) | −6.82 (1.17) | −4.33 (1.54) | −5.63 (0.89) | |
Log density of high-wage workers | −0.86 (0.36) | −0.74 (0.28) | ||
County FEs | X | |||
Observations | 949 | 949 | 11,223 | |
Geographic unit | County | County | ZIP code | |
Data source | Paychex & Intuit | Paychex & Intuit | Earnin |
Notes. This table presents estimates from population-weighted OLS regressions at the county and ZIP code level. We regress percentage changes in small-business revenue (using Womply data) and low-wage employment (using Paychex-Intuit and Earnin data) on median two-bedroom rent (as measured in the 2014–2018 ACS). Standard errors are reported in parentheses; county-level regressions use robust standard errors and ZIP-level regressions use standard errors clustered by county. The dependent variable is in percentage point units. The dependent variable in Panel A is the average change in small-business revenue measured from March 23 to April 12, 2020, relative to January 4 to 31, 2020. All regressions in Panel A are at the ZIP code level. The dependent variable in Panel B is the change in low-wage employment measured from June 27 to July 31, 2020, relative to January 4 to 31, 2020. In Panel B, columns (1) and (2) are at the county level using combined Paychex and Intuit data, while column (3) is at the ZIP code level using Earnin data. In both panels, column (1) shows the baseline regression without any controls: this specification corresponds to the estimated slope coefficient and standard error reported in Figure III (small-business revenue) and Figure IV, Panel C (low-wage employment). In Panel A, column (2) adds county fixed effects and column (3) further adds the log of the density of high-wage workers as a control (which is observed using the Census LODES for 92% of ZIP codes representing 99% of the U.S. population). In Panel B, column (2) adds the log of the density of high-wage workers as a control to the baseline county level regression, and column (3) switches to ZIP code–level data for a specification analogous to the one in column (3) of Panel A. Data sources: Womply, Paychex, Intuit, Earnin, ACS, Census LODES.
In sum, businesses in dense, affluent areas lost the most revenue—consistent with the sharp reduction in spending on in-person goods and services by high-income households. Next, we examine how businesses reacted to this loss of revenue, focusing on the incidence of the shock on their employees.
III.C. Labor Market Effects
We begin by analyzing how the loss of revenues affected labor demand using data on job postings from Lightcast. Figure IV, Panel A presents a binned scatter plot of the change in job postings that require minimal education between January 2020 and the April 2020 trough versus median rents by county. Job postings with minimal educational requirements fell much more sharply in high-rent areas than for workers in lower-rent areas (difference = 6.9 percentage points, or 22.8%), consistent with the larger shocks to revenue faced by firms located in high-rent areas. By contrast, postings for jobs that require higher levels of education—which are much more likely to be in tradeable sectors that are less influenced by local conditions (e.g., finance or professional services)—exhibit no relationship with local rents (Figure IV, Panel B).

Changes in Job Postings and Employment Rates versus Rent
This figure shows binned scatter plots of the relationship between median rents and changes in job postings (Panels A and B) or changes in employment rates (Panel C). The binned scatter plots are constructed as described in Figure II. Solid lines are best-fit lines estimated using OLS. Each panel also displays the slope coefficient and standard error of the corresponding linear OLS regression. In each panel, the x-axis variable is the median rent in a county for a two-bedroom apartment in the 2014–2018 ACS. In Panel A, the y-axis variable is the average value of our job postings series for jobs requiring minimal or some education between March 25 and April 14, 2020 (see Section II.B and Online Appendix D for more detail on our job postings series). Panel B replicates Panel A with job postings for workers with moderate, considerable, or extensive education. In both Panels A and B, we winsorize our job postings series at the 99th percentile of the (population-weighted) county-level distribution within each level of required education. In Panel C, the y-axis variable is the average value of our bottom wage quartile employment series during July 2020 (see Section II.B and Online Appendix E for more detail on the construction of our employment series). Data sources: Paychex, Intuit, Lightcast, ACS.
Having established that the pandemic reduced labor demand, especially for lower-skilled workers working in affluent areas, we turn to examine its effects on employment rates using data from payroll companies.
1. Benchmarking
Our payroll-based employment series is broadly aligned with measures from nationally representative statistics. Online Appendix Figure XIII.A shows that month-on-month changes in employment rates for all workers estimated from combined Paychex and Intuit payroll data generally fall between estimates obtained from the Current Employment Statistics (CES; a survey of businesses) and Current Population Survey (a survey of households). Turning to specific sectors, Online Appendix Figure XIII.B focuses on month-on-month employment changes in two sectors that experienced very different trajectories: food services, where employment fell heavily, and professional services, where it did not. In both cases, our Paychex-Intuit series closely tracks data from the CES. Online Appendix Figure XIV.A shows more generally that changes in employment rates across private nonfarm sectors (two-digit NAICS) are very closely aligned in our series and the CES, with a correlation of 0.97 when looking at changes from January to July 2020.
Unlike with spending and business revenues, there are publicly available sources of data on employment rates that can be disaggregated geographically and used to evaluate the representativeness of our data across areas. Our employment series closely matches state-level variation in employment changes during the pandemic in the CES, with a population-weighted correlation of 0.73 when looking at changes from January to July 2020 (Online Appendix Figure XIV.B). Our estimates are also well aligned with commuting zone (CZ) level estimates from the Quarterly Census of Employment and Wages (QCEW) (Online Appendix Figure XIV.C). Similarly, disaggregating the national data by wage rate, we find that our estimates are closely aligned with estimates based on the Current Population Survey and estimates in Cajner et al. (2020) (Online Appendix Figure XV).
These comparisons indicate that our combined employment series provides representative estimates of changes in employment rates across wage groups and geographic areas during the COVID pandemic. A natural question going forward is how accurate our local area estimates will be in more typical periods, where the shocks of interest are likely to be far smaller than during the pandemic. To evaluate the accuracy of our data from this broader perspective, we calculate the population-weighted RMSE between our estimates of CZ-level changes in quarterly employment in January to September 2021 and corresponding statistics from the QCEW. We find an RMSE of 1.69 percentage points for the 50 largest CZs and 3.64 percentage points when including all CZs. Because the QCEW statistics are based on unemployment insurance records covering the entire population, the RMSE can be loosely interpreted as the average standard error of our estimate, accounting for noise arising from both sampling error and changes in nonrepresentative sampling. The relatively small MSEs indicate that our data can identify employment shocks considerably smaller than those induced by the pandemic. For instance, the worst-hit quartile of CZs in the United States during the Great Recession had mean employment losses of 8.73 percentage points from 2007 to 2010, while the least-hit quartile of CZs had mean employment gains of 2.59 percentage points; our data would have been sufficiently precise to reliably differentiate those CZs. As another example, Aldy (2014) estimates that the 2010 Gulf Oil spill decreased employment in non-Panhandle Gulf Coast Florida counties by 2.7 percentage points; since the population of this region is equivalent to the fourth-largest CZ (with population of 7 million), our payroll-based series would have been sufficiently precise to detect and monitor this effect in near real time as well.
The key limitation of publicly available employment data is that existing data sources can only be disaggregated either by county or wage level. Our payroll-based data sources allow us to measure changes in employment by county and wage level, which we show proves to be valuable in understanding the effects of the COVID shock.21
2. Heterogeneity by Wage Rates
Figure V, Panel A plots employment rates by real prepandemic wage quartile. Each series shows the change in the total number of workers employed in jobs with hourly wage rates that fall in the relevant quartile of the pre-COVID wage distribution (with thresholds adjusted over time for inflation as described in Section II.B) relative to the baseline level in January 2020.

Changes in Employment by Wage Quartile
Panel A plots our combined Paychex-Intuit employment series from January 2020 through December 2021 for each wage quartile. We define moving wage quartile thresholds in each month based on 100%, 150%, and 250% of the federal poverty line for a family of four, adjusted for inflation, then converted into a full-time-equivalent hourly wage by dividing by 2,000 hours (50 weeks of work at 40 hours per week). In January 2020, the thresholds were |${\$}$|13.10, |${\$}$|19.65, and |${\$}$|32.75, and the four bins in ascending order by wage contained 23.4%, 27.4%, 25.7%, and 23.5% of CPS respondents. See Section II.B and Online Appendix E for details on the construction of this series. In Panel B, we reweight the county-by-industry (two-digit NAICS) distribution of bottom wage quartile employment to match the distribution for top wage quartile employment in January 2020. For each series in Panel B, we restrict the sample to county-by-industry cells with nonzero employment in all four wage quartiles in January 2020; this sample restriction excludes 2.5% of worker-days from the sample. Data sources: Paychex, Intuit.
We find substantial heterogeneity in job losses by wage rate, consistent with the findings of Cajner et al. (2020) in prior work using ADP data. Employment rates fell by 39% around the trough of the recession (April 15, 2020) for workers in the bottom wage quartile (i.e., the total number of jobs paying < |${\$}$|13.10/hour in January 2020 was 39% lower as of April 15). By contrast, employment rates fell by only 14% for those in the top wage quartile (those jobs paying more than |${\$}$|32.75/hour in January 2020) as of April 15.
High-wage workers not only were less likely to lose their jobs to begin with but also recovered their jobs much more quickly. By June 2020—just three months after the recession began—employment for high-wage workers had nearly returned to the pre-COVID baseline. Employment rates in low-wage jobs recovered rapidly to 20% below baseline levels by summer 2020, but then stalled from that point onward.
3. Heterogeneity across Areas
To identify the mechanisms driving these employment effects, we again exploit geographic variation, studying whether employment fell most in the high-rent areas that faced the largest demand shocks. Figure IV, Panel C plots changes in bottom-wage-quartile employment rates from January to July 2020 versus median rents, by county. Consistent with the larger shocks in high-rent areas to business revenue and labor demand for low-skilled workers, low-wage employment rates fell much more in more affluent counties. Low-wage employment fell by 21.7% in the highest-rent counties, compared with 16.8% in the lowest-rent counties. We find a similar pattern at the ZIP code level using employment data from Earnin (Online Appendix Figure XVI.A). Table I, Panel B presents a set of regression estimates quantifying these effects. Low-wage employment rates fell more in higher-rent areas (column (1)), even when controlling for the density of high-wage workers (column (2)) and comparing ZIP codes within the same county (column (3)).
The concentration of employment losses in more affluent areas is a consequence of the specific pattern of demand shocks induced by COVID rather than a general feature of recessions. Online Appendix Figure XVII shows that in the Great Recession (2007–2010), counties in the bottom quartile of the household median income distribution accounted for 29% of job losses, while those in the top quartile accounted for 21% of job losses. By contrast, in the COVID recession (January–April 2020), counties in the top quartile accounted for a larger share of job losses than counties in the bottom quartile.
In summary, the pandemic led to a short V-shaped recession for high-wage workers, but a prolonged reduction in employment for lower-wage workers that persisted until at least December 2021, the end of our analysis period. Geographic disaggregation reveals that the drop in low-wage employment at the start of the pandemic was driven primarily by a contraction in spending among high-income individuals—which then reduced labor demand for low-skilled workers—rather than voluntary reductions in labor supply (that might have been induced, for example, by health concerns or unemployment benefits). In the next section, we examine why employment rates remained low even as the economy began to recover.
III.D. Recovery
By the middle of 2021, aggregate consumer spending (Figure I, Panel A) and small-business revenues (Online Appendix Figure IX.A) had met pre-COVID baseline levels and continued to climb upward. Yet employment rates in jobs that paid wages in the bottom quartile of the prepandemic distribution remained 20% lower even as of December 2021 (Figure V, Panel A). What explains this “jobless recovery” at the bottom of the wage distribution?
1. Wage Growth
Part of the explanation is real wage growth: wage rates rose faster than the poverty line during the pandemic, leading some workers to move up out of the bottom wage bin (rather than into nonemployment). To quantify the effect of wage growth, we seek to measure how much wage rates changed in a given job. Lacking panel data at the job level, we measure changes in wage rates in detailed industry, occupation, and demographic cells using data from the CPS (see Online Appendix E.4 for details). Using the estimated wage growth distribution, we estimate that 7.7 percentage points of the reduction in the total number of workers in the lowest wage group as of December 2021 is due to wage growth, leaving 12 percentage points due to changes in employment patterns—either exits into nonemployment or switches to higher-paying jobs.
We assess the contribution of switching to higher-paying jobs using two methods: assessing whether the cross-sectional composition of employment has shifted toward higher-paying jobs and measuring employment rates by prepandemic wage quartile in panel data. To implement the first test, we measure the wage distribution based on pre-COVID (2019) wage rates in each industry × occupation × race × gender × region cell of the CPS. We find that shifts in the job distribution across these cells actually led to an increase in the share of workers in the lowest wage group as of December 2021. To implement the second test, we use the CPS Outgoing Rotation Group panel, consisting of individuals who responded to CPS survey interviews spaced 12 months apart. In this panel, nonemployment rates as of July 2020 to February 2021 are 7.7 percentage points higher for those who started out in the bottom wage quartile pre-COVID (between July 2019 to February 2020) than for those who started out in the top wage quartile.22 These findings indicate that exits to nonemployment explain most of the remaining reduction in bottom-quartile employment in the cross-sectional data after accounting for wage growth.
In the rest of this section, we analyze why low-wage workers remained out of work at higher rates as of December 2021, distinguishing between labor demand and supply channels.
2. Labor Demand
Although aggregate demand recovered, consumer demand may have shifted across sectors and technologies in ways that reduced labor demand for lower-skilled workers in the United States. For example, consumer demand shifted persistently over the course of the pandemic toward larger companies, online vendors, and certain sectors such as retail trade (Carman and Nataraj 2020; Dunn, Hood, and Driessen 2020; Alexander and Karger 2023). Such companies might have more capital-intensive production functions or outsource more of their production, leading to a persistent downward shift in the demand for low-skilled labor in the United States. Furthermore, firms may have sought efficiencies in their production processes and economic activity may have shifted to more efficient firms during the recession, potentially further reducing labor demand (Berger 2012; Lazear, Shaw, and Stanton 2016; Jaimovich and Siu 2020).
To evaluate this demand-side explanation, we first examine the evolution of aggregate job postings over time in Online Appendix Figure IX.B. Postings for jobs that required minimal or no skills had returned to pre-COVID levels by mid-2020 and were well above pre-COVID levels in most of 2021 as businesses sought to restaff after reducing their payrolls earlier in the pandemic, consistent with the findings of Forsythe et al. (2022).23
Furthermore, there is no evidence of mismatch in labor demand relative to the supply of low-wage workers across sectors or places. Figure V, Panel B plots employment for workers in the bottom wage quartile, reweighting the series to match baseline employment shares by county and industry (two-digit NAICS) in the top wage quartile. This reweighting closes very little of the gap between the two series, showing that differences in industry and location do not explain the differences in employment trajectories between low- and high-wage workers.24 Similarly, we find no evidence of a spatial mismatch between job posts and workers: reweighting job postings across counties by the number of bottom-wage-quartile workers in January 2020 has little effect on the time series of job postings (Online Appendix Figure IX.B).
We conclude that low-wage workers appear to have had considerable demand for their skills in their own counties, but chose not to take jobs that were available.
3. Labor Supply
Given these findings, we examine mechanisms that may have led to a reduction in labor supply among low-wage workers. We begin by analyzing how the labor market recovery differed in high- versus low-rent counties, building on the analysis in the previous sections.25
Figure VI, Panel A shows that job postings were approximately 20% above pre-COVID baseline levels in both high-rent and low-rent counties in December 2021. The gradient in job postings with respect to rent that emerged when the pandemic hit (Figure IV, Panel A) disappeared entirely by December 2021. Yet low-wage employment rates remained substantially lower in high-rent areas than in low-rent areas (Figure VI, Panel B). In the lowest-rent counties—where the initial reduction in aggregate demand was smallest (as measured by small-business revenues and job postings)—the total number of workers with jobs in the bottom wage quartile as of December 2021 was 9% lower than it was pre-COVID. This 9% reduction is roughly consistent with what we would expect based on wage growth (as discussed already), indicating that employment had roughly fully recovered in places where the pandemic had minimal effects on aggregate demand initially. In contrast, in the highest-rent counties, bottom-wage-quartile employment was 23% lower in December 2021 than it was pre-COVID.

Evolution of the Association between Low-Education Job Postings and Low-Wage Employment with Rent
This figure presents a summary of the results of a set of regressions documenting the relationship between job postings and employment with rent over time. Panel A replicates Figure IV, Panel A, but using the average value of the low-education job postings series in December 2021 instead of April 2020. Panel B replicates Figure IV, Panel C, but using the average value of the Paychex-Intuit employment series in December 2021 instead of July 2020. The binned scatter plots are constructed as described in Figure II. Panel C plots the slope of the best-fit line from a population-weighted regression of low-education job postings on median county rent (as in Panel A) for each month from April 2020 through December 2021. The slopes estimated in Figures IV, Panel A and VI, Panel A are the first and last estimates in this series, respectively. Panel D replicates Panel C for the slope of the bottom wage quartile employment versus median rent (as in Panel B). In Panels C and D, the dashed lines above and below the solid series represent the upper and lower boundaries of the 95% confidence interval for the slope estimated in each month. Panels B and D omit counties from CA, MA, and NY, since these three states raised the minimum wage at some point after July 2020 above our upper threshold for the bottom wage quartile of employment. Data sources: Lightcast, Paychex, Intuit, ACS.
Figure VI, Panels C and D characterize the evolution of the job postings and employment gradients by county-level rents by month from April 2020 to December 2021. They plot slopes from regressions of job postings and low-wage employment rates on rent across counties (weighted by population) by month. The job postings gradient begins to flatten starting in January 2021 and disappears completely by the last quarter of 2021. In stark contrast, the employment gradient steepens over time and never recovers during the period we study.26
The results in Figure VI suggest that the places that experienced larger demand shocks initially (namely, more affluent, high-rent areas) exhibit persistent declines in employment even as of December 2021, despite the fact that labor demand had recovered fully in those areas by that point. One potential explanation for this hysteresis in employment rates is a change in preferences or commitments that workers made when the pandemic hit that induced persistent changes in labor supply. For example, low-wage workers may have moved to smaller apartments or changed their living arrangements such that they could afford to work less when the pandemic hit, and may have decided that they preferred to retain these arrangements going forward even when labor demand recovered. Another possibility is that low-wage workers’ human capital decayed and made it more difficult for them to obtain available jobs.
In Table II, we contextualize the magnitude of the cross-sectional variation in low-wage employment rates by regressing changes in bottom-wage-quartile employment rates from January 2020 to December 2021 on median rents by county and other covariates that reflect contemporaneous economic conditions. Column (1) replicates the specification in Figure VI, Panel B, showing that employment remains sharply depressed in higher-rent areas (where the initial aggregate demand shock was more severe) relative to lower-rent areas in December 2021. In column (2), we include two variables that measure contemporaneous health and economic conditions—the average COVID case rate from October to December 2021 (a measure of the risk of COVID exposure) and the number of weeks of unemployment insurance (UI) benefits that individuals were eligible for in their state as of December 2021—as well as a set of demographic controls. Including these variables does not affect the relationship between median rents and employment rates significantly.
Mechanisms Underlying the Persistent Reduction in Low-Wage Employment: Hysteresis versus Current Conditions
. | Dep. var.: . | Change from January 2020 to December 2021 in . | |||
---|---|---|---|---|---|
. | . | Low-wage (Q1) employment . | High-wage (Q4) employment . | ||
. | . | (1) . | (2) . | (3) . | (4) . |
Median two-bedroom rent (per thousand dollars) | −0.17 (0.04) | −0.19 (0.04) | −0.04 (0.04) | ||
Change in low-wage (Q1) employment from January 2020 to July 2020 | 0.67 (0.11) | ||||
Average daily COVID cases (thousands) in October 2021 to December 2021 | −0.10 (0.02) | −0.07 (0.02) | −0.07 (0.01) | ||
Maximum weeks of state UI benefits in December 2021 | 0.000 (0.001) | −0.002 (0.001) | −0.003 (0.001) | ||
Demographic controls | No | Yes | Yes | Yes | |
Observations | 841 | 841 | 841 | 626 | |
Change in employment explained by COVID cases (p.p.) | 3.0 | 2.1 | 2.1 | ||
Change in employment explained by UI extensions (p.p.) | 0.1 | 0.4 | 0.7 |
. | Dep. var.: . | Change from January 2020 to December 2021 in . | |||
---|---|---|---|---|---|
. | . | Low-wage (Q1) employment . | High-wage (Q4) employment . | ||
. | . | (1) . | (2) . | (3) . | (4) . |
Median two-bedroom rent (per thousand dollars) | −0.17 (0.04) | −0.19 (0.04) | −0.04 (0.04) | ||
Change in low-wage (Q1) employment from January 2020 to July 2020 | 0.67 (0.11) | ||||
Average daily COVID cases (thousands) in October 2021 to December 2021 | −0.10 (0.02) | −0.07 (0.02) | −0.07 (0.01) | ||
Maximum weeks of state UI benefits in December 2021 | 0.000 (0.001) | −0.002 (0.001) | −0.003 (0.001) | ||
Demographic controls | No | Yes | Yes | Yes | |
Observations | 841 | 841 | 841 | 626 | |
Change in employment explained by COVID cases (p.p.) | 3.0 | 2.1 | 2.1 | ||
Change in employment explained by UI extensions (p.p.) | 0.1 | 0.4 | 0.7 |
Notes. This table presents estimates for a set of population-weighted regressions examining the determinants of employment patterns in December 2021 at the county level. Robust standard errors are reported in parentheses. The sample omits California, Massachusetts, and New York due to mismeasurement of low-wage employment changes as a result of minimum wage increases; see Online Appendix E.2 for more information. Column (1) reports the results of regressing the change in low-wage (i.e., bottom-quartile) employment from January 2020 to December 2021 against the average median two-bedroom rent (as measured in the 2014–2018 ACS) at the county level. Column (2) adds the average COVID-19 case rate in October to December 2021 (a measure of the risk of COVID exposure), the maximum number of weeks of unemployment insurance eligibility in each state, and a set of demographic controls: foreign-born population share, nonwhite population share, share of the population who are working age (25–54), and female population share. Column (3) repeats the specification in column (2), replacing median two-bedroom rent with the size of the initial low-wage employment shock to each county, measured as the change in low-wage employment from January 2020 to July 2020. Column (4) repeats the specification in column (2) using the change in high-wage (i.e., top-quartile) employment from January 2020 to December 2021 as the dependent variable. The bottom two rows of the table report the change in the dependent variable explained by COVID risk exposure and UI extensions, calculated by multiplying the coefficient by the population-weighted mean of the respective variable. Data sources: Paychex, Intuit.
Mechanisms Underlying the Persistent Reduction in Low-Wage Employment: Hysteresis versus Current Conditions
. | Dep. var.: . | Change from January 2020 to December 2021 in . | |||
---|---|---|---|---|---|
. | . | Low-wage (Q1) employment . | High-wage (Q4) employment . | ||
. | . | (1) . | (2) . | (3) . | (4) . |
Median two-bedroom rent (per thousand dollars) | −0.17 (0.04) | −0.19 (0.04) | −0.04 (0.04) | ||
Change in low-wage (Q1) employment from January 2020 to July 2020 | 0.67 (0.11) | ||||
Average daily COVID cases (thousands) in October 2021 to December 2021 | −0.10 (0.02) | −0.07 (0.02) | −0.07 (0.01) | ||
Maximum weeks of state UI benefits in December 2021 | 0.000 (0.001) | −0.002 (0.001) | −0.003 (0.001) | ||
Demographic controls | No | Yes | Yes | Yes | |
Observations | 841 | 841 | 841 | 626 | |
Change in employment explained by COVID cases (p.p.) | 3.0 | 2.1 | 2.1 | ||
Change in employment explained by UI extensions (p.p.) | 0.1 | 0.4 | 0.7 |
. | Dep. var.: . | Change from January 2020 to December 2021 in . | |||
---|---|---|---|---|---|
. | . | Low-wage (Q1) employment . | High-wage (Q4) employment . | ||
. | . | (1) . | (2) . | (3) . | (4) . |
Median two-bedroom rent (per thousand dollars) | −0.17 (0.04) | −0.19 (0.04) | −0.04 (0.04) | ||
Change in low-wage (Q1) employment from January 2020 to July 2020 | 0.67 (0.11) | ||||
Average daily COVID cases (thousands) in October 2021 to December 2021 | −0.10 (0.02) | −0.07 (0.02) | −0.07 (0.01) | ||
Maximum weeks of state UI benefits in December 2021 | 0.000 (0.001) | −0.002 (0.001) | −0.003 (0.001) | ||
Demographic controls | No | Yes | Yes | Yes | |
Observations | 841 | 841 | 841 | 626 | |
Change in employment explained by COVID cases (p.p.) | 3.0 | 2.1 | 2.1 | ||
Change in employment explained by UI extensions (p.p.) | 0.1 | 0.4 | 0.7 |
Notes. This table presents estimates for a set of population-weighted regressions examining the determinants of employment patterns in December 2021 at the county level. Robust standard errors are reported in parentheses. The sample omits California, Massachusetts, and New York due to mismeasurement of low-wage employment changes as a result of minimum wage increases; see Online Appendix E.2 for more information. Column (1) reports the results of regressing the change in low-wage (i.e., bottom-quartile) employment from January 2020 to December 2021 against the average median two-bedroom rent (as measured in the 2014–2018 ACS) at the county level. Column (2) adds the average COVID-19 case rate in October to December 2021 (a measure of the risk of COVID exposure), the maximum number of weeks of unemployment insurance eligibility in each state, and a set of demographic controls: foreign-born population share, nonwhite population share, share of the population who are working age (25–54), and female population share. Column (3) repeats the specification in column (2), replacing median two-bedroom rent with the size of the initial low-wage employment shock to each county, measured as the change in low-wage employment from January 2020 to July 2020. Column (4) repeats the specification in column (2) using the change in high-wage (i.e., top-quartile) employment from January 2020 to December 2021 as the dependent variable. The bottom two rows of the table report the change in the dependent variable explained by COVID risk exposure and UI extensions, calculated by multiplying the coefficient by the population-weighted mean of the respective variable. Data sources: Paychex, Intuit.
These estimates imply that the ongoing risk of COVID can explain approximately 3.0 percentage points of the 12 percentage point reduction in bottom-wage-quartile employment that is not due to wage growth. Similarly, multiplying the coefficient on the UI benefits variable by the mean number of the weeks of additional UI benefits for which individuals were eligible in December 2021 implies that UI benefit extensions account for less than 1 percentage point of the reduction in bottom-wage-quartile employment. This cross-sectional estimate based on changes in UI policies across states over time is consistent with the quasi-experimental elasticities of employment rates with respect to UI benefit length estimated by Coombs et al. (2022), which also finds that UI benefits appear to have small effects on employment rates during the pandemic.
Table II, column (3) presents a variant of the specification in column (2), replacing median rent with the change in bottom-wage-quartile employment rates from January to July 2020—the immediate loss in low-wage employment after the shock—as the key independent variable. We find a positive relationship, showing that areas where employment fell more in the immediate aftermath of the pandemic exhibited persistent declines in employment nearly two years later.
Column (4) replicates column 2, replacing the dependent variable with the change in employment rate for jobs that paid wages in the top quartile of the prepandemic wage distribution. We find no relationship between top-quartile employment rates and rents, consistent with the rapid recovery of labor demand and employment for high-skilled workers.
Finally, we evaluate whether changes in the total number of available workers (i.e., the total population of lower-skilled workers in high-rent areas)—rather than changes in labor supply for a specific worker—can explain a significant portion of the shortfall in employment rates. Although the number of immigrants to the United States fell during the pandemic, CPS data show that trends in total low-wage employment rates for immigrants and U.S. citizens aged 16 or older were virtually identical (Online Appendix Figure XX.A). Internal migration from high-rent to lower-rent areas in the United States also does not explain a significant share of the larger reduction in employment rates in high-rent areas (Online Appendix Figure XX.B). Demographic trends in aging over this short period are also too small to explain the shortfall in employment: the working age population (aged 15–64) grew from 205.7 to 207.1 million between January 2020 and 2022 (Organisation for Economic Co-operation and Development 2023). Finally, the share of individuals who moved to self-employment (and hence were not available to be low-wage employees) is also too small to explain the aggregate shortfall in bottom-wage-quartile employment: the self-employed share of individuals aged 16 or older rose from 6.05% in January 2020 to 6.15% in December 2021 (U.S. Bureau of Labor Statistics 2023a, 2023b, 2023c).
In sum, the persistent reduction in low-wage employment is not readily explained by changes in labor demand, changes in contemporaneous incentives to work such as UI benefits or ongoing health risks, or changes in the total number of workers. Rather, the strongest predictor of the cross-sectional variation in employment rates in December 2021 are variables that predict the size of the initial shock to aggregate demand—echoing the findings of Yagan (2019), who documents hysteresis in the labor markets that were hit hardest in the Great Recession.
IV. Evaluation of Policy Responses to COVID-19
In this section, we examine the scope for stabilization policies to break the chain of events documented above: reductions in spending, especially by high-income households, were associated with losses in business revenue and low-wage employment. We begin by evaluating the stimulus payments made to households during the pandemic, illustrating how the public data we construct are useful for real-time policy evaluation. We briefly discuss other examples of policy evaluations and conclude by assessing whether the combination of policy responses was sufficient to stabilize economic activity.
IV.A. Stimulus Payments to Households
The federal government sent a total of |${\$}$|814.4 billion in stimulus checks to households at three points during the pandemic: April 2020, January 2021, and March 2021 (Internal Revenue Service 2022). Were these stimulus payments successful in boosting consumer spending?
We estimate the causal effect of each stimulus payment on spending in the first month after receipt, focusing in particular on heterogeneity across the income distribution. We focus on a one-month horizon because prior work shows that most of the impact of stimulus payments and tax refunds is concentrated within three months of receipt (e.g., Sahm, Shapiro, and Slemrod 2010; Broda and Parker 2014). Moreover, spending effects in the first month are highly predictive of spending effects in the first three months across subgroups (Parker and Souleles 2019, table 3).27
1. April 2020
The Coronavirus Aid, Relief, and Economic Security (CARES) Act made direct payments to nearly 160 million people starting in mid-April 2020. Individuals earning less than |${\$}$|75,000 received a stimulus payment of |${\$}$|1,200; married couples earning less than |${\$}$|150,000 received a payment of |${\$}$|2,400; and households received an additional |${\$}$|500 for each dependent they claimed.28 IRS statistics show that 69% of stimulus payments made in April were direct deposited on exactly April 15, 2020, though some households received payments on April 14 (Bureau of the Fiscal Service 2020).
We evaluate the effects of these stimulus payments on consumer spending using a high-frequency difference-in-differences research design applied to our card spending data, comparing daily spending before versus after April 15 in 2020 versus spending on the same calendar date in 2019. To reduce cyclical fluctuations, we residualize daily spending (indexed to average levels in January 2019) with respect to day-of-week fixed effects, which we estimate using data for 2019. We then adjust for a linear pretrend in spending (which we assume to be common across all income quartiles) to capture aggregate shocks in spending during the pandemic in 2020.29 To capture high-frequency changes in spending, we do not smooth the daily spending using a seven-day moving average, unlike in preceding sections.
Figure VII, Panel A plots the difference in daily spending in 2020 versus 2019 for households who live in ZIP codes with median household income in the bottom quartile of the national distribution (which we call “low-income” households for convenience). Spending increases markedly following the arrival of payments, with particularly high spending in the days when stimulus checks first arrived. To quantify the magnitude of the (short-run) effect of the stimulus on spending, in Online Appendix Table VI we estimate difference-in-differences models using OLS regressions of daily spending by income quartile (residualized against a common linear pretrend) in the 25 days before and after April 15 on an indicator for being pre- versus poststimulus interacted with calendar year. To capture the nonlinear dynamics evident in the nonparametric event study plots, we estimate separate treatment effects for the first five days starting on April 15 and from the sixth day onward; see Online Appendix K for a more detailed description of our methodology.

Effects of Stimulus Payments on Spending: Event Studies
This figure shows event studies of the effect of stimulus payments on consumer spending. We measure consumer spending using data from Affinity Solutions. To construct each consumer spending time series, we express consumer spending on each day as a percentage change relative to mean daily consumer spending over January 2019, residualize these daily percentage changes with respect to day-of-week fixed effects (estimated out-of-sample using data in 2019), calculate the first difference with respect to values from the corresponding period starting in 2019, and adjust the estimates for a linear pretrend in first differences. Panel A depicts this spending time series for 25 days before and after April 15, 2020 (the modal date for deposits of the CARES Act economic impact payments) for cardholders with residential addresses in the bottom income quartile of ZIP codes. We exclude April 14, 2020, from the preperiod because some households received stimulus payments on this date. Panel B repeats this figure for the top income quartile of ZIP codes. Panel C repeats Panels A and B for the days around January 4, 2021 (the modal date for deposits of the COVID-Related Tax Relief Act economic impact payments), plotting outcomes for both the bottom and top income quartiles. The preperiod in Panel C runs from December 4 to 14, 2020, with the holiday period (December 15, 2020 to January 3, 2021) removed due to high daily volatility in spending levels (see Section IV.A and Online Appendix Figure XXIII for more details). The postperiod runs from January 4 to 19, 2021, reflecting the data available when this analysis was originally published on January 26, 2021. Due to the omission of the holiday period, we do not remove a linear pretrend as in Panels A and B. Panel D repeats Panel C for the days around March 17, 2021 (the modal date for deposits of the American Rescue Plan Act economic impact payments). We exclude March 13 to 16, 2021 from the preperiod as payments were made starting March 13. In Panels A, B, and D, we interpolate the value for Easter Sunday using the average of adjacent daily values. Data sources: Affinity Solutions, ACS.
Using this approach, we estimate that spending increased by 21 percentage points (std. err. = 3.13) for bottom-income-quartile households in the first month after the stimulus payments. Accounting for the fraction of households who actually received stimulus checks in this group, this estimate translates to an increase in spending of |${\$}$|442 during the first month after receiving a |${\$}$|1,200 stimulus check (see Online Appendix K for more details). The estimates remain stable when varying the window used to estimate the treatment effect, with point estimates ranging from |${\$}$|320 to |${\$}$|452, as shown in Online Appendix Figure XXII.
Figure VII, Panel B repeats this analysis for high-income households—those who live in ZIP codes with median household income in the top quartile of the distribution. Once again, we see a clear increase in spending in the month after the stimulus payments were made relative to the month before, although there is no immediate spike in spending on the day that the checks were received, as one might expect given that higher-income households are less likely to be liquidity constrained at high frequencies. We estimate that spending for top-income-quartile households increased by 11 percentage points. This smaller percentage point effect is to be expected because higher-income households received smaller stimulus payments both in absolute terms and as a percentage of their total expenditure. Rescaling this effect, we estimate that high-income households spent |${\$}$|732 per |${\$}$|1,200 of stimulus payments received in the first month.
The first bar in each set of bars plotted in Figure VIII presents estimates of the impact of the April 2020 stimulus payments on spending over a one-month horizon for the four ZIP-income quartiles. Across the income distribution, households spent a large fraction of their April 2020 stimulus checks in the month immediately after receipt, consistent with evidence from confidential data from JPMorgan Chase account holders subsequently reported by Cox et al. (2020).30

Impacts of Stimulus Payments on Spending, by Income Quartile
This figure plots estimates of the marginal propensity to spend out of stimulus payments in the first month after receipt for each of the three rounds of stimulus payments, separately by income quartile (based on median ZIP code income). The estimates are scaled per |${\$}$|1,200 of stimulus payment and correspond to the “Combined Dollar” estimates reported in Online Appendix Table VI, column (5). See Section IV.A and Online Appendix K.3 for details on how these estimates were calculated. We also report p-values testing the null hypothesis of equal effect sizes between each pair of stimulus rounds, for the highest- and lowest-quartile of ZIP-level incomes. These p-values are based on permutation tests reported in Online Appendix Figures XXIV and XXV. Data source: Affinity Solutions.
2. January 2021
The COVID-Related Tax Relief Act made payments of |${\$}$|600 per person to most Americans available beginning on January 4, 2021. Eligibility criteria largely followed those for the earlier round of stimulus, with single households eligible for the full stimulus amount up to |${\$}$|75,000 in income (|${\$}$|150,000 for married households). The stimulus amount fell at higher income levels, with childless households with incomes up to |${\$}$|87,000 (or |${\$}$|174,000 if married filing jointly) receiving a payment.
To evaluate whether our data could shed light on this policy’s effect sufficiently rapidly to inform the design of future stimulus payments, we analyzed the effects of the stimulus payments on spending from January 4 to 19, 2021, and released results publicly on January 26, 2021 (Chetty, Friedman, and Stepner 2021). We use the same difference-in-differences design we used to study the first stimulus, except that we use December 4 to 14, 2020, as the preperiod rather than the days immediately preceding the stimulus payments because those days coincide with the Christmas holiday period, when daily spending exhibits 10 times higher variance across days (even when looking at changes across years) than during the first half of December (Online Appendix Figure XXIII).31 We also omit pretrends here because of the gap created by omission of the holiday period.
Figure VII, Panel C replicates the series in Panels A and B for the January 2021 stimulus, plotting indexed daily changes in spending in 2021 versus 2020 for bottom- and top-income-quartile households. Low-income households increase spending significantly after the arrival of the January stimulus payments. In contrast, high-income households do not change their spending levels significantly after January 4, 2021, relative to December 2020. Using difference-in-differences models analogous to those before, we estimate that low-income households increased spending over the first month after receiving their stimulus checks by 6 percentage points, while high-income households increased spending by 0.4 percentage points, an estimate that is not significantly different from zero. The middle bars shown in Figure VIII rescale these estimated effects into dollars per |${\$}$|1,200 to facilitate comparisons across stimulus rounds. While the marginal propensity to consume (MPC) out of these stimulus payments in the first month fell significantly for all income groups in January 2021 relative to April 2020, the drop in the MPC for high-income households was especially large. We estimate that low-income households spent |${\$}$|187 per |${\$}$|1,200 of stimulus received, 58% smaller than the |${\$}$|442 estimated in April 2020. High-income households spent much less of their second stimulus checks—from |${\$}$|732 per |${\$}$|1,200 received in April 2020 to just |${\$}$|35 per |${\$}$|1,200 in January 2021, a reduction of 95%.32 These heterogeneous effects on spending across income groups are aligned with results subsequently reported in May 2021 by Greig, Deadman, and Noel (2021, 20, box 1), using confidential data from JPMorgan Chase.
In short, this analysis demonstrates that one can gauge the (short-term) effects of stimulus payments with just two weeks of data after the payments are made using what are now publicly available data—enabling a rapid feedback loop for subsequent policy changes. Indeed, based on these estimates, we predicted that making further stimulus payments to high-income households would have modest effects on their spending, suggesting that targeting the next round of stimulus toward lower-income households would save substantial resources that could be used to support other programs, with minimal impact on economic activity.
3. March 2021
After extensive debate about whether higher-income households should continue to receive stimulus payments—including discussion of the evidence described above (Lambert and Sraders 2021)—Congress passed the American Rescue Plan on March 11, 2021. The final plan continued to pay the full stimulus amount of |${\$}$|1,400 to households earning up to |${\$}$|150,000, but phased the payments out more rapidly beyond that threshold than initially proposed, so that households with incomes above |${\$}$|80,000 (for single filers without children) or |${\$}$|160,000 (for married couples without children) received no stimulus. These revisions reduced the total amount of stimulus payments made to high-income households by approximately |${\$}$|17 billion relative to the original proposal in January 2020 (Watson 2021).
Did the March 2021 stimulus payments in fact have lower effects on spending of higher-income households, as predicted based on the January 2021 evidence? Figure VII, Panel D replicates the preceding figures for the 25 days before and after the March 2021 checks were sent out; here, we use exactly the same estimator as in the first stimulus, as there are no holiday-induced fluctuations in the preperiod. Bottom-income-quartile households increased spending considerably in the days following the March payments, while spending for high-income households did not change significantly. The third set of bars in Figure VIII rescale these effects into dollar impacts per |${\$}$|1,200 of stimulus payment. The estimated effects are much more similar to those observed in January 2021 than in April 2020, with positive effects on spending for lower-income households but near-zero effects on spending for top-quartile households.
4. Discussion
Why did the marginal propensity to consume out of cash windfalls fall sharply over the course of the pandemic, especially for higher-income households? Studies of stimulus payments in prior recessions find little heterogeneity in MPCs by income, but show that households with higher liquid wealth balances exhibit lower MPCs (Johnson, Parker, and Souleles 2006; Broda and Parker 2014; Jappelli and Pistaferri 2014). In normal times, most households even in the top income quartile tend to have relatively little liquid wealth (Kaplan and Violante 2014), explaining why they exhibit high MPCs out of windfalls in previous recessions. But during the pandemic, households started to accumulate substantial liquid wealth because their incomes remained relatively stable while their spending fell sharply, as discussed in Section III. The national savings rate (measured in NIPA Table 2.6) rose from 7.6% in 2019 to 18.5% on average in Q2–Q4 of 2020. Using confidential data from JPMorgan Chase, Greig, Deadman, and Noel (2021) further show that cash balances in checking accounts rose substantially from January to December 2020, with the largest increases (in dollars) among high-income households. Given this rapid growth in liquid wealth, it is not surprising that high-income households started to spend less of their stimulus payments over time.33
This analysis illustrates the value of real-time estimation of policy impacts rather than predictions based on historical estimates. Despite being based on a consensus across a large set of studies, historical predictions about the lack of heterogeneity in MPCs by income proved to be inaccurate given the unusual impacts of the pandemic on spending behavior across the income distribution.34 The core challenge is that parameters such as MPCs are not invariant to the economic and policy environment. By directly estimating such parameters in real time using newly available data, one can make policy decisions that respond transparently—based on publicly available information—to current economic conditions.
IV.B. Effects of Other Policies
The data we make publicly available can also be used to study a range of other policies beyond stimulus payments. For illustration, we briefly discuss four examples of policies that were implemented during the COVID-19 crisis. The first two are based on analyses we conduct ourselves (detailed in Online Appendix L) and the latter two are analyses conducted by other researchers using our data in combination with other data sources.
1. State-Ordered Shutdowns and Reopenings
Many states enacted stay-at-home orders and shutdowns of businesses in an effort to limit the spread of COVID infections and later reopened their economies by removing these restrictions. Using the card spending and payroll data, we evaluate the effects of these policies using event study designs that compare trends in states that shut down and reopened at different dates. We find that state-ordered shutdowns and reopenings had modest effects on economic activity. Spending and employment remained well below baseline levels even after reopenings, and trended similarly in states that reopened earlier relative to comparable states that reopened later (Online Appendix Figures XXVI–XXVII). Spending and employment also fell well before state-level shutdowns were implemented. These findings are consistent with work by Goolsbee and Syverson (2021) and Sears et al. (2023) using cell phone location data as well as Bartik et al. (2020) using timesheet data on hours of work.
2. Paycheck Protection Program
The PPP sought to reduce employment losses by providing forgivable loans worth more than |${\$}$|800 billion in total to small businesses that maintained sufficiently high employment (relative to precrisis levels). Using our payroll data disaggregated by firm size, we evaluate the impacts of the PPP on employment by comparing employment trends at firms with fewer than 500 employees (which were eligible for PPP assistance) with firms in the same sector that had more than 500 employees (who were ineligible). We find that employment increased by only 2.48 percentage points after the PPP was enacted in April 2020 relative to larger firms that were ineligible for PPP (Online Appendix Figure XXVIII). Our point estimates imply that the cost per job saved by the PPP was |${\$}$|301,863 (|${\$}$|86,201 at the upper bound of the 95% confidence interval); netting out potential UI payments to these potentially unemployed workers reduces this number only slightly to |${\$}$|283,513 per job saved (see Online Appendix L.2 for details). Autor et al. (2022a, 2022b) reach similar conclusions using the same research design with microdata from ADP, another large payroll processor. Granja et al. (2022) use a different design, exploiting cross-sectional variation in PPP takeup driven by bank composition, and reach similar conclusions, partly drawing upon the data we make publicly available. Together, all of these studies suggest the PPP had modest marginal impacts on employment in the short run, likely because the vast majority of PPP loans went to inframarginal firms that were not planning to lay off many workers.35
3. Unemployment Benefit Increases
The Federal Pandemic Unemployment Compensation (FPUC) program paid supplemental unemployment benefits of up to |${\$}$|600 per week from March to September 2020. Casado et al. (2020) use county-level variation in wage replacement rates resulting from differences in industrial composition to estimate the effect of FPUC payments on aggregate spending. Using our publicly available spending data combined with UI claims data from Illinois, they estimate that a 1% increase in the replacement rate increased county-level spending by 0.167%, which implies that each |${\$}$|1 of UI benefits increased aggregate spending at the county level by |${\$}$|1.23. For comparison, our estimates above imply that |${\$}$|1 of spending in the form of stimulus checks increased household-level spending by an MPC = 0.256 on average. In a standard Keynesian model, an MPC of 0.256 would imply an effect on aggregate spending of |$\frac{ 0.256 }{ 1 - 0.256 } = 0.344$|, an order of magnitude smaller than the estimated effect of UI benefits. In the pandemic, where some sectors were effectively shut down, theory suggests that the multipliers would be even smaller than the standard Keynesian benchmark (Guerrieri et al. 2022). This comparison suggests that UI benefits targeted to unemployed individuals were a more potent tool to stimulate aggregate spending than stimulus payments to all individuals, especially later in the pandemic as employed households built up a large stock of savings.
4. Eviction Moratoria
Many state and local governments enacted moratoria on tenant eviction during the pandemic to provide stable housing for those who might have lost their jobs. These moratoria were implemented at different times in different states and counties. An, Gabriel, and Tzur-Ilan (2022) exploit variation in the timing of such moratoria to estimate their effects on spending. Using our publicly available data on consumer spending by category coupled with other sources, they estimate that a one-week eviction moratorium is associated with a 1% increase in spending on necessities such as food and groceries. They conclude that eviction moratoria not only reduced housing instability but also boosted spending on other goods and potentially provided an aggregate stimulus as a result.
Methodologically, these examples illustrate that data from private sector sources can be used to evaluate a wide variety of policies rapidly because many policies have heterogeneous effects across geographic areas or other dimensions, such as firm size. Reassuringly, the findings obtained from our public statistics match those obtained from studies with access to the underlying microdata, demonstrating that public statistics constructed from private sector data sources can support many policy analyses. Taken together, these studies suggest that policies targeted directly at households that suffered the largest income losses—such as those who became unemployed or faced eviction—had the largest effects on spending and downstream economic activity in the pandemic.
IV.C. Secondary Effects on Spending
We conclude by stepping back from the effects of specific policies and analyzing whether the combination of government policies—those analyzed already as well as other macroeconomic responses and changes in the economy—was adequate to stem the downward spiral in economic activity set off by the initial reduction in consumer spending documented in Section III. Did the loss of jobs among low-wage workers trigger a secondary reduction in their own spending levels due to a lack of disposable income (rather than health concerns)—potentially setting off further business revenue losses and employment losses? Or was government intervention adequate to prevent such secondary responses? We investigate secondary spending responses among low-income individuals by returning to the geographic heterogeneity in the size of initial consumer demand shocks by local rent levels, as in Section III. In particular, we compare how spending evolved in low-income ZIP codes whose residents worked predominantly in high-rent areas versus those whose residents worked predominantly in low-rent areas.
Figure IX, Panel A presents a binned scatter plot of changes in low-wage employment from January to April 2020 by home (residential) ZIP code versus average workplace rent. We construct this figure by combining ZIP code–level data on employment rates of low-wage workers from Earnin that we make publicly available with public data from the Census LEHD Origin-Destination Employment Statistics (LODES) database, which provides information on the matrix of residential ZIP by work ZIP for low-income workers in the United States in 2017, to compute the average workplace median rent level for each residential ZIP. Figure IX, Panel A shows that low-income individuals who were working in high-rent areas pre-COVID were much less likely to be employed after the shock hit in April 2020—consistent with our findings above.36

Changes in Employment and Consumer Spending for Low-Income Households versus Workplace Rent
This figure examines the relationship between low-wage employment and consumer spending for individuals living in a home ZIP code z with the average rent in the ZIP codes of the workplaces for low-wage workers who live in home ZIP code z. Panel A presents a binned scatterplot showing the relationship between low-wage employment for workers living in a home ZIP code and the average median rent in the workplace ZIP codes for low-wage workers from that home ZIP code. We measure low-wage employment in each home ZIP code using the Earnin employment series in April 2020. We then match each home ZIP code to the distribution of workplace ZIP codes using the Census LODES data for low-wage workers. We calculate the x-axis variable as the average median rent for a two-bedroom apartment (measured in the 2014–2018 ACS), averaged across workplace ZIP codes using the distribution from the LODES data for each home ZIP code. See Section IV.C for a detailed discussion. Panel B replicates Panel A for a different outcome: average consumer spending between March 25 and April 14, 2020, restricting to ZIP codes in the bottom quartile of median income, as measured in the 2014–2018 ACS. Panel C replicates Panel B with consumer spending instead measured during October 2020. The binned scatter plots are constructed as described in Figure II. Panel D plots the average level of consumer spending for the top quartile of households appearing in Panels B and C ranked on average median workplace rent (i.e., the five right-most dots) in each month from February 2020 through December 2021. Data sources: Earnin, Affinity Solutions, Census LODES, ACS.
Next we analyze how these differential shocks to employment affected spending patterns, taking a step toward mapping the flow of shocks in the economy (Andersen et al. 2022). Figure IX, Panel B replicates Panel A using spending changes on the y-axis, restricting to households living in low-income ZIP codes.37 Low-income people living in areas where people tend to work in high-rent ZIP codes cut spending by 33% on average from January to April 2020, compared with 23% for those living in areas where people tend to work in low-rent ZIPs (Figure IX, Panel B). The relationship remains similar in magnitude but is less precisely estimated when we compare ZIP codes in the same county (Online Appendix Table VII).
Figure IX, Panel B implies that low-income households who lost their jobs at the start of the pandemic reduced their own spending more at the start of the pandemic—portending the start of the downward spiral. However, while employment losses for low-wage workers persisted over time, the reductions in spending did not. Figure IX, Panel C shows that by October 2020, spending in low-income ZIP codes was slightly higher than it was pre-COVID on average, and there was no longer any relationship between workplace rents and spending levels among low-income households despite the persistence of employment losses in higher-rent areas (Figure VI, Panel D). Figure IX, Panel D plots the evolution of spending in bottom-income-quartile ZIP codes that rank in the top quartile of median workplace rent. Despite the fact that these areas faced the largest and most persistent employment losses, consumer spending recovered very rapidly after falling sharply at the onset of the pandemic, exceeding pre-COVID levels starting in July 2020.
In sum, although a sharp gradient of spending reduction with respect to employment losses emerged early in the pandemic, it vanished within a few months. Total spending in areas where many workers had lost their jobs and remained out of work remained on par with areas where workers had lost less income—indicating that the secondary spending response that could have produced a further downward spiral was effectively shut down shortly after the crisis began. Losses in earned income likely did not translate to further spending reductions because the fiscal response to the crisis (e.g., via extended unemployment benefits and stimulus payments) actually increased the total disposable income of low-income households (Blanchet, Saez, and Zucman 2022; Ganong et al. 2022). These results suggest that as a whole, macroeconomic policy responses appear to have been effective in limiting secondary declines in consumer spending as workers lost their jobs—perhaps even going beyond what was necessary—even if they could not address the losses in employment that arose from the initial shock to consumer spending driven by health concerns.
V. Conclusion
Transactional data held by private companies have great potential for measuring economic activity, but to date have been accessible only through contracts to work with confidential microdata. In this article, we constructed a public database to measure economic activity at a high-frequency, granular level using data from private companies. By systematically cleaning, aggregating, and benchmarking the underlying microdata, we construct series that can be released publicly without disclosing sensitive information.
We use this new public database to analyze the economic effects of COVID-19, demonstrating two ways the data provide a new tool for empirical macroeconomics. First, the data can be used to rapidly diagnose the root factors driving an economic crisis by learning from cross-sectional heterogeneity, since different places and subgroups often face different shocks. In the case of COVID-19, we find that a sharp reduction in spending by high-income individuals due to health concerns led to losses of business revenues and persistent reductions in low-wage employment in affluent areas. Second, the data permit rapid, real-time policy evaluation—as demonstrated by our analyses showing the changing effects of fiscal stimulus payments over the course of the pandemic—opening a path to fine-tuning policy responses based on their observed impacts rather relying solely on historical estimates.
The benefit of constructing a public database to conduct such analyses rather than working directly with private firms’ confidential data is that we centralize the fixed costs of cleaning the data for research purposes. This facilitates transparency and reproducibility and enables researchers to readily access this data to conduct a much broader set of analyses. For example, the data have been used by local policy makers to inform local policy responses and forecast tax revenue impacts (e.g., Maine, Missouri, Kansas, and Texas). They have also been used by congressional staff to design federal policies, for example, predicting the effects and costs of policies targeted based on business revenue losses (Bennet 2020). And they have been used by other researchers to analyze a broad range of issues, from constructing price indices that account for changes in consumption bundles (Cavallo 2020) to analyzing the effects of political views on economic outcomes (Makridis and Hartley 2020).
Although we have focused here on the short-run effects of COVID-19, private sector data can be useful in monitoring effects of economic shocks on long-term outcomes as well. As an illustration, Figure X plots weekly student engagement on Zearn, an online math platform used by nearly 1 million elementary school students as part of their regular school curriculum (see Online Appendix I). Children in high-income areas learned less when the COVID crisis hit and schools shifted to remote instruction, but soon recovered to baseline levels. By contrast, children in lower-income areas completed 41% fewer lessons than they did prepandemic through the end of the school year. These findings—first established in May 2020 and confirmed by subsequent work such as Goldhaber et al. (2022) and Jack et al. (2023)—raise the concern that the pandemic may have long-lasting effects on low-income families not just through persistent reductions in employment documented here but also through effects on the next generation.

Effects of COVID-19 on Educational Progress by Income Group
This figure plots a time series of student engagement on the Zearn Math online platform, splitting schools into quartiles based on the share of students in the school eligible for Free or Reduced Price Lunch (FRPL). We measure student engagement as the average number of students using the Zearn Math application in each week, relative to the mean value of students using the platform in the same classroom during the reference period of January 6 to February 7, 2020. We restrict the sample to classrooms with at least 10 students using Zearn on average and at least 5 students doing so in each week during the reference period. We measure the share of students eligible for FRPL in each school using demographic data from the Common Core data set from MDR Education, a private education data firm. Data sources: Zearn, Common Core.
Over the twentieth century, the Bureau of Economic Analysis built on a prototype developed by Kuznets (1941) to institute surveys of businesses and households that form the basis for today’s National Income and Product Accounts. The database created here provides a prototype for a system of more granular, real-time national accounts built using transactional private sector data. The fact that even this first prototype yields insights that cannot be obtained from existing data suggests that aggregating data from private companies to construct public statistics has great potential for improving our understanding of economic activity and policy making.
Data Availability
The data underlying this article are available in the Harvard Dataverse, https://doi.org/10.7910/DVN/4CFSZW (Chetty, Friedman, and Stepner 2023).
Footnotes
We thank the corporate partners who provided the underlying data used to construct the public database built in this article: Affinity Solutions (especially Atul Chadha and Arun Rajagopal), Lightcast (Anton Libsch and Bledi Taska), CoinOut (Jeff Witten), Earnin (Arun Natesan and Ram Palaniappan), Homebase (Ray Sandza and Andrew Vogeley), Intuit (Christina Foo and Krithika Swaminathan), Kronos (David Gilbertson), Paychex (Mike Nichols and Shadi Sifain), Womply (Derek Doel and Ryan Thorpe), and Zearn (Billy McRae and Shalinee Sharma). We are very grateful to Nathaniel Hendren, who collaborated with us to launch the initial version of the database and helped conduct preliminary analyses for the first draft of this article in spring 2020. We are also grateful to Ryan Rippel of the Gates Foundation for his support in launching this project and to Gregory Bruich for early conversations that helped spark this work. We thank David Autor, Gabriel Chodorow-Reich, Haley O’Donnell, Emmanuel Farhi, Jason Furman, Steven Hamilton, Erik Hurst, Xavier Jaravel, Lawrence Katz, Fabian Lange, Emmanuel Saez, Ludwig Straub, Danny Yagan, and numerous seminar participants for helpful comments. The work was funded by the Chan-Zuckerberg Initiative, Bill & Melinda Gates Foundation, Overdeck Family Foundation, Andrew and Melora Balson, Harvard University, Brown University, JPB Foundation, Smith Richardson Foundation, and the University of Toronto. The project was approved under Harvard University IRB 20-0586. The Opportunity Insights Economic Tracker Team as of July 2023 has consisted of Hamidah Alatas, Camille Baker, Harvey Barnhard, Matt Bell, Gregory Bruich, Tina Chelidze, Lucas Chu, Westley Cineus, Sebi Devlin-Foltz, Michael Droste, Dhruv Gaur, Federico Gonzalez, Rayshauna Gray, Abigail Hiller, Matthew Jacob, Tyler Jacobson, Margaret Kallus, Fiona Kastel, Laura Kincaide, Caitlin Kupsc, Sarah LaBauve, Lucía Lamas, Maddie Marino, Kai Matheson, Jared Miller, Christian Mott, Kate Musen, Danny Onorato, Sarah Oppenheimer, Trina Ott, Lynn Overmann, Max Pienkny, Jeremiah Prince, Sebastian Puerta, Daniel Reuter, Peter Ruhm, Tom Rutter, Emanuel Schertz, Shannon Felton Spence, Krista Stapleford, Kamelia Stavreva, Ceci Steyn, James Stratton, Clare Suter, Elizabeth Thach, Nicolaj Thor, Amanda Wahlers, Kristen Watkins, Alanna Williams, David Williams, Chase Williamson, Shady Yassin, Ruby Zhang, and Austin Zheng.
For example, data on consumer spending disaggregated by geography are only available for selected large metro areas at a biannual level in the Consumer Expenditure Survey (CEX).
Survey-based statistics themselves do not necessarily provide “ground truth” because of sampling error, recall error, and growing nonresponse bias (Dutz et al. 2021; Meyer and Mittag 2021). Thus, even for longer-term inferences at the national level, combining information from transactional data with information from surveys can be valuable.
We provide a replication kit that generates all of the results in the article from publicly available data. We use nonpublic data for certain robustness checks and validation analyses reported in the Online Appendix (as documented in the replication kit).
We verify the quality of our publicly available ZIP code–level proxies for income by showing that our estimates of spending by income group during the pandemic are closely aligned with those of Cox et al. (2020), who observe household income directly for JPMorgan Chase clients in confidential microdata.
We always index to January 2020 after summing to a given cell (geographic unit, industry, etc.) rather than at the firm or individual level. This dollar-weighted approach overweights bigger firms and higher-income individuals, but leads to smoother series and is more relevant for certain macroeconomic policy questions (e.g., changes in aggregate spending).
We use a higher level of geographic aggregation to detect breaks here than the county-level aggregation used for consumer spending because the number of small businesses is an order of magnitude smaller than the number of active credit and debit cards, and so tests for structural breaks have less power.
Industry is defined using select NAICS supersectors, aggregated from two-digit NAICS classification codes. Job qualifications are defined using ONET job zones, which classify jobs into five groups based on the amount of preparation they require. We also obtain analogous data broken down by educational requirements.
In January 2020, the thresholds were |${\$}$|13.10, |${\$}$|19.65, and |${\$}$|32.75, and the four bins in ascending order by wage contained 23.4%, 27.4%, 25.7%, and 23.5% of CPS respondents. The FPL is updated annually at the beginning of each year. We use the annual FPL to set the thresholds each January and smoothly adjust the thresholds in the year using CPI inflation, as described in Online Appendix E.
As an example of the specific data-processing challenges that we address in constructing the employment series, bunching at integer values in the wage distribution generates discontinuities in the number of workers assigned to each wage group as the thresholds for the groups are updated due to inflation. For example, when the threshold for the lowest wage group crosses |${\$}$|14/hour, a discrete mass of workers who were previously a part of the second quartile are now defined as being in the bottom quartile, causing a discontinuity in both series. To address this issue, we spread workers out from the whole number wages by adding a random number between −0.5 and 0.5 to their hourly wage, transforming the point mass at the integer wage into a uniform distribution between [wage − 0.5, wage + 0.5] (see Online Appendix E.2 for details).
Most of the reduction in private investment was driven by a reduction in inventories and equipment investment in the transportation and retail sectors, both of which are plausibly a response to reductions in current and anticipated consumer spending. In the first quarter of 2020, consumer spending accounted for an even larger share of the reduction in GDP, further supporting the view that the initial shock to the economy came from a reduction in consumer spending (U.S. Bureau of Economic Analysis 2020).
The rest of the reduction is largely accounted for by health care expenditures; housing and motor vehicle expenditures did not change significantly.
The series are not perfectly comparable because the category definitions differ slightly across the data sets. For example, we observe food and accommodation services combined together in the card data but only food services in the MARTS. In addition, the MARTS includes corporate card transactions, whereas we exclude them to isolate consumer spending. Hence, we would not expect the series to track each other perfectly even if the card spending data provided a perfect representation of national spending patterns.
One specific source of potential bias in our spending series is that it does not include cash transactions and thus could be biased by potential substitution from cash to credit card purchases. We evaluate this concern using receipts data from CoinOut, which allows us to measure cash spending on groceries (see Online Appendix B.3). In practice, trends in card and cash spending track each other closely (Online Appendix Figure VI.B). These results—along with the fact that our card spending series closely track estimates from the MARTS—indicate that aggregate fluctuations in card spending do not appear to have been offset by opposite-signed changes in cash spending.
Cox et al. (2020) report an 8 percentage point larger decline in spending for the highest income quartile relative to the lowest income quartile in the second week of April. Our estimate of the gap at that time is also 8 percentage points, although the levels of the declines in our data are slightly smaller in magnitude for both groups.
For example, more than 50% of workers in food and accommodation services (a major nontradeable sector) work in establishments with fewer than 50 employees (U.S. Census Bureau 2017).
We focus on small businesses because their customers are typically located near the business itself; larger businesses’ customers (e.g., large retail chains) are more dispersed, making the geographic location of the business less relevant.
We use 2010 Census ZIP Code Tabulation Areas (ZCTAs) to perform all geographic analyses of ZIP-level data. Throughout the text, we refer to these areas simply as “ZIP codes.”
Rents are a simple measure of the affluence of an area that combine income and population density: the highest-rent ZIP codes tend to be high-income, dense areas such as Manhattan. Plotting small-business revenue against median incomes or population density produces analogous results (Online Appendix Figure XI).
Part of the reason that revenues fell so sharply in high-rent ZIP codes is that affluent families moved elsewhere during the pandemic. To quantify the relative contribution of such “extensive-margin” mechanisms versus intensive-margin reductions in spending by high-income households who did not leave, we use aggregated mobile phone data from SafeGraph (Allcott et al. 2020) to estimate changes in local population at high frequencies. Although population fell more in high-rent areas, changes in small business revenues as of April 2020 still exhibit a sharp gradient with respect to local rents even conditional on SafeGraph-based estimates of population counts (12.3% per |${\$}$|1,000 rent, std. err. 0.95).
Of course, households do not restrict their spending solely to businesses in their own ZIP code. We find similar patterns when zooming out to the county level. Counties with larger top 1% income shares experienced larger losses of small-business revenue (Online Appendix Figure XII.B). Poverty rates are not strongly associated with revenue losses at the county level (Online Appendix Figure XII.C), indicating that it is the presence of the rich in particular (as opposed to the middle class) that is most predictive of economic effects on local businesses.
Another benefit of our payroll-based employment series is the timeliness of its local-area estimates: it matches the county-level granularity of the QCEW (which is released with a lag of six months), but with the timeliness of the monthly employment statistics in the CES that are released at the national level and for 450 metropolitan statistical areas.
We cannot use this panel approach to examine employment beyond February 2021 conditional on pre-COVID wage rates because households responding to the CPS answer the Outgoing Rotation Group panel questions exactly twice, 12 months apart.
The high level of job postings in the second half of 2021 may also reflect a labor supply shortage, as companies had to post more jobs to fill a set of positions.
Online Appendix Figure XVIII presents a specific example of this result by plotting trends in employment and spending in the retail trade sector. Total retail spending was 25% higher as of December 2021 relative to the pre-COVID baseline. Employment of high-wage workers was 5% above baseline levels, but employment of low-wage workers was still down by 19% in this sector—as in the economy as a whole.
We omit California, Massachusetts, and New York in this cross-sectional analysis because they each raised their minimum wages during our sample, leading to a discrete mechanical reduction in the number of bottom-wage-quartile workers over the course of the pandemic (see Online Appendix E.2).
The differential changes in employment rates in low-wage jobs across low-versus high-rent areas are not driven by differential changes in wage growth rates or occupational switching. Using the approaches described above at a national level (see Online Appendix E.4 for details), we find that wage growth rates are, if anything, lower in high-rent states than low-rent states and that rates of switching to higher-paying jobs are uncorrelated with state-level rents. Furthermore, the CPS panel shows that employment for workers who started in the bottom wage quartile prepandemic remained lower in high-rent states in February 2021 (Online Appendix Figure XIX).
Prior studies benefited from substantial variation in the timing of payments, permitting identification over a longer period of time. In contrast, the stimulus payments we study each largely arrived on a single day, making it challenging to estimate effects over longer horizons without strong assumptions about counterfactual trends.
The payments were reduced at higher levels of income and phased out entirely for households with incomes above |${\$}$|99,000 (for single filers without children) or |${\$}$|198,000 (for married couples without children).
We permit pretrends because spending fell rapidly for all income groups in the days immediately preceding the April 15 stimulus payments, as shown in Figure I, Panel A. We assume common trends to maximize precision, as we find no significant differences in pretrends in spending across income quartiles in the 25 days preceding the stimulus payments. The differential changes in spending by income quartile discussed in Section III emerged before that period, immediately after the pandemic began. We also show that not adjusting for pretrends at all yields qualitatively similar conclusions in Online Appendix Figure XXI.
Disaggregating the spending data by sector, we find that most of the additional spending from the April 2020 stimulus went to durable goods rather than in-person services. The stimulus thus increased the overall level of spending but did not channel money back to the businesses that lost the most revenue due to the COVID shock. These findings provide evidence for the “broken Keynesian cross” mechanism established in Guerrieri et al. (2022)’s model, where funds are not recirculated back to the sectors shut down by the pandemic, potentially diminishing multiplier effects.
Using the same 25-day preperiod window as was used for the first stimulus yields point estimates that are statistically indistinguishable from those we present, but with much wider confidence intervals due to the greater noise in the preperiod.
Both estimates are significantly lower (with p < .005) than those from April 2020 based on a permutation test; see Online Appendix Figures XXIV and XXV for the full distribution of placebo estimates.
Summarizing the literature on impacts of stimulus payments, Sahm (2019) observes that “households with low liquid assets relative to their income tend to spend more (and more quickly) out of additional income than those households with ample liquidity.” In normal times, Sahm observes that “targeting current low-income or low-wealth households may not identify the households most likely to spend the stimulus, which could include some wealthy households.” The link between income and liquid wealth changed during the pandemic, making such targeting more feasible.
With the benefit of hindsight, one may have been able to predict that MPCs would begin to fall for high-income households as their liquid savings rose, but it is difficult to gauge ex ante which of the many potential dimensions of heterogeneity and structural change warrant attention.
This analysis focuses solely on short-run employment effects; it remains possible that the PPP may have long-term benefits by reducing permanent business closures, as emphasized by Hubbard and Strain (2020).
These results are driven by work location rather than sectoral differences in employment across areas: in the Earnin microdata, we find similar results even when comparing workers employed at the same firm (e.g., a chain restaurant). People working in high-rent ZIP codes in January 2020 remained less likely to have a job (anywhere) in April 2020 than their coworkers working in a different establishment of the same firm in lower-rent ZIP codes.
We restrict this figure to households living in low-income ZIPs because we cannot disaggregate the Affinity data by individual-level income. Since the employment data already represent only low-income workers, we do not restrict to low-income ZIPs in the employment analysis; however, the patterns are very similar when restricting to low-income ZIPs in the Earnin data.