Abstract

We study the relationship between management practices, organizational performance, and task clarity, using observational data analysis on an original survey of the universe of Ghanaian civil servants across 45 organizations and novel administrative data on over 3,600 tasks they undertake. We first demonstrate that there is a large range of variation across government organizations, both in management quality and in task completion, and show that management quality is positively related to task completion. We then provide evidence that this association varies across dimensions of management practice. In particular, task completion exhibits a positive partial correlation with management practices related to giving staff autonomy and discretion, but a negative partial correlation with practices related to incentives and monitoring. Consistent with theories of task clarity and goal ambiguity, the partial relationship between incentives/monitoring and task completion is less negative when tasks are clearer ex ante and the partial relationship between autonomy/discretion and task completion is more positive when task completion is clearer ex post. Our findings suggest that organizations could benefit from providing their staff with greater autonomy and discretion, especially for types of tasks that are ill-suited to predefined monitoring and incentive regimes.

Introduction

The relationship between the management practices under which public servants operate and organizational performance is a central question for public administration (e.g., Honig 2018; Ingraham, Joyce, and Kneedler Donahue 2003; Lynn, Heinrich, and Hill 2000; Meier, Laurence, and O’Toole 2002). Miller (2000) and Miller and Whitford (2016) describe the debate between Carl Friedrich and Herman Finer to distinguish between two broad schools of thought. On the one hand, if bureaucrats are viewed as agents whose preferences diverge from their principals or who shirk their duties, then they should be managed through top-down tools of control such as monitoring and rewards/sanctions in order to elicit effort and minimize moral hazard (Finer 1941). On the other hand, if bureaucrats are viewed as professionals trying to do their best for the public good, then public bureaucracies ought to delegate significant autonomy and discretion to bureaucrats, relying on their professionalism and expertise to deliver public services (Friedrich 1940). The broad contours of these two approaches manifest themselves in many subsequent debates, with some authors (particularly, but not exclusively, in public administration) emphasizing the value of autonomy and discretion (Andersen and Moynihan 2016; Carpenter 2001; Miller and Whitford 2016; Rainey and Steinbauer 1999; Rose-Ackerman 1986; Simon 1983), whereas others (particularly, but not exclusively, in economics) focus on the importance of top-down monitoring and incentives (e.g., Duflo, Hanna, and Ryan 2012; various in Finan, Olken, and Pande 2017). Other authors have argued that the relationship between management and performance might depend on the nature of the agency’s tasks or goals (Chun and Rainey 2005; Wilson 1989).

We contribute to this debate by studying the relationships between a broad spectrum of management practices and the full range of bureaucratic tasks, across 45 ministries and departments in the central government of Ghana. Ghana is a medium-sized, lower-middle income democracy, and ranks just below the global median (46th percentile) in government effectiveness by the Worldwide Governance Indicators.1 The civil servants we study are largely in mid-level bureaucratic policymaking and oversight roles, rather than frontline implementation. To measure management practices, we conduct in-person surveys with the universe of professional-grade civil servants in Ghana’s central government—nearly 3,000 individuals—and construct measures of management quality, adapting the methodological innovations of Bloom and Van Reenen (2007) and Bloom, Sadun, and Van Reenen (2012) from organizational economics to the public sector. This enables us to construct an overall index of management quality as well as subindices related to the use of incentives and monitoring, autonomy and discretion, and other practices. These are based not on subjective self-reported perceptions or whether specific de jure rules are in place, but on probing interviews benchmarked to an absolute scale that seek to capture the de facto management practices being used in reality, not just the de jure practices prescribed on paper.

To measure task completion and clarity, we collect, digitize, and hand-code quarterly and annual progress reports of each organization’s planned activities against their actual achievements. This yields a database on the characteristics and completion of 3,620 tasks covering the entire range of bureaucratic activity, from procurement and infrastructure to policy development, advocacy, human resource management, budgeting, and regulation. Task completion is a fundamental aspect of bureaucratic performance and public service delivery, and allows us to compare different organizations against a common performance metric which is widely available even in contexts with relatively little existing performance information. We validate our measure against a subsample of audited projects to ensure that this self-reported bureaucratic data are truthful.

We first document that there is substantial variation across organizations on both management quality and task completion. This is despite the fact that all organizations that operate under the same civil service law, regulations, and pay structure are overseen by the same authorities, draw from the same pool of potential hires, and are located proximately to each other in the capital—in some cases in the same building. The existence of significant and systematic variation in both process-based (management practices) and output-based (task/project completion) measures of performance across organizations within government has implications for the theory and measurement of organizational performance, state capacity, and governance.

We then estimate the relationships between management practices and task completion, exploiting the fact that multiple organizations conduct each task type. We find that overall management quality and each management subindex on its own is positively correlated with task completion, with varying degrees of significance, and that the subindices are positively correlated with each other. However, organizations do not implement management practices in isolation, they implement a portfolio of practices, so it is important to estimate their relationships with task completion jointly. When we do this, we show that autonomy/discretion and incentives/monitoring have opposing signs: a 1 standard deviation (SD) increase in autonomy/discretion-related practices is associated with a 12 percentage point increase in the likelihood a task is fully completed; in contrast, a 1 SD increase in incentives/monitoring-related management practices is associated with a decrease of 4 percentage points in the likelihood it is fully completed.

We further investigate the relationships between management practices and task completion by examining how these relationships vary with the ex ante clarity of task definition and the ex post clarity of actual achievement. Bureaucratic tasks are ex ante clear when the task can be defined in such a way as to create little uncertainty about what is required to complete the task, and are ex post clear when a report of the actual action undertaken leaves little uncertainty about whether the task was effectively completed. We hypothesize that the top-down control strategies of incentives and monitoring should be relatively more effective when tasks are easy to define ex ante because it is easier to specify what should be done and construct an appropriate monitoring regime. On the other hand, empowering staff with autonomy and discretion should be relatively more effective when tasks are unclear ex ante, as well as when the actual achievement of the task is clear ex post (because ex post clarity makes it easier to detect abuse of discretion). We find strong evidence consistent with this mechanism: for tasks with below-median ex ante clarity, 1 SD increases in autonomy/discretion- and incentives/monitoring-related practices are associated with a 21 percentage point increase and 14 percentage point decrease in task completion, respectively. However, there is no significant difference between management practices for tasks with above-median ex ante clarity. Similarly, for tasks with above-median ex post clarity, a 1 SD increase in autonomy/discretion (incentives/monitoring) is associated with a 34 percentage point increase (1 percentage point decrease) in task completion, consistent with our hypotheses.

Our findings are consistent with theories that emphasize that monitoring and incentive systems can backfire in contexts (such as core civil service policymaking tasks) intensive in multitasking, coordination, and instability, and where tasks are differentially observable (Dixit 2002; Honig 2018). Because these coefficients have to be interpreted relative to each other, the implication is that Ghana’s civil service organizations are underproviding autonomy and discretion to their staff relative to their use of monitoring and incentives, particularly on tasks that are ex ante unclear or ex post clear. That is, organizations appear to be overbalancing their management practice portfolios toward top-down control measures at the expense of entrusting and empowering the professionalism of their staff. These results are particularly striking in the context of a lower-middle income country such as Ghana, where concerns about abuse of discretion among public officials are especially salient among academics and citizens alike.

We also make a methodological contribution by showing that because the underlying management practice subindices are positively correlated, estimating the relationships between a single set of management practices and performance without accounting for other practices—as is common in the empirical literature—would lead to significant omitted variable bias. We regard this as empirical justification for our approach of seeking to understand these relationships simultaneously across a broad spectrum of management practices, rather than focusing more narrowly on the effects of a single type of management practice, with important implications for other studies of bureaucratic management and service delivery. Although our context does not allow for clean causal identification of these effects, the empirical support we find for the task clarity mechanism makes it unlikely that our findings are driven entirely by reverse causality. Indeed, our goal of investigating these relationships across such a broad spectrum of management practices under actual implementation conditions would be limited in the more controlled or narrow settings that would enable clean causal identification of results. Our findings are thus an important complement to (quasi-)experimental studies in expanding our understanding of the relationship between management and performance in state bureaucracies.

The remainder of this article proceeds as follows. The second section presents a theoretical framework for studying the relationship between management practices and output, with a focus on the longstanding debate between incentives/monitoring-led approaches and autonomy/discretion-led approaches. The third section discusses our empirical context and data, and the forth section shows our descriptive results on variation in management and productivity within Ghana’s Civil Service. The fifth section presents out empirical method and main results, sixth section investigates mechanisms, and seventh section examines external validity. The eight section concludes by discussing implications for research and policy.

Theory: Management, Task Clarity, and Performance

As Miller (2000) and Miller and Whitford (2016) discuss, one common view of bureaucrats (following Finer 1941) is that they need to be monitored and rewarded and sanctioned. Accordingly, the relationship between performance and the use of top-down strategies of management control like incentives and monitoring has long been a central question for public administration and economics (Boyne and Hood 2010; Dixit 2002; Wilson 1989). The relationship between these approaches and bureaucratic output is theoretically ambiguous. On one hand, management through incentives and monitoring may elicit agent effort and discourage behavior that is misaligned with a principal’s preferences. On the other hand, the nature of public sector work—involving multiple goals, difficult-to-measure outputs and outcomes, extensive coordination, and environmental uncertainty may limit the scope for the effective use of incentives and monitoring or result in efforts at top-down control backfiring. A large literature focuses in specifically on the effect of performance-related pay on performance, documenting both successes and failures (e.g., Dahlstrom and Lapuente 2010; Hasnain, Manning, and Pierskalla 2012; Miller and Whitford 2007; Perry, Engbers, and Jun 2009), and another set of studies documents how the actual ways in which performance information in bureaucracies is used often deviate from its idealized form (e.g., Heinrich 1999; Moynihan et al. 2011). Quantitative evaluations of the effects of incentive schemes or rigid agent monitoring schemes with attached sanctions (e.g., Duflo, Hanna, and Ryan 2012; various in Finan, Olken, and Pande 2017) on performance have tended to focus solely on frontline agents and on the implementation of specific schemes in controlled conditions. Studies on the use of incentives and monitoring more broadly in the core civil service (e.g., Bevan and Hood 2006; Kelman and Friedman 2009; various in Boyne and Hood 2010) have tended to focus on specific agencies or performance measures, resulting in questions about the generality of their findings.

On the other hand, perspectives (following Friedrich 1940) that emphasize the professionalism of bureaucrats are bifurcated between studies of the effects of autonomy of government organizations from politicians and studies of the effects of discretion of frontline bureaucrats. Although there is broad consensus that organizational autonomy from political influence is important for performance (Andersen and Moynihan 2016; Carpenter 2001; Moe 2013; Miller and Whitford 2016; Rainey and Steinbauer 1999), quantitative studies of individual-level discretion in economics and political science have mainly focused on its potential downsides in terms of discrimination (Einstein and Glick 2017) or corruption (Olken and Pande 2012). On the other hand, a large literature on bottom-up policy implementation (e.g., Thomann, Van Engen, and Tummers 2018) emphasizes the potential positive effects of discretion on policy implementation among frontline bureaucrats, and Rasul and Rogger (2018) find a positive association between autonomy and project completion in Nigeria. However, we are aware of few studies that attempt to measure the relationship between performance and the extent to which mid-level, core civil servants are able to use local information and make flexible decisions within organizations and apply it to tasks across the full spectrum of bureaucratic activity, which is the empirical focus of this study. We thus refer to autonomy/discretion throughout our study, to indicate that our operationalization falls in between the conventional use of autonomy as an organization-level characteristic and discretion as pertaining to individual frontline bureaucrats. Importantly, however, our perspective on autonomy/discretion shares its central hypothesis with these more common approaches: the nature of many core public sector tasks may require bureaucrats to make flexible judgments using local information, and thus organizations must rely on bureaucrats’ professionalism to make good decisions and perform effectively.

We conceptualize management in public organizations as a portfolio of practices that correspond to different aspects of management, each of which may be implemented more or less well. Bureaucracies may differ in their intended management styles, that is, what bundle of management practices they are aiming to implement, and may also differ in how well they are executing these practices. An organization may execute a given set of management practices in a consistent and coherent fashion, or in a disorganized and ad hoc way. Although research on the effects of management practices often focus on a single practice in isolation, in reality bureaucracies are executing a wide range of management practices simultaneously. Whether deliberately or through inaction, organizations adopt policies on the (non-)use of targets, (non-)monitoring of key performance indicators, (dis-)allowance of discretion in decision making, (non-)reward of staff performance, and so on. This implies that management can be viewed as a joint set of choices across a range of practices. Finally, our approach assumes that what matters for determining organizational performance is not the de jure practices by which an organization avows itself to be managed, but the de facto practices that are actually used on a day-to-day basis within the organization. Management thus differs across organizations both in style and in quality, so debates over the merits of a particular management practice or style thus also require an understanding of how well the practice(s) are likely to be implemented under real-world conditions, so questions of the choice of management practices are not separable from issues of policy implementation.

The conceptual distinction we make (following Miller 2000 and Miller and Whitford 2016) between top-down incentives and monitoring and professionalism-oriented autonomy and discretion is a different one than that made by another big-picture debate on public sector management: that between hierarchy-oriented Weberian bureaucracy and market-oriented New Public Management (NPM) styles of governance. Although the Weberian-versus-NPM debate has received prominence in the literature due to its correspondence to the historical evolution of management in public organizations, our distinction corresponds instead to two divergent views of how best to manage bureaucratic agents: to what extent should they be managed with the carrot and the stick, and to what extent should they be empowered with the discretion associated with other professions? We view this question as cross-cutting the Weberian/NPM distinction. Throughout the article, we focus on our functional distinction without trying to situate it within the Weberian-versus-NPM debate.

Task clarity and related concepts have long been an important element of public administration theory because the differing nature of bureaucratic tasks implies that different management approaches might be effective in different situations. Wilson (1989, Ch. 9) makes a widely referenced distinction between the visibility of agency outputs and outcomes, and constructs a two-by-two typology of agency types: production agencies, which have high output and outcome visibility; procedural agencies, which have high output visibility but low outcome visibility, and thus are managed using tight top-down controls on activities and processes; craft agencies, which have low output visibility but high outcome visibility, and thus can be delegated with significant autonomy; and coping agencies, which have low visibility of both outputs and outcomes and are thus difficult to manage effectively. Relatedly, Chun and Rainey (2005) define four types of goal ambiguity (mission comprehensiveness, directive, evaluative, and priority) and show that goal ambiguity is negatively associated with various performance measures, and Romzek (2000) discusses how different accountability mechanisms (e.g., hierarchical vs. professional) are appropriate for different bureaucratic scenarios.

In our analysis, we focus at the level of particular tasks (rather than assuming agencies have a single main task type) and distinguish between ex ante and ex post task clarity. We hypothesize that ex ante and ex post task clarity may affect the ability of organizations to use different types of management practices to effectively deliver it. In particular, some types of tasks might be ex ante easy to specify, and others might be ex post easy to measure. For tasks that are ex ante unclear, the design of effective incentives and monitoring schemes is likely to be harder, all else equal, since it is unclear what bureaucrats should be aiming for and how to measure it. On the other hand, when task achievement is clear ex post, granting bureaucrats discretion over how to implement tasks is likely to be relatively more effective since abuse of that discretion will be easier to detect ex post. Although focused on tasks rather than agencies, our distinction between ex ante and ex post clarity is conceptually similar to Chun and Rainey’s (2005) directive and evaluative goal ambiguity distinction and to Wilson’s (1989) distinction between procedural and craft agencies.

Our theoretical discussion can be summarized in two sets of hypotheses. The first set refers to whether top-down monitoring- and incentive-based management approaches are more strongly associated with task completion on average than bottom-up autonomy- and discretion-based approaches. Because there are theoretical arguments on either side of this debate, we capture this with two opposing hypotheses:

  • H1a: Intensive use of management practices related to monitoring and incentives is more strongly associated with task completion than intensive use of management practices related to autonomy and discretion, all else equal.

  • H1b: Intensive use of management practices related to autonomy and discretion is more strongly associated with task completion than intensive use of management practices related to monitoring and incentives, all else equal.

Our second hypothesis refers to how these relationships are moderated by ex ante and ex post ask clarity:

  • H2: Monitoring- and incentive-based management approaches will be more strongly associated with completion for tasks with relatively high ex ante clarity, and autonomy- and discretion-based management approaches will be more strongly associated with completion for tasks with relatively high ex post clarity, all else equal.

A priori, either H1a or H1b (but not both) could be true, so we test these hypotheses against each other empirically. Given its underlying theoretical rationale, we expect H2 to hold regardless of which of H1a or H1b we find greater support for.

Our article is related to important literatures in public administration, economics, and political science on organizational performance and state capacity. Within public administration, most of this literature is focused on high-income countries. Due to the well-known challenges of measuring organizational performance (Boyne and Walker 2005; Talbot 2010), this literature tends to focus on one type of organization, such as school districts or police departments (e.g., Andersen and Mortensen 2010; Meier, Laurence, and O’Toole 2002; Nicholson-Crotty and O’Toole 2004). Investigations of performance across a fuller range of public sector activities has often relied on subjective measures of organizational performance, leading to concerns about common source bias (Meier, Laurence, and O’Toole 2012). Our measure of task completion aims to build on this existing literature by establishing an output-based organizational performance metric that is comparable across organizations and not based on subjective surveys, while also enabling us to cover the entire spectrum of sectors and activities undertaken by government—much as we aim to measure organizations’ full portfolios of management practices rather than just the use of one specific practice.

A large literature in public administration, economics, political science, and sociology examines variation in bureaucratic quality across and within states. Although many studies analyze cross-national variation in state capacity and overall public sector management (e.g., Kaufmann, Kraay, and Mastruzzi 2010), more relevant to our study is literature that examines within-country, cross-agency variation in performance (e.g., Ingraham, Joyce, and Kneedler Donahue 2003). Outside of OECD contexts, this has taken the form primarily of case study-based literature that documents the existence of “islands of excellence” or “pockets of effectiveness” through rich case studies (Leonard 2010; McDonnell 2017; Tendler 1997). However, these small-N analyses lead to concerns about the generalizability of their findings. Among quantitative studies of within-country variation outside OECD contexts, there is a small but growing literature that documents variation across organizations in capacity using input-based measures (Bersch, Praça, and Taylor 2016; Gingerich 2013). A limitation of this existing literature is that these measures of capacity are typically divorced from consideration of the type of management practices each organization is implementing, so that there is little evidence on the potential links between the style and the quality of management postulated by this framework. Owusu (2006) comes closer to our focus by using expert perception surveys to measure variations in performance of government organizations in Ghana, but with the usual potential measurement issues associated with subjective measures. Using our novel measures of task completion and management practices, we contribute to this literature on the extent of variation in organizational performance within governments in low- and middle-income countries, and also provide new tools that can be used in high-income contexts.

Context and Data

Ghana is a lower-middle income country home to 28 million individuals, with a central government bureaucracy that is structured along lines reflecting both its British colonial origins and more presidentialist postindependence reforms. We study the universe of 45 Ministries and Departments in the Civil Service. The headquarters of these organizations are all located in Accra, but have responsibility for public projects and activities implemented nationwide.2 Ministries and Departments are overseen by the Office of the Head of Civil Service (OHCS), which is responsible for personnel management and performance within the civil service. OHCS coordinates and decides on all hiring, promotion, transfer, and (in rare circumstances) firing of bureaucrats across the service. OHCS develops and promulgates official management regulations and processes, but Ministries’ and Agencies’ compliance with these is imperfect, with the result that actual management practices are highly variable across organizations—as in many countries worldwide (Bersch, Praça, and Taylor 2016; Gingerich 2013; Ingraham, Joyce, and Kneedler Donahue 2003; Tendler 1997). All these Ministries and Departments have the same statutory levels of autonomy and political oversight structures, which facilitates our analysis (without limiting the scope of the theoretical predictions). Although all countries’ civil services differ in certain aspects, the basic structure of Ghana’s civil service is common across many Anglophone countries, and the types of challenges civil service organizations in Ghana face (e.g., lack of resources, environmental uncertainty, political-bureaucratic tensions, employee abuse of discretion) are widely shared ones worldwide.

Our analysis focuses on the professional grades of technical and administrative officers within these Ministries and Departments. We therefore exclude grades that cover cleaners, drivers, most secretaries, and so forth. On average, each organization employs 64 bureaucrats of the type we study (those on professional grades). We designate bureaucrats as being at either a senior level or nonsenior level. Senior bureaucrats are those that classify themselves as a “Director (Head of Division) or Acting Director” or as a “Deputy Director or Unit Head (Acting or Substantive).” By this definition, the number of nonsenior bureaucrats overseen by each senior bureaucrats (i.e., the span of control) is 4.52 on average, but again there is considerable variation across Ministries.3

Around 45% of bureaucrats are women, 70% have a university education, and 31% have a postgraduate degree (senior officers are more likely to be men, and to have a postgraduate degree). As in other state organizations, civil service bureaucrats enjoy stable employment once in service: the average bureaucrat has 14 years in service, with their average tenure in the current organization being just under 9 years. Appointments are made centrally by OHCS, bureaucrats enjoy security of tenure, and transitions between bureaucracies are relatively infrequent. Our analysis is based on two data sources. First, we hand-coded quarterly and annual progress reports from Ministries and Departments, covering tasks ongoing between January and December 2015. As detailed below, these reports enable us to code the individual tasks under the remit of each organization, and the extent to which they are initiated or successfully completed. Second, we surveyed 2971 bureaucrats from all 45 civil service organizations over the period August–October 2015. As detailed below, civil servants were questioned on topics including their background characteristics and work history in service, job characteristics and responsibilities, engagement with stakeholders outside the civil service, perceptions of corruption in the service, and their views on multiple dimensions of management practices.

Coding Task Completion

Worldwide, civil service bureaucracies differ greatly in whether and how they collect data on their performance, and few international standards exist to aid cross-country comparisons. To therefore quantify the delivery of public sector task completion in our context, we exploit the fact that each Ghanaian civil service organization is required by OHCS to provide quarterly and annual progress reports. Although organizations differed in their reporting formats and coverage, most reports included a comprehensive table of all tasks, outputs, and projects that were to be undertaken by the organization during the reporting period. We are thus able to use the progress reports of 30 of the 45 Ministries and Departments (our civil servant survey covers 2,247 bureaucrats in these 30 organizations). Supplementary figure A1 provides a snapshot of a typical progress report and indicates the information coded from it, and supplementary appendix discusses the coding process in detail.4 Of the 15 organizations for which we were unable to code progress reports, the barrier in most cases was that the organizations produced reports in a purely narrative form (i.e., without a table), which made it impractical to separate the work undertaken into discrete tasks that could be used for coding. We discuss below how we validated the accuracy of reports we were able to code.

Progress reports cover the entire range of bureaucratic activity. Although some of these tasks are public-facing outputs, others are purely internal functions or intermediate outputs. For brevity, we refer to the activities, outputs, projects, functions, and processes reported on collectively as “tasks.”5 We were able to use these progress reports to identify 3,620 tasks underway during 2015. The tasks undertaken by each organization in a given year are determined through an annual planning and budgeting process jointly determined between: the core executive, mainly the Ministry of Finance and the sector minister representing government priorities; the organization’s management, based in large part on consultatively developed medium-term plans; and ongoing donor programs. This schedule of tasks is formalized in the organization’s annual budget (approved by Parliament) and annual workplan. The quarterly and annual reports that we use to code task completion thus detail the task that the organization’s workplan committed the organization to working on during the time period of study.

Supplementary appendix describes how we hand-coded and harmonized the information to measure task completion across the Ghanaian civil service. Three key points are of note in relation to this process. First, each quarterly progress report was codified into task line items using a team of trained research assistants and a team of civil servant officers seconded from the Management Services Department (MSD), an organization under OHCS tasked with analyzing and improving management in the civil service. MSD officers are trained in management and productivity analysis and frequently review organizational reports of this nature, making them ideally suited to judging task characteristics and completion.

Second, coders were tasked to record task completion on a 1–5 scoring grid, where a score of one corresponds to, “No action was taken towards achieving the target,” three corresponds to, “Some substantive progress was made towards achieving the target. The task is partially complete and/or important intermediate steps have been completed,” and a score of five corresponds to, “The target for the task has been reached or surpassed.” Tasks can be long-term or repeated (e.g., annual, quarterly) tasks. There were at least two coders per task.6 For example, a score of 1 was given for the task “Final accounts on 2nd tranche of loan component prepared and endorsed” with associated completion status of “Not undertaken,” and a score of 5 was given for the task “Prepare 2015 1st quarter Progress Report” with associated completion status of “2015 1st Quarter report prepared and submitted to CAGD Management and Ministry of Finance.”

Third, as progress reports are self-compiled by bureaucracies, an obvious concern is that low performing bureaucracies might intentionally manipulate their reports to hide the fact. To check the validity of progress reports, we matched a subsample of 14% of tasks from progress reports to task audits conducted by external auditors through a separate exercise undertaken by OHCS. Auditors are mostly retired civil servants, overseen by OHCS, and they obtain documentary proof of task completion. For matched tasks, 94% of the completion levels we code are corroborated based on the qualitative descriptions of completion in audits.7

The types of tasks included in the data are revealing of the full scope of activity of bureaucracies. Figure 1 shows the most common task type in Ghanaian central government bureaucracies relates to the construction of public infrastructure such as roads, boreholes, and schools (e.g., “Identify bungalows and initiate procurement process,” “Rehabilitation of Bosomkyekye-Ouagadugu [sic]” road, “Construction of Secondary Data Centre at Kumasi”), comprising 24% of tasks. Other common task types are advocacy (e.g., “Sensitize printers and suppliers on the procurement law and packaging,” “Kaizen Forum Organized,” “Talk shows in four rural district markets in the region on the GIPC act held”), comprising 16% of tasks, and monitoring, review, and audit (e.g., “Collate 2nd quarterly reports on the Ministry’s work plan for Management meetings,” “Preparation of 2014 Annual Report,” “Conduct second phase of Housing Audit”), comprising 14%.

Task types across organizations.
Figure 1.

Task types across organizations.

Note: The task type classification refers to the primary classification for each output. Each color in a column represents an organization implementing tasks of that type, but the same color across columns may represent multiple organizations. Figures represent all 30 organizations with coded task data.

Figure 1 demonstrates two salient facts that motivate our analysis. First, each task type is implemented by many different organizations and each organization implements multiple task types, allowing us to disentangle the performance of bureaucracies from the types of tasks they undertake. Second, prominent activities on which previous studies of management and organizational performance have been based, such as procurement or public infrastructure development, comprise only a small minority of the full range of tasks undertaken by the public sector. Although there are of course analytical advantages to focusing on a single type of task, we view this evidence as motivation for our focus on the full range of tasks undertaken by the public sector.

In addition to coding task completion, our coders also rated the clarity of the expected level of achievement for the task in the reporting period, that is, the ex ante task clarity. They also rated the clarity of the actual description of what was done, that is, the ex post task clarity. Their rating was based on the clarity of the information on the ex ante target and ex post actual achievement contained in the report itself, which likely corresponds to clarity for managers themselves since these reports are designed primarily as management tools. As with task completion, coders scored these variables on a 1–5 scale. A score of one corresponded to an ex ante target that was “undefined or so vague it is impossible to assess what completion would mean,” a score of three corresponded to a target that “is defined, but with some ambiguity,” and a score of five corresponded to “no ambiguity over the target—it is precisely quantified or described.” An example of a task coded as a 1 for ex ante clarity was “Improved compliance with environmental laws and regulations,” whereas “Organise review meeting on 2014 APR” was coded as a 5. These benchmarks were analogously defined and coded for ex post clarity.

Figure 2 plots the variation in ex ante and ex post task clarity. Although the mean task was coded approximately four out of five on both measures, only 22.7% of tasks were given the highest rating of five for ex ante clarity and 19.1% were rated five for ex post clarity, indicating that the vast majority of tasks were not perfectly clear. As one would expect, ex ante and ex post clarity are positively correlated (ρ = .36), but not perfectly so, so there are significant numbers of tasks that were relatively clearly specified ex ante but not ex post (and vice versa).

Ex ante and ex post task clarity.
Figure 2.

Ex ante and ex post task clarity.

Note: Circle size is proportional to the number of tasks that fall within each bin of width 0.5. Red lines indicate mean values for each measure of task clarity.

Coders also coded a range of other task characteristics, such as the routineness of a task, whether the task is a single activity or a bundle of interconnected activities, and the level of coordination with external stakeholders the task requires, which we use as control variables in our analysis.

Measuring Management

To measure the quality of management practices, we draw on the methodological innovations of Bloom and Van Reenen (2007, 2010) and Bloom, Sadun, and Van Reenen (2012) (BSVR henceforth), which have transformed the empirical study of management practices in organizational economics. BSVR use structured telephone interviews with managers to score management quality on a 1–5 scale across 15 different management practices, incorporating a number of methodological innovations order to mitigate the recognized biases associated with the common practice of using subjective, self-reported measures of organizational performance (Meier, Laurence, and O’Toole 2012). This methodology was initially developed for use in measuring management in private firms in the manufacturing sector, but has subsequently been adapted and extended to organizational settings as diverse as hospitals, schools, and NGOs, in both developed and developing countries (Bloom et al. 2014).

We adapted BSVR’s methodology to be administered as an in-person survey, and to cover fourteen practices across six dimensions of management practice that are relevant for the public sector: roles, flexibility, incentives, monitoring, staffing, and targets. Following Rasul and Rogger (2018), we aggregate these into the three subindices of autonomy/discretion, incentives/monitoring, and other (which includes the residual practices, staffing, and targets). The autonomy/discretion subindex comprises the topics of role and flexibility, which cover the extent of discretion bureaucrats have to decide how to go about implementing and achieving tasks, the extent to which they have flexibility to adapt their work to contextual specificities, and their flexibility in being able to generate and adopt new work practices.8 The incentives/monitoring subindex covers the extent to which poor or good performance is rewarded or punished (financially or nonfinancially), and the extent of the use of both individual- and team-level performance metrics. Our residual “other” subindex covers the remaining questions, which relate to staffing and the use of targets in the organization.9 We use this residual subindex of other practices as a control only and do not attach a substantive interpretation since it falls outside our theoretical focus. Table 1 summarizes the construction of these management subindices, and supplementary table A1 gives full details of each management related question, by topic, as well as the scoring grid used by our enumerators for each question.

Table 1.

Summary of Management Subindices

Management SubindexTopicPractice Description
Autonomy/DiscretionRoles• The extent to which senior staff make substantive contributions to the policy formulation and implementation process.
• The extent to which senior staff are given discretion to carry out assignments in their daily work.
Flexibility• The extent to which the division makes efforts to adjust to the specific needs and peculiarities of communities, clients, or other stakeholders.
• The extent to which the division is flexible in terms of responding to new and improved work practices.
Incentives/MonitoringPerformance
Incentives
• The extent to which underperformance would be tolerated, given past experience.
• The extent to which staff are disciplined for breaking the rules of the civil service.
• The extent to which the performance of individual officers is tracked using performance, targets, or indicators and rewarded (financially or nonfinancially).
Monitoring• The extent to which the division tracks how well it is performing and delivering services using indicators.
OtherStaffing• The extent to which efforts are made to attract talented people to the division and retain them.
• The extent to which people are promoted faster based on their performance.
• The extent to which the burden of achieving targets is evenly distributed across different officers.
• The extent to which senior staff try to use the right staff for the right job.
Targeting• The extent to which the division has a clear set of targets derived from the organization’s goals and objectives that affect individuals’ work schedules.
• The extent to which staff know what their individual roles and responsibilities are in achieving the organization’s goals when they arrive at work each day.
Management SubindexTopicPractice Description
Autonomy/DiscretionRoles• The extent to which senior staff make substantive contributions to the policy formulation and implementation process.
• The extent to which senior staff are given discretion to carry out assignments in their daily work.
Flexibility• The extent to which the division makes efforts to adjust to the specific needs and peculiarities of communities, clients, or other stakeholders.
• The extent to which the division is flexible in terms of responding to new and improved work practices.
Incentives/MonitoringPerformance
Incentives
• The extent to which underperformance would be tolerated, given past experience.
• The extent to which staff are disciplined for breaking the rules of the civil service.
• The extent to which the performance of individual officers is tracked using performance, targets, or indicators and rewarded (financially or nonfinancially).
Monitoring• The extent to which the division tracks how well it is performing and delivering services using indicators.
OtherStaffing• The extent to which efforts are made to attract talented people to the division and retain them.
• The extent to which people are promoted faster based on their performance.
• The extent to which the burden of achieving targets is evenly distributed across different officers.
• The extent to which senior staff try to use the right staff for the right job.
Targeting• The extent to which the division has a clear set of targets derived from the organization’s goals and objectives that affect individuals’ work schedules.
• The extent to which staff know what their individual roles and responsibilities are in achieving the organization’s goals when they arrive at work each day.

Note: Interviews were conducted using open questions to initiate discussion on each practice, followed by probing follow-up and requests for examples, after which the interviewer would score each practice on a 1–5 scale. See text for further details of interview method and supplementary appendix A for full details of practices and scoring grid.

Table 1.

Summary of Management Subindices

Management SubindexTopicPractice Description
Autonomy/DiscretionRoles• The extent to which senior staff make substantive contributions to the policy formulation and implementation process.
• The extent to which senior staff are given discretion to carry out assignments in their daily work.
Flexibility• The extent to which the division makes efforts to adjust to the specific needs and peculiarities of communities, clients, or other stakeholders.
• The extent to which the division is flexible in terms of responding to new and improved work practices.
Incentives/MonitoringPerformance
Incentives
• The extent to which underperformance would be tolerated, given past experience.
• The extent to which staff are disciplined for breaking the rules of the civil service.
• The extent to which the performance of individual officers is tracked using performance, targets, or indicators and rewarded (financially or nonfinancially).
Monitoring• The extent to which the division tracks how well it is performing and delivering services using indicators.
OtherStaffing• The extent to which efforts are made to attract talented people to the division and retain them.
• The extent to which people are promoted faster based on their performance.
• The extent to which the burden of achieving targets is evenly distributed across different officers.
• The extent to which senior staff try to use the right staff for the right job.
Targeting• The extent to which the division has a clear set of targets derived from the organization’s goals and objectives that affect individuals’ work schedules.
• The extent to which staff know what their individual roles and responsibilities are in achieving the organization’s goals when they arrive at work each day.
Management SubindexTopicPractice Description
Autonomy/DiscretionRoles• The extent to which senior staff make substantive contributions to the policy formulation and implementation process.
• The extent to which senior staff are given discretion to carry out assignments in their daily work.
Flexibility• The extent to which the division makes efforts to adjust to the specific needs and peculiarities of communities, clients, or other stakeholders.
• The extent to which the division is flexible in terms of responding to new and improved work practices.
Incentives/MonitoringPerformance
Incentives
• The extent to which underperformance would be tolerated, given past experience.
• The extent to which staff are disciplined for breaking the rules of the civil service.
• The extent to which the performance of individual officers is tracked using performance, targets, or indicators and rewarded (financially or nonfinancially).
Monitoring• The extent to which the division tracks how well it is performing and delivering services using indicators.
OtherStaffing• The extent to which efforts are made to attract talented people to the division and retain them.
• The extent to which people are promoted faster based on their performance.
• The extent to which the burden of achieving targets is evenly distributed across different officers.
• The extent to which senior staff try to use the right staff for the right job.
Targeting• The extent to which the division has a clear set of targets derived from the organization’s goals and objectives that affect individuals’ work schedules.
• The extent to which staff know what their individual roles and responsibilities are in achieving the organization’s goals when they arrive at work each day.

Note: Interviews were conducted using open questions to initiate discussion on each practice, followed by probing follow-up and requests for examples, after which the interviewer would score each practice on a 1–5 scale. See text for further details of interview method and supplementary appendix A for full details of practices and scoring grid.

Following BSVR, for each question enumerators would first ask what practices were used in an open-ended way, then probe respondents’ responses and ask for examples in order to ascertain what practices are actually in use (much as a qualitative interview would), as opposed to simply asking for respondents’ perceptions of management quality. Interviewers would then use this information to score each practice on a continuous 1–5 scale, where 1 represents nonuse or inconsistent/incoherent use of that practice within the organization and 5 represents strong, consistent, and coherent use of that practice. To further anchor the scores and provide comparability across organizations, the scoring grid for each practice is benchmarked to actual descriptions of the practices in use. This improves on the more commonly used Likert-style measures of perceptions of management practices, which are vulnerable to differential anchoring across respondents and organizations.

To undertake these in-person, interview-style surveys, we collaborated closely with Ghana’s OHCS. We first recruited survey team leaders from the private sector, with an emphasis on previous experience of survey work in Ghana. We worked closely with the team leaders to give them an appreciation and understanding of the practices and protocols of the public service. OHCS then seconded a group of 34 junior public officials with preexisting experience of public sector work to act as enumerators. They worked on rotation across the survey period, so that at on any given day approximately two thirds were conducting interviews. The Head of Service ensured their commitment to the survey process by stating the research team would monitor interviewer performance and that these assessments would influence future posting opportunities. We trained the team leaders and public officials jointly, including intensive practice interview sessions, before undertaking the first few interviews together in order to harmonize and calibrate assessments.

Figure 3 summarizes the participant flow and study sample. Within the Civil Service as a whole, we first excluded officers working outside of each organization’s headquarters staff in regional or district offices (officers working in office annexes located in physically separate buildings from the main headquarters but administratively part of the main headquarters were included). We then excluded support staff such as cleaners, drivers, secretaries, and security guards who are classified as “subprofessional” grades by the Civil Service. To compile the list of eligible professional-grade civil servants based in headquarters offices, our survey team worked with each organization’s human resources directorate. This yielded a total of 3,039 staff eligible to be interviewed across 45 organizations. Over the period from August to November 2015,10 the enumerators interviewed 2,971 of these civil servants. This constitutes 98% of all eligible staff in these organizations, with the remainder mostly having been out of the office during the survey period. Interviews were conducted in person, but were double-blind in the sense that we ensured that interviewers had never worked in the organizations in which they were interviewing and did not know their interviewees, and likewise interviewees did not know their interviewers.

Study population and sample.
Figure 3.

Study population and sample.

Note: Figures denoted with an asterisk (*) are approximations because existing administrative data (which was designed for other purposes) did not enable precise calculations. Figures for professional-grade civil servants (study population) and number interviewed (survey sample) are exact figures.

We convert the scores on each management practice into normalized z-scores by taking unweighted means of the underlying z-scores (so are continuous variables with mean zero and variance one by construction), using the most senior bureaucrat in each division since these officers have an awareness of management practices at senior management level as well as their day-to-day implementation.11 Greater autonomy/discretion for staff within an organization thus corresponds to higher scores on this subindex, and for the incentives/monitoring measure, the provision of stronger incentives/monitoring corresponds to higher scores. Of course, although these scores do correspond to real variation in the qualitative management practices being used, and our quantitative coding represents commonly accepted notions of the quality of each practice, we do not presume that higher scores always correspond to “good management” in the sense that they necessarily improve performance. Rather, these relationships are what we try to estimate empirically in our main results section below.

Variation in Task Completion and Management

These data sets allow us to provide rich new descriptive evidence on within-country, cross-organization variation in the completion of tasks, as well as in the management practices with which organizations are run. Because these descriptives are both novel and noteworthy, we first present evidence of this variation in each variable before going on to examine the relationship between them.

Figure 4 shows that there is substantial variation in task completion across civil service organizations, whether measured in the proportion of tasks started or finished or in the average score on our 1–5 scale. To quantify this variation, we note that the 75th percentile organization has an average completion rate 22% higher than the 25th percentile organization. This task-based measure provides powerful evidence of the variation in actual bureaucratic performance across bureaucracies within a government, thus building on and extending the existing input-, survey-, and perception-based measures of this variation (Bersch, Praça, and Taylor 2016; Gingerich 2013; Ingraham, Joyce, and Kneedler Donahue 2003).

Task completion by organization.
Figure 4.

Task completion by organization.

Note: Multiple coders assessed an task such that here we take the minimum assessment of initiation and the maximum assessment of completion, so that it is possible for proportion started to be lower than proportion completed (as it is for one organization). Completion status is a continuous 1–5 score for each task, here rescaled to 0–1.

Figure 5 shows the average completion status of different task types. Average completion rates are clustered between 2.97 (Training) and 3.45 (Personnel Management), and although there are some statistically significant differences across task types, this variation is much less dramatic than the range of variation across organizations shown in figure 2. This perhaps reflects that organizations aim for some degree of balance across task types in the level of ambition in the setting of tasks. Nonetheless, we will control for task type in our later analysis to ensure that our results are not being driven by the variation in task incidence across organizations.

Task completion by task type.
Figure 5.

Task completion by task type.

Note: The task type classification refers to the primary classification for each output.

Figure 6 shows that, as with bureaucratic performance on task completion, there is substantial variation in the management practices bureaucrats are subject to across organizations along both of our subindices of management practices. Because many of the underlying practices are subject to administrative rules or guidelines aimed at producing uniform practices across organizations, the existence of this variation demonstrates that there is substantial de facto deviation from these de jure procedures with the Civil Service. To reiterate, this variation occurs despite the fact that all organizations share the same colonial and postcolonial structures, are governed by the same civil service laws and regulations, are overseen by the same supervising authorities, are assigned new hires from the same pool of potential workers, and are located proximately to each other in Accra. Although a mainly qualitative literature has previously documented variation in state capacity and performance within states in developing countries, our findings represent large-scale evidence that such within-government variation in management and performance is systematic and does not consist merely of a handful of problem organizations or “pockets of effectiveness” (Leonard 2010; McDonnell 2017; Tendler 1997). Although such variation is often cast in a negative light in bureaucratic settings, this variation can also derive from organizations going above and beyond formal procedures by implementing supporting informal practices or by adapting formal rules to fit their circumstances.

Variation in management practices.
Figure 6.

Variation in management practices.

Note: Organization z-scores presented for organizations with task completion data available.

A second important feature of organizational management practices illustrated by figure 6 is that organizations’ scores on the two practices are strongly positively correlated with each other (ρ = .59). Organizations that give their staff flexibility in day-to-day operations also use stronger monitoring and incentives, on average. Although autonomy/discretion and incentives/monitoring as abstract approaches derive from different perspectives on public sector management, in practice organizations that score highly on one also score highly on the other (at least in the Ghanaian context).12 This empirical finding provides much-needed nuance to the debate between top-down and bottom-up approaches to management discussed in the Theory: Management, Task Clarity, and Performance section. These are not alternative approaches in the sense that organizations face a a binary either-or choice between them; rather, each organization combines these two approaches to one degree or another. Rather than selecting one style or the other, the pertinent question is thus in what proportions or ways they should be combined.

Finally, figure 6 also illustrates a key feature of the data, which enables our research design to answer this question: although there is a positive overall correlation between the autonomy/discretion and incentives/monitoring subindices, there is nonetheless considerable variation in the relative balance between them. That is, some organizations lean relatively more heavily toward the use of autonomy/discretion, whereas others lean relatively more heavily toward incentives/monitoring. In the next section, we explore how this variation in approaches to management is related to the variation we observe in task completion.

Empirical Method and Results

To study the relationship between task completion and management, we take as our unit of observation task i of type j in organization n. We estimate the following OLS specification,

(1)

where yijn is a measure of task completion—either a binary indicator of whether the task is fully completed (Columns 1–5, our preferred measure) or a continuous measure of the task completion rate on the interval [0,1] (Column 6, shown for robustness). Management practices are measured using the autonomy/discretion, incentives/monitoring and other indices, and PCijn and OCn are task and organizational controls.13 As figure 1 highlighted, many organizations implement the same task type j, so we can control for task-type fixed effects Lj in equation (1), as well as fixed effects for the broad sector the implementing organization operates in.14

The partial correlations of interest are r1 and r2, the association between task completion and a 1 SD change in management practices along the respective margins of autonomy and incentives/monitoring. If r2 > r1, this would be evidence in support of H1a, whereas if r1 > r2, it would be evidence in support of H1b. Although our specification controls account for a wide array of potential confounding related to the organization, task type, and data collection, it is of course possible that there are other variables that are correlated with both management practices and task completion, so we emphasize that these estimates are partial correlations; we discuss interpretation and the potential for future causal research in the concluding section. To account for unobserved shocks, we cluster standard errors (SE) by organization n, the same level of variation as management practices.

Table 2 presents our main results.15 Before moving on to the main specification presented in equation (1), Column 1 first shows that there is a positive relationship (p = .034) between the overall management z-score (which combines all three subindices) and task completion. This is evidence that higher overall management scores are indeed associated with higher task completion, even after controlling for an extensive range of organizational, task, and survey noise-related variables.

Table 2.

Management and Task Completion

Dependent Variable(1)(2)(3)(4)(5)(6)
TaskTaskTaskTaskTaskCompletion
Completion
[Binary]
Completion
[Binary]
Completion
[Binary]
Completion
[Binary]
Completion [Binary]Rate [0–1 Continuous]
Management–Overall0.059
(0.027)
[.034]
Management–Autonomy/Discretion0.0800.1170.052
(0.031)(0.066)(0.044)
[.013][.086][.243]
Management–Incentives/Monitoring

0.040

(0.034)

[0.259]

−0.040

(0.062)

[.521]

−0.076

(0.040)

[.070]

Management–Other0.047−0.0120.020
(0.026)(0.059)(0.038)
[.073][.834][.600]
Test: autonomy/discretion = incentives/monitoring (p-value)0.1520.084
Noise controlsYesYesYesYesYesYes
Organizational controlsYesYesYesYesYesYes
Task controlsYesYesYesYesYesYes
Fixed effectsTask Type,Task Type,Task Type,Task Type,Task Type,Task Type,
SectorSectorSectorSectorSectorSector
Observations (clusters)3,620 (30)3,620 (30)3,620 (30)3,620 (30)3,620 (30)3,620 (30)
Dependent Variable(1)(2)(3)(4)(5)(6)
TaskTaskTaskTaskTaskCompletion
Completion
[Binary]
Completion
[Binary]
Completion
[Binary]
Completion
[Binary]
Completion [Binary]Rate [0–1 Continuous]
Management–Overall0.059
(0.027)
[.034]
Management–Autonomy/Discretion0.0800.1170.052
(0.031)(0.066)(0.044)
[.013][.086][.243]
Management–Incentives/Monitoring

0.040

(0.034)

[0.259]

−0.040

(0.062)

[.521]

−0.076

(0.040)

[.070]

Management–Other0.047−0.0120.020
(0.026)(0.059)(0.038)
[.073][.834][.600]
Test: autonomy/discretion = incentives/monitoring (p-value)0.1520.084
Noise controlsYesYesYesYesYesYes
Organizational controlsYesYesYesYesYesYes
Task controlsYesYesYesYesYesYes
Fixed effectsTask Type,Task Type,Task Type,Task Type,Task Type,Task Type,
SectorSectorSectorSectorSectorSector
Observations (clusters)3,620 (30)3,620 (30)3,620 (30)3,620 (30)3,620 (30)3,620 (30)

Note: SE are in parentheses with exact p-values in brackets and are clustered by organization throughout. All columns report OLS estimates. p-Values reported for Wald test of coefficient equality. See text for details of dependent variables, controls, and fixed effects. p-Values reported for Wald test of coefficient equality. See text for details of controls and fixed effects. Figures are rounded to three decimal places.

Table 2.

Management and Task Completion

Dependent Variable(1)(2)(3)(4)(5)(6)
TaskTaskTaskTaskTaskCompletion
Completion
[Binary]
Completion
[Binary]
Completion
[Binary]
Completion
[Binary]
Completion [Binary]Rate [0–1 Continuous]
Management–Overall0.059
(0.027)
[.034]
Management–Autonomy/Discretion0.0800.1170.052
(0.031)(0.066)(0.044)
[.013][.086][.243]
Management–Incentives/Monitoring

0.040

(0.034)

[0.259]

−0.040

(0.062)

[.521]

−0.076

(0.040)

[.070]

Management–Other0.047−0.0120.020
(0.026)(0.059)(0.038)
[.073][.834][.600]
Test: autonomy/discretion = incentives/monitoring (p-value)0.1520.084
Noise controlsYesYesYesYesYesYes
Organizational controlsYesYesYesYesYesYes
Task controlsYesYesYesYesYesYes
Fixed effectsTask Type,Task Type,Task Type,Task Type,Task Type,Task Type,
SectorSectorSectorSectorSectorSector
Observations (clusters)3,620 (30)3,620 (30)3,620 (30)3,620 (30)3,620 (30)3,620 (30)
Dependent Variable(1)(2)(3)(4)(5)(6)
TaskTaskTaskTaskTaskCompletion
Completion
[Binary]
Completion
[Binary]
Completion
[Binary]
Completion
[Binary]
Completion [Binary]Rate [0–1 Continuous]
Management–Overall0.059
(0.027)
[.034]
Management–Autonomy/Discretion0.0800.1170.052
(0.031)(0.066)(0.044)
[.013][.086][.243]
Management–Incentives/Monitoring

0.040

(0.034)

[0.259]

−0.040

(0.062)

[.521]

−0.076

(0.040)

[.070]

Management–Other0.047−0.0120.020
(0.026)(0.059)(0.038)
[.073][.834][.600]
Test: autonomy/discretion = incentives/monitoring (p-value)0.1520.084
Noise controlsYesYesYesYesYesYes
Organizational controlsYesYesYesYesYesYes
Task controlsYesYesYesYesYesYes
Fixed effectsTask Type,Task Type,Task Type,Task Type,Task Type,Task Type,
SectorSectorSectorSectorSectorSector
Observations (clusters)3,620 (30)3,620 (30)3,620 (30)3,620 (30)3,620 (30)3,620 (30)

Note: SE are in parentheses with exact p-values in brackets and are clustered by organization throughout. All columns report OLS estimates. p-Values reported for Wald test of coefficient equality. See text for details of dependent variables, controls, and fixed effects. p-Values reported for Wald test of coefficient equality. See text for details of controls and fixed effects. Figures are rounded to three decimal places.

To illustrate the omitted variable bias that occurs when analyzing just one set of management practices in isolation, Column 2 then estimates the relationship between task and autonomy/discretion without controlling for organizations’ other management practice, Column 3 does the same for incentives/monitoring, and Column 4 for our residual subindex of other practices. The subindices are all positively associated with task completion, albeit with varying levels of statistical significance and more strongly so for autonomy/discretion. Column 5 then presents our core specification from equation (1), using binary task completion as the dependent variable, whereas Column 6 presents the same specification with an alternative continuous measure of task completion.

A consistent set of findings emerges across our main specifications in Columns 5–6: (1) management practices providing bureaucrats more autonomy and discretion are positively correlated with the likelihood of task completion (r1 > 0) and (2) management practices related to the provision of incentives or monitoring to bureaucrats are negatively correlated with the likelihood of task completion (r2 < 0). These estimates imply that a 1 SD increase in the autonomy/discretion subindex is associated with an increase in the likelihood a task is fully completed by 11.7 percentage points, and a 1 SD increase in incentives/monitoring is associated with a decrease in the likelihood a task is fully completed by 4.0 percentage points. This evidence is consistent with H1b, and thus is inconsistent with H1a.

These magnitudes are substantively important: recall the backdrop here is that only 34% of tasks are fully completed. The different point estimates between Columns 2–4 and Columns 5–6 illustrate that examining autonomy/discretion (incentives/monitoring) in isolation would lead to a downward (upward) bias on the point estimates, due to the underlying positive correlation between these measures. In terms of statistical significance, the difference between these two coefficients is more relevant than the individual difference from zero in terms of our theoretical question; a Wald test of coefficient equality varies in significance depending on the dependent variable. This is suggestive evidence that these two dimensions of management practice are differentially associated with task completion.

To examine how these suggestive findings vary with the ex ante and ex post clarity of the task (H2), Table 3 re-estimates equation (1) on the subsample of tasks that fall below and above median on each measure of task clarity. Columns 1 and 2 show that task completion’s positive relationship with autonomy/discretion and negative relationship with incentives/monitoring is far stronger for tasks that have below-median ex ante clarity than for those that have above-median ex ante clarity. The difference between r1 and r2 is statistically significant at less than the 1% level for tasks that are below median on ex ante clarity, but there is no substantive of significant difference in coefficients for above-median ex ante clarity tasks. Similarly, for tasks with above-median ex post clarity (Columns 3–4), a 1 SD increase in autonomy/discretion (incentives/monitoring) is associated with a 33.9 percentage point increase (1 percentage point decrease) in task completion. Because 34% of tasks are fully completed, this implies that for tasks with high ex post clarity an increase of 1 SD in autonomy/discretion is associated with a near-doubling of completion likelihood. For tasks that are below median in terms of ex post clarity, autonomy/discretion is still positive and statistically significantly different from both zero and the coefficient on incentives/monitoring, but the difference is far smaller. This pattern of results is consistent with H2.

Table 3.

Management Practices and Task Clarity

Ex Ante TaskClarityEx Post TaskClarity
(1)(2)(3)(4)
Below MedianAbove MedianBelow MedianAbove Median
Management–Autonomy/Discretion0.2070.0190.1260.337
(0.059)(0.107)(0.052)(0.077)
[.002][.863][.021][<.001]
Management–Incentives/Monitoring−0.138−0.020−0.023−0.012
(0.051)(0.097)(0.047)(0.104)
[.011][.834][.623][.907]
Management–Other−0.0070.027−0.008−0.227
(0.060)(0.082)(0.050)(0.077)
[.907][.742][.879][.006]
Test: autonomy/discretion = incentives/monitoring (p-value)0.0000.8160.0430.009
Noise and organizational controlsYesYesYesYes
Task controlsYesYesYesYes
Fixed effectsTask type,Task type,Task type,Task type,
sectorsectorsectorsector
Observations (clusters)1,851 (29)1,769 (29)2,177 (29)1,443 (27)
Ex Ante TaskClarityEx Post TaskClarity
(1)(2)(3)(4)
Below MedianAbove MedianBelow MedianAbove Median
Management–Autonomy/Discretion0.2070.0190.1260.337
(0.059)(0.107)(0.052)(0.077)
[.002][.863][.021][<.001]
Management–Incentives/Monitoring−0.138−0.020−0.023−0.012
(0.051)(0.097)(0.047)(0.104)
[.011][.834][.623][.907]
Management–Other−0.0070.027−0.008−0.227
(0.060)(0.082)(0.050)(0.077)
[.907][.742][.879][.006]
Test: autonomy/discretion = incentives/monitoring (p-value)0.0000.8160.0430.009
Noise and organizational controlsYesYesYesYes
Task controlsYesYesYesYes
Fixed effectsTask type,Task type,Task type,Task type,
sectorsectorsectorsector
Observations (clusters)1,851 (29)1,769 (29)2,177 (29)1,443 (27)

Note: SE are in parentheses with p-values in brackets and are clustered by organization throughout. All columns report OLS estimates. The dependent variable in all columns is a dummy variable that takes the value 1 if the task is completed and 0 otherwise. The sample for Columns 1–2 and 3–4 is split according to whether the target clarity or actual achievement clarity is above median versus below or equal to the median. p-Values reported for Wald test of coefficient equality. See text for details of controls and fixed effects. Figures are rounded to three decimal places.

Table 3.

Management Practices and Task Clarity

Ex Ante TaskClarityEx Post TaskClarity
(1)(2)(3)(4)
Below MedianAbove MedianBelow MedianAbove Median
Management–Autonomy/Discretion0.2070.0190.1260.337
(0.059)(0.107)(0.052)(0.077)
[.002][.863][.021][<.001]
Management–Incentives/Monitoring−0.138−0.020−0.023−0.012
(0.051)(0.097)(0.047)(0.104)
[.011][.834][.623][.907]
Management–Other−0.0070.027−0.008−0.227
(0.060)(0.082)(0.050)(0.077)
[.907][.742][.879][.006]
Test: autonomy/discretion = incentives/monitoring (p-value)0.0000.8160.0430.009
Noise and organizational controlsYesYesYesYes
Task controlsYesYesYesYes
Fixed effectsTask type,Task type,Task type,Task type,
sectorsectorsectorsector
Observations (clusters)1,851 (29)1,769 (29)2,177 (29)1,443 (27)
Ex Ante TaskClarityEx Post TaskClarity
(1)(2)(3)(4)
Below MedianAbove MedianBelow MedianAbove Median
Management–Autonomy/Discretion0.2070.0190.1260.337
(0.059)(0.107)(0.052)(0.077)
[.002][.863][.021][<.001]
Management–Incentives/Monitoring−0.138−0.020−0.023−0.012
(0.051)(0.097)(0.047)(0.104)
[.011][.834][.623][.907]
Management–Other−0.0070.027−0.008−0.227
(0.060)(0.082)(0.050)(0.077)
[.907][.742][.879][.006]
Test: autonomy/discretion = incentives/monitoring (p-value)0.0000.8160.0430.009
Noise and organizational controlsYesYesYesYes
Task controlsYesYesYesYes
Fixed effectsTask type,Task type,Task type,Task type,
sectorsectorsectorsector
Observations (clusters)1,851 (29)1,769 (29)2,177 (29)1,443 (27)

Note: SE are in parentheses with p-values in brackets and are clustered by organization throughout. All columns report OLS estimates. The dependent variable in all columns is a dummy variable that takes the value 1 if the task is completed and 0 otherwise. The sample for Columns 1–2 and 3–4 is split according to whether the target clarity or actual achievement clarity is above median versus below or equal to the median. p-Values reported for Wald test of coefficient equality. See text for details of controls and fixed effects. Figures are rounded to three decimal places.

While our operationalization of task clarity does not correspond precisely with Wilson’s (1989) discussion of output and outcome observability, our findings are nonetheless consistent with the distinctions Wilson makes between procedural and craft agencies. Procedural agencies have tasks that are possible to specify ex ante and observe agents undertaking but are hard to measure ex post, and thus tend be managed in rigid, rule-bound ways that give agents minimal discretion. Craft agencies have tasks that are difficult to specify or observe ex ante but whose results are relatively easy to observe ex post, so they tend to give their staff discretion with the knowledge that abuses of it will be relatively easy to detect. Although we focus on tasks rather than agencies as a whole, we provide empirical support for this broad idea: incentives/monitoring-heavy management approaches are relatively more effective when task clarity is high ex ante and low ex post, and autonomy/discretion-heavy management approaches are relatively more effective when task clarity is low ex ante and high ex post.

Supplementary appendix provides a battery of robustness checks on our estimates. Supplementary table A2 shows the results of table 2 to be broadly robust to alternative samples, exclusion of outlier organizations, estimation methods, fixed effects specifications, and codings of completion rates. Supplementary table A3 does likewise for Table 3. Supplementary tables A4 and A5 further show the results to be robust to alternative clusterings of the SE, and supplementary table A6 examines alternative codings of completion rates.

One nuance in interpretation of this analysis revolves around our estimation of partial rather than absolute correlations between management practices and task completion. Recall that the coefficient on each index represents a partial correlation of that set of practices with task completion, conditional on the other practices being used in the organization (as well as our other controls and fixed effects). Thus, although we find that management practices related to incentives and monitoring are negatively related to task completion conditional on the level of autonomy and discretion being used in the same organization, this does not imply that all incentives and monitoring are bad for task completion. Rather, it implies that organizations seem to be overbalancing what we describe as their portfolio of management practices inefficiently toward incentives and monitoring at the expense of autonomy and discretion, particularly for tasks with low ex ante task clarity and high ex post clarity.

A second point relevant for the interpretation of these findings is that these estimates are based on our measurement of de facto management practices in an organization. Although management in an organization can be described qualitatively both by the style of management (what type of practices the organization is trying to implement) and by the quality of implementation of these practices, our measure of management combines both of these into a single dimension for each practice. We thus estimate the relationship between task completion and the management practices that organizations are actually using, rather than the management practices they are trying to use or an idealized version of these management practices. Although this limits our study’s ability to speak to the potential efficacy of these practices when implemented “correctly,” the relationships between management practices and task completion under real-world implementation conditions is (in our view) a more important question.

Conclusion

Our study investigates the question of how two prominent approaches to public sector management top-down control through monitoring and incentives, versus relying on bureaucratic professionalism through autonomy and discretion—are related to bureaucratic task completion. We find positive conditional associations between task completion and organizational practices related to autonomy and discretion, but negative conditional associations with management practices related to incentives and monitoring. This finding provides new evidence in support of the potential effectiveness of “bottom-up” approaches to management in the running debate about their relative merits vis-à-vis more “top-down,” carrot-and-stick approaches to public sector management (Miller 2000; Miller and Whitford 2016). This is consistent with recent empirical evidence from a range of other contexts that finds positive effects from management strategies that promote bureaucratic autonomy and discretion (e.g., Bandiera et al. 2019; Honig 2018; Rasul and Rogger 2018).

Consistent with our theoretical expectations, these relationships vary strongly with task clarity: the positive relationship between task completion and autonomy/incentives (relative to incentives/monitoring) is far stronger for tasks with low ex ante clarity and high ex post clarity. By refining and operationalizing Wilson’s (1989) classic output–outcome visibility typology in terms of bureaucratic task clarity, we provide empirical support for his key theoretical insight and for the intuition of many practitioners: that although top-down, control-oriented approaches to management may be appropriate for tasks for which it is clear in advance what needs to be done, for tasks that are difficult to fully specify in advance (but for which performance can be measured ex post) managers need to find ways to allow and support the exercise of bureaucratic discretion.

These findings are likely to be relevant to a range of other countries. Ranked just below the global median in terms of government effectiveness, empirical findings from Ghana are potentially more relevant to the wide range of low- and middle-income countries that have been relatively understudied in public administration than findings from high-income countries (which represent the extreme end of the global distribution of wealth, human resource availability, and government effectiveness). For other low- and middle-income countries, our results push back against the widespread assumption that bureaucratic discretion should be minimized in such settings, and point to the ways in which bureaucratic professionalism can be an effective management strategy even in challenging contexts. If anything, we would expect this to be even more true in high-income contexts where such professional norms are even more strongly entrenched. But although the average effectiveness of top-down versus bottom-up management approaches might vary across contexts depending on these underlying conditions, we see an even wider scope of relevance for our theoretical argument about how the relative effectiveness of these approaches varies according to the ex ante and ex post clarity of the task. Although further empirical studies replicating this analysis in high-income countries would be welcome, the fact that the three closest theoretical predecessors of our study (Chun and Rainey 2005; Romzek 2000; Wilson 1989) are all focused on high-income countries hints at the broad relevance of this mechanism.

Two key methodological features of our study are that we measure management across a broad spectrum of management practices, rather than just a specific practice or instance of a policy, and that we measure task completion across the full range of bureaucratic activity and across many organizations. Not only is this breadth valuable in itself and methodologically innovative, but we show that it matters for our results: since management practices are correlated with each other, estimating the impact of one management practice on task completion without controlling for others leads to significant omitted variable bias. Similarly, our measurement of de facto management practices across the full civil service allows us to analyze these practices as they are actually implemented, as opposed to what managers say they are doing or how these practices might work under closely controlled experimental conditions. Although these features put together make our data unique, the underlying measurement methods are widely applicable and do not rely on idiosyncrasies of the Ghanaian context (although varying institutional features may require different analytical approaches). In particular, our task completion data was contained in simple quarterly and annual reports that were designed and used for simple monitoring purposes, and similar reports are produced by public sector organizations worldwide. Our findings demonstrate that it is possible to clean and code such data to produce usable data for analysis. Although care would need to be taken to validate such self-reported data in each case (as we did in Ghana), this type of administrative data at organizational level seems relatively underexploited at present (with exceptions such as Ingraham, Joyce, and Kneedler Donahue 2003).

Although we have demonstrated the robustness of our findings against an extensive range of organizational, individual, and task characteristics to rule out many alternative explanations, and shown that many of the mechanism driving this result are consistent with theoretical predictions, it is nonetheless possible that there exist additional unobserved factors that (partially) explain the observed associations. In this sense, we view this study as an important complement to more narrowly focused (pseudo-)experimental studies (e.g., Andersen and Moynihan 2016; Banerjee et al. 2014) as well as more nuanced qualitative research in advancing our knowledge about how management practices are related to task completion in the public sector. These findings could also serve as motivation and evidence base for future experimental interventions conducted in conjunction with governments to try to improve and evaluate the use of autonomy- and discretion-related management practices.

Finally, our study also builds on the literature examining cross-country differences in bureaucratic effectiveness by pushing forward the frontier in understanding within-country variation in effectiveness (Andersen and Mortensen 2010; Bersch, Praça, and Taylor 2016; Gingerich 2013; Leonard 2010; McDonnell 2017; Meier, Laurence, and O’Toole 2002; Nicholson-Crotty and O’Toole 2004; Owusu 2006). In its theoretical conceptualization of these issues, its innovative measurement methodology, and its striking empirical findings, we hope that this study can contribute to advancing our understanding of the causes and consequences of within-country variation in public sector management and effectiveness.

Footnotes

1

Worldwide Governance Indicators 2015 update, available at https://info.worldbank.org/governance/wgi/Home/Reports. See Kaufmann et al. (2010) for methodological details.

2

Ghana distinguishes between the Civil Service and the broader Public Service, which includes dozens of autonomous agencies under the supervision, but not direct control of their sector ministries, as well as frontline implementers such as the Police Service, Education Service, and so forth. Our sample is restricted to the headquarters offices of Civil Service organizations.

3

In Ghana, grades of technical and administrative bureaucrats are officially referred to as “senior” officers, whereas grades covering cleaners, drivers, and so forth are referred to as “junior” officers, regardless of their tenure or seniority. Although we restrict our sample to “senior” officers in the formal terminology, throughout we use the terms senior and nonsenior in their more colloquial sense to refer to hierarchical relationships within the professional grades.

4

Where an organization produced multiple reports during this time (e.g., a midyear report and an annual report), we selected the latest report produced during the year to include in our sample.

5

Unfortunately, the reports do not distinguish between activities and outputs in the theoretical sense in which the terms are used by some logical frameworks or theories of change, instead just reporting them as lists of tasks the organization has to undertake.

6

Given the tendency for averaging scores across coders to reduce variation, for our core analysis we use the maximum and minimum scores to code whether tasks are fully complete/never initiated respectively. Supplementary figure A6 shows that alternative approaches to aggregating scores yield similar results.

7

Among the handful of non-corroborated tasks, the lowest “true” completion rate was a 3 of 5, indicating that the rare instances of misreporting were relatively minor.

8

Note that autonomy in this sense refers to autonomy within an organization (of individuals and teams from senior management) rather than the autonomy of the organization (from political principals), and that all organizations in our sample have the same degree of legal and procedural autonomy in the organizational sense.

9

Although the use of targets is often associated with top-down management approaches that are also intensive in incentives and monitoring, target setting is potentially also an important element of management through autonomy and discretion. We therefore leave it in our residual category.

10

This survey period was a time when the Civil Service was operating normally, so we do not expect that there was anything peculiar about this time period that would have influenced our results. The country’s economy was stable, there were no major structural reforms of the Civil Service ongoing or planned, and elections were not scheduled until December 2016.

11

The median (mean) number of senior managers per organization is 13 (20).

12

Comparing raw management scores, we find that the 75th percentile organization has an 8%–9% higher management score than the 25th percentile organization across both subindices as well as in overall management scores. Although this variation is smaller in numerical terms than we observe for task completion, quantitative comparison of these differences is difficult because management does not have a natural scale.

13

Task controls comprise task-level controls for whether the task is regularly implemented by the organization or a one off, whether the task is a bundle of interconnected tasks, and whether the division has to coordinate with actors external to government to implement the task. Organizational controls comprise a count of the number of interviews undertaken (which is a close approximation of the total number of employees) and organization-level controls for the share of the workforce with degrees, the share of the workforce with postgraduate qualifications and the span of control. Following BSVR, we condition on “noise” controls related to the management surveys. Noise controls are averages of: indicators of the seniority, gender, and tenure of all respondents; the average time of day the interview was conducted; and the average of the reliability of the information given as coded by the interviewer.

14

For the purpose of estimation, we aggregate some similar task types based on their primary classification, so that the task fixed effects included are as follows: Advocacy and Policy Development, Financial and Budget Management, ICT Management and Research, Monitoring/Training/Personnel Management, Physical Infrastructure, Permits and Regulation, and Procurement. Sector fixed effects relate to whether the task is in the administration, environment, finance, infrastructure, security/diplomacy/justice, or social sector.

15

We do not report the R2 of our regressions because this statistic does not have its usual interpretation under the linear probability model with binary dependent variables.

Acknowledgments

We gratefully acknowledge financial support from the International Growth Centre (1-VCS-VGHA-VXXXX-33301) and the World Bank’s i2i trust fund financed by UKAID. We thank Dan Honig, Julien Labonne, Anandi Mani, Don Moynihan, Arup Nath, Simon Quinn, Raffaella Sadun, Itay Saporta-Eksten, Chris Woodruff, three anonymous referees, and seminar participants at Oxford, Ohio State, PMRC, and the EEA Meetings for valuable comments. Jane Adjabeng, Mohammed Abubakari, Julius Adu-Ntim, Temilola Akinrinade, Sandra Boatemaa, Eugene Ekyem, Paula Fiorini, Margherita Fornasari, Jacob Hagan-Mensah, Allan Kasapa, Kpadam Opuni, Owura Simprii-Duncan, and Liah Yecalo-Tecle provided excellent research assistance, and we are grateful to Ghana’s Head of Civil Service, Nana Agyekum-Dwamena, members of the project steering committee, and the dozens of Civil Servants who dedicated their time and energy to the project. This study was approved by UCL’s Research Ethics Committee. All errors remain our own.

Data Availability

The data underlying this article cannot be shared publicly due to the conditions for collecting it agreed with Ghana’s Civil Service.

References

Andersen
,
S. C.
, and
P. B.
Mortensen.
2010
.
Policy stability and organizational performance: Is there a relationship?
Journal of Public Administration Research and Theory
20
(
1
):
1
22
.

Andersen
,
S. C.
, and
D. P.
Moynihan
.
2016
.
Bureaucratic investments in expertise: Evidence from a randomized controlled field trial
.
Journal of Politics
78
(
4
):
1032
44
.

Bandiera
,
O.
,
M. C.
Best
,
A. Q.
Khan
, and
A.
Prat
.
2019
.
The allocation of authority in organizations: A field experiment with bureaucrats
.
Mimeo
, December 9.

Banerjee
,
A.
,
R.
Chattopadhyay
,
E.
Duflo
,
D.
Keniston
,
and
N.
Singh
.
2014
.
Improving police performance in Rajasthan, India: Experimental evidence on incentives, managerial autonomy and training
. NBER Working Paper 17912. https://www.nber.org/papers/w17912.

Bersch
,
K.
,
S.
Praça
, and
M.
Taylor
.
2016
.
State capacity, bureaucratic politicization, and corruption in the Brazilian state
.
Governance
30
(
1
):
105
24
.

Bevan
,
G.
, and
C.
Hood
.
2006
.
What’s measured is what matters: Targets and gaming in the English Health Care System
.
Public Administration
84
(
3
):
517
38
.

Bloom
,
N.
,
R.
Lemos
,
R.
Sadun
,
D.
Scur
, and
J.
Van Reenen
.
2014
.
The new empirical economics of management.
NBER Working Paper 20102, May.

Bloom
,
N.
,
R.
Sadun
, and
J.
Van Reenen.
2012
.
The organization of firms across countries
.
Quarterly Journal of Economics
127
:
1663
705
.

Bloom
,
N.
, and
J.
Van Reenen
.
2007
.
Measuring and explaining management practices across firms and countries
.
Quarterly Journal of Economics
122
:
1351
408
.

Bloom
,
N.
, and
J.
Van Reenen
.
2010
.
New approaches to surveying organizations
.
American Economic Review Papers and Proceedings
100
:
105
9
.

Boyne
,
G. A.
, and
M. W.
Walker.
2005
.
Introducing the ‘Determinants of Performance in Public Organizations’ symposium
.
Journal of Public Administration Research and Theory
15
:
483
8
.

Boyne
,
G.
, and
C.
Hood.
2010
.
Incentives: New research on an old problem
.
Journal of Public Administration Research and Theory
20
:
i177
80
.

Carpenter
,
D
.
2001
.
The forging of bureaucratic autonomy: Reputations, networks, and policy innovation in executive agencies, 1862–1928
.
Princeton, NJ
:
Princeton Univ. Press
.

Chun
,
Y. H.
, and
H. G.
Rainey
.
2005
.
Goal ambiguity and organizational performance in U.S. Federal Agencies
.
Journal of Public Administration Research and Theory
15
:
529
57
.

Dahlstrom
,
C.
, and
V.
Lapuente.
2010
.
Explaining cross-country differences in performance-related pay in the public sector
.
Journal of Public Administration Research and Theory
20
(
3
):
577
600
.

Dixit
,
A
.
2002
.
Incentives and organizations in the public sector: An interpretive review
.
Journal of Human Resources
37
:
696
727
.

Duflo
,
E.
,
R.
Hanna
, and
S. P.
Ryan
.
2012
.
Incentives work: Getting teachers to come to school
.
American Economic Review
102
(
4
):
1241
78
.

Einstein
,
K. L.
, and
D. M.
Glick
.
2017
.
Does race affect access to government services? An experiment exploring street-level bureaucrats and access to public housing
.
American Journal of Political Science
61
(
1
):
100
16
.

Finan
,
F.
,
B. A.
Olken
, and
R.
Pande.
2017
.
The personnel economics of the state
. In
Handbook of economic field experiments
, eds
A. V.
Banerjee
and
E.
Duflo
.
Elsevier
. pp.
467
514
.

Finer
,
H
.
1941
.
Administrative responsibility in democratic government
.
Public Administration Review
1
(
4
):
335
50
.

Friedrich
,
C
.
1940
.
Public policy and the nature of administrative responsibility
.
Public Policy
1
:
3
24
. Reprinted in Francis Rourke, ed. Bureaucratic Power in National Politics, 3rd ed. Boston: Little, Brown. 1978.

Gingerich
,
D
.
2013
.
Governance indicators and the level of analysis problem: Empirical findings from South America
.
British Journal of Political Science
43
(
3
):
505
40
.

Hasnain
,
Z.
,
N.
Manning
,
and
J. H.
Pierskalla.
2012
.
Performance-related pay in the public sector
. World Bank Policy Research Working Paper 6043. http://documents.worldbank.org/curated/en/666871468176639302/Performance-related-pay-in-the-public-sector-a-review-of-theory-and-evidence.

Heinrich
,
C
.
1999
.
Do government bureaucrats make effective use of performance management information?
Journal of Public Administration Research and Theory
9
(
3
):
363
93
, https://doi.org/10.1093/oxfordjournals.jpart.a024415.

Honig
,
D
.
2018
.
Navigation by judgment: Why and when top down management of foreign aid doesn’t work
.
Oxford, UK
:
Oxford Univ. Press
.

Ingraham
,
P. W.
,
P. G.
Joyce
, and
A.
Kneedler Donahue.
2003
.
Government performance: Why management matters
.
London, UK
:
Johns Hopkins Univ. Press
.

Kaufmann
,
D.
,
A.
Kraay
, and
M.
Mastruzzi
.
2010
.
The worldwide governance indicators methodology and analytical issues
. World Bank Policy Research Working Paper 5430. https://openknowledge.worldbank.org/handle/10986/3913.

Kelman
,
S.
, and
J.
Friedman.
2009
.
Performance improvement and performance dysfunction: An empirical examination of distortionary impacts of the emergency room wait-time target in the English National Health Service
.
Journal of Public Administration Research and Theory
19
:
917
46
.

Leonard
,
D. K
.
2010
.
“Pockets” of effective agencies in weak governance states: Where are they likely and why does it matter?
Public Administration and Development
30
:
91
101
.

Lynn
,
L.
,
C.
Heinrich
, and
C.
Hill
.
2000
.
Studying governance and public management: Challenges and prospects
.
Journal of Public Administration Research and Theory
10
(
2
):
233
61
.

Mcdonnell
,
E
.
2017
.
Patchwork Leviathan: How pockets of bureaucratic governance flourish within institutionally diverse developing states
.
American Sociological Review
82
(
3
):
476
510
.

Meier
,
K. J.
,
J.
Laurence
, and
J. R.
O’Toole
.
2002
.
Public management and organizational performance: The effect of managerial quality
.
Journal of Policy Analysis and Management
21
(
4
):
629
43
.

———.

2012
.
Subjective organizational performance and measurement error: Common source bias and spurious relationships
.
Journal of Public Administration Research and Theory
23
:
429
56
.

Miller
,
G
.
2000
.
Above politics: Credible commitment and efficiency in the design of public agencies
.
Journal of Public Administration Research and Theory
10
(
2
):
289
327
.

Miller
,
G.
, and
A. B.
Whitford
.
2007
.
The principal’s moral hazard: Constraints on the use of incentives in hierarchy
.
Journal of Public Administration Research and Theory
17
(
2
):
213
33
.

———.

2016
.
Above politics: Bureaucratic discretion and credible commitment
.
Cambridge, UK
:
Cambridge Univ. Press
.

Moe
,
T
.
2013
.
Delegation, control, and the study of public bureaucracy
. In
The handbook of organizational economics
, eds
R.
Gibbons
and
J.
Roberts
,
1148
81
.
Oxford, UK
:
Princeton Univ. Press
.

Moynihan
,
D. P.
,
S.
Fernandez
,
S.
Kim
,
K. M.
Leroux
,
S. J.
Piotrowski
,
B. E.
Wright
, and
K.
Yang
.
2011
.
Performance regimes amidst governance complexity
.
Journal of Public Administration Research and Theory
21
:
i141
55
.

Nicholson-Crotty
,
S.
, and
L.
O’Toole
.
2004
.
Public management and organizational performance: The case of law enforcement agencies
.
Journal of Public Administration Research and Theory
14
(
1
):
1
18
.

Olken
,
B.
, and
R.
Pande
.
2012
.
Corruption in developing countries
.
Annual Review of Economics
2012
(
4
):
479
509
.

Owusu
,
F
.
2006
.
Differences in the performance of public organisations in Ghana: Implications for public-sector reform policy
.
Development Policy Review
24
(
6
):
693
705
.

Perry
,
J.
,
T.
Engbers
, and
S. Y.
Jun
.
2009
.
Back to the future? Performance-related pay, empirical research, and the perils of persistence
.
Public Administration Review
69
(
1
):
39
51
.

Rainey
,
H. G.
, and
P.
Steinbauer
.
1999
.
Galloping elephants: Developing elements of a theory of effective government organizations
.
Journal of Public Administration Research and Theory
9
:
1
32
.

Rasul
,
I.
, and
D.
Rogger
.
2018
.
Management of bureaucrats and public service delivery: Evidence from the Nigerian Civil Service
.
Economic Journal
128
:
413
46
.

Romzek
,
B
.
2000
.
Dynamics of public sector accountability in an era of reform
.
International Review of Administrative Sciences
66
:
21
44
.

Rose-Ackerman
,
S
.
1986
.
Reforming public bureaucracy through economic incentives?
Journal of Law, Economics and Organization
2
:
131
61
.

Simon
,
W
.
1983
.
Legality, bureaucracy, and class in the welfare system
.
Yale Law Journal
92
:
1198
269
.

Talbot
,
C
.
2010
.
Theories of performance: Organizational and service improvement in the public domain
.
Oxford, UK
:
Oxford Univ. Press.

Tendler
,
J
.
1997
.
Good government in the tropics
.
Baltimore, MD
:
Johns Hopkins Univ. Press
.

Thomann
,
E.
,
N.
Van Engen
, and
L.
Tummers
.
2018
.
The necessity of discretion: A behavioral evaluation of bottom-up implementation theory
.
Journal of Public Administration Research and Theory
28
(
4
):
583
601
.

Wilson
,
J
.
1989
.
Bureaucracy: What government agencies do and why they do it
.
New York, NY
:
Basic Books
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)