To say that positive political theory (PPT) scholarship on the hierarchy of justice is theory rich and data poor is to make a rather uncontroversial claim. For over a decade now, scholars have offered intriguing theoretical accounts aimed at understanding why lower courts defy (comply with) higher courts. But only rarely do they subject the accounts to rigorous empirical interrogation. The chief obstacle, it seems, is the lack of a reliable and valid measurement strategy for placing judges of lower courts and justices of higher courts in the same policy space. Without such a strategy, we can systematically test few, if any, hypotheses flowing from PPT models of the judicial hierarchy. With such an approach not only can we investigate the implications of these models, we can assess many others flowing from the larger PPT program on judging, as well. It is to the challenge of scaling judges and justices (as well as legislatures and executives) that we turn in this article. We begin by explicating our measurement strategy, and then by explaining its advantages over previous efforts. Next we explore the results of our approach and provide a descriptive look at data it yields: a “Judicial Common Space” (JCS) score for all justices and judges appointed since 1953. The last section offers three applications designed to shore up the suitability and adaptability of the JCS for a range of positive projects on the courts.
The US Supreme Court's decision in Roper v. Simmons (2005), which invalidated the death penalty for defendants under the age of 18, already has elicited unusually searching commentary. Some analysts have taken Roper as yet another indication of the Court's growing wariness of the death penalty; others have focused on the escalating battle between the majority and Justice Scalia over the use of “international opinion” to adjudicate questions of American Constitutional Law; and still a third group has drawn attention to the Court's willingness to overturn its relatively recent decision in Stanford v. Kentucky (1989) to hold for Christopher Simmons (see, e.g., Babbin 2005; Banner 2005; Greenhouse 2005).
What has received virtually no attention, somewhat surprisingly, is that the justices were not the first to overturn Stanford: the Missouri Supreme Court set aside Simmons's death sentence on the ground that “a national consensus has developed against the execution of juvenile offenders … since Stanford.” But only Justices O'Connor and Scalia, writing in dissent, seemed to notice that a state court had directly defied precedent established by the top of the judicial hierarchy, the Supreme Court of the United States. “To add insult to injury,” Scalia wrote, “the Court affirms the Missouri Supreme Court without even admonishing that court for its flagrant disregard of our precedent in Stanford. Until today, we have always held that ‘it is this Court’s prerogative alone to overrule one of its precedents.' … Today, however, the Court silently approves a state-court decision that blatantly rejected controlling precedent.”
Scalia, of course, could have said much the same of the Fourth Circuit's decision in United States v. Dickerson (1999),2 and the Fifth's in Hopwood v. Texas (1996)3—not to mention any number of subtler forms of lower court deviation from precedents set by a higher court (e.g., distinguishing, limiting, or avoiding precedents). Indeed, as one observer noted well over half a century ago, “[Many] precedents have been rejected through the stratagem of distinguishment; others have been the subject of conscious judicial oversight. As a consequence, judicial discretion among ‘inferior’ judges is not so confined and limited as legal theorists would have it” (Comment 1941, 1448–9).
This observation raises a question that, depending on one's perspective, may be posed two different ways: Why do lower courts defy higher courts, or, given the minute percentage of lower court cases that are heard and reversed (these days, under 1%), why do lower courts comply with higher courts?
No shortage of scholarly responses exists but strategic models following from agency theory are particularly prominent in the contemporary literature. In general, these accounts assume heterogeneous policy preferences among judges and examine the incentives and opportunities created by various institutional features of the modern judicial hierarchy. But the specifics of the models vary, as do (at least some of) their empirical implications. One class emphasizes litigant policing by affected parties and suggests that lower courts will be more likely to deviate from Supreme Court precedent when the trial court judge and the appellate court panel share similar ideological dispositions (thereby enabling them implicitly to collude against a litigant and thus to keep information from the Supreme Court) (see generally, McCubbins and Schwartz 1984; Lupia and McCubbins 1994). A second set stresses strategic auditing by the Supreme Court in settings of “adverse selection,” and anticipates a higher likelihood of disobedience when ideological diversity on the lower court panel is low (thereby decreasing the likelihood of dissent and thus of an audit) (see, e.g., Cross and Tiller 1998; Cameron et al. 2000). Yet a third class highlights implicit tournaments to avoid review among lower court judges in a setting of “moral hazard” (see, e.g., Cameron 1993; McNollgast 1994). Among the many implications of this approach is that the odds of deviation increase as the ideological distance grows between the population of judges on the appellate courts and the Supreme Court.
Other specific hypotheses from principal-agent accounts are easy enough to derive. What may be more challenging, if past efforts are any indication, is empirically assessing them. The chief obstacle, it seems, is the lack of a reliable and valid measurement strategy for placing judges of lower courts and justices of higher courts in the same policy space. Without such a strategy, we can systematically test few, if any, hypotheses flowing from agency accounts. With such an approach not only can we investigate the implications of these models, we can assess many others flowing from the general positive political theory (PPT) program on judging, as well. So, for example, assuming that legislatures and executives too can be placed in the same space, we can consider the extent to which the separation-of-powers (SoP) system constrains lower courts—an understudied subject but nonetheless one of considerable interest to scholars working in the field (e.g., Revesz 2001).
It is to the challenge of scaling judges and justices (as well as legislatures and executives) that we turn in this article. We begin by explicating our measurement strategy, and then by explaining its advantages over previous efforts. Next we explore the results of our approach, and provide a descriptive look at data it yields: a “Judicial Common Space” (JCS) score for all justices and judges serving between 1953 and 2000. The last section offers three contemporary applications—all of which, we hope, shore up the suitability and adaptability of the JCS for the PPT project.
The goal of our measurement strategy is to place Supreme Court justices and Court of Appeals judges into a policy space that we call the JCS.4 Any measurement strategy that meets this goal should have a number of properties. The measures should be reliable and valid, they should not be issue or time dependent (e.g., they should be amenable to backdating and updating with the availability of new data), and, ideally, they should be comparable to measures developed for members of Congress and the President. The measurement strategy we outline below meets these desirata.
The starting point for our approach is the NOMINATE Common Space scores (Poole and Rosenthal 1997; Poole 1998) that are the result of a scaling algorithm that takes a set of issue scales (in this case, a set of measures for Representatives, Senators, and Presidents) fit term by term. Using legislators who have served in both chambers, Presidents who have served in the legislature, and stated presidential vote intentions, the algorithm provides an ideal point for all Representatives, Senators, and Presidents in a two-dimensional Downsian issue space (Downs 1957). We use only the predominant first dimension for our analysis (available for the 75th  through the 108th  Congress) (Poole 2005).5
Developed by Giles et al. (2001, 2002), the state-of-the-art measure for the preferences of US Court of Appeals judges (and, for that matter, federal district court judges) too relies on the Common Space scores but exploits the norm of senatorial courtesy. If a judge is appointed from a state where the President and at least one home-state Senator are of the same party, the nominee is assigned the NOMINATE Common Space score of the home-state Senator (or the average of the home-state Senators if both members of the delegation are from the President's party). If neither home-state Senator is of the President's party, the nominee receives the NOMINATE Common Space score of the appointing President.
For our purposes, the approach of Giles et al. has several key advantages. For one, its developers already have demonstrated that their measure exhibits face, convergent, and construct validity and outperforms other common measures, such as the party of the appointing President or the ideology of the state from which the judge is selected (Giles et al. 2002). Second, the strategy of Giles et al. locates all sitting circuit court judges in the Common Space.
Which leaves us with only one challenge: locating Supreme Court justices in the same space. To meet it, we rely on a vote-based measure of Supreme Court ideology developed by Martin and Quinn (2002).6 These “Martin-Quinn” scores, which are available for all justices in all terms from 1937 to 2003 (Martin and Quinn 2005a), are derived from voting patterns on the Supreme Court, and allow justices' ideal points to change over time. They are dynamic in that each justice has an ideal point in each term served.7
Because we can calculate the median on the Court for each term from the Martin-Quinn scores (see Martin et al. 2005), they may seem ideal for our purposes—but a problem of no small consequence emerges: the Martin-Quinn measures are not directly comparable to the NOMINATE Common Space scores. The Common Spaces scores are bounded below by −1 and above by 1, whereas the Martin-Quinn scores are theoretically unbounded (currently, they range from about −6 [Justice Douglas] to 4 [Justice Thomas]). What is more, and as we explain in the Appendix, an insufficient number of “bridging observations” exists to place circuit court judges and Supreme Court justices on the same scale. The solution thus must lie in choosing a reasonable transformation from the Martin-Quinn space to the Common Space.
To make this choice, we turn to work by Sala and Spriggs (2004) whose theoretical tack to transforming Martin-Quinn space into Common Space hinged on an important article by Moraski and Shipan (1999). In seeking to model the nomination game played between the President and the Senate when faced with a vacancy on the Supreme Court, Moraski and Shipan (1999) classified all nominations as “unconstrained” (the President can make a nomination at his ideal point), “semiconstrained” (the President can move the Court median closer to his ideal), or “fully constrained” (the President cannot affect the ideological composition of the Court). Sala and Spriggs (2004) creatively used data from the unconstrained nominations to validate their transformation between the Martin-Quinn and the NOMINATE scores.
Our approach also relies on the unconstrained confirmed nominees to the Supreme Court to estimate the transformation between the Martin-Quinn space and the Common Space but we invoke a different transformation (as well as a distinct validation strategy).8 What results from this procedure is a score for each term for each justice (and measures for the Court as a whole, such as its median member) who resides in the JCS.9
In the next section, we present the details of this estimation exercise, as well as several validations of the transformation; and in the Appendix, we compare our measurement strategy against others. For now, though, it is important to understand the assumptions underlying our approach. First, it requires that the vote-based NOMINATE Common Space scores and the Martin-Quinn scores are reasonable measures of actors' sincere preferences. Although scholars have provided evidence of sophisticated voting in these institutions (see, e.g., Spiller and Gely 1992; Calvert and Fenno 1994; Epstein and Knight 1998; Martin 2001), owing to the large data sets and the overtime nature of the scores, we suspect that the effect of insincere behavior is small. Second, in light of the (solid) case that Giles et al. (2002) make for their inferential measurement strategy, we accept that it is valid.10 Third, we must assume that the predominant issue dimension across all three institutions taps the same substantive issues. It might be the case, for example, that actors perceive their roles quite differently when placed in different institutional settings, thereby rendering their issue preferences incomparable. If this criticism is to be believed, it damns not only this empirical exercise but also any theoretical work that relies on spatial models. Since it is precisely these models that we wish to assess empirically, we think it sensible to proceed with an empirical approach consistent with those very models. Finally, we must believe that the transformation from the Martin-Quinn space to the NOMINATE Common Space is fundamentally correct. In some ways this is a matter of faith but, as we explain below, it is a matter that we have tried to validate in several different ways.
Estimating the JCS
Placing Supreme Court justices in the JCS requires us, for the reasons we suggest above, to transform the Martin-Quinn scores. The data we use are the unconstrained (Moraski and Shipan 1999) confirmed nominees to the Supreme Court (n = 15), in addition to the Common Space score for the President (CSi) and the Martin-Quinn score for the justice in her (his) first year of service (MQi). A linear transformation will not suffice, recall, because the Martin-Quinn scores are unbounded (though empirically run from −6 to 4), whereas the NOMINATE Common Space scores are bounded below at −1 and above at 1. It is thus necessary to estimate a nonlinear transformation between these two variables. We use a rescaled tangent transformation on CSi and fit the following model using ordinary least squares:
In order to map Martin-Quinn scores into the JCS, we must solve equation (1) for CSi. The resultant prediction equation, which we refer to as the “arctangent predictor,” is
In comparing the linear and arctangent predictors, as we do in Figure 1, it is clear that both behave quite similarly in the middle of the Martin-Quinn space. But, as it turns out, for extreme justices the linear predictor inappropriately places them outside the −1 to 1 interval. Substantively speaking, this suggests that the linear predictor, although suitable for moderate justices, is not so for those at the ends of the left-right scale.
Figure 1, although informative, reveals little about the face validity of our transformation. To explore that matter, we first compare the distribution of ideal points for members of Congress, judges on the federal circuits, and justices on the Supreme Court. We plot these empirical densities in Figure 2. Of interest here is the distribution for Supreme Court justices, which, in the transformed Space, looks plausible enough: two modes emerge, one slightly to the left and the other slightly to the right. The very extreme members (Justices Douglas and Marshall on the left, Justices Scalia and Thomas on the right) receive scores near the endpoints of the Common Space. Of note too is the bimodal distribution of ideal points for the Courts of Appeals.
If the arctangent predictor is reasonable, it also should hold that the measures are related across institutions. Since 12 Supreme Court justices confirmed between 1953 and 2000 also served on a circuit court, we can compare their (Giles et al. 2001) Court of Appeals score with their first-term–transformed Martin-Quinn score. Should the measures be comparable, they ought line up on a 45° line.
In Figure 3, we explore this criterion by plotting the scores. Notice, first, that they correlate quite highly and fall close to the 45° line. This implies a good model fit. In addition, because half of the points fall above the 45° line and half below, Figure 3 evinces no directional biases in the transformation (i.e., the transformation does not disproportionally predict overly liberal or conservative behavior). What this suggests, in turn, is that the final piece of the JCS—the arctangent predictor in equation (2)—is sensible.
The Judicial Branch
With the JCS in hand, we can now present comparable measures for the Courts of Appeals and the Supreme Court. In Figure 4, we plot the year-by-year median for 11 numbered circuits and the D.C. Court of Appeals. We also locate the median of the Supreme Court as a point of reference.
Momentarily, we provide several illustrations of how scholars might deploy these scores for a range of positive research projects. For now, let us simply point out that the JCS appears to square with our impressions of the circuits. So, for example, today's most liberal are the Second and the Ninth, whereas the Fourth and Fifth are among the most conservative. Notice too and just as we might expect, the rightward swing during the Reagan years, followed by a shift to the left in the 1990s when Clinton added 61 appellate court judges (though, as we explore in Section 4, the effect of the Clinton regime was stronger in some circuits than others).
The ideological composition of the entire circuit, as we depict it in Figure 4, is of importance in many contexts but so too is the composition of an individual panel. If scholars are interested in modeling venue shopping or the decision over whether to appeal a district court's decision, the distribution of the median judge in possible three-judge panels would be of interest—and we can deploy the Courts of Appeals measures to identify it. Figure 5 provides a simple example. There we use simulation to create the empirical density of the median judge on three-judge panels for the Fifth and Ninth Circuits in 1998.11 Note that for the Ninth, a 0.67 probability exists that a panel would be to the left of 0 in the JCS; in the much more conservative Fifth Circuit, that probability is only 0.37. Further, although it is possible to attain a moderately conservative panel on the Ninth, far less likely is one that is extremely conservative, though this remains a distinct possibility for the Fifth.
The Legislative and Executive Branches
A chief advantage of the JCS approach is that the scores are in NOMINATE Common Space (Poole 1998) that makes possible relatively precise empirical explorations of the American SoP system. Figure 6 provides one example: a comparison between the median justice on the Supreme Court with the median member of the House or Senate. But we need not exclude the executive from analyses of interbranch politics. Quite the opposite: Since Common Space scores also exist for Presidents (see Table 2), they too can be integrated into the JCS.
As we suggested at the onset, our chief motivation for developing the JCS was to facilitate assessment of positive theories of the judicial hierarchy. In what follows we provide a taste of how scholars could deploy our approach to do just that. Believing that the JCS may be useful for other types of PPT research, as we have hinted throughout, we supply quick examples of two additional applications: one centering on the SoP system and the other on judicial nominations.
Our intent here, as our emphasis on “taste” and “quick” implies, is not to write three additional articles but rather to sketch out how researchers can exploit the JCS for a range of projects. On the other hand, by offering these examples, we do not mean to suggest that the JCS is suitable for the assessment of each and every implication flowing from positive accounts of judging. So, for example, although the JCS scores are perfectly appropriate for explorations of the extent to which the preferences and likely actions of elected actors may constrain the Courts of Appeals, researchers will need to take several additional steps to deploy them to study the effect of the SoP system on the US Supreme Court. The distinction here hinges on the underlying data we used to generate the ideological assessments of the judges and justices: for the former, we invoke data independent of their votes (the President's or Senators' NOMINATE scores); for the justices, we make use of the (transformed) Martin-Quinn estimates, which are, in fact, developed from votes. To the extent that accounting for justices' votes (as is typically the goal in SoP research) via a preference-based measure that relies on their votes is circular, the pure JCS scores (i.e., the transformed Martin-Quinn scores) are problematic—but, we hasten to note, still serviceable. Should analysts wish to study the effect of the SoP system on the Court's decisions in the area of, say, civil rights, all they need to do is remove civil rights cases from data used to generate the Martin-Quinn estimates, recompute those estimates, and then transform them (as we have done) into the JCS. By purging the particular issue area of interest, in other words, the JCS scores become perfectly appropriate for use in SoP studies of the Supreme Court, along with any other research on the role of the justices' preferences in decision making. This “purging” step, it is worth reiterating, is not necessary in the applications we sketch below since all three focus on circuit judges, and not justices.
The Hierarchy of Justice
We began this article with a trio of strategic models flowing from a principal-agent approach to lower-higher court interactions: litigant policing, strategic auditing, and “tournaments” among lower courts. To the extent that these represent distinct conceptual accounts of the hierarchy of justice, they are capable of generating unique implications. Tournament models, to provide but one example, suggest that deviations from Supreme Court precedent are more likely to occur by judges on lower court panels who have been reversed by the Supreme Court (relatively) frequently in prior cases. These “repeat offenders” or “reversal-insensitive” judges, to put it succinctly, may be more likely to deviate in future cases.
And yet, since all three strategic models flow from a unifying account—agency theory—they share several features. Most relevant here is that they draw a distinction between the preferences of the enacting Court (embodied in the existing legal doctrine), the preferences of the contemporaneous Supreme Court (the preferences of the Court at the time the lower court is considering the case), and the preferences of the lower court hearing the case. Hence, regardless of the specific model, three configurations of the players in ideological space (depicted in Figure 7) are of interest. In Configuration 1, the lower court undertakes doctrinal deviation if it pursues its own preferences. By doing so, however, it engages in what we might call hierarchical conformity, as the lower court's action actually conforms to the preferences of the contemporaneous Supreme Court. In Configuration 2, the lower court—if it pursues its own preferences—engages in doctrinal conformity but hierarchical deviation to the extent that it will reach a decision distant from the preferences of the current Supreme Court. In Configuration 3, a lower court that pursues its own preferences engages in both doctrinal and hierarchical deviation.
Following from these configurations are a number of hypotheses common to all three principal-agent models. For example, lower courts will be more likely to engage in doctrinal deviation when:
The spatial location of preferences conforms to Configuration 1 rather than to Configurations 2 or 3. That is, a lower court L is most likely to deviate from E when it is allied with the contemporary Court against the enacting Court.
The distance between E and C increases, controlling for configuration. Thus, even when the lower court prefers the enacting Court's doctrine, increasing the distance from the enacting Court to the contemporary Court leads reversal-sensitive judges to shy away from the enacting Court's doctrine.
The distance between E and L increases, controlling for configuration.
The distance between C and L increases, controlling for configuration
We do not attempt here to assess empirically these and other implications; again, that would require another article or two. What we wish to point out instead is that the JCS provides a necessary building bloc—a primitive, really—for undertaking this task. That is because the scores enable us to place the enacting Court (doctrine), the lower court, and the current (contemporaneous) Court in the same policy space.
Figure 8 provides a simple example, depicting the JCS for the two (federal) lower court deviations we referenced at the article's outset: Dickerson (a departure from Miranda; see note 1) and Hopwood (a departure from Bakke; see note 2). Beginning with Dickerson, note that the JCS resembles Configuration 1 in Figure 7: the enacting Miranda (1966) Court is well to the left of the Fourth Circuit and the Supreme Court (in 1999, when the Fourth decided Dickerson). It is thus not particularly surprising that the lower court engaged in doctrinal deviation but hierarchical conformity. The enacting (Bakke) Court also was to the left of the Court in 1996, the year the Fifth Circuit handed down its decision in Hopwood. But note that the contemporaneous Court was closer to Bakke than to the Fifth, suggesting that the lower court, to a certain extent, engaged in doctrinal and hierarchal deviation. Perhaps, as a tournament model might suggest, the judges on the lower court panel were repeat offenders and, thus, somewhat more willing to deviate than otherwise expected. Alternatively, a model specifying the conditions of strategic auditing by hierarchical superiors might best capture the Fifth's decision.
These possibilities, as well as many others, require far more consideration than space permits, and they require consideration over a large number of cases at that. But the important point here is that such an empirical evaluation, and a rather precise one, is now possible via the JCS. This is not to say that the JCS supplies all the information necessary to undertake such an analysis: additional data, such as the number of previous overrulings of panel members by the Supreme Court, must be amassed. It is only to say that the JCS takes an important first step.
The SoP/Checks-and-Balances System
In a seminal article published in the Yale Law Journal, Eskridge (1991) reports that between 1967 and 1990, Congress overturned 344 statutory decisions—nearly two-thirds of which (n = 220) were issued by the lower federal courts, not the US Supreme Court. Given these numbers, it is somewhat surprising that few scholars have explored the extent to which the threat of legislative override constrains circuit court decision making; in fact, Revesz (2001) is the only (published) systematic effort of which we are aware.12 That article examines the impact of changes in the party composition of Congress and the President on the D.C. Circuit's review of “health-and-safety decisions” rendered by federal agencies between 1970 and 1996.
The author found no effect. “The empirical analysis,” Revesz asserts, “does not support any of the hypotheses derived from the positive political theory models on the impact of changes in the composition of the political branches on judicial votes.” As a result, he contends, that “serious questions [emerge] about the plausibility of the claim by positive political theorists that judicial review of administrative action serves the interests of the current Congress.”
Perhaps Revesz is right but in light of his method—which employs political party as a surrogate for the ideology of Congress, the President, and the Circuit—as well as plausible theoretical reasons to expect Congress to have some effect on (lower) federal court decisions (see, e.g., Spiller and Gely 1992; Epstein and Knight 1998; Segal and Spaeth 2002), his rather strong conclusion deserves reconsideration. And that is where the JCS may be of use. No longer do we need deploy party as a (hardly unproblematic) indicator of political preferences;13 and no longer do we need to speculate about the relative distance (proximity) of Congress to the lower courts; rather, we can easily place Congress, both houses, specific committees, and even individual members of the legislature in the same policy space as the circuits (see, e.g., Figure 6). So doing will enable far more accurate assessments of claims embedded in the PPT literature.
The President too can be integrated more precisely into tests of SoP models: our JCS database contains an annual score for him, in addition to the legislature and judiciary (see also Table 2). But other examinations of the effect of presidential regimes on the courts also are possible. So, for example, we can now assess the impact of any given administration's appointments on the circuits, as we do for the Clinton presidency in Figure 9. Note that for some—most notably the First, Seventh, and Tenth—the President and his advisors barely made a dent in their ideological composition. For others (especially the Second and the Sixth), Clinton was able to move them considerably to the left (or at least to a position far more liberal than the Supreme Court).
The Judicial Appointments Game
Deploying the JCS to study the effect of presidential regimes on the circuits may be useful for any number of research projects but perhaps none more so than appointments to the federal bench. When it comes the circuits, an emerging body of literature suggests that the duration of delays over appointments to these courts increases when Senators perceive the nomination as “critical”—meaning that it could swing the ideological balance of the circuit toward the left or right (see, e.g., Binder and Maltzman 2002). Extant studies tend to define a critical nomination vis-à-vis the party affiliation of the nominee or the appointing President (e.g., Binder and Maltzman  deem a “critical nomination” as one in which the percentage of Democratic judges on the particular circuit is between 40 and 60). But again, in light of the concerns many scholars have raised about the use of party-based measures to capture judicial ideology (see note 12), the JCS may provide a more precise and valid indicator (see, e.g., Figures 4 and 9)—along with, crucially so, a method for capturing the distance between the Circuit's political preferences and the Senate's.
This line of research deals with the confirmation of candidates to the courts of appeals; another area of interest is the extent to which appointments to these courts may be telling of the future composition (and perhaps ideological direction) of the US Supreme Court: With only three exceptions since 1969, every justice appointed to the Supreme Court served as a federal circuit judge; since 1986, there have been no exceptions. This “norm of prior judicial experience” (Epstein et al. 2003) seems so inculcated that in their widely publicized “tournament of judges,” Choi and Gulati (2004a, 2004b); allow only federal circuit court judges to compete for seats on the Supreme Court.
The Choi and Gulati project focuses exclusively on “objective measures of merit”: tournament winners are those judges who publish many opinions, have high citation rates, express “independence” via dissents, and so on. Whether this is a desirable approach to filling spots on the high court is arguable. What is not a matter of debate is that the Choi and Gulati approach fails to capture the realities of judicial appointments: As research grounded in PPT tells us, Presidents and Senates—although not inattentive to candidates' qualifications—are quite concerned with moving the Court's ideological median as close as possible to their ideal points; the President, of course, also must attend to critical players within the Senate so as to ensure confirmation of his candidate (see, e.g., Segal et al. 1992; Moraski and Shipan 1999).
Here too the JCS can help in assessing predictions generated by an important line of PPT research. Certainly we can deploy the scores to identify the preferences of the key actors in the appointment process; we also might make use of them to consider how particular candidates from the circuits could affect the Court's median. Figure 10, which employs the JCS scores to map the positions of the current justices, and Table 3, which lists the scores of six leading candidates for a seat on the Court, provide the makings of such a consideration.14
|Samuel Alito||Third since 1990||0.525|
|Emilio Garza||Fifth since 1991||0.532|
|Michael Luttig||Fourth since 1991||0.250|
|Michael McConnell||Tenth since 2003||0.347|
|John Roberts||D.C. since 2003||0.538|
|J. Harvie Wilkinson||Fourth since 1984||0.259|
|Samuel Alito||Third since 1990||0.525|
|Emilio Garza||Fifth since 1991||0.532|
|Michael Luttig||Fourth since 1991||0.250|
|Michael McConnell||Tenth since 2003||0.347|
|John Roberts||D.C. since 2003||0.538|
|J. Harvie Wilkinson||Fourth since 1984||0.259|
Beginning with Figure 10, notice that the mapping conforms to common perceptions: Scalia and Thomas anchor the right, with the four “liberals” on the left. O'Connor, of course and in line with virtually every account of the current Court, is the median. Now consider the JCS scores of the oft-mentioned contenders (all sitting circuit court judges) for a position on the Court should O'Connor or Rehnquist depart at the end of the 2004 term. Each possible nominee, as Table 3 shows, is quite conservative—a result that also comports with conventional wisdom about these would-be justices.
Assuming that George W. Bush is free to nominate anyone on this list—a potentially problematic assumption in light of the possibility of a filibuster—and assuming that the President prefers a more conservative to a more liberal median—a far less onerous assumption—will the Court move much to the right? In the case of a Rehnquist retirement the answer is no: all the names in Table 3 are to the right of the median, O'Connor. That is not so if it is O'Connor who vacates her seat: Should Bush appoint any candidate listed in the table, Kennedy would move into the median position, with the resulting Court a good deal more conservative.
Such predictions are of contemporary interest but historical counterfactuals are also possible. Suppose, for example, that Ronald Reagan had appointed Richard Posner rather than Anthony Kennedy to the seat vacated by Lewis Powell in the 1986–1987 term (see Figure 11). This is a hypothetical over which many scholars have speculated but we need not engage in uninformed guesswork. Both Kennedy and Posner were circuit court judges in 1986–1987, with JCS scores of 0.409 and 0.006, respectively. In other words, at the time of Powell's retirement we would predict Kennedy to be among the more conservative members of the Court, whereas Posner would have replaced Powell as the median.
Theorizing about the hierarchy of justice could take many forms. But requisite (though insufficient) to assessing the empirical implications of virtually any positive account of lower-higher court interactions is a method for capturing the preferences of both courts and, crucially, the distance between them. In this article, we offer such a measurement strategy—the JCS—a strategy we believe to be both valid and reliable.
At the same time, we have worked to show that the JCS may have something to offer to the PPT program beyond its concern with the judicial hierarchy. Many other projects flowing from that program also have made effective use of spatial bargaining models to generate insights about the formation of law and policy. Some have been subjected to rigorous empirical scrutiny; others are left untested or simply illustrated with one or two exemplary cases. We believe that JCS can offer a corrective—in the form of foundational tool—to test rigorously the predictions of these models. Although it might be tempting to dismiss this measurement strategy—or, for that matter, any measurement strategy that necessitates comparisons between (or among) justices, judges, legislators, and executives—we ought to keep in mind that the theoretical models themselves rest on these assumptions. If we desire to assess adequately their empirical implications, as we should, then measures such as the JCS are fundamental to the enterprise.
Other Modeling Strategies
Over the last decade or so, measurement models have enjoyed something of a renaissance in the social sciences (e.g., Poole and Rosenthal 1997; Clinton et al. 2004). Can we identify a model-based strategy for operationalizing a JCS—a strategy that might be distinct from the NOMINATE Common Space but would nonetheless allow scholars to study the judicial hierarchy? We have explored a number of options, and our best guess is “no.” Let us elaborate.
Social scientists have developed essentially three model-based strategies to place actors from different institutions into a common space. One set uses individuals to “bridge” observations across institutions. Poole (1998), for example, exploits the fact that many Senators served in the House, and some Presidents served in the House, Senate, or both to derive the NOMINATE Common Space measures (he also counts Presidents as voting when they announce their vote intentions on particular bills).15 To deploy Poole's scaling algorithm, the researcher must have sufficient data to scale actors in both institutions. Scaling the Supreme Court is relatively straightforward (Martin and Quinn 2002); scaling the circuit courts, however, is essentially impossible due to the existence of three-judge panels. With only three actors casting votes in each individual case, not enough data exist to estimate ideal points and case-specific parameters.16 Using overlapping service is thus not viable in this context, although, as we show in Figure 3, it can be used to validate measures.
The second model-based approach employs common cases or bills to bridge institutions (Bailey and Chang 2001). If we are willing to believe that choices within an institution are strategically independent of one another, then, we can compare, say, a congressional vote on a statute with a Supreme Court decision in the statutory context. Bailey and Chang (2001) exploit this design to scale members of Congress, the President, and Supreme Court justices. But, for two reasons, this approach is deficient when dealing with data from the Courts of Appeals. First, it is improbable that circuit court judges act sincerely on cases, especially on those they believe the Supreme Court might review. Second, the circuit court sets the reversion point from the Supreme Court. This implies that a “liberal” vote on the appellate tribunal may be distinct from a “liberal” vote on the Supreme Court since the status quo points are different.
Nonetheless, and despite these criticisms, we were interested in how such an approach would perform in practice. Accordingly, we collected data on every circuit court decision that led to a Supreme Court case (1946–2003), and coded the polarity of the decision (liberal or conservative). Typically this (extremely sparse) vote matrix produced 12 votes per case: the nine Supreme Court justices and the three judges on the panel. We then fit a one-dimensional item response theory model to this vote matrix using MCMCpack (Martin and Quinn 2005b), identifying the model by fixing Rehnquist at 3 and Marshall at −3.
In Figure 12, we plot the results of this analysis. In the left-hand cell, we compare the Supreme Court estimates with those from a model fit only to the justices. These points should line up on the 45° line, but it is clear that they do not, that instead a conservative bias emerges. Estimates from the full model are even further to the right.
In the right-hand cell, we compare the distributions of estimated ideal points for the Supreme Court justices with the circuit court judges. Here, we see a liberal bias for the distribution of Court of Appeals judges and a conservative bias for Supreme Court justices. This is not terribly surprising since the Supreme Court typically reverses decisions of the appellate court. But it does suggest that, facially, the measures derived from this scaling exercise are inappropriate. In other words, bridging via case stimuli will not work in this context.
A final strategy is to embed a strategic model within the statistical measurement model. Two examples of this approach are Clinton and Meriowitz (2004), which explicitly models the agenda over a handful of congressional votes, and Martin and Quinn (2001), which invokes a hierarchical model to recover status quo and alternative points for Supreme Court cases using information about the circuit of origin. These (very technical) approaches exhibit some promise, but the three-judge panel makes them infeasible without extraordinary assumptions about case stimuli. We are thus skeptical about the promise of these approaches in the context of the hierarchy of justice.
We also owe thanks to Charles M. Cameron and Scott Comparato for supplying valuable insights; to the National Science Foundation for supporting our research on the hierarchy of justice and ideal point estimation; and to Micheal Giles for making available his data on appellate judges. On the project's Web site (http://epstein.law.northwestern.edu/research/JCS.html) is a full replication archive, including databases housing the JCS scores, as well as the documentation necessary to reproduce our results.
For each calendar year 1953–2000, House, Senate, Supreme Court, and Appeals Court (for each circuit) medians.
For all Appeals Court judges in years 1953–2000, updated the scores of Giles et al. (2001); that is, JCS scores.
For all Supreme Court justices in all terms 1952–1999 (calendar year 1953–2000), JCS scores (transformed Martin-Quinn scores).