## Abstract

The foundation upon which accounts of policy-motivated behavior of Supreme Court justices are built consists of assumptions about the policy preferences of the justices. To date, most scholars have assumed that the policy positions of Supreme Court justices remain consistent throughout the course of their careers and most measures of judicial ideology—such as Segal and Cover scores—are time invariant. On its face, this assumption is reasonable; Supreme Court justices serve with life tenure and are typically appointed after serving in other political or judicial roles. However, it is also possible that the worldviews, and thus the policy positions, of justices evolve through the course of their careers. In this article we use a Bayesian dynamic ideal point model to investigate preference change on the US Supreme Court. The model allows for justices' ideal points to change over time in a smooth fashion. We focus our attention on the 16 justices who served for 10 or more terms and completed their service between the 1937 and 2003 terms. The results are striking—14 of these 16 justices exhibit significant preference change. This has profound implications for the use of time-invariant preference measures in applied work.

## Introduction

Do the revealed preferences of Supreme Court justices change over time?1 The answer to this question is of profound importance to both policymakers and academics. When nominating someone to the US Supreme Court, presidents typically want to appoint a like-minded individual who will hold his or her ideological course for the entirety of his or her life term. Similarly, before voting on a nominee, senators need to form an expectation of how that person will decide cases over as many as the next 20–30 years. If justices tend to exhibit temporally stable revealed preferences it is *relatively* easy to form such expectations of future behavior and we might expect the ideological makeup of the Court to be extremely reflective of the balance of power between the Democratic and Republican Parties at the times of nomination.

The temporal stability of revealed judicial preferences is also of great importance to scholars of judicial politics. If the assumption of preference stability does not hold, then the findings of all studies that rely on time-invariant measures of justice preferences may be called into question. We note that this includes a very large fraction of judicial politics work appearing in the top journals. But for the study of Epstein et al. (1998) no systematic empirical analysis has determined the extent to which judicial preferences change over time.

In this article we employ a Bayesian dynamic ideal point model developed by Martin and Quinn (2002) to estimate revealed preferences for all Supreme Court justices serving between 1937 and 2003. This model allows us to separate the effects of case content from the effects of justice-specific policy positions and estimate ideal points that are on a comparable scale over time. Furthermore, because these ideal point estimates are based on a statistical measurement model, we can gauge uncertainty of the estimates and other quantities of interest. We use the results from this model to demonstrate that the revealed preferences of Supreme Court justices are far from stable.

The current article differs from that of Martin and Quinn (2002) in that it is solely concerned with assessing whether or not the revealed preferences of Supreme Court justices change over time, whereas the article by Martin and Quinn (2002) focuses on the general modeling strategy that makes the present article possible. The current article takes the model developed in Martin and Quinn (2002) as a starting point and examines the issue of preference change in a more systematic and comprehensive fashion than was done in Martin and Quinn (2002). It also extends the model by including data back to the 1937 term.

We begin this article by reviewing the literature and arguing that the Epstein et al. (1998) study has methodological limitations that restrict its ability to uncover preference change. We then discuss the measurement model we employ, highlighting its ability to estimate preferences while controlling for changes in case stimuli. In Section 4, we discuss research design and present the evidence for preference change. The final section concludes.

## A Methodological Critique of the Literature

Personal policy preferences, or attitudes, are key explanatory variables in attitudinal accounts of judicial behavior (Segal and Spaeth 1993). If the attitudinal model is true, then the revealed preferences, or ideal points, of the justices will correspond to their personal attitudes. Strategic accounts of Supreme Court decision making also make use of preferences for explaining interdependent behavior. As Epstein et al. (1998) note, the prevailing wisdom in the study of judicial behavior is that, “[t]he occasional anomaly notwithstanding, most jurists evince consistent voting behavior over the course of their careers” (801).

### The Assumption of Constant Preferences

The attitudinal model (Schubert 1974; Rhode and Spaeth 1976; Segal and Spaeth 1993) asserts that justices have personal attitudes and that case material provides stimuli that trigger the justices' attitudes and consequently their decisions. The model does not *explicitly* assume that attitudes are fixed. However, nearly all empirical work related to the attitudinal model employs constant measures of attitudes. The most commonly used measures are Segal and Cover (1989) scores, which are based on newspaper editorials at the time of confirmation. Others use measures such as the party identification of the justice (George and Epstein 1992) or measures of social background (Tate 1981). In some areas, such as civil rights and civil liberties, these time-invariant measures are shown to be quite successful in accounting for votes, but in others, like economics and federalism cases, their performance is much less impressive (Epstein and Mershon 1996). An alternative explanation of behavior—the strategic model (Eskridge 1991; Epstein and Knight 1998)—asserts that justices have policy preferences and pursue their interests in an interdependent choice situation. Although, this model does not necessarily assert that preferences are constant, nearly all empirical work in this genre employs measures where this is the case, most notably Eskridge (1991) and Segal (1997) who use Segal and Cover (1989) scores, and Spiller and Gely (1992) who use the party of the appointing president.

The key point to take from the literature is that the assumption of constant preferences in not a theoretical one per se, but rather is chosen for empirical convenience. It is somewhat surprising, then, that little systematic research has been conducted to determine whether or not the assumption is consistent with the data. The anecdotal evidence suggests that preference change sometimes occurs (see, e.g., Ulmer 1981; Atkins and Sloope 1986, as well as accounts in the law reviews and the popular press).

These anecdotal accounts are suggestive, but to draw definite conclusions it is necessary to systematically study the behavior of many justices over time. The first to do so was Baum (1988), who was primarily interested in *policy* change on the Court. Although he claims there may be some preference change, he concludes that case stimuli, not preference change, are what explain the observed dynamics. In the only study with the goal of assessing preference change, Epstein et al. (1998) look at all 16 justices who served 10 or more terms and served their entire career between 1937 and 1993.2 They contend that it is vital to look at justices who have served for a long period of time and to only look at justices who have completed their entire service in the time period (otherwise, one could underestimate the number of justices who demonstrated significant preference change). To measure preferences, they argue that votes are the best place to look (Epstein and Mershon 1996) and use the Baum-corrected (1988) percentage of liberal votes on civil liberties as their measure of policy preferences. Given this measure, they fit linear, quadratic, and cubic regressions of preferences on time. They find that seven justices exhibit no significant preference change (Brennan, Burger, Burton, Harlan, Jackson, Marshall, and Stewart), four exhibit linear trends (Blackmun, Clark, Reed, and White), and five exhibit nonlinear change (Black, Douglas, Frankfurter, Powell, and Warren). Their conclusion is that preference change is significant and that it should be accounted for in future studies. However, these results depend on an ideal point estimator (a justice's Baum-corrected percentage of votes in the liberal direction on civil liberties cases) that is not well suited to the task at hand. As Baum (1988) notes, one of the three assumptions on which his method is based is “each justice's ideal point on the civil liberties dimension remains *constant throughout the justice's career*” (907, italics added).

### The Baum Correction

The Baum (1988) correction is the tool Epstein et al. (1998) employ to tie together estimates of ideal points throughout time. To formalize the problem of dynamic ideal point estimation, let θ_{t,j} ∈ denote the ideal point or policy position of justice *j* in term *t*. Furthermore, without loss of generality, let *x*_{k}^{(l)},*x*_{k}^{(r)} ∈ with *x*_{k}^{(l)} ≤ *x*_{k}^{(r)}. If *x*_{k}^{(l)} < *x*_{k}^{(r)} we can think of *x*_{k}^{(l)} and *x*_{k}^{(r)} as the locations of the liberal and conservative policy alternatives for case *k*, respectively. These two case parameters contain the information about the policy content of each case. The midpoint between these two policy locations (*x*_{k}^{(l)} + *x*_{k}^{(r)})/2 (also called the indifference point) determines the manner in which the justice votes.3 If, for example, *x*_{k}^{(r)} is strictly greater than *x*_{k}^{(l)}, then those to the left of the midpoint will be more likely to vote for the liberal option and those to the right will be more likely to vote in the conservative direction.

One approach to estimating θ_{t,j} is to take the raw average of the number of liberal decisions made by justice *j* in term *t*. This approach is not without problems. Figure 1 contains an illustration of two hypothetical configurations of preferences in terms *t* – 1 and *t* for nine justices. In the top line of the figure in term *t* – 1 the midpoint falls between Justice 5 and 6. In the second line, the location of the midpoint has changed and now falls between Justice 3 and 4. Notice that in the figure the preferences remain the same, but the observed vote would change (it would be 5–4 in the first case, and 6–3 in the second). Thus, computing a raw average across a set of cases could be misleading unless the changes in case stimuli are controlled for.

Baum recognizes this fact and offers a correction to account for it. In his case, between two natural Courts (or, for Epstein et al. [1998], between terms), one computes the median change in the percentage of liberal votes made by each justice and then takes the median of these differences across all justices serving in those natural Courts (terms). The Baum correction constructs an ideal point estimate by subtracting this median difference from each justice's percentage of liberal decisions. More formally,

*j*in time period

*t*and

*J*

_{(t–1):t}is the set of justices who served in both time period

*t*and time period

*t*– 1. The Baum correction thus cleanses case content from justices' voting behavior by assuming that preferences are fixed and that any dynamics are solely in the case parameters.

As Baum notes, this correction is only appropriate if preferences are temporally constant. Indeed, by inspecting Figure 1, it is clear that if preferences were also allowed to move freely, it would be impossible to determine whether the derived correction was explained by changes in the case stimuli or changes in preferences. Without additional modeling, the two are in fact conflated. Observed changes in Baum-corrected percent liberalism measures will tell us that *something* changed (either the ideal points or the case parameters), but it cannot tell us *which* changed.4 Thus, since one of the major assumptions underlying the Epstein et al. (1998) study is inconsistent with the primary research goal of that study, one should view the results of Epstein et al. (1998) with some caution.

There are other limitations that call the Epstein et al. (1998) findings into further question. First, the authors treat their estimates as if they are known with certainty. However, they are estimates and thus have some estimation uncertainty attached. It is well known that failing to account for this uncertainty in the dependent variable will bias standard errors (SEs) downward. To test for significant preference change, Epstein et al. (1998) estimate linear, quadratic, and cubic models of ideal points regressed on time. From these regressions they draw the conclusions highlighted above. Not only does this assume that the variance of the regression disturbances are constant (which is unlikely to be the case given the underlying probability model) but it also rests heavily on the parametric assumption that ideal points follow low-order polynomials in time. If a justice's preferences change for only a small subset of terms, the slope estimates will be attenuated toward zero and thus biased against finding preference change. In short, even if the Baum correction was accurate, the Epstein et al. (1998) method of assessing preference change may be overly liberal by treating ideal point estimates as known quantities and may be overly conservative by estimating global models of preference change.

## Bayesian Dynamic Ideal Point Estimation

From this review it is clear that the question of whether or not the preferences of Supreme Court justices change over time is still open. It is also clear that to answer the question one requires a statistical model that allows for ideal points to exhibit a wide range of dynamics, based on a parametric statistical model that *simultaneously* estimates case stimuli and ideal points. Measures of uncertainty should be reported, and accounted for in diagnosing preference change. Further, tools other than global regression models are needed to assess the amount and magnitude of preference change.

The model we employ begins with two assumptions. First, justices vote in accordance with the spatial model outlined in the previous section. That is, all votes can be explained solely by considering the ideal points of the justices and the case stimuli. Second, we assume that a single issue dimension structures all decision making from 1937 to the present.5 With these two assumptions and a further assumption that disturbances to the latent utility of voting are independent Gaussian random variables, one can show that a standard two-parameter item response model can be used to estimate both case parameters and the ideal points from voting data (Clinton et al. 2004). This model, which we call a constant ideal point model, assumes that ideal points are time invariant.

Martin and Quinn (2002) extend this model and propose a dynamic ideal point model that allows for preferences to change over time. Conceptually, this model estimates all the case parameters from a distribution common to all terms and an ideal point for each justice in each term on a comparable scale across terms. The model is formalized as follows. Let *K*_{t} ⊂ {1, 2, …, *K*} denote the set of cases heard by the Supreme Court in term *t*, and *J*_{k} ⊂ {1, 2, …, *J*} denote the set of justices who heard case *k*. The cardinality |*J*_{k} denotes the number of justices sitting on a case *k*, which is typically nine, but fewer in certain cases. We are interested in modeling the decisions made in terms *t* = 1, …, *T* on cases *k* ∈ *K*_{t} by justices *j* ∈ *J*_{k} in a unidimensional issue space. We code all votes in term *t* on case *k* by justice *j* as either being in favor of the conservative option (*v*_{t,k,j} = 1) or the liberal option (*v*_{t,k,j} = 0). The observed data matrix **V** is thus a (*K* × *J*) matrix of votes and missing values. We note that, for reasons that will become apparent below, it matters not whether we code votes as liberal/conservative, majority/minority, affirm/reverse, and so forth.

The spatial model suggests that the ideal points of the justices θ_{t,j}for the *j*th justice in term *t*, and the case stimuli *x*_{k}^{(l)} and *x*_{k}^{(r)}, determine votes on the merits. These are the quantities we wish to make inferences about from the data. To do so, let *z*_{t,k,j} denote the difference between the latent random utility of voting for the conservative policy and the latent random utility of voting for the liberal policy. We expect that this latent random utility explains the votes on the merits in the following fashion:

_{k}= [

*x*

_{k}

^{(l)}

*x*

_{k}

^{(l)}–

*x*

_{k}

^{(r)}

*x*

_{k}

^{(r)}], β

_{k}= 2[

*x*

_{k}

^{(r)}–

*x*

_{k}

^{(l)}], and ε

_{t,k,j}is a random error term which we assume is homoscedastic with known variance.6 The two case parameters α

_{k}and β

_{k}characterize the case characteristics. More specifically, the ratio −α

_{k}/β

_{k}is the midpoint between

*x*

_{k}

^{(l)}and

*x*

_{k}

^{(r)}. The policy position for justice

*j*in term

*t*is θ

_{t,j}. This model differs from that standard two-parameter item response model in that these ideal points are allowed to change over time.

To complete the model it is necessary to assign prior distributions to all parameters. We begin by assuming standard Gaussian prior distributions for the case parameters7:

*T*

_{j}is the first term justice

*j*served and is the last term

*j*served. We do not estimate ideal points for terms in which a justice did not serve. Δ

_{θt,j}is an evolution variance parameter which is fixed a priori by the researcher. Its magnitude determines how much borrowing of strength (or smoothing) takes place from one time period to the next. Note that as Δ

_{θt,j}→ 0, we approach a model with temporally constant ideal points. At the other extreme, as Δ

_{θt,j}→ ∞, we get a model in which the ideal points are temporally independent. To complete the prior, we must anchor each time series at the unobserved time period zero. Here, in a slight abuse of notation, we let 0 denote time period T

_{j}– 1 for all

*j*. We assume that:

This approach is similar to that of Berry et al. (1999).

To estimate this model, we adopt the strategy of Martin and Quinn (2002), which is based on standard item response theory (Bock and Liberman 1970; Hambleton and Swaminathan 1985; Albert 1992; Bradlow et al. 1999; Johnson and Albert 1999) and Bayesian dynamic linear models (West and Harrison 1997). The strategy uses Markov chain Monte Carlo (MCMC) methods (Jackman 2000; Gill 2002) to simulate from the posterior distribution *f*(θ, α, β|**V**)∝*f*(**V**|θ, α, β)π(θ)π(α, β). These methods allow one to simulate from a distribution that is otherwise computationally intractable.8 There are many advantages to using Bayesian methods in the context of ideal point estimation; see Jackman (2001) and Bafumi et al. (2005) for a review. Due to the large number of parameters, maximum likelihood estimation for our dynamic ideal point model would be intractable.

Before we turn to our specific application, it is important to recognize some additional properties of this model. First, this is a fully parametric statistical model, which not only solves the fundamental problem of dynamic ideal point estimation but also allows us to report measures of uncertainty for all quantities of interest. Second, this approach does not conflate possible changes in case stimuli and ideal points. Both are estimated separately in the model: the case parameters α_{k} and β_{k} are the estimates of the case stimuli, and the θ_{t,j} are the ideal point estimates in each term. Third, this model allows for ideal points to change over time. The use of the random walk prior allows for change to take an extremely wide range of smooth forms and is *much more flexible* than assuming linear or polynomial change, such as the D-NOMINATE model of Poole and Rosenthal (1997).

## The Evidence for Preference Change, 1937–2003

To make our results comparable to those of Epstein et al. (1998) and due to data availability we focus our attention on the Supreme Court from the 1937 to the 2003 terms. During this time period, 41 justices served (*J* = 41). We obtain data from three sources: (1) data for the 1953–2003 terms comes from the *Original United States Supreme Court Database* (Spaeth 2004); (2) data for the 1946–1952 terms comes from the *Vinson–Warren Court Database* (Spaeth 2001); (3) data for the 1937–1945 terms comes from an unpublished data set collected and used by Epstein et al. (1998).9 This selection results in *K* = 4741 total cases, the most heard in the 1972 term (108) and the fewest heard in the 2003 term (41).

To ensure a common scale for the ideal points across time, it is necessary to assign informative priors for justices that span the entire length of the study. In our case, we set the prior mean for the ideal points *m*_{0,j} to zero for all justices except Black, Stewart, and Rehnquist, with prior means −2.0, 1.0, and 3.0, respectively. The prior variances *C*_{j,0} were set to 1 for all justices but for these three; the prior variances for these three are set to 0.1. Note that this prior is only on the first term in which the justice served. For all other terms, the ideal point in the previous term serves as the prior mean. To complete the prior we set the evolution variance Δ_{θt,j} = 0.1 for all justices in all terms after their first.10 After specifying the priors, we employ the Martin and Quinn (2002) MCMC algorithm to simulate from the posterior distribution. With the posterior sample in hand, we first performed standard convergence tests. All suggested that the chain has reached the stationary distribution.11

Our main quantity of interest is the ideal points of the justices. Due to space considerations, we only report the ideal point estimates of the 16 justices Epstein et al. (1998) considered. These justices are chosen because they served for 10 or more terms and because they completed their entire terms of service between the 1937 and 2003 terms inclusive.12 The ideal point estimates, for each justice in each term in which they served, are presented in Figure 2. The large point in the middle is the posterior mean of the ideal points, and the error bars represent plus or minus two posterior standard deviations (SDs). One can think of the posterior mean as a point estimate and the posterior SD as a SE. The amount of uncertainty of the estimates depends primarily on two factors; *ceteris paribus* more extreme justices are estimated with less certainty than centrist justices, and terms with more less cases are estimated with less certainty than those with more cases.

The scale we estimate is a conservatism scale—higher values represent greater conservatism. The results in Figure 2 are striking. Many justices seem to trend over the course of their careers. Black begins his career as a liberal, but gets more conservative over time. Frankfurter and Reed also trend toward conservatism; Reed ends his career as a moderate, whereas Frankfurter ends his career as a conservative. Other justices get more liberal over time. The classic example is Blackmun, a Nixon appointee who was actually quite conservative in his first few terms. Yet at the time of his retirement in the mid-1990s, Blackmun became quite liberal and in fact was one of the most liberal justices on the Court. Clark, Powell, and Warren also seem to become slightly more liberal over the course of their careers.

Changes in ideal points are not limited to directional trends. Some justices remain somewhat constant throughout the course of their careers, such as Stewart and White. Others exhibit more exotic patters. Douglas begins his career as liberal, becomes more moderate through the late 1940s and early 1950s, and then becomes increasingly more liberal through the remainder of his career. Harlan too has a parabolic shape, although the amount of change is far less dramatic than that of Douglas.

How well does the model fit? In short, quite well. In particular, these results correlate highly with percent liberalism in civil rights, civil liberties, economics, and federalism cases (Martin and Quinn 2002). This is surprising, as most existing measures only fare well for civil rights and civil liberties cases (Epstein and Mershon 1996). Additionally, the model has solid explanatory power. Overall, the model correctly classifies 75.7% of the votes.13 In Figure 3 we plot the term-by-term percent correct classification for our model. The model performed worst in the 1945 term (68.5% correctly classified) and best in the 1939 term (84.9% correctly classified). There are clearly dynamics in the classification rate, but compared to a baseline of 50% classification, the model does well. Not surprisingly, the model appears to do increasingly better in terms with stable membership; the big dips correspond to the appointment of new justices.

From Figure 2 it appears as if the preferences of Supreme Court justices change a great deal over time. However, these are quantities that are measured with uncertainty. It is important to account for that uncertainty when making claims about whether or not there is significant change in ideal points for individual justices. To assess preference change, it is also important *not* to rely on global models of change, such as linear regression models. In fact, the probability of interest is the posterior probability that a particular justice is more conservative in subsequent terms than in a baseline term. As Hagle (1993) notes, justices learn a great deal during their first term of service. Maltzman and Wahlbeck (1996) also show that justices are amenable to persuasion early in their careers. This implies that the first term of service is not a terribly reliable baseline. For our first comparison, we take the mean ideal point of each justice's second, third, and fourth terms of service as the baseline for comparison θ*_{j}. Then, for all subsequent terms, we compute the posterior probability that the justice is more conservative than the baseline. Formally, we compute:

For each of the 16 justices, we plot these posterior probabilities in Figure 4. Each cell of the figure contains dotted horizontal lines at 0.025 and 0.975. If the estimated probability is greater than 0.975, then we can conclude that the justice was significantly more conservative in that term. If the estimated probability is less than 0.025 percentile, we can conclude that the justice was significantly more liberal in that term than in the baseline term.

The results in Figure 4 are striking. Justices Black, Douglas, Frankfurter, Harlan, Jackson, Reed, and White are significantly more conservative in some subsequent terms than the baseline. Justices Blackmum, Brennan, Burger, Clark, Douglas, Harlan, Marshall, Powell, and Warren are significantly more liberal in some subsequent terms.

Justices Douglas and Harlan are significantly more conservative than the baseline in some subsequent terms and significantly more liberal than the baseline in other subsequent terms. This is consistent with the parabolic trajectories in Figure 2. But the patterns are quite different, as Harlan is only significantly more liberal in his final term than the baseline, whereas Douglas is significantly more liberal for well over his final decade. Another interesting pattern is White, who is significantly more conservative than the baseline for two periods. Only two justices—Burton and Stewart—demonstrate no significant change in their ideal points, *even after controlling for changes in case stimuli*.14 The implication is clear—Epstein et al. (1998)*underestimate* the amount of preference change on the Supreme Court. And, their conclusion that Brennan, Burger, Harlan, Jackson, and Marshall exhibit no significant change is incorrect. This is likely due to their measurement strategy and using only a global test of preference change. Indeed, our results for Harlan, Douglas, and White show that assuming a particular functional form, either linear or parabolic, for ideal points trajectories is an inappropriate assumption. The findings for these justices would be masked (as with Harlan) or attenuated (as with Douglas and White) when using a global measure.

Instead of imposing a particular baseline for comparison, we present further evidence of preference change in Figure 5. To construct this figure, we compute the posterior probability that a given justice is more conservative in term *r* than term *s* for all possible combinations where *r* > *s*. The specific algorithm used to compute this quantity is discussed in Appendix. We summarize these posterior probability profiles for four justices of interest in Figure 5. The baseline term is on the *y* axis, and the comparison term is on the *x* axis. For example, if one is interested in determining whether or not Justice Black is more conservative in subsequent terms than the 1950 term, one would read across from left to right at the 1950 tick on the *y* axis. The legend shows how the probabilities are encoded in the figure. The bright red color implies that the justice is significantly more liberal, and the bright blue color implies that the justice is significantly more conservative. For the sake of space, we only present these profiles for Justice Black, Harlan, Marshall, and Stewart. The profiles for the other 12 justices are available in the Web appendix.

The results in Figure 5 confirm the conclusions drawn from Figure 4. Compared to his first terms, Justice Black is significantly more conservative in nearly every subsequent term. Also, if we chose terms in the mid-1950s as the baseline, we also see that Justice Black was also significantly more conservative in the late 1960s. Similarly, we see that Justice Marshall became significantly more liberal throughout the term of his service. But, by the early 1980s, he remains a stable liberal. Justice Harlan, with the parabolic trajectory, is another interesting case. The estimated posterior probability profile shows that depending on the baseline category, Harlan became significantly more conservative or significantly more liberal. Finally, the cell for Justice Stewart shows no significant preference change regardless of the baseline. These profiles show that global tests of preference change are inappropriate; rather, one should use local estimates of the probabilities of interest as we have done here. The findings from these results are striking—preference change is a common phenomenon that occurs quite often in the Supreme Court.

To get a sense of how our results fit with more qualitative accounts of attitudinal change we examine the path of Justice Blackmun's ideal points together with the cut-points from two important death penalty cases. Justice Blackmun was the justice with perhaps the most dramatic change in preferences over the course of his career (Greenhouse 2005). When he was nominated by President Nixon, the model suggests that Justice Blackmun was the second most conservative member of the Court (second only to Chief Justice Burger); when he retired, he was the second most liberal member of the Court (Justice Stevens was more liberal).

Figure 6 plots the ideal point trajectory of Justice Blackmun along with the cut-points for two major death penalty cases. In 1972, the Supreme Court decided Furman v. Georgia, 408 U.S. 238 (1972), and held that the death penalty as currently employed in the states was a cruel and an unusual punishment. The Court ruled that the arbitrariness of the use of the death penalty violated the Constitution. Blackmun dissented from this decision, thus taking the conservative side. Figure 6 shows the cut-point in the Furman case. The model predicts that justices above the cutline should vote in the conservative direction. In the 1971 term Blackmun was clearly on the conservative side of the cut-point. A second important death penalty case was decided in 1976; Gregg v. Georgia, 428 U.S. 153 (1976). In Gregg the Court reversed course and upheld the constitutionality of death penalty legislation spawned in response to Furman. The Court held that the new death penalty statutes in the states were consistent with the protections of the Eighth and Fourteenth Amendments. Blackmun concurred with this conservative decision, which is also consistent with the model; in the 1975 term he falls above the Gregg cut-point (on the conservative side).

Looking at Blackmun's votes on these cases in isolation one would have no reason to believe that his views were becoming more liberal. This would be hard to square with his later decisions and writings, which clearly show that he had moved to the left (Greenhouse 2005). However, because the model used here looks at all Blackmun’s votes over time together with the votes of the justices he served with, it points to a gradual leftward shift throughout most of Blackmun's tenure on the Court. The substantive content of Blackmun’s decisions and other writings never enter the model, yet the model's results are remarkably consistent with what we know of Blackmun from more qualitative accounts.

The model also allows us to entertain counterfactuals about how justices would have decided cases at different points in time. Although, for obvious reasons, such an exercise should not be taken too seriously, it does allow us to get a sense of how *important* the changes are that we identify. For instance, the results from our model suggest that if the Court heard Furman in 1976 there is less than a 50% chance that Blackmun would have upheld the existing Georgia death penalty statute. Similarly, had Gregg been decided by the Court in 1985 or later, our model suggests that Blackmun would have voted to strike down the new death penalty laws. The changes in Blackmun's revealed that preferences are not only statistically significant but also they are substantively important.

## Implications and Conclusion

The results presented above strongly suggest that the policy positions of Supreme Court justices do not remain constant throughout the course of their careers. This finding goes against much of the prevailing wisdom in judicial politics research and calls into question the results from a large body of research that explicitly assumes temporal stability of preferences. When scholars employ preference measures that are constant across time, such as Segal and Cover (1989) scores, the independent variable capturing preferences will be measured with *systematic error*. It is well known that this can lead to bias in the estimation of structural parameters of interest and can lead to incorrect substantive conclusions. Oftentimes, when one is analyzing a single cross section of data, this measurement error will be of little consequence. But when studying Supreme Court behavior over time using time-invariant preference measures can lead to incorrect conclusions about the effect of preferences or other variables on the outcome of interest. Since most statistical studies of judicial behavior look at behavior over time (e.g., Segal and Spaeth 1993; Maltzman and Wahlbeck 1996), this is a serious concern.

The most commonly used measure of judicial preferences, Segal and Cover (1989) scores, are time invariant. These measures have the advantage of being truly exogenous from behavior, but the results presented here demonstrate that the assumption of constancy is unwarranted. Epstein and Mershon (1996) demonstrate that the explanatory power of Segal and Cover (1989) scores is limited to civil rights and liberties issues and that the scores should only be used to study aggregated votes for those issue areas. Our results go a step further. Based on our results, Segal and Cover (1989) scores should generally not be used to study judicial behavior that occurs over time. Other measures, such as party identification of the justices or social background characteristics suffer the same ills.

What is the solution to this problem? One important by-product of our research is that we estimate an ideal point for every justice in every term for all justices serving between 1937 and 2003. These measures are available in electronic form on the Web appendix. Our measures are time varying and thus do not suffer from the deficiency of other commonly used measures. When studying phenomena other than votes on the merits, these measures can be employed as independent variables to explain behavior. However, in a strict sense, the measures should not be used directly in probit or logit models to study votes on the merits because these same votes were used to construct the measures.

One important question that we leave for future research is: what *explains* preference change? Although the results clearly show that justices exhibit change in revealed preferences over time, we do not offer an exhaustive theory for why this is so. While it is tempting to speculate about what such a grand theory might look like,15 we suspect that personal, idiosyncratic reasons operating at the justice level are at least, if not more, important than broad system-level mechanisms. Moreover, with only a small number of justices, attacking this problem with a *statistical* research design is implausible. A far better approach would be to perform a historical or doctrinal analysis of the behavior of individual justices. We look forward to future work that pursues this research agenda.

At the end of the day, this article contributes to the judicial politics literature in a number of ways. Our dynamic ideal point model provides an extremely powerful and flexible method to simultaneously learn about the effects of justice policy positions and case content on decision making. The data suggest that nearly every justice exhibits statistically significant preference change over the course of his or her career. The patterns uncovered by the model are substantively interesting and facially valid. Just as important as what we have learned from this article is what we still do not fully understand. Although there is a great deal of evidence that revealed preferences of justices change over time, more research needs to be done to more fully understand the mechanisms that cause such change.

### Appendix

#### Monte Carlo Estimates of Quantities of Interest

To estimate the quantities of interest regarding preference change, we use the following Monte Carlo algorithms. To estimate the quantity in equation (6), the following pseudocode illustrates how this quantity can be calculated for term *r* and justice *j* using the draws from the posterior distribution of **θ**:

*Algorithm A1*. PROBMORECONS1 (**θ**, *r,j*)

*G*is the number of MCMC draws and

*P*

_{r,j}is the probability that justice

*j*is more conservative than the baseline in term

*r*.

For the quantity of interest plotted in Figure 3:

*Algorithm A2*. PROBMORECONS2 (**θ**, *r,s,j*)

*P*

_{r, s, j}is the probability that justice

*j*is more conservative in term

*r*than in term

*s*.

We gratefully acknowledge additional financial support from the Weidenbaum Center at Washington University and the Center for Statistics and the Social Sciences with funds from the University Initiatives Fund at the University of Washington. Supplementary results, a replication data set, and documented C++ code to estimate the model using the Scythe Statistical Library (Martin and Quinn 2003) are available in a Web appendix at the authors' Web sites. The data sets were built from the *Original United States Supreme Court Database* (Spaeth 2004), the *Vinson–Warren Court Database* (Spaeth 2001), and a data set generously provided by Lee Epstein, Valerie Hoekstra, Jeffrey Segal, and Harold Spaeth. All errors and interpretations remain our sole responsibility.

## References

*(computer file)*

*(computer file)*

*revealed preferences*. By this we mean the preferred policy positions defined in an issue space revealed through the votes of the justices (Epstein and Mershon 1996). It is important to note that this notion of revealed policy preferences is conceptually very different from personal policy preferences or attitudes (Segal and Spaeth 1993). The revealed preferences that we estimate on a policy scale are likely caused by any number of factors, including personal attitudes, the decision context, and so forth. The purpose of this article is to document the change in revealed preferences.

_{k}and β

_{k}are specific to case

*k*and only depend on the differences of policy outcomes and their squares, votes can be coded in any way that is consistent across justices within a case and the likelihood will not change.

**I**

_{2}, 5 ·

**I**

_{2}, and 10 ·

**I**

_{2}; the substantive results remain the same. This suggests that the data contain a reasonable amount of information about these case parameters.

_{θt,j}= 0.01, Δ

_{θt,j}= 0.25, Δ

_{θt,j}= 0.5, and Δ

_{θt,j}= 3.0 and find nearly identical results. This implies that regardless of the amount of smoothing, many justices exhibit preference change.