## Abstract

Two contributions in this issue, Grant and Lebo and Keele, Linn, and Webb, recommend using an ARFIMA model to diagnose the presence of and estimate the degree of fractional integration, then either (i) fractionally differencing the data before analysis or, (ii) for cointegrated variables, estimating a fractional error correction model. But Keele, Linn, and Webb also present evidence that ARFIMA models yield misleading indicators of the presence and degree of fractional integration in a series with fewer than 1000 observations. In a simulation study, I find evidence that the simple autodistributed lag model (ADL) or equivalent error correction model (ECM) can, without first testing or correcting for fractional integration, provide a useful estimate of the immediate and long-run effects of weakly exogenous variables in fractionally integrated (but stationary) data.

## 1 Introduction

Political scientists strive to study the substantive meaning of temporal dynamics rather than to treat them as nuisances to be fixed in the error term ( Beck and Katz 1996 ). If the autodistributed lag (ADL) and mathematically equivalent error correction models (ECM) described by De Boef and Keele (2008) have been inappropriately used in some instances, I suspect that it is out of a laudable desire to pursue this agenda. This issue’s article by Grant and Lebo (2016) reminds us that how we handle the nuisances is still important, and that doing so improperly can lead to “discovering” non-existent relationships. In particular, I concur with Grant and Lebo’s advice to carefully assess the stationarity of variables prior to analysis: time-series analysis of non-stationary variables without appropriate differencing (or an appropriate ECM, if variables are cointegrated) can yield misleading results. This advice is also encapsulated in Keele, Linn, and Webb’s ( 2016 ) very helpful Table 5.

But the analysis guidelines produced by Grant and Lebo (2016) and Keele, Linn, and Webb (2016) raise a significant question about the handling of fractionally integrated data. When fractional integration is suspected in a time series, both articles recommend measuring the degree of fractional integration d with an ARFIMA model, then using d to either (i) fractionally difference the data before analysis, or (ii) when co-integration is present, estimate a fractional ECM model. But Keele, Linn, and Webb (2016) are also justifiably skeptical of the ARFIMA model’s ability to recover d in a data set with a small number of temporal observations T. First, they present simulation evidence (in Fig. 1 ) that ARFIMA estimates of d are both noisy and biased when T < 1000. They also show that the ARFIMA model frequently produces false positive results for d in non-fractionally integrated data under the same circumstances. They cite prior studies indicating that these limitations of the ARFIMA model have been observed by other researchers. Finally, they express concern that complex models in very short data sets (with many parameters per observation) are likely to be overfitted. If fractionally integrated data are both ubiquitous (as Grant and Lebo [2016] suggest they are) and difficult to study in short data sets, the evidence offered by Keele, Linn, and Webb (2016) suggests that a large number of important questions in political science may be very difficult to answer using the recommended methodology.

Fig. 1

ADL approximation of null ARFIMA models. These figures depict the result of analyzing 1000 simulated data sets from the ARFIMA process in equations 2–4 with $γx=0,$ using the ADL model in equation (5) . The left figure depicts the estimates of $β^2$ from the model; these should be centered on 0 if the estimates are accurate. The line of numbers at the bottom of the left figure depicts the proportion of $β^2$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d. The right figure depicts the estimates of $LRM^$ from the Bewley transformation of the model; these should also be centered on 0 if the estimates are accurate. The line of numbers at the bottom of the right figure depicts the proportion of $LRM^$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d .

Fig. 1

ADL approximation of null ARFIMA models. These figures depict the result of analyzing 1000 simulated data sets from the ARFIMA process in equations 2–4 with $γx=0,$ using the ADL model in equation (5) . The left figure depicts the estimates of $β^2$ from the model; these should be centered on 0 if the estimates are accurate. The line of numbers at the bottom of the left figure depicts the proportion of $β^2$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d. The right figure depicts the estimates of $LRM^$ from the Bewley transformation of the model; these should also be centered on 0 if the estimates are accurate. The line of numbers at the bottom of the right figure depicts the proportion of $LRM^$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d .

I examine an alternative: if data might be fractionally integrated but are stationary (and the independent variables are weakly exogenous), estimate an ADL/ECM model on the data without first estimating d and fractionally differencing. Although this model is undoubtedly misspecified, it may nevertheless provide an accurate approximation of important dynamic relationships in the data. Moreover, it is considerably simpler than the ARFIMA model and perhaps less susceptible to over-fitting. In a simulation study, I find evidence that an ADL/ECM can accurately detect and recover immediate and long-run relationships in this setting while avoiding false positives. Consequently, the ADL/ECM appears to be a valid option for studying short T data sets with fractional integration. The results suggest that dealing with the non-stationarity and/or co-integration of time-series data is a methodologically higher priority compared to correcting for possible fractional integration, and that researchers may generally trust the results of ADL/ECM models in this environment.

## 2 Approximating ARFIMA Data with ADL/ECM Models

The general form of an ARFIMA process for a variable y , as defined by Shumway and Stoffer (2010 , 272), is written as

(1)
$ϕ(L)(1−L)d(yt−μt)=θ(L)εt,$
where $t=1…T$ indexes time, a non-integer d indicates a degree of fractional differencing, $ϕ(L)$ is an autoregressive function of lag operators 1L which acts on the fractionally differenced and de-meaned y , 2 and $θ(L)$ is a moving average function of lag operators which acts on the white noise term ε.3 When y is suspected to be fractionally integrated, Grant and Lebo (2016) and Keele, Linn, and Webb (2016) recommend estimating d using an ARFIMA model that matches this process, then fractionally differencing y before determining its relationship with other variables like x. x may also need to be differenced prior to analysis as well; a fractional ECM may be possible if x and y share the same d.

Given the objections to ARFIMA modeling in short T data sets raised by Keele, Linn, and Webb (2016) , it may be possible to use the ADL/ECM to approximate a fractionally integrated data generating process in order to recover both immediate and long-run relationships dy / dx when x is weakly exogenous with respect to y.4 As long as $d∈(−12,12)$ , a series like equation (1) is stationary ( Shumway and Stoffer, 2010 , 269); De Boef and Keele (2008) demonstrated that the ADL/ECM can be very useful for studying dynamic relationships in stationary data.

The ADL/ECM is obviously misspecified for the data-generating process in equation (1) . The intent is not to precisely mirror the data-generating process, but to approximate immediate and long-run relationships between x and y within an acceptable degree of error. Given the problems that Keele, Linn, and Webb (2016) identify in estimating d in short panels (and the complexity of time-series analysis in general), some form of misspecification may be inevitable. Moreover, because the ADL/ECM is a relatively simple model, the risk of overfitting may be reduced compared to the more complex ARFIMA model.

Grant and Lebo (2016) are primarily motivated by imprudent use of the ADL/ECM model that often finds relationships among variables where none exist. Thus, to recommend the use of ADL/ECM models to study fractionally integrated data without first fractionally differencing, it is imperative to demonstrate that this use does not encounter the problems that they identify. The key issue is whether the long-run memory present in a series with fractional integration creates spurious or severely biased estimates of short- or long-run relationships between x and y. I answer this question using a simulation study.

## 3 Monte Carlo Evidence

Consonant with the concerns of Grant and Lebo (2016) , my simulation study is designed to answer four questions: To answer these questions, I create three simulated data-generating processes with different characteristics. 5 Specifically, I vary the relationship between x and y : the possibilities are that (1) changes in x have no impact on y , (2) permanent changes in x at time $t*$ have an immediate, permanent impact on y at time $t*$ but no further impact, and (3) permanent changes in x have an immediate impact on y that continues to increase over time as a long-run adjustment in y. Because Keele, Linn, and Webb (2016) note that difficulties in estimating d are most acute in short data sets, I assess the ADL’s suitability for time series with T = 100; this is sufficiently short that we might be skeptical of estimates of d from an ARFIMA model.

1. Do ADL/ECM models find immediate or instantaneous relationships in fractionally integrated data where they do not exist?

2. Do ADL/ECM models find long-term or dynamic relationships in fractionally integrated data where they do not exist?

3. Can ADL/ECM models accurately recover the magnitude and direction of immediate/instantaneous relationships in fractionally integrated data where they do exist?

4. Can ADL/ECM models accurately recover the magnitude and direction of long-term relationships in fractionally integrated data where they do exist?

### 3.1 Fractionally Integrated Data Generating Processes without Long-Term Adjustment

To simulate fractionally integrated data for y and x , I create data using an ARFIMA process with the form

(2)
$(1−0.5L)(1−L)d(yt−μt)=εt$

(3)
$μt=γxxt$

(4)
$(1−0.5L)(1−L)0.3xt=ψt$
$γx=0$ when there is no relationship between x and y ; I set $γx=0.5$ when there is such a relationship. In this ARFIMA process, the short- and long-run impacts of x are identical; the mean of y shifts immediately to reflect a change in x , and fluctuations around this mean (which are subject to long memory) are unrelated to the value of x. I vary $d∈{0, 0.1, 0.2, 0.3, 0.4, 0.45}$ , similar to Keele, Linn, and Webb (2016) . The noise terms ε and ψ are $∼N(0,1)$ . Series for y and x of length T = 100 out of this process are generated using the
arfima.sim
function in the
arfima
package for R. I draw 1000 data sets for each Monte Carlo study.

For each data set, I estimate an ADL model 6 of the form

(5)
$yt=β0+β1yt−1+β2Δxt+β3xt−1+ζt,$
where $Δxt=xt−xt−1$ and its coefficient β2 shows the immediate impact of a change in x on y at the time of the change, $t*$ . The long-run impact of a permanent change in x on y is given by $LRM=β3/(1−β1)$ ; I estimate the Bewley transformation of the model (described in De Boef and Keele 2008 ) in order to measure this impact and its variance. 7

#### 3.1.1 False positive rates for immediate and long-run impacts

When $γx=0$ in equation (3) , there is no immediate or long-term impact of x on y. In this environment, I designate a false positive immediate impact as a statistically significant value of $β^2$ using a two-tailed t -test, $α=0.05$ . I similarly designate a false positive long-run impact as a statistically significant $LRM^$ using the same test.

Consider Fig. 1 , which shows the estimated values of $β^2$ and $LRM^$ for each of the 1000 simulated data sets and the percentage of estimates that are statistically significant. As the figures make clear, both the immediate impacts (estimates of $β^2$ from equation 5 ) and long-run impacts (estimates of $LRM^$ using the Bewley method) are consistently centered on zero with false positive rates near the nominal $α=0.05$ value of the statistical significance test. In other words, in fractionally integrated data, the ADL is resistant to finding immediate and long-run relationships between y and x where they do not exist.

#### 3.1.2 True positive rates and accuracy estimates

When $γx=0.5$ in equation (3) , the immediate impact of y on x is the same as the long-run impact: 0.5. If the ADL model can accurately approximate the relationships in this data set, it should show an accurate and identical immediate and long-term impact ( $β^2=LRM^=0.5$ ). I designate a true positive immediate impact as a statistically significant value of $β^2$ using a two-tailed t -test, $α=0.05$ ; I use the same test for detecting true positive $LRM^$ values.

Figure 2 shows the estimated values of $β^2$ and $LRM^$ for 1000 data sets simulated under these conditions. Both immediate and long-run impact estimates are properly centered on the correct value of 0.5. However, the degree of noise in the estimate of long-run impacts increases as d gets closer to 0.5 and y gets closer to being non-stationary. This additional noise hurts the ADL model’s ability to distinguish the estimated LRM from zero; when d = 0.45, only about a quarter of true positive long-run relationships are detected.

Fig. 2

ADL approximation of ARFIMA models with identical immediate and long-run impacts.

These figures depict the result of analyzing 1000 simulated data sets from the ARFIMA process in ( equations 2–4 ) with $γx=0.5$ using the ADL model in equation (5) . The left figure depicts the estimates of $β^2$ from the model; these should be centered on 0.5 if the estimates are accurate. The line of numbers at the bottom of the left figure depicts the proportion of $β^2$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d. The right figure depicts the estimates of $LRM^$ from the Bewley transformation of the model; these should also be centered on 0.5 if the estimates are accurate. The top line of numbers at the bottom of the right figure depicts the proportion of $LRM^$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d. The bottom line of numbers at the bottom of the right figure depicts the proportion of $LRM^$ estimates that are statistically distinguishable from $β^2$ using the same test criterion.

Fig. 2

ADL approximation of ARFIMA models with identical immediate and long-run impacts.

These figures depict the result of analyzing 1000 simulated data sets from the ARFIMA process in ( equations 2–4 ) with $γx=0.5$ using the ADL model in equation (5) . The left figure depicts the estimates of $β^2$ from the model; these should be centered on 0.5 if the estimates are accurate. The line of numbers at the bottom of the left figure depicts the proportion of $β^2$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d. The right figure depicts the estimates of $LRM^$ from the Bewley transformation of the model; these should also be centered on 0.5 if the estimates are accurate. The top line of numbers at the bottom of the right figure depicts the proportion of $LRM^$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d. The bottom line of numbers at the bottom of the right figure depicts the proportion of $LRM^$ estimates that are statistically distinguishable from $β^2$ using the same test criterion.

However, I believe that in the presence of an immediate impact of x on y , it may sometimes be reasonable to assert that the null expectation for the long-run relationship should be equal to the immediate impact. If changes in x are not theoretically expected to “wear off” over time, then the effect of y caused by a change in x at time $t*$ should persist by inertia beyond that time point. Consequently, I also test the hypothesis that $β2≠LRM$ against the null that $β2=LRM$ , showing the results in a second line of numbers in Fig. 2 b. 8 For this test, the ADL (falsely) rejects this null only slightly more often than the expected $α=0.05$ rate.

### 3.2 Fractionally Integrated Data Generating Processes That Include Long-Term Adjustment

To simulate fractionally integrated data for y and x that includes a long-term adjustment, I slightly modify the ARFIMA process in equations (2–4) :

(6)
$(1−0.5L)(1−L)d(yt)=κt$

(7)
$κt=γxxt+εt$

(8)
$(1−0.5L)(1−L)0.3xt=ψt.$
This allows changes in x to propagate into the ARFIMA process by directly entering the noise term; a change in x at time $t*$ will have continuing impacts on y after that time as the initial impacts echo through the lag structure of y.

I assess the accuracy of the immediate impact of x on y as before: $β^2$ should be statistically significant and $=0.5$ if the ADL yields accurate results. Given the existence of a gradual, long-term adjustment in y initiated by a change in x , I also assess how well the ADL can match the trajectory of change in y following a single permanent change in x. Specifically, once an ADL model is fitted, I set x = 0, set the lagged value of $yt−1$ in order to put the system into an equilibrium $y*$ , simulate a change in x of 1 at time $t*$ , then calculate $y^$ from this model for $t*+c$ from $c=1...15$ . I then compare the difference between $y*$ and $y^t*+c$ calculated from the ADL to the true difference as simulated using the true parameters and

arfima.sim
. This process gives a sense of how well the ADL is able to approximate the unfolding of the data-generating process over time after a change in x.

The results are shown in Fig. 3 . As in the case with identical immediate and long-run impacts, the ADL does an excellent job of accurately measuring the immediate dy / dx and rejecting the null hypothesis. The estimated $LRM^$ is statistically significant over 93% of the time (and over 98% when d < 0.45); additionally, it is statistically distinguishable from the immediate impact $β^2$ over 92% of the time. 9

Fig. 3

ADL approximation of ARFIMA models with long-run adjustment.

These figures depict the result of analyzing 1000 simulated data sets from the ARFIMA process in equations 6–8 with $γx=0.5$ using the ADL model in equation (5) . The left figure depicts the estimates of $β^2$ from the model; these should be centered on 0.5 if the estimates are accurate. The line of numbers at the bottom of the left figure depicts the proportion of $β^2$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d. The right figure depicts the accuracy of ADL estimates of the change in y at time $t*+c$ (compared its initial equilibrium) caused by a one-time permanent change in x at time $t*$ ; these should be centered on 0 if the ADL estimates are accurate (as the figure shows the difference between the true trajectory and the ADL estimate).

Fig. 3

ADL approximation of ARFIMA models with long-run adjustment.

These figures depict the result of analyzing 1000 simulated data sets from the ARFIMA process in equations 6–8 with $γx=0.5$ using the ADL model in equation (5) . The left figure depicts the estimates of $β^2$ from the model; these should be centered on 0.5 if the estimates are accurate. The line of numbers at the bottom of the left figure depicts the proportion of $β^2$ estimates that are statistically significant ( $α=0.05$ , two-tailed) for each value of d. The right figure depicts the accuracy of ADL estimates of the change in y at time $t*+c$ (compared its initial equilibrium) caused by a one-time permanent change in x at time $t*$ ; these should be centered on 0 if the ADL estimates are accurate (as the figure shows the difference between the true trajectory and the ADL estimate).

As shown in Fig. 3 b, the trajectory of long-run changes in y over time is estimated with progressively greater noise as the temporal distance from the intervention gets larger; this noise also gets larger as d grows. Additionally, there is a tendency of the ADL model to underestimate the magnitude of long-run changes, especially when d is close to 0.5. The underestimation of long-run impacts makes sense: fractionally integrated time series are equivalent to very long autodistributed lag models ( Shumway and Stoffer 2010 , 268–69) while the ADL includes just one lag. Consequently, a change in x has effects that propagate cumulatively over a long period of time; the ADL must approximate this process with a much smaller number of lags of y.

## 4 Conclusion

My simulation study the use of ADL/ECM models to study the immediate and long-run effects of a fractionally integrated (but stationary) and weakly exogenous variable x on a fractionally integrated y. Under the conditions of the simulation: Based on these conclusions, it appears that ADL/ECM models are very useful for recovering the immediate impact of x on y , despite fractional integration. The results for long-run impacts are not quite as robust: these impacts are likely to be incorrectly estimated by an ADL/ECM run on fractionally integrated data, possibly because the very long memory of such series allows for an especially extended impact on y of a one-time permanent change in x. Nevertheless, hypothesis tests on the LRM do allow it to be distinguished from immediate impacts where appropriate, and do not allow it to be distinguished when there is no long-term cumulative effect.

1. ADL/ECM models did not find immediate or instantaneous effects of x on y in fractionally integrated data where they do not exist ; standard t -tests for the coefficient on $Δxt$ in the ADL produced false positives at close to the expected rate.

2. ADL/ECM models did not find long-run effects of x on y in fractionally integrated data where they do not exist . Again, standard t -tests for the long-run multiplier ( LRM ) estimated by the Bewley transformation of the ADL produced false positives at close to the expected rate.

3. ADL/ECM models accurately identified and recovered the magnitude and direction of immediate impacts of x on y in fractionally integrated data where they existed.

4. ADL/ECM models detected the presence of long-run impacts greater than the immediate effect of x on y in fractionally integrated data, but the magnitudes were underestimated and tests for distinguishability from the immediate impact were more useful than tests against a long-run impact of zero.

My overall recommendation is to slightly refine the advice of Grant and Lebo (2016) and Keele, Linn, and Webb (2016) . In non-fractionally cointegrated data sets with many temporal observations T , it seems appropriate to estimate d with an ARFIMA model and fractionally difference a variable prior to estimation as indicated in Keele, Linn, and Webb’s Table 5. But an ADL/ECM provides a serviceable approximation in a short T data set, where d is inaccurately estimated and overfitting is a concern. This recommendation does not absolve a researcher of the responsibility to establish that the studied data are stationary (and that the independent variables are weakly exogenous) before applying the ADL/ECM; based on prior research, I would still expect the ADL/ECM to be a very problematic choice for non-stationary (and non-cointegrated) or endogenously related variables.

1 The lag operator is: $L(xt)=xt−1$ , with higher multiples of L leading to deeper lags (e.g., $L2(xt)=xt−2$ ).
2 Specifically, $ϕ(L)=1−ϕ1L−ϕ2L2−…−ϕpLp$ with p the order of the AR process.
3 Specifically, $θ(L)=1+θL+θL2+…+θqLq$ with q the order of the MA process. For more details on ARFIMA modeling, see Shumway and Stoffer ( 2010 , 12, 85, and 90).
4 Weak exogeneity implies that y does not have a direct or indirect (error-mediated) causal impact on x ; see Enders (2015 , 394–95) for details.
5 See Esarey (2016) for the R code necessary to replicate these simulations.
6 There is a mathematically equivalent ECM for this model, as laid out in De Boef and Keele (2008) and Keele, Linn, and Webb (2016) ; I focus on the ADL formulation for ease of interpretation.
7 The Bewley model estimates $yt=α0+α1Δyt+α2xt+α3Δxt+ηt$ , using $yt−1$ as an instrument for $Δyt$ . The coefficient α2 is the estimate of LRM , with its estimated variance as the variance of this impact.
8 To test the hypothesis that $β2≠LRM$ , I draw 1000 simulated values from asymptotic distribution of the ADL model and subtract the draws for $β^2$ from those for the $LRM=β^3/(1−β^1)$ , subtract the 1000 draws, then use the difference as a parametric bootstrap approximation of the difference between the immediate and long-run impact. I then examine the 2.5th and 97.5th quantiles of these differences to test whether any difference is statistically significant.
9 I use the same parametric bootstrapping technique for this test as laid out in footnote 8.

*Replication files for this study are available on the Political Analysis Dataverse at http://dx.doi.org/10.7910/DVN/DH1IUI .

## References

Beck
Nathaniel
Katz
Jonathan N.
.
1996
.
Nuisance vs. substance: Specifying and estimating time-series-cross-section models
.
Political Analysis

6
(
1
):
1
36
.
De Boef
Suzanna
Keele
Luke
.
2008
.
Taking time seriously
.
American Journal of Political Science

52
:
184
200
.
Enders
Walter
.
2015
.
Applied Econometric Time Series
, 4th edition.
Hoboken, NJ
:
Wiley
.
Esarey
Justin
.
2016
.
Replication data for: Fractionally integrated data and the autodistributed lag model: Results from a simulation study
.
Harvard Dataverse, V1
.
Grant
Taylor
Lebo
Matthew
.
2016
.
Error correction methods with political time series
.
Political Analysis

24
:
3
30
.
Keele
Luke
Linn
Suzanna
McLaughlin Webb
Clayton
.
2016
.
Treating time with all due seriousness
.
Political Analysis

24
:
31
41
.
Shumway
Robert H.
Stoffer
David S.
.
2010
.
Time Series Analysis and Its Applications
, 3rd edition.
New York, NY
:
Springer
.

## Author notes

Edited by Janet Box-Steffensmeier