Recovery after stroke: not so proportional after all?

The proportional recovery rule asserts that most stroke survivors recover a fixed proportion of lost function. Reports that the rule accurately predicts empirical recovery are rapidly accumulating. However, Hope et al. show that there is a fallacy at the heart of the rule that confounds many of these results.


Correlations
There are two basic correlations we are interested in, (1) the correlation between initial performance and performance at second test, i.e. ( , ), and (2) the correlation between initial performance and recovery, i.e. ( , − ) = ( , Δ). The latter of these is the key relationship, and we would expect this to be a negative correlation; that is, as initial performance is smaller (i.e. further from ), the larger is recovery. (One could also formulate the correlation as (( − ), − ), which would flip the correlation to positive, but the two approaches are equivalent).
Our main correlations are defined as follows,

Standard Deviation of a Difference
We need a straightforward result on the standard deviation of a difference.

Proof
The result is a direct consequence of the following standard result from probability theory, e.g. see Ross, S. M. (2014). Introduction to probability and statistics for engineers and scientists. Academic Press.,

Key Results
The following proposition enables us to express the key correlation, ( , ( − )), in terms of covariance of its constituent variables.

Proof
Using distributivity of multiplication through addition, associativity of addition, the definition of covariance and proposition 2, we can reason as follows. . √ 2 + 2 − 2 .

QED
It is straightforward to adapt proposition 3 to be fully in terms of correlations.

Scale Invariance
The next set of propositions justifies working with a standardised variable.

Proof
Using distributivity of a multiplicative constant through averaging, √ 2 = | | and distributivity of square root through multiplication, we can reason as follows.

Proof
For any ∈ ℝ, using distributivity of multiplication through mean and addition, and lemma 1, the following holds, Then, one can multiply both sides by

Proof
We can use distributivity of multiplication through subtraction and corollary 1 to give us the following.

Proof
The proof has two parts.

Proof
Both results are easy consequences of proposition 5.

Main Findings
Theorem 1: Since will be standardised, we can adapt the finding in proposition 4, to give us the key relationship we need, This leads to the key observation that, as gets smaller, ( , ( − )) tends towards − ( , ), which equals −1. In other words, as the variability of Y decreases, the imprint of becomes increasingly prominent. This is shown in the next theorem.

Proof
The right hand side of equation Imprint, has five constituent terms, two in the numerator and three in the denominator. Of these five, three are products with the standard deviation of , i.e. . Assuming all else is constant, as reduces, the absolute value of each of these three terms reduces towards zero. The rate of reduction is different amongst the three, but they will all decrease. Accordingly, as decreases, ( , ( − )) becomes increasingly determined by the two terms not involving , and thus, it tends towards − ( , ) √+1 = − ( , ) = −1.

Equality of Residuals
An important finding of section 5 of the main text, is that the residuals resulting from regressing Y onto X are the same as regressing Y-X onto X. We show in this section, that this equality of residuals is necessarily the case.
We focus on the following two equations, where ̃ is the × 2 matrix, with first column being and second being the × 1 vector of ones (which provides the intercept term); 1 and 2 are 2 × 1 vectors of parameters and , , 1 and 2 are × 1 vectors. As in the rest of this document, and are our (demeaned) initial and outcome variables, while 1 and 2 are our residual error terms.

Proposition 9
If we assume that 1 and 2 are fit with ordinary least squares, with 1 and 2 the associated residuals, then, 1 = 2 .

Proof
Under ordinary least squares, the parameters are set as follows. We start with the second of these, and using left distributivity of matrices, and then substituting Eqn 3, we obtain the following. Using the fact that the variable is demeaned, we can now evaluate the main term here as follows, where 2 is the dot product of with itself, Σ is the sum of the vector , and = 2 − Σ Σ is the determinant of the matrix being inverted. From here we can derive the following, We can then substitute this equality for 2 in eqn 2 and re-arrange to obtain, − =̃2 + 2 =̃( 1 − ( 1 0 )) + 2 =̃1 − + 2 It follows straightforwardly from here that, −̃1 = 2 i.e. 1 = 2 , as required. QED Proposition 9 shows that the residuals resulting from fitting equations 1 and 2 will be the same. A consequence of this is that the error variability will be the same. As a result of this, the factor that determines whether more variance is explained when regressing onto or when regressing − onto , is the variance available to explain. That is, the relative variance of and − drive the 2 values of these two regressions. This then implicates the variance of and and in fact their covariance (which impacts the variance of − ).
More precisely, we can state the following.
is big relative to 2 , then regressing − onto will explain more variability than regressing onto .
2) If ( − ) 2 is small relative to 2 , then regressing − onto will explain less variability than regressing onto .