Invited Commentary

YC. Differential item functioning was negligible in an adaptive test of functional status for patients with knee impairments who spoke English or Hebrew. Qual Life Res. 2009;18:1067–1083. 47 Brand CA, Amatya B, Gordon B, et al. Redesigning care for chronic conditions: improving hospital-based ambulatory care for people with osteoarthritis of the hip and knee. Intern Med J. 2010;40:427–436. 48 Jordan JL, Holden MA, Mason EE, Foster NE. Interventions to improve adherence to exercise for chronic musculoskeletal pain in adults. Cochrane Database Syst Rev. 2010;1:CD005956.

The importance of this study is described in the introduction of the article: to perform comparative effectiveness research (CER), valid outcome measures are needed that discriminate patients by risk factors in a similar way across settings. Because various countries can have very different health care structures and payment systems, it can be expected that different treatments or interventions will be used and at different time points during an episode of care. These treatment differences provide excellent information to determine better treatments for specific types of patients as long as the outcome measures used have known-groups construct validity.
To be sure that the patient selection criteria used in the study did not bias the study findings, in addition to the findings presented in the article related to selected patients undergoing knee-impairment rehabilitation based on therapist volume and completion rates, it would be important to examine the same question about known-groups construct validity using all patients undergoing knee rehabilitation with admission and discharge FS scores in each country. Whether the findings related to known-groups construct validity continue to hold for the larger set of all patients undergoing knee rehabilitation then could be determined. Many people raise concerns about the "huge" potential for patient selection bias in observational studies. . .as if there were none in randomized controlled trials (RCTs) or other experimental designs. My experience with practice-based evidence (PBE) prospective observational studies is that they exhibit less patient selection bias than RCTs.
The most interesting finding in the present study is the difference in change in FS from admission to discharge between the United States and Israel. Why does the United States have a significantly larger gain in FS from admission to discharge? As stated in the "Discussion" section, . . . after controlling for all variables available . . . (intake FS, age, sex, symptom acuity, surgical and exercise history, and related medication use), which were all found to predict discharge FS (PϽ.001, R 2 ϭ.3), a dif-ference of 5.9 discharge FS measure points between countries was still present. This difference remained practically the same . . . after controlling for number of visits, which was 3 visits higher in the United States compared with Israel. . . .
We could learn a great deal from additional research focused on determining better practices for specific types of patients if both countries collected detailed data on the treatments used during knee rehabilitation in addition to the FS measure at admission and discharge and the variables included in the knowngroups construct validity. This collection of data would provide a database for a PBE study 2,3 to identify treatments that are associated with better discharge FS, controlling for admission FS and the validated patient predictor variables from the present study.
Because the United States and Israel have such different health insurance systems, and use some treatments to a different extent (in particular, surgical treatments and medications), we can assume that the 2 countries also provide different physical therapy interventions during rehabilitation. These differences provide a wealth of variation from which to determine what works best for specific types of patients. A PBE study design approach has been used successfully in more than 30 multi-site studies in various clinical settings (inpatient surgical and medical care, ambulatory care, long-term care, hospice care, and rehabilitation care),

Comparative Known-Groups Construct Validity
including 4 studies in rehabilitation: stroke, joint replacement, spinal cord injury, and traumatic brain injury. 4 -10 The stroke and traumatic brain injury PBE studies each involved a study site from outside the United States, along with US sites. Dramatic differences were observed in patient characteristics and treatments between countries in those 2 studies, which led to very interesting findings related to better practices for these 2 conditions. I suggest that the next step is to conduct a PBE study for patients undergoing rehabilitation for knee impairments in facilities in Israel and the United States. We would like to thank Horn for her supportive and insightful commentary 1 on our article, 2 emphasizing the need to pursue our next goal toward conducting comparative effectiveness research 3 between the United States and Israel. It is our intention to do so using a practice-based evidence 4 -6 study design that will include expanded data from both countries on patient characteristics, treatment coding, and functional outcomes.
We agree that differences in health care systems and physical therapy settings between countries, and differences found in risk-adjusted functional outcomes by country, offer a unique opportunity to identify ways to achieve best possible outcomes. We take this opportunity to call for additional health care practitioners and researchers around the world, using a variety of health settings and care, to join us in this exciting endeavor. The more variance we have in the data analyzed, the greater the potential to discover best treatment options for the benefit of our patients. 7 Horn raised concern regarding the possibility that in trying to minimize patient selection bias by applying strict patient selection criteria using a minimum number of patients and completion rate per therapist, we might have actually increased patient selection bias due to the consequent and significant reduction in sample size analyzed. We agree with Horn that a valid way to test the generalizability of our findings in relation to patient selection criteria would be to compare our results with those that use all patients with admission and discharge data.
Therefore, we conducted an additional identical analysis of knowngroups construct validity using all patients receiving knee rehabilitation who had admission and discharge functional status scores. The sample size increased to 9,584 and 10,092 patients in Israel and the United States, respectively, roughly doubling the sample size for Israel and tripling it for the United States, compared with the original analyses. Using these less strict inclusion criteria, completion rates decreased from 60% to 46% for the Israeli sample and from 63% to 33% for the US sample. Interestingly, compared with the original known-groups construct validity analyses described in our article, 2 we found exactly the