Predicting OA progression to total hip replacement: can we do better than risk factors alone using active shape modelling as an imaging biomarker?

Objective. Previously, active shape modelling (ASM) of the proximal femur was shown to identify those individuals at highest risk of developing radiographic OA. Here we determine whether ASM predicts the need for total hip replacement (THR) independent of Kellgren (cid:2) Lawrence grade (KLG) and other known risk factors. Methods. A retrospective cohort study of 141 subjects consulting primary care with new hip pain was conducted. Pelvic radiographs taken on recruitment were assessed for KLG, centre-edge angle, acetabular depth and femoral head migration. Clinical factors (duration of pain, use of a stick and physical function) were collected by self-completed questionnaires. ASM differences between shape mode scores at baseline for individuals who underwent THR during the 5-year follow-up ( n = 27) and those whose OA did not progress radiographically ( n = 75) were compared. Results. A 1 S . D . reduction in baseline ASM mode 2 score was associated with an 81% reduction in odds of THR (OR = 0.19, 95% CI 0.52, 0.70) after adjustment for KLG, radiographic and clinical factors. A similar reduction in odds of THR was associated with a 1 S . D . reduction in mode 3 (OR = 0.45, 95% CI 0.28, 0.71) and a 1 S . D . increase in mode 4 score (OR = 2.8, 95% CI 1.7, 4.7), although these associations were no longer significant after adjustment for KLG and clinical factors.


Introduction
Currently there are no disease-modifying osteoarthritis drugs (DMOADs) proven to halt or reverse osteoarthritic disease progression. Thus treatment for this debilitating disease is largely symptomatic until joint destruction and pain are sufficiently severe to justify a total joint replacement. One reason for the lack of DMOADs is the often slow but variable progression of OA, making it difficult to predict which patients will remain stable and which will progress rapidly.
At the present time classification of radiographic OA is made using semi-quantitative scoring systems such as KellgrenLawrence Grade (KLG) [1,2] or Croft (a modification of KLG) [3] or the ACR criteria [4]. Although these are adequate for disease classification, their value as biomarkers for identification and stratification of patients at high risk of progression in clinical trials is limited due to their poor sensitivity to change.
Bone shape plays an important role in the development and progression of OA. While widths, lengths and angles can be used [57], the hip joint is complex and not well described by simple geometry alone [811]. Active shape modelling (ASM) is a statistical method that characterizes and quantifies variation in shape. This makes ASM an ideal tool for use with anatomic structures such as brain ventricles [12], spine [2,1315], hips [16,17], hands or heart [18,19].
We have previously demonstrated that ASM can identify, at an earlier stage of disease, those subjects at highest risk of developing radiographic OA or requiring a total hip replacement (THR) [17]. This study aimed to corroborate our initial observations in a larger population using a more sophisticated model, which unlike the model used in our previous study, explicitly measures medial and lateral osteophytes and the acetabulum. We also sought to evaluate whether ASM offers any added value over KLG and known risk factors in predicting OA progression to THR.

Patient population and study design
This study performed a retrospective analysis of radiographic images taken from the Primary Care Rheumatology (PCR) study. Details of patient recruitment into the PCR study have been described in detail elsewhere [2022]. In brief, this was a 5-year longitudinal study that took place in 35 general practices across the UK. Patients presenting to their primary care physician with a new episode of hip joint pain, which the physician believed to have arisen directly from the hip rather than having been referred from elsewhere, were invited to take part in the study. Ethical approval was obtained locally for each general practice before the start of the study and the subject's informed consent was obtained according to the Declaration of Helsinki.
At baseline, patients underwent an anteroposterior pelvic radiograph, a clinical examination (hip and knee flexion and rotation and Heberden node count) and completed questionnaires assessing health and function [WOMAC, short form 36 health questionnaire (SF-36) and a pain questionnaire]. The subjects were invited to return annually to repeat the same evaluations. In Years 2 and 5 the subjects underwent repeat radiographs. Femoral head migration (none, supero-medial, superolateral or concentric), acetabular depth, centre-edge angle (angle between the centre of the femoral head and lateral edge of the acetabulum) and minimum joint space width (mJSW) in millimetres were assessed in each radiograph [23].
In this study, all available radiographs from the PCR study were digitized using a Howtek MultiRAD 850 scanner (Howtek, Hudson, New Hampshire, UK) at a resolution of 146 dpi and a depth of 12 bits. If all features were visible, the radiograph was converted to 8-bit depth. If areas were unclear, brightness and contrast were adjusted to show all bony outlines before converting.
KLG was used to assign a severity grade to each hip in all radiographs that were of sufficient quality. Radiographs were compared with standard images from a reprint of the original Atlas of Standard Radiographs of Arthritis from the Epidemiology of Chronic Rheumatism [24] and from the revised Atlas of Individual Radiographic Features (since KLG 0 was not included in the original atlas) [25]. A single reader blinded to clinical diagnosis scored all radiographs. All THRs on study subjects during the study were recorded. To distinguish hips that showed rapid radiological OA progression from those that did not, hips were classified as progressing (an increase of two or more KLG) or non-progressing (a change of zero or one KLG). To examine differences between hips that did not show radiographic OA progression and those that resulted in THR only, the non-progressing hips and those that progressed to a THR during the study were included in the analysis. A single hip was chosen from each subject. Where the subject progressed to THR during the study, this was the hip chosen. In subjects where neither hip progressed radiographically, one hip was chosen at random such that the distribution of left to right hips matched that of the THR group. Subjects where one hip progressed radiographically and one did not were excluded from the non-progressing group.
Model development ASM uses landmark points to describe an object's outline. Each landmark refers to the same location in every image, allowing variation in shape to be measured across different images. Landmarks were placed on each image using the active shape modelling toolkit (Visual Automation Limited, Manchester, UK), a software program that runs within MATLAB (MathWorks, Inc., Natick, MA, USA) software.
The data set of landmark coordinates are input variables for a point distribution model [26]. This is analysed using principal component analysis to derive orthogonal modes of variation (modes), independent output variables that characterize shape variation across the data set [17].
Each mode is normalized over the whole data set to have a mean of zero and expressed in S.D. units with each image assigned a score for each mode in terms of S.D.s from the mean. Modes are linearly independent.
The ASM algorithm adds a search facility that can be used to automatically search a new image for an object. A robust automated method for segmenting objects in an image is often the primary reason for using ASM, but the objective of this study was to quantify changes in shape associated with disease so, although the automatic search was used to increase the speed at which points were placed, each point was checked and manually re-located if necessary. In order to match the images as closely as possible, all femurs in the ASM had the same orientation, chosen to correspond to a left hip, so images of right hips were flipped about a vertical axis.
Two ASM models were assessed, an existing 16-point model built for our previous analysis using data from the Rotterdam study [17] and, in a development of the www.rheumatology.oxfordjournals.org 563 OA progression prediction by shape modelling subsequently described 29-point model [27], a new 45-point model of the hip, in which osteophytes and acetabulum were explicitly marked ( Fig. 1).
For the 45-point model, a new ASM was developed from the hip images in this study, while the 16-point model scores were calculated using the ASM defined in the previous study [17]. This ensured that we could replicate our previous results in a second larger population as well as test whether a more comprehensive model, including more radiographic features of OA, would further improve THR prediction.
In this study we investigated the first six modes of variation of the 16-point model, as these were the modes associated with disease severity and progression in the previous study. The mode numbers for the 16-point model are the same as those described in the previous study.
We selected the first eight modes of variation from the new 45-point model following examination of the scree plot that showed a change in gradient at this point, which is a widely accepted cut-off for determining modes that contain the majority of the information (variance) in the model [28]. These modes together explained >75% of the total variance, and no other mode described >3% of the variance in the model. The mode numbers for the 45-point model are unique to this model; thus mode 2 in the 45-point model does not describe the same shape variation as mode 2 in the 16-point model.

Statistical analysis
Analyses were performed using SPSS version 18.0 (SPSS, Chicago, IL, USA). Spearman's rank correlation was used to investigate associations between baseline shape modes, baseline clinical risk factors, and baseline KLG. Stepwise logistic regression was used to test the relationship between potentially predictive factors and THR. The relationships are described using odds ratios (ORs) with 95% CIs. The influence of baseline age, BMI and gender on THR was assessed. The following clinical factors were analysed based on data derived from the administered questionnaires: use of a stick, duration of pain (recalled time interval of current pain episode) and severity of pain (based on numerical rating). Average stiffness score, average pain score and average physical function score were extracted from the WOMAC, with responses coded and transformed according to standard algorithms (WOMAC users guide VII). These six clinical factors were selected because they had previously been shown to be clinically relevant in predicting radiographic OA in this population [6,2022]. In addition to these questionnaire data, acetabular depth, femoral head migration, centre-edge angle and baseline mJSW (millimetres) were also included in the model to determine whether baseline ASM adds predictive value over the standard scoring system and clinical risk factors. Student's t-tests and one-way analysis of variance (ANOVA) were also used to test significance where appropriate.

Results
In total, 195 subjects (63 males and 132 females) were recruited into the study. The mean age of the subjects was 62.7 (10.7) years and mean BMI was 27.1 (4.8) kg/ m 2 . Using the ACR criteria for hip OA [4], 31.9% of subjects recruited into the study fulfilled the criteria of hip pain and femoral and/or acetabular osteophytes at baseline. Baseline pelvic radiographs were available for 191 of the 195 subjects. Of these, 176 were of sufficient quality to allow KLG scoring. During the study, 38 subjects had a THR and 103 subjects had hips that did not show radiographic progression between baseline and 5-year radiographs (i.e. KLG remained unchanged or changed by only one grade in both hips). From these two groups, 27 and 75 patients, respectively, had a complete set of baseline clinical data and were therefore used to test relationships between ASM and clinical parameters. There were no significant differences in height, weight and BMI between the THR and non-progression groups, respectively [163 vs 166 cm (P = 0.22), 71.2 vs 74.9 kg (P = 0.23) and 26.6 vs 26.5 kg/m 2 (P = 0.90)]. Although not significant, the percentage of females in the THR group was greater than in the non-progression group [39.5 vs 26.0% (P = 0.12)]. The THR group was, however, significantly older [66.8 vs 61.5 years (P = 0.026)]. There was also a significant difference in baseline KLG (P < 0.001), with those in the THR group more likely to have a higher KLG (KLG 0, 49.5 vs 0.0%; KLG 1, 38.8 vs 13.2%; KLG 2, 6.8 vs 28.9%; KLG 3, 4.9 vs 31.6%; and KLG 4, 0.0 vs 26.3%, non-progressor vs THR group, respectively).

Relationship between ASM mode scores, KLG and other radiographic parameters
Using the 16-point model, significant negative correlations were observed between modes 1 (head deformation), 3 (superior smoothing) and 6 (superior neck flattening) and KLG at baseline (Table 1, Fig. 2). Mode 6 was also significantly correlated with mJSW. Using the 16-point model, significant positive correlations between mode 2 and centre-edge angle and acetabular depth were observed (Table 1). Mode 6 scores were significantly lower in those with supero-lateral and concentric femoral head migration than those with medial femoral head migration (P = 0.001 and P = 0.02, respectively).
Significant correlations with KLG at baseline were also observed for the new 45-point ASM and modes 3, 4, 6 and 7 (Table 1). Significant correlations with baseline mJSW were observed for the new 45-point ASM and modes 3, 4 and 6 ( Table 1). Mode 3, associated with widening and flattening of the femoral head and neck and osteophyte formation, decreased in parallel with an increase in KLG ( Fig. 3C and D). In contrast, increasing mode 4 scores, associated with osteophyte formation, joint space narrowing and a reduction in neck shaft angle, increased in parallel with an increase in KLG (Fig. 3E and F). Mode 6, associated with evidence of osteophyte formation and widening of the femoral head, decreased in parallel with an increase in KLG.
A high mode 2 score was associated with some evidence of lateral osteophytes, femoral head flattening and thickening of the femoral neck, whereas a low score was associated with poor acetabular coverage and a steeper neck shaft angle. Mode 2 was not associated with KLG or baseline mJSW ( Fig. 3A and B); however, it was significantly positively correlated with centre-edge angle ( Table 1). Mode 2 was significantly higher in those subjects with supero-medial femoral head migration than those with supero-lateral femoral head migration (P = 0.003). In addition, significant positive correlations between mode 4, centre-edge angle and acetabular depth were observed (Table 1).

Relationship between ASM and clinical factors
For the 16-point model, low mode 2 scores correlated with increased baseline pain duration and mode 4 with  average baseline pain score (WOMAC) ( Table 1). The 45-point model mode 4 scores were significantly higher in those individuals who reported using a stick for mobility (P = 0.018).

Prediction of progression to THR
The predictive value of ASM alone (before and after adjustment for KLG) and in combination with other clinical factors was assessed by logistic regression. Only three of the six clinical factors described above entered into the stepwise model as predicting future THR: use of a stick, poorer physical function (from WOMAC) and longer duration of pain. Age, BMI and gender were not predictive. For the 16-point model, baseline scores of modes 1 and 6 significantly predicted progression to THR in the unadjusted analysis. Following adjustment for KLG, only mode 6 remained predictive, losing statistical significance after adjustment for clinical measures (Table 2).
Likewise, for the 45-point model, baseline scores of modes 3, 4 and 6 significantly predicted progression to THR, but none of these modes remained significant after adjustment for KLG. In contrast, once adjusted for KLG, mode 2 became significant, with ORs showing a 46% reduction in odds of THR for a 1 S.D. increase in baseline score (OR = 0.54, 95% CI 0.29, 0.99). Including clinical factors in the model strengthened this association, giving a 71% reduction in odds of THR for each S.D. increase in baseline mode 2 score (OR = 0.29, 95% CI 0.11, 0.75) ( Table 3). Measures of femoral head migration, acetabular depth, centre-edge angle and baseline mJSW were also added to investigate the effect of other measures of hip geometry. Interestingly, even after adjustment for these measures, baseline values of mode 2 remained strongly predictive of THR (OR = 0.17, 95% CI 0.04, 0.71) ( Table 3).
The positive predictive value (PPV) for KLG predicting THR was 57.9% and the negative predictive value (NPV) was 95.2%. When mode 2 was combined with KLG the PPV increased to 73.7% and the NPV was 91.4%.

Discussion
In this study we identified from baseline radiographs those subjects at greatest risk of having a THR within 5 years of the subjects first presenting to their primary care physician with hip pain. This extends our previous findings that ASM is a powerful tool for assessing risk of progression of hip OA to THR [17]. For the first time we have shown that, even with prior knowledge of KLG and other risk factors, quantification of radiographic images using ASM provides additional predictive ability and is a useful biomarker of OA.
Current whole joint methods for grading disease severity, such as KLG and Croft, are subjective and semiquantitative. Quantitative measures, such as joint space narrowing, account for only one aspect of the disease, in this case cartilage loss. The ASM models presented here found high correlations with KLG but, unlike KLG, present these data on a continuous scale. Thus ASM may provide

567
OA progression prediction by shape modelling a more sensitive and quantitative scale to enable small changes in disease progression to be measured. This study tested two different ASM designs for investigating radiographic hip OA. The first model, used in a previous study, comprised only 16 points to describe the femoral head and neck, the location of the major changes in bone shape caused by OA. The second model used 45-points to encompass the whole of the proximal femur, the acetabular eyebrow and osteophytes (Fig. 1).
Initially we applied the same 16-point model used in our previous study on the Rotterdam cohort to images from the PCR study. In accordance with our previous findings, we found that baseline values of modes 1 (femoral head deformation) and 6 (superior neck flattening) predicted THR [17]. In this study we went further and included KLG and clinical factors (pain duration, use of a stick and WOMAC physical function) in the logistic regression model. Following this adjustment, baseline values of mode 6 (superior neck flattening) still predicted THR, independent of KLG, though significance was lost when clinical factors were included in the regression model. This may be due to the reduction in power of the estimators when adjusting for more factors ( Table 2).
Using the new 45-point model, which incorporated the acetabulum, baseline values of modes 3, 4 and 6 predicted THR, although after adjustment for KLG these were no longer significant. Interestingly, once KLG was included in the logistic regression model, mode 2 was found to predict THR. Again, this was taken further and clinical factors (pain duration, use of a stick and WOMAC physical function) as well as the geometrical measures acetabular depth, centre-edge angle, femoral head migration and baseline mJSW were included in the logistic regression model. Following these adjustments, mode 2 still predicted THR.
These data are not only consistent with our previous studies, but also with other groups who have investigated the potential of femoral morphometry to act as an imaging biomarker for OA [2931]. Indeed, Lynch et al. [30] also used an ASM approach using a model of the proximal femur that was larger than our previous study [17], although it did not include the acetabulum or explicit modelling of osteophytes.
One inherent limitation associated with this study is that the PCR study was designed to investigate hip pain in general instead of OA in particular. This meant that asymptomatic subjects with radiographic OA were not identified and included in the study. Nevertheless, this study confirms the findings of previous studies investigating the link between hip morphology and the incidence, progression or severity of OA. Several factors have been examined, including joint space width, centre-edge angle, femoral head migration, neck shaft angle, dysplasia, impingement and estimates of pistol grip deformity based on femoral head and neck width [29,3241]. The ASM presented here describes changes in shape across the whole hip joint, incorporating many of these geometric measures identified previously as risk factors for OA.
Potential bias may also have been introduced by only including those subjects in the analysis for which complete clinical data were available. However, no significant differences were observed in age, weight and height between those included in the analysis and those without a complete set of data (data not shown).
In addition to analysis performed using THR as an outcome measure, we have also repeated these analyses using progression of 52 KLG and/or THR and progression alone as outcome measures. Similar results are observed in both analyses and can be found in the supplementary tables (available as supplementary data at Rheumatology Online).
The development of DMOADs is challenging not only in determining the therapeutic target, but also in defining those who should be treated. The slow and variable progression of OA makes it difficult to predict which patients will remain stable and who will progress rapidly. Furthermore, radiographic evidence of joint space narrowing through conventional radiographs (currently the only surrogate endpoint accepted by the FDA as an indicator of disease-modifying efficacy) can take years to demonstrate significant changes. Various biochemical biomarkers have been examined as potential surrogates of joint destruction in OA, but so far the results have been disappointing [42]. The availability of reliable biomarkers that allow identification and stratification of patients likely to progress more rapidly and to measure and track disease-related changes over a short period of time may therefore promote the successful development of DMOADs.
In conclusion, we have confirmed that ASM can effectively measure the severity of OA and identify those at greatest risk of progression to THR. Both the original and the new, extended ASM identified significant differences at baseline between the THR group and those that did not progress. ASM provides additional information to KLG while generating a quantitative scale corresponding to KLG. The new, larger (45-point) model provides a comprehensive model of the hip, including joint space narrowing and osteophytes, and is more powerful than the 16-point model for predicting THR. Mode 2 significantly predicted THR, even after adjustment for KLG, clinical factors and geometry. Further work should establish whether this method can be used to identify at an individual level those at risk of THR and rapid disease progression.
Rheumatology key messages . Active shape modelling of the hip is a reliable biomarker of radiographic hip osteoarthritis severity. . Active shape modelling improves identification of patients at higher risk of rapid osteoarthritis progression. . Active shape modelling improves identification of patients at higher risk of total hip replacement.