size of unerupted canines and premolars. Design and i mplementation of a h ybrid g enetic a lgorithm and a rtiﬁ cial n eural n etwork s ystem for p redicting the s izes of u

Medical Sciences, Mashhad, Iran. E-mail: talebim@mums.ac.ir SUMMARY The aim of this study was to develop a novel hybrid genetic algorithm and artiﬁ cial neural network (GA – ANN) system for predicting the sizes of unerupted canines and premolars during the mixed dentition period. This study was performed on 106 untreated subjects (52 girls, 54 boys, aged 13 – 15 years). Data were obtained from dental cast measurements. A hybrid GA – ANN algorithm was developed to ﬁ nd the best reference teeth and the most accurate mapping function. Based on a regression analysis, the strongest correlation was observed between the sum of the mesiodistal widths of the mandibular canines and premolars and the mesiodistal widths of the mandibular ﬁ rst molars and incisors ( r = 0.697). In the maxilla, the highest correlation was observed between the sum of the mesiodistal widths of the canines and premolars and the mesiodistal widths of the mandibular ﬁ rst molars and maxillary central incisors (0.742). The hybrid GA – ANN algorithm selected the mandibular ﬁ rst molars and incisors and the maxillary central incisors as the reference teeth for predicting the sum of the mesiodistal widths of the canines and premolars. The prediction error rates and maximum rates of over/underestimation using the hybrid GA – ANN algorithm were smaller


Introduction
Predicting the size of unerupted teeth during the mixed dentition period is important in managing the developing occlusion of growing children. Accurate prediction can help determine whether the available space in the posterior segments is suf cient to allow the permanent teeth to erupt freely and in good alignment in their respective arches ( Kirschen et al ., 2000 ). Three prediction methods are typically used: direct measurement of the unerupted tooth size on radiographs ( Staley et al ., 1984 ;De Paula et al ., 1995 ), calculations from prediction equations and tables ( Tanaka and Johnston, 1974 ;Ferguson et al ., 1978 ;Moyers,1988 ), and a combination of radiographic measurements and prediction tables ( Hixon and Oldfather, 1958 ;Staley et al ., 1984 ;Bishara et al ., 1989 ).
The Hixon and Oldfather approach is considered to be the most accurate ( Gardner, 1979 ;Irwin et al ., 1995 ), but it is complex and dif cult to implement. The Tanaka and Johnston prediction equations and table are also widely used ( Al-Khadra, 1993 ;Schirmer and Wiltshire, 1997 ). However, they were developed for white North American children and their use is questionable in other populations since tooth sizes vary signi cantly between and within different ethnicities ( Bishara et al ., 1989 ;Al-Khadra, 1993 ;Schirmer and Wiltshire, 1997 ). Moyer's regression scheme has achieved widespread clinical acceptance because of its simplicity and ease of application. Several simple linear regression equations have also been proposed for populations of different ethnic origins ( Al-Khadra, 1993 ;Yuen et al ., 1998 ;Lee-Chan et al ., 1998 ;Jaroontham and Godfrey, 2000 ;Diagne et al ., 2003 ), with mixed dentition analyses varying between different racial and population groups.
The increasing use of information systems in health care and the considerable growth of medical databases require traditional data analyses to be adjusted to new more ef cient computational models. Methods based on arti cial intelligent algorithms have found numerous applications in different biomedical sciences. Genetic algorithms (GAs) and arti cial neural networks (ANNs) have been widely used for predicting and interpreting biological activities, separately or as hybrid structures ( Niwa, 2004 ;Zheng and Thomson, 2005 ;Fernandez and Caballero, 2006 ). In particular, ANNs are widely used to analyze medical data ( Meersman et al. , 1996 ;Lo et al. , 1998 ;Ngan and Hu, 1999 ) and have also been used for dental caries prediction ( Devito et al. , 2008 ).
The aim of this study was to develop a hybrid GA -ANN algorithm to identify the best reference teeth for accurately predicting the size of unerupted canines and premolars.
Design and i mplementation of a h ybrid g enetic a lgorithm and a rtifi cial n eural n etwork s ystem for p redicting the s izes of u nerupted c anines and p remolars S. Moghimi * , M. Talebi ** and I. Parisay *** 481 S. MOGHIMI ET AL.

Data p reparation
This cross-sectional study was carried out on 106 subjects (52 girls, 54 boys, aged 13 -15 years) selected from a group of 1400 students in the seven regional directories of Mashhad, Iran. Informed consent was obtained prior to the investigation. Inclusion criteria were: complete permanent dentition with no caries, proximal restorations, attrition, or dental anomalies; all teeth fully erupted to the occlusal plane; no previous or ongoing orthodontic treatment; and no transverse discrepancies, such as a crossbite or scissor bite.
Polysilicone impressions (Coltene speedex, Polysiloxane condensation type silicone elastomer, Whaledent Inc., Mahwah, New Jersey, USA) were obtained from each subject and were poured in type IV plaster (Zhermack Elite Rock Gypsums, Elite Rock Class IV Die Stone, Zhermack SpA., Badia Polesine, Italy) on the same day by a technician. Measurements of the maximum mesiodistal widths of all mandibular and maxillary incisors, canines, premolars, and  rst molars (upper lateral incisors were not included as predictors because of their form variability) were carried out with an electronic digital ve r nier calliper (Zhenjiang Richoice Machinery Imp. & Exp. Co., Ltd , Jiangsu, China), with an accuracy of ± 0.02 mm and repeatability of ± 0.01 mm. For easier access to the interdental spaces, the measuring tips of the calliper were narrowed. Since the narrowed tips  t the interdental spaces, they provide reliable estimations of the teeth sizes. The measurements were obtained perpendicular to the occlusal plane. All the measurements were carried out by one author (IP) and repeated after 24 hours. Intra-observer reliability was predetermined as 0.2 mm. When the two sets of measurements varied by 0.2 mm or less, the measurements were averaged; otherwise, a new set of measurements was made and the nearest three measurements were averaged.

Development of the intelligent hybrid algorithm
The role of GA is to  nd the best reference input for the prediction process, while the ANN searches for the best mapping function to predict the targets (sizes of premolars and canines) based on the inputs provided by the GA.
The GA is a search algorithm based on survival of the  ttest among string structures. Since 10 measurements were made on each dental cast (six on the upper dental cast and four on the lower dental cast), the GA chromosomes consisted of 10 × 1 binary vectors consisting of 0s and 1s. To obtain the best possible results, a  tness function was de ned based on the error of the neural network, as follows: the maximum and minimum error and standard deviation were used in the  tness function, which forced the algorithm to focus on reducing the error range, maximum, and minimum. A population size of 50 and 200 generations were used to predict the mesiodistal widths of the unerupted maxillary and mandibular canines and premolars. The GA tends to converge on a near-optimal solution through iterative use of three operators (selection, crossover, and mutation ; Baese, 2004 ). Selection determines the survival of each population, based on the value of the  tness function, mutation arbitrarily alters one or more components of a selected chromosome, while crossover allows the search to fan out in diverse directions, and chromosomes of the best population members to be combined to produce the next generation. In the GA , the crossover fraction was set to 0.8. The proposed algorithm was trained as follows. During each iteration, the GA introduced the reference teeth (by setting the corresponding values in the chromosomal binary vector to 1) into the ANN. The ANN tries to  nd the best mapping function for relating the reference inputs to the targets. Next, the algorithm checks the stopping criteria. If satis ed, the results are reported; if not, the GA moves to the next generation, searching for better possible candidates among the reference teeth. This process is repeated until the algorithm  nds a result that satis es the stopping criteria or until the number of generations exceeds the prede ned value. This process is illustrated in Figure 1 .
Due to the dimensions of the input vector,  ve fold crossvalidation was used to avoid any bias or error during testing of the prediction ef ciency of the proposed algorithm. This would provide an estimate of the generalization error through resampling the training and test samples, which were selected randomly.

of 6 PREDICTING THE SIZES OF UNERUPTED TEETH
To reduce the number of measurements used as input vectors, a linear inequality constraint for the GA was de ned: where V is the 10 × 1 input binary vector. By changing the value of c , the maximum number of references (the number of chromosomes entries set to one) that the GA is allowed to select may be de ned. For example, by setting c = 4, the GA was forced to use at most four reference values.
A multi layered perceptron neural network was used. The ANN had two hidden layers with eight and four neurons, respectively. The nodes of the  rst hidden layer had a tansigmoidal transfer function and those of the second hidden and output layers had purely linear transfer functions. The weights and biases of the network were initialized with random values. Batch training, where weights between the neurons were updated only after all the training samples were exposed to the network. The Levenberg -Marquardt method ( Wu, 2008 ) was used for training, which gave the best possible results.
Once the training was completed, the GA was no longer used in the testing process. The reference teeth introduced by the GA as the best input candidates are used by the ANN for predicting the output through utilizing the best mapping function, developed during training.
All processing steps for the development of the hybrid GA -ANN algorithm were performed using matrix laboratory, version 7.8.0.347 (Matlab, MathWorks, Inc., Natick, Massachusetts, USA).

Regression analysis
The Kolmogrov -Smirnov test was used to con rm data normality. Pearson product-moment coef cients were used to estimate the correlation between the mesiodistal widths of the canines and premolars in the maxilla and mandible with the reference teeth (maxillary central incisors and  rst molars, mandibular incisors and  rst molars). Statistical calculations and analyses, including error variance of the estimates ( mean square of error ), correlation coef cients ( r ), coef cients of determination ( r 2 ), and analysis of variance (ANOVA) of the regression equations were performed using the Statistical Package for Social Sciences version 15.1 (SPSS Inc., Chicago, Illinois, USA). Statistical signi cance was established at the 0.05 level. Table 1 shows the linear regression equations derived for predicting the mesiodistal widths of the mandibular canines and premolars using different reference teeth. The linear equations were obtained using the sum of the mesiodistal reference teeth as inputs.

Regression analyses
The highest correlation in the mandible was between the sum of the mesiodistal widths of the canines and premolars with the mesiodistal widths of the mandibular  rst molars and central and lateral incisors ( r = 0.697). The prediction errors (estimated as the sum of the mesiodistal widths of the canines and premolars on both sides of a dental arch) are indicated in Figure 2a .
The linear regression equations used to predict the mesiodistal widths of the maxillary canines and premolars are illustrated in Table 2 . As in the previous case, the linear equations were obtained using the sum of the mesiodistal widths of the reference teeth as inputs. The highest correlation in the maxilla was between the sum of the mesiodistal widths of the canines and premolars with the mesiodistal widths of the mandibular  rst molars and maxillary central incisors ( r = 0.742 ). The prediction errors are indicated in Figure 2b .

Hybrid GA -ANN
The prediction error of the hybrid GA -ANN algorithm is illustrated in Figure 3 . The reference teeth selected by the GA were the mandibular  rst molars, central and lateral incisors, and maxillary central incisors. The same references were used to predict the mesiodistal widths of both the mandibular and maxillary canines and premolars. The under/ overestimation rates of the real sizes of the maxillary canines and premolars were less than those of the mandible. Table 3 shows comparison of the maximum errors in predicting the sizes of the mandibular and maxillary canines and premolars using linear regression analysis and the hybrid GA -ANN. The maximum prediction errors obtained using the hybrid GA -ANN were noticeably smaller than those obtained using linear regression. The prediction accuracy was higher in the maxilla than in the mandible when using the hybrid GA -ANN algorithm.

Figure 2
Prediction error in estimating the mesiodistal widths of the canines and premolars in the mandible (a) and maxilla (b) using linear regression and the reference teeth with the highest and second highest correlation in (a)  The maximum over/underestimation errors were plotted as a function of c ( Figure 4 ). These are the errors in predicting the sizes of the maxillary canines and premolars for the same test samples (20 per cent of the total cases, selected randomly). For c = 4, the algorithm selected the maxillary and mandibular central incisors, and the mandibular  rst molars as reference teeth. When c = 3, the 484 S. MOGHIMI ET AL.

Figure 3
Prediction error in estimating the mesiodistal widths of the canines and premolars in the mandible and maxilla using the hybrid genetic algorithms and arti cial neural network. selected references were the maxillary central incisors and mandibular  rst molars. When c = 2, the selected references were the maxillary central incisors. The absolute value of the rates of under/overestimation increased when the algorithm was forced to select fewer references.

Discussion
Different racial and ethnic groups present variations in the mesiodistal widths of their permanent teeth ( Bishara et al ., 1989 ;Schirmer and Wiltshire, 1997 ;Niwa, 2004 ;Zheng and Thomson, 2005 ;Fernández and Caballero, 2006 ). Thus, equations de ned by data collected from one ethnic group to predict the size of unerupted permanent teeth might not be applicable to another ( Al-Khadra, 1993 ;Yuen et al ., 1998 ). In contrast, the hybrid algorithm adjusts its structure based on the training samples presented to the system. Therefore, the hybrid GA -ANN algorithm can be used to predict the size of unerupted teeth in different ethnic groups, provided that an appropriate training data set is presented to the system.
An ideal prediction method should result in no difference between the predicted and actual widths of the permanent canines and premolars. However, prediction methods are not 100 per cent precise and can over-or underestimate the actual sizes of unerupted teeth ( Fisk and Markin, 1979 ; Genetic algorithm -arti cial neural network (GA -ANN) . Moyers, 1988 ). Overestimation by 1 mm of the actual widths of the permanent canines and premolars on each side of the arch is better than underestimation at preventing crowding and should not seriously affect an extraction decision ( De Paula et al ., 1995 , Bernabé andFlores-Mir, 2005 ). Using the hybrid GA -ANN algorithm, the prediction error for the maxillary unerupted teeth was between − 1 and 0 mm a greater percentage of the time than the prediction error for the mandible. Thus, when using the hybrid GA -ANN algorithm, the prediction accuracy was higher in the maxilla than in the mandible. For both the maxilla and mandible, the prediction error was between − 1 and 0 mm a greater percentage of the time when using the GA -ANN algorithm than when regression equations were used. This indicates that the hybrid GA -ANN algorithm provided more accurate results than the most correlated regression equations. In both the mandible and maxilla, the percentage of prediction errors >1.5 mm or < − 1.5 mm was noticeably smaller using the hybrid GA -ANN algorithm than when regression equations were used. This is because the  tness function in equation 1 was de ned to minimize the maximum absolute error value. During regression analysis, the sizes of the mandibular  rst molars and maxillary central incisors were strongly correlated with those of the maxillary canines and premolars ( r = 0.742 ). In the mandible, the highest correlation was observed between the sizes of the mandibular  rst molar and incisors with the sizes of the canines and premolars ( r = 0.697 ). The GA -ANN selected the mandibular  rst molars and incisors, and maxillary central incisors as the reference teeth for predicting the sizes of the unerupted mandibular and maxillary canines and premolars.
Reducing the value of c , which limited the number of references that the GA -ANN algorithm could choose, increased the maximum error values. However, the maximum error was not large compared with errors obtained using the linear regression equations. For c = 2 , the algorithm identi ed the mandibular  rst molar as the most suitable reference tooth for predicting the sizes of the canines and premolars. This agreed well with regression analyses since the correlation of regression analysis between the maxillary central incisors and canines and premolars was 0.697. This was also the highest correlation observed when using a single reference for prediction.
The data were prepared by manual measurements. For ease of use of the introduced algorithm, a graphical user interface was developed ( Figure 5 ). The system parameters vary depending on whether the prediction was for unerupted teeth in the mandible or maxilla; thus, the user  rst selects the aim of the prediction (i.e. mandible or maxilla) then uploads a digital photograph of the corresponding cast. The measurement system is calibrated by clicking on a 10 mm line (e.g. a ruler) three times, and the average value of the selected distance is recorded as the calibration basis. The user then clicks three times on the two ends of the prede ned teeth in the captured cast image. The recorded measurements are fed into the algorithm software, and the results are provided in both graphical and numerical form. In the calibration phase, the horizontal or vertical distance between the selected points is recorded (based on the positioning of the calibration element), while in the measurement phase the Euclidean distance is calculated since the teeth may have many different directions. As the static occlusal view may not be ef cient for  nding the mesiodistal width of tilted teeth, digital photographs cannot be useful for predicting the width of the unerupted teeth when measurements need to be carried out on such teeth. This study has several limitations. Due to the regional nature, the proposed prediction method was only tested in one ethnic group. More generalized studies in different ethnic groups are needed to validate the feasibility of the proposed method.
Multivariable regression could provide more accurate results compared with the single variable regression technique adopted in this study. However, these equations are complex and dif cult to memorize, justifying the use of the simple equations adopted ( Staley et al ., 1984 ;Nourallah et al ., 2002 ;Legovi ć et al ., 2003 ;Bernabé and Flores-Mir, 2005 ).

Conclusions
The proposed technique is a promising tool for predicting the sizes of unerupted canines and premolars with greater accuracy than linear regression analyses. Most notably, the structure of the technique can be adjusted based on the data collected from different ethnic groups.