Deep machine learning of the spectral power distribution of the LED system with multiple degradation mechanisms

The performance and reliability of the light-emitting diode (LED) system significantly depend on the thermal–mechanical loading-enhanced multiple degradation mechanisms and their interactions. The complexity of the LED system restricts the theoretical understanding of the root causes of the luminous fluctuation or the establishment of the direct correlation between the thermal aging loading and the luminous outputs. Thispaperappliesthedeepmachinelearningtechniquesanddevelopsagatednetworkwiththetwo-steplearningalgorithmtobuildtheempiricalrelationshipbetweenthedesignparametersandthethermalagingloadingandtheluminousoutputofLEDproducts.Theflexibilityoftheproposedmethodwillbedemonstratedbyintegratingitwithdifferentneuralnetworkarchitectures.TheproposedgatednetworkconcepthasbeenvalidatedinbothmultipleLEDchippackagingandLEDluminaireunderthermalagingloading.ThevalidationoftheluminousdataofmultipleLEDchippackagingshowsthatthemaximumdifferencesofthecorrelatedcolortemperature(CCT)andcolorcoordinateare2.6%and1.0%,respectively.Moreover,themachinelearningresultsoftheLEDluminaireexhibitthatthedifferencesoflumendepreciation,CCTandcolorcoordinateare1.6%,1.9%and1.1%,after2160hofthermalaging.


INTRODUCTION
Light-emitting diode (LED) has become as one of the most promising lighting solutions due to its energy efficiency, flexible controllability and long lifetime [1,2]. An LED lamp or luminaire is a complex system that is mainly comprised of an LED light source, a driver, control gears, secondary optical parts and heat dissipation components, as shown in Fig. 1a. At the LED light source level, the silicone encapsulants are combined with phosphors to form a composite for blue light (LED chip) conversion into "white" color [3], as shown in Fig. 1b. The spectral power distribution (SPD) is considered as the fingerprint of an LED system (shown in Fig. 1c), from which both luminous flux (in lumen) and color shift (in terms of color coordinates) can be determined.
The LED chip often has a lifetime as long as 25 000-100 000 h, but the LED lamp or system has a shorter life. From the recent study, it has been observed that there are many mechanical failures to cause the fast degradation of the LED system. The water trapped inside silicone may cause bubble generation [4,5]. The higher temperature at the phosphor/silicone interface can cause discoloration, decohesion and cracking of the interface layer [6]. Another study revealed that the addition of phosphor with high-temperature aging would stiffen the silicone matrix significantly. The increased Young's modulus of the silicone was observed with aging time, which causes severe stress conditions and cracks in the material [7]. In addition, the die-attach delamination [8,9], broken bond wire, solder joint cracking, etc. also contribute to the catastrophic or degradation failure of the LED system [1]. The thermal dissipation design plays an important role [10]. Moreover, the LED driver, which is a very complicated subsystem within an LED lamp, has a significant impact on LED's performance, as the driver regulates the electrical current input for the LED system [11][12][13][14].
Many studies have focused on the investigation of single physics or mechanisms for the degradation of the LED system. For example, a physics-of-failure-based reliability prediction methodology for LED drivers has been developed to estimate the failure rate distribution of an electrolytic capacitor of the given LED driver systems [12]. The coupled electronic-thermal simulation has been carried out to study the relationship between the driver's output current and the luminous flux throughout the operation life [13]. In a recent study, a comprehensive study was conducted to investigate the effect of humidity and phosphor on moisture absorption, hygroscopic swelling, mechanical behavior and thermal properties of silicone/phosphor composite in comparison with the pure silicone [5].
In recent years, the machine learning method has been widely used in various research domains. It has been proven to be able to handle the complexity with high nonlinearity and multivariate relationships. In the field of electronic packaging, for the known failure mechanism. Chou et al. and Liu et al. did pioneering works using the AI-assisted design and simulation concept for solder joint reliability analysis [15][16][17]. Yuan et al. have extended this concept to predict the risk of the waferlevel chip-scale packaging under the thermal cycling loading, and only the fatigue failure mechanism of the solder joints is considered. The sequential neural networks, including the recurrent neural network (RNN) and long short-term memory (LSTM), are applied due to the time-dependent nature of the degradation mechanism [18,19]. Fan et al. developed a decomposition approach of the SPD with the artificial neural network (ANN) learning to correlate the thermal impact to a multiple LED chip packaging [20] under various input current and case temperature combinations.
Different from the author's previous approach using machine learning as an assisted method for finite element simulation, this paper develops a deep neural network with a new gated network structure to train and analyze the LED system spectrum degradation directly. This research aims to establish a simple numerical neural network model to learn the luminous measurement by TM-35-19 [21] that might contain multiple degradation mechanisms, but not to replace it. Comparing to the approach in [20], the decomposition process is neglected, and a two-step learning approach has been developed to accelerate the machine learning speed of the gated network. Moreover, the flexibility of the gated network makes it possible to be merged with most of the neural network architectures, e.g. ANN, RNN and LSTM, and broadens the application. In this paper, the ANN-based gated network is first validated by the datasets from [20]. Afterward, due to the time-dependent nature, the sequential network model-based gated network is trained to learn the SPDs of the LED luminaire under thermal aging loading. This paper is organized as follows: In Section 2, the concept of the gated neural network architecture, the two-step learning procedure and the definition of the average error norm are provided. Section 3 shows the SPDs obtained from the multiple LED chip packaging and LED luminaire. Next, in Section 4 the gated network with the two-step algorithm is validated by SPDs of the multiple LED chip packaging. Additionally, LED lamp aging test results are applied to test the accuracy of sequential gated neural networks. Herein, three sequential neural network architectures are applied to compare the learning efficiency. Section 5 summarizes this paper.

THEORY 2.1 Gated neural network architecture
In this paper, the direct representation from the physical parameters to the full spectrum of LED model is designed in the following format: where f(·) is the direct correlation function and wl, g, p and s p are the wavelength, gate vector, physical parameter vector and spectrum power distribution, respectively. Let f(·) be a deep neural network, then Eq. (1) can be visualized as Fig. 2. Each component of the gate vector g represents the turning points and/or the secondary derivation change points. In the combination of wl (wavelength), the neurons of the second layer of the gated neural network are able to extract the characteristics of the SPD and the physical parameters (e.g. junction temperature, input current). Two more fully linked hidden layers, after the above-mentioned layer, are designed to capture the influence of the physical parameters on the SPD. Here, we define the neural network architecture that applied the gate vector ( g) as the gated neural network architecture. More than a time-independent neural network framework, the gated neural network architecture of Fig. 2 can be applied to the time-dependent/sequential deep machine learning algorithms, such as the RNN and LSTM framework.

Two-step learning procedure
According to Fig. 2, the gated network is complicated, and it is composed of many weightings. At the beginning of the learning procedure, it requires the initial guessing of these weightings, which often causes the learning procedure's instability. Therefore, a two-step learning procedure is proposed to reduce the influence of such instability. First, a typical SPD was selected from the total datasets and the correlation between the wavelength (wl) and spectrum power (s p ) was trained, as shown in Fig. 3a. Next, a new network structure, which includes multiple physical parameters ( p), has been established, as shown in Fig. 3b. In this new network structure, the weightings that belong to the previous network structures are replaced by the previous values. Moreover, during the first few hundred learning cycles, these weightings are even fixed to obtain a fast convergence of the training of the new weightings, which is shown in Fig. 3b as the weightings belong to the shaded neurons. Only after certain learning iterations, the constraints of the weightings are released to acquire a better convergence. In this paper, the sigmoid activation function is applied in order to secure the convergence stability and the high-order continuity of Eq. (1).
For the time-dependent aging SPD learning, under the assumption that Eq. (1) is validated, a degradation parameter s(t) can be introduced. Hence, the time-dependent aging SPD function can be written as Equation (2) implies a two-step learning: f 0 is the first step learning of the SPD before aging and the learning of the degradation parameter s(t) is the second step.

Average error norm
The accuracy of the learning results is evaluated by the average error norm. Regarding Eq. (1), assume that the true and predicted spectrum powers are denoted as s p,true and s p,pred , respectively. The error of each spectrum power can be defined as e s = (s p,true − s p,pred )/s p,true , which is also the cost function of the gated neural network learning algorithm. Hence, the error   norm of an SPD can be defined as where n is the number of spectrum errors. Therefore, the average error norm (AEN) of m SPDs is defined as 3. E XPERIMENTAL DATA 3.1 Multiple LED chip packaging under different case temperatures and input currents A multiple chip packaging, which consists of cyan, blue and red LEDs and is covered by the yellow phosphor, has been selected as a test carrier, as shown in Fig. 4. Under various case temperatures and input currents, the SPD has been measured by an integrated sphere system with the control of the case temperature and reported in [20]. Figure 5a-e shows averaged SPD data obtained from the five case temperature levels of 25, 40, 60, 70 and 80°C, respectively. In each panel of Fig. 5, the SPDs at different input current levels are shown. The color characteristics derived from the SPDs are listed in Table 1, including the correlated color temperature (CCT) and the color coordinate based on the CIE 1931 standard. Because three LED chips have been applied, there are three major peaks, which are shown in all panels in Fig. 5. The flat plateau at ∼530-625 nm is the contribution of the emission of the phosphor, which is excited by the blue and cyan chips. In all different combinations of the case temperatures and currents, the basic SPD shapes are not changed. The spectral power increases with increasing in the input current. Moreover, the spectral power drops significantly as the case temperature rises. Furthermore, the peak value of the LED chip slightly changes at different base temperatures.

Figure 6
The LED luminaire under the aging test.

Time-dependent LED luminaire aging
Consider five LED luminaires, including multiple white-light LED packaging (blue chip with yellow phosphors) on a printed circuit board, reflector, plastic lens, electronic driver and luminaire, shown in Fig. 6, under an accelerated power-on lifetime test (Fig. 7) with the ambient temperature of 50°C [13]. The SPDs of the averaged five lamps during the thermal aging loading are shown in Fig. 8, where the multiple degradation phenomena, including the LED chip/packaging, yellowing of the reflectors/lens and drivers, might occur simultaneously. The SPD data have been collected every 240 aging hours and there are total 10 SPD datasets for each luminaire. The averaged SPDs over these five luminaires have been selected as the total datasets. Table 2 lists the lumen maintenance and color characteristics that have been derived from the SPDs shown in Fig. 8, where the CIE 1931 X-Y color coordinate data are derived by the integral of the SPD and the standard CIE 1931 color-matching functions.
Due to the complexity of the LED system, as depicted in the third column of Table 2, the decaying of the lumen output is not linear with time. Moreover, the decaying of the SPDs, shown in Fig. 8, indicates a nonuniform trend with respect to time. The peak spectral power at 530-680 nm, which is mainly contributed by the phosphors, decays fast in the first 960 h, and then the decaying speed becomes slow and steady.

MACHINE LE ARNING OF THE G ATED NEUR AL NET WORK 4.1 Learning of the SPD of multiple LED chip packaging
The 30 averaged SPDs shown in Fig. 5 are the total datasets for machine learning. Only nine of the total datasets have been applied for the gated neural network training, and the rests are applied for the validation.
The gate vector ( g) has been defined carefully in order to represent the characteristics of the SPD. Referring to Fig. 9, nine gates are defined as 380, 455, 475, 495, 545, 595, 620 and 640 nm. The gate vector ( g) is fixed through all machine learnings in this section.
The two-step learning procedure is applied. First, the SPD where the current is 140 mA and cast temperature is 60°C is selected as the baseline training set. The spectrum powers between 380 and 430 nm are selected as the testing sets. Figure 10a shows the comparison of the experimental measurement and the trained gated neural network prediction, and Fig. 10b shows the convergence of the learning, where one can observe the stepwise convergence due to the gated network structure. According to Eq. (3), the error norm of comparison in Fig. 10a is 0.2587, and the three significant peaks that represent the cyan, blue and red LEDs can be clearly identified.
Second, a new neural network that is based on the previous gated neural network with the physical parameter vector ( p) is established, as visualized in Fig. 3b. Among the total 30 SPD datasets, only 9 of them where the case temperatures are 25, 60 and 80°C and the input currents are 50, 140 and 200 mA are selected as the training sets. The spectrum powers between 380 and 430 nm of the SPD where the case temperature is 25°C    and the input current is 50 mA are selected as the testing sets. Figure 11 shows that convergence of the secondary step gated network structure and the transition occurs at 100 000 iterations. According to Eq. (3), the error norm of each SPD is listed in Table 3, where the cells highlighted in bold represent the training set, and the remaining ones are the dataset for the comparison purpose. The average error norm is 0.7056, according to Eq. (4). Notably, the SPD error norm at the first step improves to 0.1473 because more learning iteration is involved. Via the analysis of variance (ANOVA) method, the main effects, including the input current (mA) and the case temperature (°C), are plotted in Fig. 12, where the gated network structure predicts the SPD well under high current and high case temperature conditions. Considering the predictability of the gated neural network with the physical parameter vector, Table 4 lists the comparison of the optical characteristics of the whole dataset SPD in terms of CCT and CIE 1931 coordinate. The columns CCT and XY of Table 4 are defined as (CCT exp − CCT pred )/CCT exp spectively, where the subscripts "exp" and "pred" represent the experimental and gated network predicted values. The CCT column of Table 4 shows a similar trend to the SPD error norm shown in Table 3. The paired t-test shows that these two columns have a P-value of 0.385, which means statically relevant. The predictability of the gated neural network can be expected to be <2.62% for CCT and <1% for CIE 1931 coordinate. Fan et al. [20] have applied the same datasets for the SPD decomposition with the ANN learning method, and the root-mean-square error (RMSE) and chromaticity difference ( xy) are defined as In [20], the averaged RMSE and xy are reported as 6.33 × 10 −5 and 0.0021, respectively. The same error estimation equation has been applied in the learning results shown Figure 11 The convergence of the second step of the learning results. in Table 3, and the averaged RMSE and xy are 3.642 × 10 −5 and 0.0020, respectively. This comparison indicates that the same level of prediction capability can be achieved in our gated neural network approach. However, the proposed method does not require the SPD decomposition, and it has the potential to integrate with time-dependent neural network algorithms.

Learning of the SPD of LED luminaire
Ten averaged SPDs, shown in Fig. 6, have been selected as the total datasets. Only 3 of these 10 datasets are applied for the gated network training, and the rests are applied for the accuracy validation.
At the first step of the gated neural network learning, the SPD before the aging (the 0-h data) has been selected, and the gate vector ( g) has been selected as 380, 454, 464, 483, 540 and 604 nm, which are the significant peak/valley or the curvature change points, as shown in Fig. 13. The training result of the gated network at the first step is shown in Fig. 14a, and the SPD error norm [following Eq. (3)] is 0.1482 after 560 000 learning iterations, shown in Fig. 14b. Note that the spectrum powers of 380-410 nm wavelength are selected as the testing sets to monitor the learning procedure.
Following Eq. (2), the s(t) curves are derived in Fig. 15a, and four gate parameters are selected as 380, 429, 459 and 500 nm, as Figure 12 The main effects' (cast temperature and input current) plot of the SPD error norm ANOVA. shown in Fig. 15b. Three different neural network methods, including the conventional ANN, RNN, and gate network LSTM [19], have been chosen to compare the application feasibility. The network structures of these three are shown in the first row of Table 5. Moreover, in order to have a fair comparison, the initial guessings of the weightings of these three networks are set to be similar to the ANN one, including the top structure of the RNN and the a-gate of LSTM. Furthermore, the complexity of the LSTM has been reduced; as shown in the illustration of Table 5, fixed values are given to the i-and o-gates, and a shallow network remains at the f-gate of LSTM. The SPDs at the aging hours of 720, 1440 and 2160 have been selected as the training sets, and the spectrum powers between 380 and 410 nm at 720 h are selected as the testing set for the ANN, RNN and LSTM training. The learning rate is fixed to 0.2 for all three methods. The best average error norms of ANN, RNN and LSTM within 75 000 learning iterations are listed in the second row of Table 5. Following Eq. (4), the average error norms listed in Table 5 considered the full datasets instead of the training set only. Due to that, the SPD change under thermal aging is mainly due to the bandgap of chip and accumulated defect growth of the phosphors, which are mostly time-dependent factors. Hence, theoretically, the sequential network architectures (like RNN and LSTM) should perform better than ANN, which coincides with the learning results listed in the second row of Table 5.
Moreover, it shows that the LSTM performs the best among all three methods, and this is because of the state variable ([s t ]) in LSTM, which is also reported in [19]. The learning results show that the LSTM performs the best. Figure 16a shows the prediction results, and Fig. 16b shows the convergence curve. Table 6 correlates the SPD error norm and the luminous prediction of the gated network. It shows that the LSTM architecture of the gated network performs best compared to the ANN and RNN.
Based on the best LSTM result shown in Table 5, an additional 300 000 learning iterations have been executed, and Table 6 lists the learning results and the comparison to the luminous data of the total 10 datasets. According to Table 6, although the prediction accuracy decreases for 720-1440 h data, it becomes stable after 1680 h. From Table 2, the experimental results show significant lumen maintenance change due to the thermal aging after 2160 h (90 days), with a slight color quality change. The machine learning results of the solid-state lighting products exhibit that the maximum differences of lumen depreciation, CCT and color coordinate are 1.58%, 1.91% and 1.07%, after 2160 h of thermal aging.

CONCLUSIONS
The complexity of the LED system and the interactions among the thermal-mechanical-driven degradation mechanisms

Figure 16
The learning result and the convergence of the second step gated network learning. Table 6 The prediction accuracy of the LSTM-based gated network. at different levels induce the difficulty of identifying every mechanism's impact. Hence, the challenge of building the correlation of the thermal and thermal aging loading to the LED product's luminous output remains. Therefore, machine learning methods have been applied to conquer this challenge.

Errors of LSTM-based gated network prediction
In this paper, a gated neural network has been proposed to empirically correlate the thermal and thermal aging loading to the luminous performance for LED packaging and LED luminaire under multiple degradation mechanisms. The gates are derived from the SPD to prescribe its characteristics. The theoretical flexibility of the gated network enables it to be embedded with most neural network architectures, such as the ANN, RNN and LSTM. Moreover, a two-step algorithm is then proposed to accelerate the learning speed of the gated networks.
The gated network concept has been validated in both LED packaging and LED luminaire. First, the SPDs of the multiple LED chip white-light packaging under various thermal and current loadings have been used to verify the gated network's capability with the two-step algorithm. Among all 30 cases, only 9 of them are used for neural network training. The maximum differences of the CCT and color coordinate are 2.6% and 1.0%, respectively. Compared to the SPD decomposition with the ANN method in [20], our method achieved the same level of prediction accuracy. Moreover, the proposed method does not require the SPD decomposition, and it has the flexibility to be integrated with other time-dependent neural network algorithms.
Moreover, a total of 10 SPD datasets are available from the LED luminaire aging tests. Only 3 out of those 10 datasets are applied for the gated neural network training. The error of the timedependent training includes the errors from both the gated neural network f(·) from Eq. (1) and s(t) from Eq. (2). The LSTM architecture embedded by the gated network exhibits that the maximum differences of lumen depreciation, CCT and color coordinate are 1.6%, 1.9% and 1.1%, after 2160 h of thermal aging.