Fault diagnosis of air-conditioning refrigeration system based on sparse autoencoder

To overcome the drawbacks of using supervised learning to extract fault features for classification and low nonlinearity of the features in most of current fault diagnosis of air-conditioning refrigeration system, sparse autoencoder (SAE) is presented to extract fault features that are used as the input to the classifier and to achieve fault diagnosis for air-conditioning refrigeration system. The SAE structure is tuned by adjusting the number of hidden layers and nodes to build the optimal model, which is compared with the fault diagnosis model based on support vector machine. Results indicate that the indexes of the model combined with SAE, such as accuracy, precision and recall, are all improved, especially for the faults with high complexity. Besides, SAE shows high generalization ability with small-scale sample data and high efficiency with large-scale data. Obviously, the use of SAE can effectively optimize the diagnosis performance of the classifier.


INTRODUCTION
The application of air-conditioning refrigeration systems has penetrated into various fields of society, and the scale of the system itself has increased year by year. During the operation of the system, various faults will inevitably occur and the functions specified in the original design are lost. Fault diagnosis of the airconditioning refrigeration system can not only ensure the normal operation of the system but also avoid unnecessary losses, which has practical significance [1,2]. At present, most fault diagnosis of air-conditioning refrigeration systems is to judge the operating status by monitoring and analyzing the state parameters. The specific process includes collecting parameters such as temperature, pressure, flow, superheat, fault feature extraction and classification. Classifiers include back-propagation neural networks [3], probabilistic neural networks [4], support vector machines (SVMs) [5,6] and so on. The fault feature is used as the input of the classifier, and the feature quality directly affects the final classification effect. In the field of air-conditioning refrigeration system fault diagnosis, most feature extraction belongs to supervised learning [7][8][9], which requires a large amount of tag data, and the formation of tag datasets requires a large number of system tests and relies heavily on professional experience.
Based on this, principal components analysis (PCA) [10][11][12] has been applied in the air-conditioning field recently to achieve unsupervised learning feature extraction. PCA replaces multiple variables of the original data with fewer independent and irrelevant variables, which can reflect most of the original data. This process can extract features without label data, and the dimension reduction effect is better. However, PCA also shows some limitations. Simple PCA processing can result in reduced fault sensitivity of new features. Sparse autoencoder (SAE) can get a set of base vectors that can efficiently represent sample data by unsupervised learning [13,14]. Due to the high nonlinearity of the extracted features, the multilayer SAE structure can further extract fault features from different dimensions, so that it has better fault sensitivity. In this paper, the SAE is selected for feature extraction, and the reconstructed feature data are input into the classifier to realize fault diagnosis of the air-conditioning refrigeration system.

Sparse autoencoder
(1) Neural network establishment: the SAE is a symmetric neural network structure including input layer, hidden layer and output layer. The input data are encoded to obtain hidden layer data and then the hidden layer data are decoded again to obtain the input data, which optimally solves the minimum input and output data error. At this time, the corresponding hidden layer data can be regarded as another expression of the original data.
(2) SAE establishment: simply copying the data between layers to achieve the reproduction of the input data, the hidden layer extracted from it does not make sense. However, when the neural network structure is limited, such as limiting the expression of hidden layer data, forcing the network to perform lowdimensional expression on the input data and its hidden layer expression often shows practical significance. So SAE is proposed to improve this problem. The spare autoencoder uses a sparse constraint on the hidden layer nodes based on the autoencoder, adding a sparse penalty term to the objective function to obtain a concise expression of the data. When the neuron node output is close to 1, the node is in an active state, and when the neuron node output is close to 0, the node is in an inactive state. The SAE requires most of the hidden layer nodes to be inactive, with only a few nodes having outputs, which also allows the number of SAE hidden layer nodes to be greater than the number of input layer nodes. When the input data are an m dimensional vector, the average output value of the hidden layer node j is as follows: a 2 j (x (i) ) represents the active unit to which the hidden layer node j corresponds. a = f (WX + b), where W is the weight matrix, b is the offset vector and f is the activation function. The KL distance is introduced as a sparse penalty term, which consists of two Bernoulli distributions.
It is characterized in that when ∧ ρ j is very close to ρ, the KL distance is close to 0, and when ∧ ρ j is slightly deviated from ρ, the KL distance will increase rapidly. This means that the smaller the KL distance, the closer ∧ ρ j is to ρ. Further, selecting a value of ρ close to 0, when the corresponding KL distance is smaller, the average output value of the hidden layer node j is closer to zero.
Traditional neural network objective function is generally expressed as follows: The former term of the formula represents the square value and mean value of all samples error, and the latter term is used to limit the weight to a smaller level. SAE objective functions add sparse penalty term.
β is the weight of the sparse penalty term.
(3) Neural network training: by optimizing the minimum value of the objective function, the W and bare optimized. The solution can be combined with the back propagation algorithm and the gradient descent method to continuously train the neural network to update the weights.
ε is the learning rate. Through optimization, the SAE can realize unsupervised training, obtain better hidden layer feature expression and realize the reconstruction and the feature extraction of input data [15].
Multilayer SAE [16] adopts multiple hidden layers to extract features layer by layer, which reduces the error dispersion effect when error back propagation [17], and each hidden layer can extract features from different dimensions. In this paper, the fault diagnosis model of air-conditioning refrigeration system with different network structures is established. Layer-by-layer training is carried out to extract fault features of different dimensions, and the characteristics of each group are further integrated to form the final feature vector input to the classifier.

Support vector machine
The SVM projects the sample to the high-dimensional space through the mapping function and searches for the optimal edge hyper plane in the new dimension so that it can be linearly separable. Construct the maximum edge calculation formula, convert it into constrained convex optimization problem and use the kernel function to reduce the high-dimensional calculation. The SVM classifier has the ability to model complex nonlinear boundaries with high accuracy and is widely used in nonlinear classification [18]. The ASHARE 1043RP dataset has many monitoring parameters and target classifications, which has certain nonlinear complexity. Therefore, this paper selects SVM as the classifier of the diagnostic model.

Diagnostic model evaluation indexes
The accuracy rate can intuitively show the proportion of the sample that is correctly judged to be the fault category, but only the accuracy rate cannot fully compare the pros and cons of the model. Combined with multiple evaluation indexes, it can directly reflect the performance of fault detection and diagnosis models in different aspects [19]. This paper uses multiple evaluation indexes to quantitative model performance, including accuracy, precision, recall and F-measure. The calculation formula is as follows: |TP| is the true number of positive predictions and |TN| is the true number of negative predictions. |FP| is the false number of positive predictions and |FN| is the false number of negative predictions.

FAULT DATA SOURCE AND PREPROCESSING
The ASHRAE 1043-RP refrigeration unit failure simulation experiment [20] provides monitoring data for the unit under various operating and fault conditions. This paper uses the project data to train and test the fault diagnosis model. The experimental device is a 316 kW centrifugal chiller and tests 27 working conditions. The faults studied in this paper are seven single faults. As shown in Table 1, 64 characteristic parameters are monitored, including direct measurement data such as evaporator inlet and outlet water temperature, condenser pressure, compressor discharge temperature and so on. The data were obtained by the calculation formula such as the coefficient of performance, cooling capacity, hot water flow and so on. Each of these faults contains four fault levels, typically 10%, 20%, 30% and 40%. Refrigerant overcharge fault is shown in Table 2. 40 000 sets of data are randomly extracted from the experimental database, and data preprocessing is performed; the unsteady data samples are deleted. The parameter measurement time, unit status and hot water valve status are deleted because the measurement time is independent of the unit fault classification; the unit status and hot water valve status are the same for the steady state sample.

FAULT DIAGNOSIS RESULTS AND DATA ANALYSIS
A total of 160 000 sets of experimental data are extracted from the experimental database, and each of the fault degree is 40 000 sets. All data are standardized. Under the premise of ensuring sample balance, in the Level 1 fault degree data, 1700 sets of data are randomly selected as SAE training samples, 1700 sets of data are used as SVM classifier training samples and 1700 sets of data are used as SVM test samples. Adjust the number of hidden layer nodes by using the unlabeled training sample as input data. Three-layer neural network training is performed on SAE, and the original sample data are reconstructed according to the SAE neural network weight after training. As shown in Figure 1, the number of SAE three-layer network input layer nodes is 61. When the number of hidden layer nodes is close to the number of input layer nodes, the accuracy is low. Due to the information compression and sparse constraints of the SAE hidden layer, the feature information extracted at this time is insufficient. As the number of nodes increases, the accuracy also increases. When the number of nodes reaches a certain level, the accuracy tends to be stable. For the experimental project data, when the hidden layer nodes are selected as 400, the diagnostic model has the highest accuracy. SAE feature extraction is formally expressed as data dimension increase and the number of hidden layer neuron nodes increases. Through the limitation of sparsity, a set of complete base vectors are obtained, which has certain data   compression capability. At the same time, the extracted features have nonlinear characteristics and high fault sensitivity. Further, the data size of the SAE training sample is changed, and the influence on the accuracy of the SAE + SVM fault diagnosis model is analyzed, as shown in Figure 2.
As can be seen from Figure 2, the number of samples for SAE training increases from 1700 to 20 000 and the overall accuracy of the model shows an upward trend. However, the growth rate is not obvious and a high accuracy rate can be achieved with 1700 sets of data. SAE automatic feature learning can extract better features with less data. It has lower requirements on training sample data size and stronger generalization ability. For diagnostic objects with limited historical data, it is more applicable.
Further, using data of different degrees, 10 000 sets of data are randomly selected for each degree as SAE training samples; 20 000 sets of data are used as SVM training samples; and 4000 sets of data are used as SVM test samples. The SAE + SVM model and the SAE-free SVM model are tested and compared. The accuracy of the two models is shown in Figure 3. It can be seen from Figure 3 that as the fault degree gradually increases, the accuracy of the two models basically rises. As the fault degree is deeper, the greater the deviation of the unit parameters from the normal parameters and more accurate the model is. When the fault degree is the same, the accuracy of SAE + SVM diagnosis model is higher than that of SVM diagnosis model, especially when the fault degree is low. For example, in the case of a relatively low degree of Level 1 fault, the accuracy of the former is ∼3% higher than that of the latter. It shows that the features extracted by SAE have higher fault sensitivity, especially for lowdegree faults. Further, the number of hidden layers is increased on the basis of single layer SAE, and the model accuracy is compared. For example, six layers (in-600-500-400-300-out) indicate that there are four layers in the hidden layer, with 600, 500 and 300 nodes in each layer. For SAE with the same number of layers, adjust the number of hidden layer nodes, select the model with the best accuracy and compare the accuracy of different SAE layers, as shown in Table 3.
From Table 2, when the number of SAE layers is increased from three layers to six layers, the performance of model diagnosis is gradually improved. It shows that increasing the feature information of different dimensions is conductive to improving the fault sensitivity of the diagnosis model. When the number of SAE layers continues to increase to seven layers, the diagnostic accuracy of the model shows a certain downward trend, even lower than the accuracy of single-layer SAE structure. The reason is that too much feature information forms a new correlation, resulting in feature redundancy. For the research object of this paper, the optimal structure of SAE is six layers. For the case of more monitoring and complex parameters, the model can further expand the number of SAE hidden layers and even reach deep SAE.
Select SVM, single-layer SAE (in-400-out structure) + SVM and multilayer SAE (in-600-500-400-300-out structure) + SVM model and mark them as Model 1, Model 2 and Model 3, respectively. The fault accuracy of each model is shown in Figure 4.  For each fault category (including normal conditions), the diagnostic accuracy of Model 1 is lower than Model 2 and Model 3, especially for refrigerant leak fault (Refleak) and Model 2 is >1% higher than Model 1. It proves that the characteristics of SAE extraction have better fault sensitivity. Compared with Model 2, the accuracy of condenser fouling (ConFoul) fault is slightly lower than that of Model 2 because both model accuracy are very close to 1, other conditions are higher. Further, the multiindex evaluation parameters of the diagnosis results of each fault category are listed in Table 4.
According to Table 4 and Figure 4, the accuracy of each model is close to 1 for faults such as ConFoul, ReduCF, NonCon and ReduEF. For other faults, the accuracy, precision, recall and Fmeasure indexes are all low. There are many cases of misdiagnosis and false alarms, especially for RefLeak fault, model performance is at its lowest level. For the water flow change faults, the monitoring parameters include water flow and water valve that are highly correlated with water flow faults, so the diagnostic accuracy is higher. For the global fault caused by refrigerant and lubricating oil, involving the refrigeration cycle, all monitoring parameters of the system have large changes, which forms a large interference to the model diagnosis. For the partial fault of ConFoul and NonCon, the above interference is not involved and the diagnostic accuracy is high. Despite the low accuracy of normal condition and global fault, both Model 2 and Model 3 have better diagnostic results than Model 1, especially for RefLeak fault, which has increased by ∼10%. Model 3 further enhances the performance of Model 2, which indicates that the fault characteristic extracted by SAE (especially multilayer SAE) are more sensitive to faults, improve the ability to fault recognition and diagnosis.

CONCLUSIONS
Based on SAE, fault diagnosis model of air-conditioning refrigeration system is proposed. Conclusions are summarized as follows: (1) When the number of nodes in the SAE hidden layer is more than that in the input layer, such as the former is 400 and the latter is 61, the features extracted by SAE neural network are highly nonlinear and have certain data compression capability. It has strong generalization ability for small-scale data and also has the advantage of high training efficiency for large-scale data.