Application of seismic multiattribute machine learning to determine coal strata thickness

The coal mining industry is developing automated and intelligent coal mining processes. Accurate determination of the geological conditions of working faces is an important prerequisite for automated mining. The use of machine learning to extract comprehensive attributes from seismic data and the application of that data to determine the coal strata thickness has become an important area of research in recent years. Conventional coal strata thickness interpretation methods do not meet the application requirements of mines. Determining the coal strata thickness with machine learning solves this problem to a large extent, especially for issues of exploration accuracy. In this study, we use seismic exploration data from the Xingdong coal mine, with the 1225 working face as the research object, and we apply seismic multiattribute machine learning to determine the coal strata thickness. First, through optimal selection, we perform seismic multiattribute extraction and optimal multiparameter selection by selecting the seismic attributes with good responses to the coal strata thickness and extracting training samples. Second, we optimise the model through a trial-and-error method and use machine learning for training. Finally, we illustrate the advantages of this method using actual data. We compare the results of the proposed model with results based on a single attribute, The results show that application of seismic multiattribute machine learning to determine coal strata thickness meets the requirements of geological inspection and has a good application performance and practical significance in complex areas.


Introduction
The coal mining industry is developing automated and intelligent coal mining processes. Accurate determination of the geological conditions of working faces is an important prerequisite for automated mining. In particular, the variation in the coal seam thickness is an important basis for automated control (Hao et al. 2020). The development of intelli-gent coal mines is a continuous process of improvement and development and the integration, extraction and transformation of geological data (Si et al. 2020). Among all geological data, three-dimensional seismic data are of great importance, especially in coalfield exploration, because these data can effectively reflect the characteristics of target objects in coalfields. Intelligent mining in coal production has been facilitated by big data machine learning. In coalfield seismic exploration, the coal seam is the target of seismic exploration and is usually defined as a thin layer. The waves reflected from coal seams are superimposed composite waves under the combined effects of multiple waves and converted waves between the top and bottom interfaces of coal seams. The vertical resolution cannot be used to determine the coal strata thickness. In general, the borehole interpolation method and the seismic inversion method are used to determine the coal strata thickness (Tian and Goulty 1997). Due to the limited density and number of boreholes in production, borehole interpolation results often have large errors and are not effective in guiding production. Seismic inversion is usually conducted in combination with logging curves . Due to the lack of logging curves in coalfields, the inversion results are not as practical as expected (Cooke and Schneider 1983;Hu et al. 2019). Although underground exploration is more accurate than ground exploration, the detection distance is short and the target range is small. In terms of the time dimension and data volume, the influence of underground exploration is comparatively much weaker in the self-perception and self-learning of intelligent mining.
In the construction of the information entities of intelligent mining, the information of the coal seam is categorised as entity attribute information. Seismic data contain rich geological information. The seismic attributes reflect the multiple characteristics of the waveforms in seismic data, such as the geometry, kinematics, dynamics and statistics (Brown 1996;Russell et al. 1997). Therefore, the extraction, storage, visualisation, analysis, verification and evaluation of the seismic attributes include constructing the entity attributes and accurately describing the target geological object information to meet the diverse requirements of intelligent mining (Wang 2012;Peng 2020). By using the seismic attribute method, the characteristics of the thin layers in seismic information can be effectively reflected, and the best combination of the physical relationship between the thin layers and the seismic attributes can be obtained (Zhou et al. 2019). Based on this relationship, linear and nonlinear methods are used to determine the coal strata thickness. Many scholars have found that the coal strata thickness is closely related to seismic waves. By analysing the relationship between the coal seam thickness and seismic-related attributes (Suo et al. 2011). researchers have concluded that there is a monotonic nonlinear relationship between the seismic attributes and the coal strata thickness for thin seams (Widess 1973;Shan 2020). Some scholars have proposed that the ratio of the integral of the reflected wave amplitude spectrum and the first moment of the seismic wavelet amplitude spectrum is related to the coal strata thickness (Dong and Xu 2005;Zeng and Marfurt 2015). To illustrate the relationship between the amplitude attributes and the coal strata thickness, a wedgeshaped coal seam model was established (Zou et al. 2017).
However, the single seismic attribute method uses only one type of seismic attribute parameter and has large random errors. The accuracy of the calculation results is low, so this method is very limited (Mirkamali et al. 2013). The neural network algorithm performs the functions of learning, association, self-organisation, memorisation and fault tolerance determination (Brown et al. 2000). The intricate nonlinear relationship between a set of the input feature and the predicted output value can be simulated Liu et al. 2021). The algorithm can further provide effective predictions based on new input features. To determine the coal strata thickness, the multiattribute characteristics of the coal strata thickness in incomplete seismic data have been used for sample training and processing (Wang et al. 2013). Through a unique generalisation ability, the interpretation ambiguity is reduced, and the efficiency and accuracy of interpretation are improved. In this way, desirable results are obtained for nonlinear problems related to reflected wave attribute parameters and coal strata thickness (Sun et al. 2013).
Combinations of various types of seismic attribute information in different regions and different layers are very different. Even in the same region, the reliability of predictions is often different because of differences in geological structures. Establishing a method based on nonlinear seismic attribute combinations to accurately predict the thickness of coal seams for intelligent mining has become a key focus of researchers. In this study, based on the seismic data from the mining area 1200 located in Xingdong coal mine, we conduct research on two aspects after area selection: the extraction of the seismic attributes of the coal seam and the selection of the optimal parameters, and the construction of optimal models to determine the thickness of the coal seam by machine learning. Through correlation coefficient and cross-correlation calculations, the correctness of the extracted coal strata thickness seismic attributes of the target seams is ensured. The optimal combination of attributes and the number of attributes are optimised to minimise the prediction error of the effective attribute combination of the optimal multiparameter selection (Wu et al. 2009;Li et al. 2019). The extracted and optimally selected multiparameter coal seam seismic attributes are used as the input layer. Through the trial-anderror method, we construct the neural network structure and optimise the number of nodes in the middle layer. Each layer is optimised through feedback. The complex projection relationship between various seismic multiattribute factors and the coal strata thickness is characterised (Li et al. 2007;Sachindra et al. 2018). A machine learning neural network is used to address nonlinear problems. The results are verified by using the coal strata thickness data in the exposure location as the training samples and the boreholes in the adjacent working face as the prediction application. Network function approximation and multiattribute combination from input to output are achieved. The complex relationship between the seismic multiattribute combination response and the coal strata thickness is obtained (Badel et al. 2011). The results show that the seismic multiattribute coal strata thickness determination method for working faces based on machine learning has desirable application effects in complex areas.

Engineering geological setting
The structure of the mining area 1200 is extremely complex, which is located in the Xingdong coal mine, with developed faults and a large coal seam inclination. The lineups of the coal seam in the west are relatively contiguous, while in the east, the seismic wave field is affected by the large boundary fault F2 and exhibits deformation; thus, the uncertainty of the coal strata thickness increases. The reflection events of the #2 coal seam in the mining area 1200 are shown in figure 1.
When /4 becomes the widely accepted and applied resolution limit ( is the seismic wavelet length). The thickness of the coal seam in the mining area is less than the maximum tuning thickness.
To better use the seismic attributes to determine the coal strata thickness, optimal area selection is performed. We selected 1225 working faces in figure 1 due to the location in the western part of mining area 1200 of reliable data. Fewer structures, a high signal-to-noise ratio of seismic data, with less variation of coal seam burial depth and less topographic relief all make it easy to exclude interfering factors and facilitate the study of the relationship between coal strata thickness and seismic attributes.
The effectiveness of the back propagation (BP) neural network method to determine the seismic multiattribute coal strata thickness in working faces is validated using examples. The specific research framework is shown in figure 2.

Seismic multiattribute extraction and optimal multiparameter selection
Seismic attributes are derived from amplitude-type attributes, frequency-type attributes, phase-type attributes and their three derived attributes (Pigott et al. 2013). The seismic attribute parameters that sensitive to the coal strata thickness are extracted through optimal selection.
3.1.1. The seismic attributes are normalised, the correlation coefficients are calculated and the effectiveness of attribute analysis is ensured. To ensure the effectiveness of the attribute analysis, the seismic attributes of the 1225 working face and the mining area 1200 are both extracted. The correlation coefficients in the mining area 1200 are shown in Table 2. The maximum correlation coefficient R is less than 0.3. No valid relationship can be built, so therefore, appropriate working faces must be selected for optimal seismic attribute selection.
In this study, a total of 34 types of seismic attribute are extracted along the #2 coal seam of the 1225 working face, including 15 amplitude attributes, 14 frequency attributes and five phase attributes. The extraction time window is set to be longer than half of the period of the coal seam reflection wave. The time window length is 20 ms. The different attributes are normalised, and the attribute data and the known borehole coal strata thickness are combined into learning samples for attribute correlation analysis (Zhang et al. 2017). The formulas for the normalisation and correlation coefficient calculation are as follows (Guo et al. 2004): In equation (1), x (i) is the parameter value of the ith point before processing, y (i) is the parameter value of the ith point after processing (Dianat and Kasaei 2010), x min and x max represent the minimum and maximum values before proceeding, respectively.
In equation (2) (Meng et al. 2006), R(i) represents the correlation coefficient between the ith attribute and the coal strata thickness, x i represents the ith seismic attribute,x represents the mean of the seismic attribute, y i represents the thickness at the coal seam exposure point andȳ is the mean coal strata thickness. A total of 14 seismic attributes are selected from Table 1 with absolute correlation coefficient R values greater than 0.4. These seismic attributes include the energy half-time, instantaneous frequency, time dip, dip azimuth, dominant frequency, positive to negative phase ratio, upper loop area, decile frequency 2, decile frequency 4, decile frequency 6, decile frequency 8, frequency band ratio 6, spectral attribute central frequency and spectral attribute dominant frequency.
3.1.2. Through the cross-correlation analysis, the relative independence of each attribute is achieved, thus maintaining the stability of the algorithm. Based on the correlation coefficient calculation results of the seismic attributes and concerning the magnitude of the value, the seismic attributes with higher correlation coefficients are optimally selected or combined. Extracting the relatively independent attributes with large correlation coefficients is the basis for establishing a reasonable determination model. The attributes with cross-correlation coefficients greater than 0.95 are selected for attribute substitution and are shown in Table 3. The calculation formula is the same as equation (2). The energy half-time, instantaneous frequency, time dip, dip azimuth, dominant frequency, positive to negative phase ratio 6, upper loop area and frequency band ratio are calculated.
3.1.3. The purpose of the optimal attribute combination is to find M optimal attribute combinations from N attributes and to achieve the minimum prediction error. According to the crosscorrelation calculation results, the optimal attribute combination is obtained by using an exhaustive search. The specific steps are as follows: (i) the best attribute determined from the cross-correlation analysis is selected, namely, attribute 1; (ii) all attributes are paired with attribute 1 to form attribute pairs, and the best attribute pair is selected by finding the minimum prediction error; attribute 2 is determined accordingly and (iii) all the attributes are combined with attributes 1 and 2, and the best three-attribute combination is selected by finding the minimum prediction error (Luo and Qin 2019); attribute 3 is determined accordingly, and more attributes are combined and selected following these steps. The optimal attribute combination includes the positive to negative phase ratio, time dip, frequency band ratio 6, dominant 837  frequency, dip azimuth, instantaneous frequency and energy half-time.

Effectiveness analysis-selecting the number of attributes.
The effectiveness analysis method is used to select the number of attributes by calculating the actual error and the theoretical prediction error for different numbers of attributes, to avoid 'overtraining' . (i) Positive to negative phase ratio; (ii) time dip; (iii) frequency band ratio of 6 and (iv) dominant frequency. Figure 3 is a cross-plot of the effectiveness analysis of the coal seam attributes in the 1225 working face. The horizontal coordinate represents the number of attributes and the vertical coordinate represents the percentage error. The red line represents the theoretical error and the black line represents the actual error. When the number of attributes is increased to five, the actual error increases and the addition of any attribute beyond the fourth attribute leads to overtraining. Therefore, the number of attributes is determined to be four. 838

The coal seam thickness determination model established using a BP neural network
The machine learning neural network algorithm is a learning algorithm that uses BP through adjustment of the weights and is called a BP network (Sola and Sevilla 1997;Chatterjee et al. 2010), In a BP neural network, the seismic attribute is a subset of different geological characteristics reflected in seismic data, and it is a seismic characteristic quantity describing geological information such as coal seam structure, lithology and physical properties. A BP neural system incorporates an input layer, an output layer and a particular case with different hidden layers. Through the connections between neurons, the relevant weights are assigned. The training algorithm iteratively adjusts the weights to minimise the root-mean square error concerning the actual output value of the network and the expected output value. The algorithm also returns the prediction accuracy (Lin and Liu 2000).
During the calculation process, it is necessary to normalise the initial data to constrain each characteristic scale to the same range. A network model and its structure are established. Training samples are selected according to the demands. Input and output expectations are added to the training network. Through forward propagation and BP processes, the weight thresholds are corrected and the output errors are calculated until the learning error requirements are satisfied and function approximation is achieved. After the training, the actual data are evaluated and the final results are quantified.
The BP artificial neural network algorithm learning process for weight adjustment reduces the error. Based on the 839 is constructed that ensures that the errors do not increase (Tripathy et al. 2020). the constructed error function is where C l is the target output layer node and out[l] k is the output expectation of the kth sample. Within a certain range, the coal seam thickness and coal seam reflected wave attribute parameters exhibit linear relationships. However, nonlinear relationships between the coal seam reflected wave attribute parameters and the coal seam thickness are also exhibited. The BP network can correct the parameters through error feedback, ensuring the nonlinear projection ability of the network.

Selection of the structure and hidden nodes.
The adaptive ability of the BP neural system relies on the structure of the hidden layers of the network. It indicates that any three-layer BP network can approximate nonlinear continuous functions of arbitrary precision (Gan, et al. 2020). We use a three-layer BP network for prediction. One of the most important factors affecting network performance is the number of hidden nodes (Wu and Cao 2019). The selected number of hidden nodes should be reasonable. An excessively small number of hidden nodes will result in less information, a low accuracy and slow network convergence; an increase in the number of hidden nodes will increase the amount of information, the accuracy and the network convergence speed. However, too many hidden nodes will complicate the neural network topology structure, worsen the fault tolerance and increase the identification error. Therefore, a reasonable number of hidden nodes must be selected.
By using the coal strata thickness data of the working face as training samples, a coal seam thickness BP neural network prediction model is established. A three-layer network structure is constructed. The four types of optimally selected seismic attribute values are used as the input layer, and the determined coal strata thickness value is used as the output layer. By using the trial-and-error method, the number of hidden nodes is gradually increased, and the best choice for the number of hidden layer nodes is 6, as shown in figure 4. The x coordinate is the number of hidden nodes, and the y coordinate is the actual percentage error.

BP network model parameters.
(a) Transfer function selection: the logsig function has the best effect among the transfer functions from the input layer to the hidden layer. It is a sigmoid function (abbreviated to S-type function). the logsig function has good differential characteristics. When the input signal is weak, the neuron also has output, and when the input signal is strong, there is no 'overflow' imagination. The transfer function from the hidden layer to the output layer is the linear transfer function purelin. Purelin can keep and scale any previous arbitrary range of values for comparison with sample values, while sigmoid can only have a range of values from 0 to 1. (b) Training function and parameter selection: the training function Learngdx of the gradient descent backpropagation algorithm is trained by adaptively adjusting the learning rate and additional momentum factors. The initial learning rate is 0.01, the objective function error target is 0.0001 and the momentum term is set to 0.9 to accelerate the convergence of the algorithm, thereby effectively improving the learning efficiency, reducing the number of iterations, reducing errors and restraining the network from falling into a local minimum.

Weight update and calculation.
To improve the model network weight training speed, the hidden layer-output layer and the input layer-hidden layer are separately trained. Separate training not only accelerates the optimisation of the hidden layer-output layer connection weights but also prevents the input layer-hidden layer weights from falling into local minima.
The adaptive learning ability of the model is reflected in the weights of a network. Under the constraints of the optimal hidden layer nodes, by adjusting and updating the weights between the connected layers. the computational sum of the square error E is minimised (Seo et al. 2008).
A BP neural network model of the coal seam thickness is established with the positive to negative phase ratio, time dip, frequency band ratio of 6 and dominant frequency as the input layer. W and V represent the connection weights between the two connection layers of the BP network, respectively (Liu et al. 2010), which obtained as follows: The effects of different seismic attributes on the coal strata thickness are different. Through optimisation of the combination of seismic attributes, a BP neural network method is established to comprehensively determine the coal strata thickness information. This method can reduce the ambiguity of seismic interpretation and fully improve the accuracy of seismic interpretation.

Discussion
(i) The application of seismic multiattribute machine learning to determine the coal strata thickness is effective. In this study, the effectiveness of the method is validated by using geological exposure. In the mining area 1200 of the Xingdong coal mine, the structure is complex and the correlation between the attributes and the coal strata thickness is poor. Under such conditions, seismic attributes that are highly correlated to the coal strata thickness of the local working face of the mining area are used. Through analysis of the sensitivity of the seismic attributes to the coal seam thickness, the relationship between the seismic attributes and the coal strata thickness is established. The BP neural network model is optimised. The four selected attributes, namely the positive to negative phase ratio, time dip, frequency band ratio six and dominant frequency, are used for calculation. The coal seam thickness in the 1225 working face is determined. The results are in line with reality, and the percentage error is low, as shown in figures 5 and 6. (ii) Multiattribute optimisation improves the success rate of determining the coal strata thickness. The key to determining the coal strata thickness is the optimal attribute selection, the premise of which is that the boundary conditions conform to a relationship with the coal strata thickness. The error and fluctuation range of the results of single-attribute methods in seismic attribute coal strata thickness determination are large. The error of test points at different positions is different, which is mainly affected by the stability. The instantaneous frequency error is 2.1%, the error of half energy error is 3.16%, the dip angle and azimuth error are 2.1% and the combination error is 1.04%. The errors relative to the testing samples are large. The optimal attribute combination can effectively improve the accuracy, which is one of the reasons that multiattribute combination is used. The error and fluctuation range of the seismic multiattribute coal strata thickness determination method for a working face based on a BP neural network are very small. Only the relative error of m55 is greater than 4%, which does not compromise the effectiveness of the method. A single attribute may be more sensitive in a certain range, but it is not general. D14* as an untrained factor is a good example. (iii) The error reduction during the calculations improves the prediction result of the method. Sample selection is not the only source of errors. In this study, because the mining area 1200 is not suitable for calculation (less than 0.3), the 1225 working face sample (greater than 0.4) is selected. This selection may also be one of the reasons for the successful prediction of the coal strata thickness. The final calculation results of the model are closely related to the attribute selection. The optimal attribute selection excludes attributes with 841 Figure 5. Application of seismic attributes in the prediction of the coal strata thickness in the #2 coal seam of the 1225 working face. relatively large errors, returning the optimal attribute combination, which improves the optimisation of the target layer and is one of the reasons that a desirable final interpretation result is obtained. Through the optimisation of the constructed model structure and adjustment of the number of internal nodes of the neural network model, the final error is effectively reduced. Apart from the testing samples, this reduction can be seen from the results of the D14* borehole outside the 1225 working face. We believe that conventional geostatistical methods can achieve good results when a large number of boreholes and data are available, which is a considerable expense in mining. We do not rule out that individual results contain actual errors, but the correctness of the research method is validated by using the unknown surrounding borehole data. (iv) The successful application of the model provides a way of predicting complex mining areas. we find that even with a large correlation coefficient, single-attribute predictions in the 1225 working face still exhibit relatively large fluctuations, which does not prevent us from performing expanded predictions in the area using the model. The effectiveness of the model gradually decreases as the overall error increases. However, the 842 gradual expansion of the prediction area or changing of the range of the sample area and backward adjustment of the attribute samples may be effective for expanding this prediction method to complex mining areas.

Conclusions
In this study, we found that the multiattribute combination coal strata thickness determination method effectively determines the coal strata thickness by narrowing the sample range, optimising the attribute combination and effectively analysing and optimising the number of nodes. Additionally, this method avoids the problems of single-attribute predictions such as large errors and poor stability. The combination of narrowing the area to test the attribute sensitivity and adjusting the window and attribute combination beneficial for interpreting complex geological conditions. The results of the multiattribute combination of the entity attributes of three-dimensional exploration methods constitute a new approach for intelligent and automated mining and interpreting geological environments.