A method for well log data generation based on a spatio-temporal neural network

Well logging helps geologists find hidden oil, natural gas and other resources. However, well log data are systematically insufficient because they can only be obtained by drilling, which involves costly and time-consuming field trials. Additionally, missing or distorted well log data are common in old oilfields owing to shutdowns, poor borehole conditions, damaged instruments and so on. As a workaround, pseudo-data can be generated from actual field data. In this study, we propose a spatio-temporal neural network (STNN) algorithm, which is built by leveraging the combined strengths of a convolutional neural network (CNN) and a long short-term memory network (LSTM). The STNN exploits the ability of the CNN to effectively extract features related to pseudo-well log data and the ability of the LSTM to extract the key features from well log data along the depth direction. The STNN method allows full consideration of the well log data trend with depth, the correlation across different log series and the actual depth accumulation effect. The method proved successful in predicting acoustic sonic log data from gamma-ray, density, compensated neutron, formation resistivity and borehole diameter logs. Results show that the proposed method achieves higher prediction accuracy because it takes into account the spatio-temporal information of well logs.


Introduction
Reservoir modelling is crucial for characterising the spatial variability of rock properties. Well logging represents the main source of input data for modelling. Engineers and geologists use petrophysical properties determined from well logs to estimate spatial correlations and apply geostatistical methods to build reasonable three-dimensional reservoir models. These models are used to improve the design of exploration and development strategies that lower costs and improve oil and gas production. However, in actual applications, acquiring sufficient log data is prohibitively expensive. Additionally, log information is frequently missing or incomplete due to various inevitable causes, such as borehole enlargement, instrument failure or incomplete logging due to economic concerns. The absence of log records represents a major problem in reservoir studies. Therefore, the generation of synthetic well logs has academic and engineering value (Chen & Zhang 2020a,b;Tatsipie et al. 2020;Osarogiagbon et al. 2020;Zeng et al. 2020;Blanes de Oliveira & de Carvalho Carneiro 2021). (Fajana et al. 2018;Saporetti et al. 2018). With the rapid development of machine learning, algorithms for the intelligent prediction of logging curves are developing rapidly worldwide. Researchers have used geological parameters to synthesise logging curves or relationships between logging curves to fill data gaps, and produced intelligent prediction methods that have achieved ideal results under certain conditions (Rezaee et al. 2008;Yang et al. 2008;Rolon et al. 2009;Yan et al. 2009;Guo 2010;Zhang et al. 2011;Alizadeh et al. 2012;Mo et al. 2015;Long et al. 2016;Salehi et al. 2016). Nevertheless, traditional machine-learning-based well-logging prediction methods have important limitations. The need for overcoming these limitations has prompted research on adaptive feature extraction and prediction methods, which exploit, among others, the rapidly developing deep-learning (DL) algorithms.
The use of DL has recently allowed scientists to achieve remarkable achievements in many fields (Banchs et al. 2001;Zhou et al. 2005;Zhou & O'Brien 2016;Su et al., 2018;Bader et al. 2019;Chebeir et al. 2019;Junno et al. 2019;Li et al. 2019;Guo et al. 2020;Wang et al. 2020a). DL combines low-level features in individual layers to form more abstract high-level attribute categories or feature representations to extract essential information from the data (Shao et al. 2018). In particular, convolutional neural networks (CNNs) (Le-Cun et al. 2015) and recurrent neural networks (RNNs) (Goltsev & Rachkovskij 2001;Tian & Noore 2004) have been successfully applied in many fields. CNNs have strong feature extraction abilities and are gradually being used to interpret logging information by identifying tight gas reservoirs (Zhu et al. 2020), estimating reservoir porosity (Feng 2020), and predicting total organic carbon content (Zhu et al. 2018) and lithology (Zhang et al. 2018b). The logging signal is a type of time-series signal; the logging information increases as the formation depth increases, and the dynamic changes between data are a rich source of information. More abundant historical data provide more accurate predictions. Because the data segmentation sliding window exists independently, the correlation between each data point in the original logging data is destroyed and related information may be lost. Thus, a CNN network cannot easily extract key features from well log data in the depth direction.
A long short-term memory network (LSTM) is a special RNN structure with a strong ability to learn dynamic modelling. The LSTM performs well with time-series data, which compensates for the deficiency of the CNN for processing time-series data and effectively performs logging information interpretation and prediction (Pham et al. 2020;Zhang et al. 2020;Chen & Zhang 2020a,b;Wang et al. 2020b,c). The recurrent structure of the LSTM network allows the data to flow forward and backward within the network. Therefore, the LSTM outputs are generated from a series of data inputs that consider the inner relationships and variation tendencies that are compatible with a geological analysis perspective (Zhang et al. 2018a;Zeng et al. 2020). However, the LSTM model does not consider the local shaping information of well logs (Pham et al. 2020). Logging data reflect the complexity and strong heterogeneity of underground reservoirs. Therefore, synthetic well log generation must consider the spatial relationships across different logging data and the variation in logging information along the depth direction.
The aim of our study was to construct a more effective intelligent system for generating log data. Therefore, we propose a spatio-temporal neural network (STNN) that can express the temporal and spatial characteristics of the data. In the following, 'spatial correlations' refer to the extremely complicated mapping that exists between the nonlinear input and output well log data owing to the heterogeneity and complexity of the underground conditions; 'temporal information' refers to curvilinear trends and contextual information in reservoirs; 'spatio-temporal excavation' refers to fully considering the well log data trend with depth, the correlation of different log series and the actual depth accumulation effect. The STNN was built by leveraging the combined strengths of the CNN and LSTM networks. It improves log generation by incorporating the local correlation of logs related to different geological units. To verify our approach, a dataset was chosen from five vertical wells in the exploration area of the Ordos Basin. Two typical intelligent approaches-LSTM and multiple linear regression (MLR)-were also applied and compared with the proposed method. Owing to the early stage of our research, we also analysed shortcomings and provided suggestions for future work.

Methodology
DL is widely used in various scientific and engineering fields (Zeng et al. 2020;Zhu et al. 2020;Wang et al. 2020b). The most commonly used DL algorithms include CNNs and RNNs, which have the advantage of feature extraction (LeCun et al. 1998), whereas the RNN is effective for mining time-series data (Cleeremans et al. 1989). To fully use these advantages, we combined the two algorithms to obtain a better performing algorithm. The three main aspects of this algorithm are presented in the following paragraphs.

Convolutional neural network
The CNN (Bengio 2009;LeCun et al. 2015) is inspired by the structure of the visual system, in which the existence of multiple filters is essential for extracting features from the input data. As the network depth increases, deeper features can be extracted and robust features with translation invariance can be obtained from the original data. Several CNN structures have been proposed, namely the one-dimensional 701 (1D), two-dimensional (2D) and three-dimensional (3D) CNNs (Zhao et al. 2019). The 1D CNN is primarily used for sequence-data processing (Abdeljaber et al. 2017), the 2D CNN is often used for image and text recognition and the 3D CNN is primarily used for medical image and video data recognition. As the logging curve is predicted as a multisequence input problem and contains coupling between different curves, the convolution kernel in the CNN network used in this study has a 1D structure (1D CNN).
CNNs can be used to deal with local features because they account for local logging data correlation. Compared with the traditional multi-layer perceptron and other neural networks, CNNs reduce the complexity of the network model and provide better generalisation capabilities by using local connections and sharing the weight of the convolution layer. CNNs abstract the input information into multi-layer features and retain the spatial topology of the original data, which adapts to the topological correlations between the logging data.
In this study, we used the edge zero filling method to ensure that the size of the data remained unchanged after the CNN convolution output, and the structural order of the data remained unchanged to allow the subsequent LSTM to learn to analyse the temporal dynamics of the data.
The pooling layer scales the input features and extracts the principal features. In this study, the local maximum value was extracted from the input features by max-pooling processing to reduce the number of trainable parameters and improve model robustness (Schmidhuber 2015).
The max-pooling process reduces the dimension of the output data and retains the most significant feature information of the input data (Tollas et al. 2015). The primary purpose of max-pooling is to reduce the parameters needed by the subsequent layer by reducing the dimension of the feature map and maintaining translation invariance (Borovykh et al. 2017).

LSTM network
The CNN data layer entails unidirectional transmission. The connection between neurons is completed by adjacent layers, and identical layers are not connected. However, for timeseries data, the value corresponding to the next time instant is related to the value corresponding to the current moment and correlated with previous values over a long period. The emergence of the RNN solves the correlation problem before and after the data sequence (Goltsev & Rachkovskij 2001;Graves & Schmidhuber 2005), connects the neurons in the same layer and associates previous and current information. In other words, the RNN combines previous and current data to output the prediction results.
The linkage structure of the RNN sequence effectively addresses the sequence-data problem. However, the gradient is easily eliminated as the RNN only has one memory unit. As the number of network layers increases, the issue of gradient disappearance becomes more serious until the state of the previous moment cannot be effectively transferred (Bengio et al. 1994). Therefore, the LSTM network (Hochreiter & Schmidhuber 1997) was proposed as an improved RNN to avoid gradient disappearance and to significantly affect the processing and prediction of time series with long intervals and delayed events (Cornia et al. 2018). Figure 1 illustrates the LSTM structure and shows the interaction layers of the four special connection modes in the LSTM. The concept of 'gate' is introduced in the LSTM to control the mutual transfer calculation in the hidden units. The unit state is the key to processing the hidden units. The previous information memorised in the unit state was processed before and after the LSTM use. The current time input and the state of the previously hidden units are processed through the three-gate structures of the interaction layer and the tanh unit of a cell unit to filter the unit time information. Additionally, new information is added to the current time calculation through the gate structure and the state of the memory unit; thus, the LSTM effectively updates and transmits key information in the time series.

Spatio-temporal neural network
A novel method for generating spatio-temporal neural networks is proposed herein to generate well logs to fully exploit the advantages of highly nonlinear, self-adaptive and selflearning single neural networks in nonlinear dynamic modelling applications. The STNN was built by leveraging the combined strengths of the CNN and LSTM networks. The network model uses the local perception ability (spatial feature extraction) of the CNN and the long-term memory function (temporal feature extraction) of the LSTM network to mine the spatio-temporal features of the logging data. The STNN exploits structural advantages to compensate for the disadvantages of the respective component networks to reduce training time, accelerate network convergence and improve performance.

STNN prediction model
The LSTM network was designed to extract key features from the well log data along the depth direction. The powerful self-learning ability of the CNN was used to extract logging data features adaptively from the input logging data to improve the prediction accuracy. The STNN was built by leveraging the combined strengths of the CNN and LSTM networks to simultaneously learn information on spatial features and temporal dynamics in the input logging signal. First, the logging data obtained were considered as multivariate time series. That is, the data collected by N logging curves at each point constitute an N-dimensional vector. The input multivariable time series is divided into feature maps using a sliding window ( Figure 2); the corresponding feature map dimension of each sliding window is M × N, where M is the time step super-parameter and N is the number of input logging curves. Each feature map was taken as an input, and the convolution operation was performed in the CNN model to extract the features between multiple variables in the spatial topology. Figure 3 illustrates the 1D CNN process. The left side of Figure 3 shows the input time-series data, which is a multi-dimensional matrix convoluted from top to bottom (shown by the arrow in Figure 3). To avoid overfitting, dropout (Srivastava et al. 2014) and an early stopping strat-egy were adopted. That is, the weight of the input feature vector is discarded with a certain probability in each training process, and the training is stopped when the training error does not decrease after setting the number of early stop epochs. Figure 4 shows the overall flow of the algorithm for the spatio-temporal network model. The experimental process primarily includes data acquisition, data preprocessing, model training, model testing and model evaluation (in which data preprocessing includes data normalisation, training set division and the test set). After data preprocessing, the training data were used to train the model. The trained model was tested using validation set data, and the evaluation index measured the performance of the model. For model training, the loss function uses the mean square error and the Adam algorithm as the optimiser. The Adam algorithm, proposed by Kingma & Ba (2014), is currently the most commonly used algorithm. Compared with other adaptive learning rate algorithms, the Adam algorithm has a faster convergence speed and more effective learning effect (Sabour et al. 2017).

Geological context of the study area
The Ordos Basin, located in the western North China platform, is a rectangular tectonic basin surrounded by structural faults adjacent to the surrounding structural units. The Daniudi gas field is located in the northeastern Ordos Basin (red box in Figure 5) on the northern Yishan slope. The fault and structure are not developed in the block, which is generally a gentle monocline, low in the southwest and high in the northeast. The principal reservoir in the Daniudi gas field is a fluvial sedimentary system with a burial depth of approximately 2500-2800 m. The lithology is primarily light gray conglomerate, gravel-bearing coarse sandstone, medium-coarse sandstone and brown mudstone.

Data acquisition and preprocessing
The data of this experiment were obtained from five vertical wells (A1-A5) in the Ordos Basin. The well logs for the observation wells included gamma-ray (GR), density (RHOB), compensated neutron (CNL), acoustic sonic (DT), formation resistivity (RT) and borehole diameter (CALI) data. The experiment used the leave-one-out method, and five groups of experiments were performed. One of the five wells was used to construct a test dataset for each group of experiments, and the other four wells were combined into a training dataset. The test dataset was not subjected to sequential extraction. The five input log curves were entered into the STNN as five complete sequences to predict the output of missing acoustic sonic logs. As a preprocessing step, we used the following min-max scaling method to scale the logging data into the range [0,1] to improve the model prediction  accuracy (Ma et al. 2015): where x and x ′ are the actual and normalised values of a well log, and x min and x max are the minimum and maximum values, respectively.

Model parameter setting
To verify the effectiveness of the STNN model for acoustic sonic log data generation, the model was compared with LSTM and MLR models. The model structure based on STNN consists of an input layer, a convolution layer, a maxpooling layer, an LSTM layer, a fully connected layer and an output layer, as shown in Figure 4. After a series of experiments, the following structure of the STNN model was chosen: convolutional layer (32 filters + 1 filter size + Relu activation + padding 'same' + 1 stride) + max-pooling (1 pooling size) + convolutional layer (64 filters + 2 filter Sizes + Relu activation + padding 'same' + 1 stride) + maxpooling (2 pooling size) + 1 LSTM layer (30 neurons + Relu activation) + 1 LSTM layer (30 neurons + Relu activation) + 1 Dropout layer (0.01) + 1 dense layer (1 neuron, linear activation) + compile (mean_squared_error loss, Adam optimiser with a learning rate of 0.001). In addition, we adopted an early stopping strategy and set the number of early stopping epochs to ten. That is, if the training error did not decrease after the set number of early stopping epochs, the maximum number of epochs of the model was 1000. Note that these hyperparameters can be tuned to obtain a better performance. However, when comparing methods of the same type in a specific task, a choice of hyperparameter values that is sufficiently fair is difficult. A fair comparison depends on many factors, including the type of algorithm, assumptions of the model and sensitivity of the input data to the model . Therefore, to compare the STNN and LSTM models used in this experiment as fairly as possible, we chose to fix all of the parameters except for the network type.

Evaluation indicators
We applied two widely used indices to measure the prediction accuracy of the forecast models . Mean absolute error (MAE) and root mean square error (RMSE) were defined as where N is the number of points in the well-logging curves and y i andŷ i are the measured data and predicted data, respectively. In addition to these evaluation indicators, we used the improvement rate (IR) index, which evaluates the improvement of target model A to model B on the given evaluation index X and is calculated using equation (4): The evaluation index value of X i was generated using model i (i = A, B). If IR X is positive, target model A performs better than B in evaluation criterion X; the gap between A and B is larger for larger IR X values.

Analysis of experimental results
To evaluate the model performance, the STNN, LSTM and MLR were applied to the acoustic sonic log generation problem of the five wells in the Daniudi Oilfield, Ordos Basin.   2 show that the prediction results of the STNN are more accurate than those of the LSTM and MLR. All three models achieved good estimations for the five wells. The prediction results of the LSTM model were significantly better than those of the MLR method (Tables 1 and 2). LSTM can be used to generate well logs from a series of input log data by considering variation trends and context information with depth. Furthermore, the spatio-temporal network model improved the prediction accuracy, indicating that the STNN model extracted the key spatio-temporal features of the sequence and provided a higher prediction accuracy. However, in well A4, the RMSE of the LSTM was slightly lower than that of the STNN, possibly because A4 has patterns different from those of other wells that do not appear in the training dataset. Perhaps most importantly, the RMSE metric is a mean, implying that the results are averaged throughout the entire well; therefore, a generally poorly predicted log may contain exceptionally accurate parts. Here, the RMSE metric only provides information on the point-based magnitude of the error throughout the entire log. Further-more, the RMSE was more sensitive to abnormal data points in the validation set. Thus, the RMSE may be larger if the few predicted points are separated by large differences. Moreover, non-key features extracted under the spatio-temporal model may reduce the prediction accuracy, revealing the importance of the CNN in the STNN model. To analyse the STNN performance, the synthetic logs of A1 and A4 (the best and worst STNN predictions, respectively) were plotted and compared with those of the LSTM (Figures 6 and 7). The left side represents the five wellknown well logs: GR, density, compensated neutron, formation resistivity and borehole diameter. The three log curves on the right side correspond to the acoustic logs generated by STNN, LSTM and MLR, respectively. The red curves represent the results generated by STNN, LSTM and MLR, and the black curves represent the actual measured acoustic sonic logs as a reference. The rightmost column in the figure shows the relative error between the predicted and actual values. Relative errors closer to the baseline of zero indicate the predicted results with smaller fluctuations and greater accuracy. The relative error is the predicted value of the model minus the actual value. Figures 6 and 7 reveal that the overall trends of the STNN and LSTM networks (especially the STNN) proposed in this paper successfully predict the changing trend of well log information. The MLR method tracks the logging signal changes poorly, especially with fluctuating logging data, and yields large errors in the prediction result. Therefore, the STNN network makes good predictions that accurately reflect the changing trends of well logs. Specifically, observations of the logging curve of well A1 from depths of 2515-2525 and 2615-2620 m reveal a step increase in the values of the DT. However, MLR failed to estimate the changing trend. The STNN and LSTM are predicted based on sequence data and accurately judge the prediction curve trend by using the input curve trend, because the prediction is based on the correlation of sequence data. This experiment indicates that the STNN and LSTM can comprehensively analyse the influence of data before the prediction point and the input at the prediction point. This characteristic enables STNN and LSTM to accurately predict sequence-data trends and generate well logs.
Comparing the prediction results of LSTM and MLR, it can be seen that the latter exhibits a large error. The MLR establishes a linear mapping relationship between the input and output logging data to reconstruct missing logging curves. However, owing to the influence of many geological factors, such as burial depth, tectonic position, sedimentary environment, lithology change and degree of diagenesis, the relationship between logging curves is a typical nonlinear relationship from the perspective of rock geophysics, which can hardly be picked by the MLR. Compared with the linear regression analysis method, LSTM has a strong 706 Figure 6. Comparison between actual and predicted DT data of different models in well A1. data mining ability and can effectively determine nonlinear relationships between different logging curves. By doing so, it can significantly improve the overall prediction accuracy.
By comparing the prediction results of the STNN and LSTM, we can see that they differ significantly. However, the STNN better predicted the overall trend of the target logging curve. Additionally, the volatility of the STNN is lower, indi-cating that it is more accurate than the LSTM for data with unknown patterns. The spatio-temporal network model effectively uses the strong feature extraction ability of the 1D CNN and uses the LSTM to extract key features from well log data along the depth direction. Additionally, the crosslayer and interlayer neurons have an information transmission function that effectively retains an optimal amount of information and removes interference. The results in the figures and tables reveal that the STNN model indeed considers temporal and spatial factors, whereas the LSTM model extracts features from well log data and produces inferior predictions. The STNN model designed in this study achieved the best prediction accuracy by considering the temporal and spatial features. Theoretically, the CNN model, which extracts local data features and abstracts them into high-level features, is more suitable for spatial expansion. The logging information reflects changes in the formation with depth, which is an effective feature for generating well logs. LSTM, which is more suitable for temporal expansion, functions like long-term memory and is more suitable for processing sequence data. To extract logging data features, we need to consider the spatial relationship between different well logs and changes along the depth direction. The STNN model expresses the spatio-temporal features for 708 more accurate predictions. The experimental results verify that the STNN model retains the spatial feature extraction advantages of the CNN and the temporal feature extraction of the LSTM. Our method, which considers geologic trends and local correlations, predicts logs that closely match the actual logs. Table 3 shows the running time (the time required for model training and prediction). The STNN model has a shorter running time than the LSTM model for the five wells. Theoretically, compared with the LSTM, the STNN introduces CNN for spatial feature extraction, which increases the complexity of the network structure and leads to an increase in running time. However, the actual running results show that the running time of the STNN model not only does not increase, but also significantly decreases. The results show that compared with LSTM, the STNN model not only has a higher prediction accuracy, but also a shorter running time. Almost all models have their own advantages and disadvantages; thus, an STNN model for generating acoustic sonic logs that uses the CNN and LSTM to extract related features from existing well logs and make predictions, respectively. The combined analysis of the prediction error and running time reveals that the proposed STNN model improves prediction accuracy, reduces the running time and has a higher performance-to-price ratio. Therefore, the STNN can be a powerful tool for predicting well-logging sequences.

Conclusion
As the exploration field changes, the reservoir medium transitions from homogeneous to heterogeneous. Pronounced nonlinear relationships and complex mapping relationships among logging data are common. Existing well log generation methods (such as cross-plots and multiple regression analysis) cannot accurately generate synthetic well logs within heterogeneous formations; therefore, new ideas must be pursued and new methods developed. Well log generation must consider the spatial relationship between different logging data characteristics and changes in well log data along the depth direction. Current studies on forecasting missing well logs using machine learning have not considered the spatial and temporal characteristics simultaneously. Therefore, in this study, we propose a spatio-temporal neural network to estimate missing acoustic sonic logs from GR, density, compen-sated neutron, formation resistivity and borehole diameter logs. The model considers geologic trends and local correlations to generate a high-accuracy acoustic sonic log. The STNN was built by leveraging the combined strengths of the CNN and LSTM networks. The STNN network is more sensitive to reservoir feature information and, therefore, correctly resolves the relationships between reservoir features within the depth sequence. The advantages of the CNN for feature extraction and LSTM for time-series data processing are used for more accurate generation of synthetic well logs. The results showed that the proposed approach produced better acoustic sonic log data prediction and reduced the running time. This approach can be extended to predict other types of log from different types of input logs. Therefore, the proposed method provides fast and high-quality data for further geological research and reservoir geological modelling, which lowers costs and enhanced productivity in oil and gas development.
Although the proposed DL model generates a better prediction result, this study only verifies and analyses the feasibility of its application in acoustic sonic log prediction from the perspective of theory and method. In addition, the application of this study does not specifically consider the impact of factors such as the geological environment, well spacing, geological events and other factors on the prediction results. Therefore, in-depth and detailed research remains necessary in specific applications. Future research will include other types of uncertainty, such as data uncertainty, and will make the local window size based on the geologic settings an input to the network. The rock physics model can be combined with a DL approach to create better predictions for well log generation. In addition, many classical machine learning methods have been widely applied in well log forecasting problems, such as support vector machines, random forest, decision trees, gradient boosting, multi-layer perceptrons and artificial neural networks, which have not been tested. However, under certain conditions, these traditional machine learning methods are effective for forecasting missing well-logging curves. Therefore, in future work, we shall carefully compare and analyse the prediction results of STNN and classical machine learning to further verify the scientific soundness and effectiveness of the proposed STNN.