A novel registration method for long-serial section images of EM with a serial split technique based on unsupervised optical flow network

Abstract Motivation The registration of serial section electron microscope images is a critical step in reconstructing biological tissue volumes, and it aims to eliminate complex nonlinear deformations from sectioning and replicate the correct neurite structure. However, due to the inherent properties of biological structures and the challenges posed by section preparation of biological tissues, achieving an accurate registration of serial sections remains a significant challenge. Conventional nonlinear registration techniques, which are effective in eliminating nonlinear deformation, can also eliminate the natural morphological variation of neurites across sections. Additionally, accumulation of registration errors alters the neurite structure. Results This article proposes a novel method for serial section registration that utilizes an unsupervised optical flow network to measure feature similarity rather than pixel similarity to eliminate nonlinear deformation and achieve pairwise registration between sections. The optical flow network is then employed to estimate and compensate for cumulative registration error, thereby allowing for the reconstruction of the structure of biological tissues. Based on the novel serial section registration method, a serial split technique is proposed for long-serial sections. Experimental results demonstrate that the state-of-the-art method proposed here effectively improves the spatial continuity of serial sections, leading to more accurate registration and improved reconstruction of the structure of biological tissues. Availability and implementation The source code and data are available at https://github.com/TongXin-CASIA/EFSR.


Introduction
Serial section electron microscopy (ssEM) is a mainstream image-acquisition method for connectomics research. In ssEM, a biological tissue is cut into 30-50 nm thick sections and then imaged with a high-resolution electron microscope (EM) (Briggman and Bock 2012). This technique offers several advantages over serial block face scanning electron microscopy (Denk and Horstmann, 2004) and focused ion beam scanning electron microscopy (Knott et al., 2008), such as a large imaging area, parallel imaging in multiple EMs, and nondestructive sample processing, making it well suited to large-volume reconstruction. In recent years, ssEM has been utilized in many large-scale connectomics reconstruction projects (Hildebrand et al. 2017, Zheng et al. 2018, Macrina et al. 2021, Shapson-Coe et al. 2021.
One major drawback of ssEM is the physical cutting of the tissue block into sections, which results in the loss of continuity between sections. In addition, the section cutting process also introduces nonlinear deformations (Gardella et al. 2003). The destruction of biological structures caused by the loss of continuity and nonlinear deformations hinders the accurate reconstruction of these structures. To address these issues, serial section images need to be registered to restore the axial continuity of biological structures, such as neurites (Anderson et al. 2009, Cardona et al. 2010, Bock et al. 2011. Section registration involves aligning similar structures between adjacent sections to restore structure continuity in biological tissue. However, this task is challenging due to the similarity in morphology of different neurites, natural morphological changes of the same neurite across different sections, and factors, such as dust, radiation damage, and other artifacts. Several methods have been proposed for section registration, such as Elastic registration (Saalfeld et al. 2012), ssEMNet (Yoo et al. 2017), voxelMorph (Balakrishnan et al. 2019), and SEAMLeSS (Mitchell et al. 2019). However, these methods that measure pixel-level similarity between sections to perform registration can lead to overregistration, causing destruction of the axial structure of the biological tissue by incorrectly eliminating natural morphological changes between sections. As shown in Fig. 1, the strong similarity constraint leads to a reconstruction where all registered neurites are perpendicular to the section plane.
This study proposes a solution to tackle the issue of overregistration in biological tissue section registration by utilizing a feature representation that considers the morphological variations among sections to measure inter-section similarity. Convolutional neural networks are well suited to this task due to their superior nonlinear modeling capabilities and ability to learn feature representations from training data (Mishchuk et al. 2017, Tian et al. 2017. In particular, the Unsupervised Learning of Thickness-Insensitive Representations (UTR) (Xin et al. 2021) feature descriptor is employed as it accounts for the morphological variations of neurite structures and reduces the impact of biological morphological changes between sections on registration. This study uses the unsupervised optical flow network to register sections. By utilizing UTR descriptor to measure the similarity between sections during network training, the proposed method allows the optical flow network to focus on positional distortions rather than morphological changes in the neurite structure, thereby minimizing the risk of overregistration.
In addition to pairwise registration design concerns, avoiding errors cumulative during serial registration is also a significant challenge. Most existing registration algorithms follow a specific order, such as successively registering serial sections to the most recently registered sections from the top of the block (Tasdizen et al. 2010). However, this process introduces errors due to morphological changes in neurite structure, and the error can accumulate across sections. Severe error accumulation can cause the warped section to fail to meet the similarity constraint of the serial registration algorithm, leading to a break in the neurite structure, as shown in Fig. 1. Elastic registration (Saalfeld et al. 2012) constructs a spring mesh across the serial sections to avoid error accumulation. ssEMNet (Yoo et al. 2017) implements serial registration by incorporating supervision of neighboring sections. Popovych et al. (2020) proposed enhancing local continuity through vector voting with several neighboring sections. Lv et al. (2020) reduce error accumulation by imposing constraints on serial registration, but this method is restricted to rigid registration. To address error accumulation, this study proposes a structural regression method. The cumulative error is first estimated using optical flow, and then compensation is performed section by section. Since biological structures are expected to transform smoothly in the axial direction, the similarity between sections determines the degree of error compensation.
Although structural regression has the benefit of compensating for cumulative registration errors, it has limitations in that it can only be used in short series because the cylindrical model (see Section 2.2 for details) of neurites used in this article is only applicable to short series and may not be suitable for long series. Therefore, to enable the use of structural regression in long series, this article proposes a strategy of dividing the long series into multiple short series and performing registration on each of them separately. With this strategy, the proposed registration framework can be used for the registration of long series.

Materials and methods
This study proposes a serial section registration framework ( Fig. 2A) that aims to achieve accurate reconstruction of neurite structures. The framework is designed to address two main aspects of serial registration: preserving the natural morphology of neural structures and compensating for cumulative errors introduced during registration. To achieve these goals, the first and last sections are registered by rigid registration, with the purpose of restoring the positional relationship between the first and last sections to provide a reference for structural regression. The optical flow is then used to register the remaining sections sequentially. Cumulative error fields are estimated and eliminated by warping for structural regression. In this study, the original section i is denoted as s i , and warped section i after registration is referred to as ws i . The reference section is denoted as s 1 , and the other sections are registered to the previous section sequentially. To register s i ; i ¼ 2; . . . ; n À 1, the optical flow network E estimates the optical flow F i!iÀ1 ¼ E s i ; ws iÀ1 ð Þ from s i to ws iÀ1 , and then F i!iÀ1 is used to warp s i to obtain ws i ¼ / F s i ð Þ. The cumulative error is estimated as the optical flow from ws nÀ1 to ws n and removed from ws i ; i ¼ 2; . . . ; n À 1 after serial registration to obtain the final result.  Xin et al.

Optical flow network
The serial registration process involves multiple pairwise registrations, so the choice of pairwise registration model can significantly impact the results. Common parametric models may not be sufficient to accurately model the complex nonlinear deformations introduced during section preparation. To address this challenge, this article uses an optical flow approach to register pairs of sections. For each pair of sections, this registration process involves estimating the optical flow field between the sections using an optical flow network and warping them into the reference section. This approach improves on the state-of-the-art unsupervised optical flow method ARFlow (Liu et al. 2020) to achieve better registration results for images of biological sections. Optical flow networks have a greater potential to learn the unique features of section images compared to traditional optical flow methods, and ARFlow uses unsupervised learning, avoiding the need for data labeling, which is difficult to obtain. The network architecture used in this article, identical to that of ARFlow, is a lightweight PWC-Net (Sun et al. 2018) (pwc-lite) that computes the optical flow from coarse to fine using five octaves; only three are shown in Fig. 3A due to space limitations. The photometric loss L ph in ARFlow measures the similarity between images and is a weighted sum of the L1 loss, structural similarity (ssim) loss, and ternary loss, which can only measure the pixel-level similarity between images. However, when this optical flow network is used to register images of biological sections, the differences in neurite structure morphology between adjacent sections are severely penalized, leading to the elimination of the morphological differences between sections of biological tissue during registration. To address this issue, the proposed method utilizes the pre-trained UTR (Xin et al. 2021) Network to extract descriptor features (feat s i ð Þ) of sections s i to calculate the similarity between sections, thereby allowing the network to focus more on the location changes of neurite structure and minimize the impact of neurite structure morphology changes between sections on registration. Specifically, in this article, we the section image obtained by warping the section s i with the optical flow F. Additionally, the smoothness loss L smooth constrains the consistency of the deformation field, while the augmentation loss L aug augments data and improves the accuracy of unsupervised learning. Figure 3B illustrates the training process of the optical flow network, which involves optimizing the photometric loss L ph , the smoothness loss L smooth , and the augmentation loss L aug .

Structural regression
Most registration methods are used to register sections with a reference section, but they may warp the spatial structure of neurites and introduce errors. The sequential nature of the registration process can cause errors to accumulate, resulting in distorted neurites. As shown in Fig. 4, the cumulative errors tend to distort angled neurites to be perpendicular to the section plane. These errors can eventually become so significant that the matching algorithm cannot match adjacent sections, leading to a break in the neural structure's continuity.
Although the proposed improved optical flow method attempts to preserve the natural morphological variation of biological structure across sections, it still cannot completely prevent the accumulation of errors. The neurite in a short series can be approximated as a cylinder, resulting in a consistent direction of the registration error across all sections. The cumulative error e i for a given section s i can be calculated as the sum of the registration errors up to that section, where De i represents the error on section s i , such To compensate for the cumulative registration error in each section, this study proposes using the UTR descriptors to estimate the proportion of the error on the current section. The difference in UTR descriptors between two adjacent sections is used to indicate the proportion of the cumulative error on the current section. To suppress the influence of cumulative errors and accurately reconstruct the morphological structure of neurites, the optical flow is used to warp section s i with a weight w i that is calculated based on the distances between the UTR descriptors of all sections.
Before the serial registration, the first and last sections have been already rigidly registered to recover their spatial position correlations, providing a reference for the subsequent structural regression. These sections are not deformed during the following registration process. The optical flow F nÀ1!n , which represents the displacement field of all pixels between the last two sections, can be used to approximate the total cumulative error e nÀ1 . The UTR descriptors prioritize the changes in position between sections, which makes them a more reasonable way to evaluate the degree of cumulative error.
To further improve the accuracy of the registration, this study uses the distances between UTR descriptors to calculate weights for all sections in the series. To calculate the weight w i for section s i , the distance d i between the UTR descriptors of section s i and section s iÀ1 is first calculated using the L2 , where featðs i Þ denotes the UTR descriptor of section s i . The weight w i is then defined as the ratio of the sum of distances between the UTR descriptors of all sections up to section s i and the sum of distances between the UTR descriptors of all sections in the series, given by w i ¼ P i j¼1 d j = P n j¼1 d j . The weight w i does not directly measure the proportion of cumulative error on section s i relative to the total cumulative error on the series. However, it reflects the degree to which compensation is applied to the cumulative error on section s i . The assumption that neurites are cylindrical in short series means that larger registration errors on a section correspond to greater differences between that section and the previous section. The difference in UTR descriptors reflects the size of the error on the current section after minimizing the impact of neurite structure changes, and therefore w i can reflect the degree of error compensation.
By using the weights to warp each section with the optical flow, this approach can effectively suppress the influence of the cumulative error and accurately reconstruct the morphological structure of neurites. A novel registration method for long-serial section images of EM 5

Division of long series
This article also proposes a method for registering long-serial sections by dividing them into multiple short series and registering them separately to compensate for the cumulative errors in serial registration of short series. Structural regression cannot accurately recover the structural morphology of neurites in longer series, as it is not possible to approximate the long neurites as cylinders. To address this issue, the long series is divided into n equal short series, where n is determined by the degree of error accumulation, with larger n used when the error accumulation is smaller and vice versa.
The last section of each short series, referred to as the benchmark section, also serves as the first section of the subsequent short series. The benchmark sections mentioned in this paragraph formed another series with lower axial resolution, and they are rigidly registered in advance by extracting SIFT points and matching them with RANSAC to estimate the rigid transformation matrix for registration. Those rigid transformations determine the spatial correspondence of the short series. This enables the recovery of position information for the first and last sections of each short series. Subsequently, the proposed short series registration method is applied to each short series for serial registration, and the resulting short series are stacked to obtain the registration results for the long series.

Implementation details and data description
This study utilizes three different datasets, namely the zebrafish brain ssEM data, CREMI dataset, and FIB-mito dataset. The zebrafish ssEM dataset, which is composed of 350 pairs of images for training and 100 pairs of images for validation, is utilized to train the optical flow network. The architecture of this network is identical to that of ARFlow. Unlike ARFlow, the photometric loss is based on UTR descriptors. To test the proposed method, the CREMI and FIB-mito datasets, which have different morphological appearance compared to the datasets used for training, are used as test sets.
The optical flow network is tested on these test sets, and the results are evaluated.
The zebrafish ssEM dataset used in this study is obtained from the brain of a zebrafish larva using automatic tapecollecting ultra-microtome scanning electron microscope imaging, which was described in Xin et al. (2022). The dataset consists of 450 pairs randomly sampled brain sections with a voxel resolution of 32 Â 32 nm 2 and a section thickness of 30 nm. The image size is 1024 Â 1024.
The CREMI dataset (https://cremi.org/) used in this study is acquired from the brain of an adult Drosophila using serial section transmission electron microscopy. The dataset consists of 125 successive EM images with a voxel resolution of 4 Â 4 Â 40 nm 3 and an image size of 1250 Â 1250. To simulate the real deformation during section preparation, the method of Yoo et al. (2017) is used to warp images. Thin plate spline (Bookstein 1989) is used to deform sections through multiple random vectors at random locations. The random vectors are sampled from a normal distribution with a mean of zero, and the random positions are distributed uniformly across the whole section image. In addition, these sections are centered-cropped to 1024 Â 1024 for ease of use. The dataset is strictly registered manually and can be used as the ground truth for EM images.
The FIB-mito (Lucchi et al. 2013) dataset used in this study represents a 5 Â 5 Â 5 lm 3 block taken from the CA1 hippocampus region of the brain. The dataset corresponds to a 1065 Â 2048 Â 1536 volume with a resolution of 5 Â 5 Â 5 nm 3 , and it is acquired by FIB-SEM. To simulate the typical section thickness of ssEM, which is between 30 and 50 nm, one of every eight sections in this dataset is kept during the experiment (at a 40 nm interval). The test dataset is generated using the same method used for the CREMI dataset.
For the CREMI dataset, this study uses the normalized cross-correlation (NCC) and the Dice coefficient of neurite labels to evaluate the accuracy of the registration results. The Dice coefficient is a measure of set similarity. In registration, the Dice coefficient of the label indicates the overlap degree of the same neuron after registration, and the Dice coefficient tends to 1, indicating that the higher the accuracy of the recovery of the same neuron structure after registration is. That is, the higher the accuracy of the registration result is. Furthermore, the Dice coefficient can measure the registration accuracy not affected by the image quality. To prevent the impact of labels representing intercellular spaces, this study uses the Dice coefficient of 50 neurites with the largest area on the first section to measure the accuracy of the registration results. This method was also used by ssEMNet to evaluate registration accuracy. For the FIB-mito dataset, this study only uses the NCC of sections to assess the registration results because unlike CREMI dataset, the dataset lacks dense labels of structures.

Results
This section evaluates the efficacy of the proposed pairwise registration and structural regression methods, as well as the entire framework including the long-serial splitting strategy. The optical flow method used in this study is an improved version of ARFlow. Thus, the original pre-trained ARFlow model (Liu et al. 2020) and the ARFlow_ft model fine-tuned on the zebrafish dataset (as described in Section 2.4) are used as the baselines for this experiment. All models are trained with Intel(R) Xeon(R) Gold 6142 CPU and one NVIDIA Tesla V100 GPU.

Pairwise registration
Pairwise registration is the basis of serial registration. It is desirable for pairwise registration methods to prioritize the location changes of biological tissues over morphological changes. In this experiment, for each randomly deformed section S moving , this article uses the previous undeformed section as the reference section S reference . More specifically, the pairwise registration estimates the deformation field of the two sections, and then the field is used to warp S moving . Figure 5 shows how our method can improve registration results. In the first row of the figure, mitochondria show a magnified morphological change between the reference section and the section to be registered. In contrast, mitochondria in the second row show a smaller morphological change. After registration, the baseline method tends to distort this natural deformation, while our method retains the morphological change.
This study uses the NCC and Dice coefficient to evaluate the registration results. The NCC measures the pixel similarity between images. While NCC (Reference) indicates the similarity between the warped result and the reference section, it cannot directly reflect the accuracy of the registration.
In comparison, NCC (GT) represents the similarity between the result and the ground truth. The Dice coefficient reflects the degree of overlap between the registered neurites and this study uses the NCC and Dice coefficient to evaluate the registration results. The NCC measures the pixel similarity between images. While NCC (Reference) indicates the similarity between the warped result and the reference section, it cannot directly reflect the accuracy of the registration.
In comparison, NCC (GT) represents the similarity between the result and the ground truth. The Dice coefficient reflects the degree of overlap between the registered neurites and the ground truth neurites. To demonstrate the effectiveness of the proposed method, the method is compared with baselines, Elastic (Saalfeld et al. 2012), SIFTFlow (Liu et al. 2011), FlowFormer (Huang et al. 2022, CRAFT (Sui et al. 2022), and SEAMLeSS (Mitchell et al. 2019). As shown in Table 1, the proposed method exhibits the highest NCC (GT) value and has the highest Dice coefficient.
In addition, an experiment is also conducted to compare the improved method with the baseline method. As shown in Table 1, when comparing the improved method with the baseline methods (ARFlow and ARFlow_ft), it can be seen that ARFlow_ft has the highest NCC (Reference), while the improved method has the highest NCC (GT). This suggests that the registration result of ARFlow_ft is most similar to the reference section, while the registration result of the improved method is closer to the ground truth. This indicates that the improved network is able to better preserve the natural morphology of the neurites.

Serial registration
This study evaluates the effectiveness of the proposed method for serial registration using the first 32 layers of each dataset. The efficacy of the proposed method is quantitatively analyzed and compared with baselines, Elastic (Saalfeld et al. 2012), SIFTFlow (Liu et al. 2011), FlowFormer (Huang et al. 2022, CRAFT (Sui et al. 2022), and SEAMLeSS (Mitchell et al. 2019). These comparison methods use the serial constraint method mentioned in the original paper, and do not use serial constraint if not mentioned. The baseline method (ARFlow and ARFlow_ft) in this article utilizes the structural regression proposed in this study as the serial constraint method, as it can be seen in Fig. 7 that the serial registration results of the baseline method can be improved by structural regression. The serial registration results are presented in Table 2, where the proposed method achieves the highest registration accuracy across different datasets.
The proposed method is specifically designed for short serial section registration, but it can still be used to register long-serial sections by dividing them into multiple short serials. To better demonstrate the performance of serial registration, this study displays a qualitative longitudinal view of long-serial registration results. In this experiment, the CREMI dataset is divided into five short series, each containing 25 sections, using the long-serial division and benchmark sections registration strategy mentioned in Section 2.3. Different serial registration methods are performed on each short series, and the results of the short serial registration are then stacked to construct a long-serial registration. The performance of the proposed method is compared with other registration methods, as shown in Fig. 6, and our method achieves the best longitudinal continuity in the long-serial sections and produces results closest to the ground truth.

Ablations
A set of ablation experiments are conducted in this study to illustrate the impact of each module on serial registration. These experiments are carried out on the first 32 layers of each dataset mentioned in Section 3.2. Table 3 shows the results of these experiments, where each section of the table corresponds to a particular module of the proposed method that was individually tested. The final version of the proposed method employs the versions with underlines. In the following paragraphs, each experiment will be described in detail.
Finetune: Although each optical flow method claims to have good generalization performance, finetuning existing pre-trained models for biological images is significant. The results show that the performance of the model is greatly improved after finetuning on the zebrafish dataset.
Similarity measure: This article replaces the pixel-level similarity measure used in the original ARFlow with the L2 distance of UTR descriptors. Training the network using the original similarity measure makes the network susceptible to differences in neural structure and morphology. The results indicate that using the L2 distance of UTR descriptors to measure slice similarity improves serial registration. Serial constraint: The proposed method uses structural regression for serial constraint. Table 3 demonstrates the impact of structural regression on serial registration, with significant improvement shown. Figure 7 illustrates the contribution of each module in the proposed method through qualitative results. This figure presents a longitudinal view of the serial registration results for the first 32 sections of the CREMI dataset. The morphological differences of the neurites in the red box demonstrate that fine-tuned optical flow, UTR descriptors, and structural regression all help to make the reconstructed neurite structure more similar to the ground truth. Moreover, our result shows improved continuity of the cell membrane when compared to the manual registration result.   Xin et al.
This study presents a new method for improving the accuracy of serial registration in connectomics research using ssEM. Serial section registration is a crucial step in reconstructing biological tissue volumes, as it tries to prevent artificial deformations and restore the correct neurite structure. However, traditional approaches of serial section registration, such as template matching and optical flow methods, have faced significant challenges, including the risk of overregistration due to optimization based on pixel-level similarity and the generation of serial registration cumulative errors. This study presents a novel approach to addressing the challenges previously mentioned. The proposed method leverages a stronger representation of features and an unsupervised  The highest performances are shown in bold. optical flow network to register pairs of sections. By relying on feature similarity rather than pixel similarity, the network is able to eliminate nonlinear deformations and accurately register the sections while preserving the natural morphological variations of neurites. It also has preferable resistance to scenes, such as contrast blur, lens distortion, and section contamination, it is because UTR (the descriptor used in the article) can suppress the influence of these factors during the training process to some extent. As demonstrated in Fig. 5, the method is able to accurately detect changes in mitochondrial size, which is a challenge for the baseline method. This suggests that the proposed method can correct positional deviations and maintain the normal structure of organelles. Moreover, an optical flow network is used to estimate and compensate for cumulative registration errors, allowing for more accurate recovery of the spatial structure of biological tissues. As shown in Fig. 7, the structural regression technique effectively compensates for cumulative error and corrects deviated neurites to their normal positions. However, it should be noted that the error accumulation assumption of the structural regression method is based on short serial sections, making it suitable only for short serial section registration. For long-serial section registration, a feasible strategy is to divide the long series into multiple short series for registration. This allows for the reconstruction of the spatial continuity of the entire block. While the short serial splitting strategy proposed in this article is simple, it may be worth considering other feasible schemes. Further study is needed to understand how to effectively connect the relationships between the short serials. In addition, this method cannot solve the registration accuracy problem caused by section fold and crack, which is also a problem that we need to further solve in the future.
The experimental results show that the enhancements proposed in this study effectively improve the spatial continuity of serial sections, resulting in improved registration and accurate recovery of the spatial structure of biological tissues. This method represents a significant advancement in the field of serial section registration, as it addresses the challenges faced by current approaches and has the potential to greatly improve the accuracy of volume reconstruction in the connectomics research.