ABSTRACT

Artificial intelligence (AI) and deep learning techniques are playing an increasing role in astronomy to deal with the data avalanche. Here we describe an application for finding resolved planetary nebulae (PNe) in crowded, wide-field, narrow-band Hα survey imagery in the Galactic plane, to test and facilitate more objective, reproducible, efficient and reliable trawls for them. PNe are important for studying the late-stage stellar evolution of low-mass to intermediate-mass stars. However, the confirmed ∼3800 Galactic PNe fall far short of the numbers expected. Traditional visual searching for resolved PNe is time-consuming because of the large data size and areal coverage of modern astronomical surveys. The training and validation data set of our algorithm was built with the INT Photometric Hα Survey (IPHAS) and true PNe from the Hong Kong/AAO/Strasbourg Hα (HASH) data base. Our algorithm correctly identified 444 PNe in the validation set of 454 PNe, with only 16 explicable ‘false’ positives, achieving a precision rate of 96.5 per cent and a recall rate of 97.8 per cent. After transfer learning, it was then applied to the VST Photometric Hα Survey of the Southern Galactic plane and bulge (VPHAS+), examining 979 out of 2284 survey fields, each covering 1° × 1°. It returned ∼20 000 detections, including 2637 known PNe and other kinds of catalogued non-PNe. A total of 815 new high-quality PNe candidates were found, 31 of which were selected as top-quality targets for optical spectroscopic follow-up. We found that 74 per cent of them are true, likely, and possible PNe. Representative preliminary confirmatory spectroscopy results are presented here to demonstrate the effectiveness of our techniques, with full details to be given in our forthcoming paper.

1 INTRODUCTION

Until the advent of digital imagery, celestial sources were ‘discovered’ by visual examination, first of the sky itself and then of photographic plates. Eventually, such plates were digitized by microdensitomers such as the Automated Plate Measuring machine (APM; Irwin, Maddox & McMahon 1994) and SuperCOSMOS (Hambly et al. 2001) that allowed for automatic image detection algorithms to be developed – so-called image analysers (e.g. Stobie 1986). These powerful techniques allowed for both image detection and separation of stars and galaxies across vast areas of the sky from application to digitized Schmidt telescope plates. These were of entire surveys of the northern and southern skies in various optical passbands such as B, R, and I. Such techniques included automatic following of the variable sky backgrounds across these survey plates and image deblending. Such sophisticated software worked well for regions outside the Galactic plane. Image crowding and extensive areas of Galactic emission from ionized gas becomes a serious issue only inside the Galactic plane.

The advent of the first wide-field narrow-band surveys of the Galactic plane in Hα, such as the SuperCOSMOS Hα Survey (SHS; Parker et al. 2005) and the VST Photometric Hα Survey of the Southern Galactic plane and bulge (VPHAS+; Drew et al. 2014) in the south and the INT Photometric Hα Survey (IPHAS; Drew et al. 2005) in the north, enabled the contrasted detection with the off-band equivalent broad-band red images of various kinds of emission-line star and resolved, isolated, and extended nebulosities. The character and identification of these extended nebulosities in particular, which may comprise H ii regions, Wolf–Rayet shells, supernova remnants, planetary nebulae (PNe), Strömgren zones, general and extensive ionized interstellar medium (ISM), etc., are varied and complex. This makes automatic detection and classification difficult, and beyond the capability of traditional image analysers of the connected pixel image data type. As a result, visual scrutiny of this Hα survey imagery, from both digitized wide-field UK Schmidt Telescope (UKST) tech-pan films for the SHS (Parker et al. 2005) and the CCD imagery from IPHAS and VPHAS+, was resorted to once more. This was to discover and identify discrete Galactic emission nebulae, a process that is inevitably subjective, inefficient, and imperfect. Nevertheless, it has resulted in large numbers of new Galactic PNe being discovered in both the southern sky from the SHS (Parker et al. 2006; Miszalski et al. 2008) and the northern sky from the IPHAS (Sabin et al. 2014) where amateur scrutiny of the on-line survey data has also played an increasing role (e.g. Le Dû et al. 2022). Another PNe candidate discovery method is to select candidates from the photometric results of compact Hα-emitting sources, or to use a colour–colour diagram of the IPHAS (r′ − Hα) versus (r′ − i′) and the Two Micron All-Sky Survey (2MASS) (J − H) versus (H − Ks), which is suitable for compact PNe smaller than 5 arcsec. For example, Viironen et al. (2009a, b) found thousands of candidates with this method. However, in representative follow-up of some dozens of candidates, most (>85 per cent) have been shown to be emission-line stars of various kinds, so this process for finding PNe is extremely inefficient.

In this work, we have investigated the use and power of deep-learning techniques to replace this retrograde visual examination. We have developed bespoke algorithms to automatically process these complex Galactic plane narrow-band Hα images to see if they can be employed to systematically detect discrete emission nebulosities with a particular emphasis on PNe.

This paper is organized as follows. In Section 2, we introduce PNe as an object class and their associated identification problems. In Section 3, we introduce the IPHAS and VPHAS+ Hα survey projects as the data used in this work. In Section 4, we detail the methods used, including algorithm construction and data processing methods. In Section 5, we demonstrate the excellent detection performance of our new model using different indicators and we discuss our new nebula target catalogue. In Section  6, we provide observations and analysis for several newly discovered nebulae as examples and we conduct relevant discussions. Finally, in Section 7, we summarize and discuss the current work and prospects for the future.

2 PLANETARY NEBULAE AS A CLASS

PNe are the gaseous, envelopes ejected by 1–8 |$\rm M_\odot$| stars towards the ends of their lives that are ionized by the radiation of their heating central stars (CSPN) that are well on the way to becoming white dwarfs. They are some of the most important and complex astrophysical objects to understand, especially as they probe mass-loss physics and stellar nucleosynthesis products (Iben 1995). Their progenitor stars constitute the majority of low- to intermediate-mass stars and are one of the primary contributors of ejected, enriched material into the ISM. With many strong diagnostic emission lines in their spectra, the abundance, densities, temperatures, ionization, and, via photoionization modelling, central star properties can all be determined, as well as more general properties such as Galactic abundance gradients (Maciel, Costa & Uchida 2003; Henry et al. 2010). For a recent, comprehensive PNe review, see Kwitter & Henry (2022). All currently confirmed Galactic PNe are contained in the ‘gold standard’ Hong Kong/AAO/Strasbourg Hα (HASH) data base1 (e.g. Parker, Bojičić & Frew 2016; Parker 2022).

2.1 Problems of identification

After 260 yr of observation and decades of theoretical studies, only ∼3880 PNe have been found in the Galaxy (Parker, Bojičić & Frew 2016). This is much lower than the best theoretical estimations of between ∼6600 to ∼45000 depending on stellar population synthesis model assumptions (e.g. Moe & De Marco 2006). Better estimates of the actual Galactic PNe population via more complete discovery processes can help rule out different PNe formation schemes and determine whether they form from single stars or have emerged from binary systems (De Marco 2009). Despite their short lifetime of typically 21 000 ± 5000 yr (Jacob, Schönberner & Steffen 2013) – but see Fragkou et al. (2022) who present strong evidence for a PNe that has survived in an open cluster for ∼78000 ± 25000 yr – PNe experience drastic changes as they evolve, resulting in diverse properties of morphology, size, surface brightness distribution, spectral characteristics, etc. (for a recent review of the identification issues at play, see Parker 2022). More complete samples of the underlying PN population are crucial to understanding their role but, simultaneously, these broad variations in observed properties also increase the difficulty in discovering them in a fair and unbiased way. Indeed, many PNe are faint and hidden in the dense star fields of the Galactic bulge and plane, which brings significant challenges to their discovery in such environments.

Traditional PNe detection methods are inefficient once size and surface brightness fall below certain limits and the surrounding environment becomes too complicated in terms of star density or extensive unrelated emission, giving rise to severe selection effects. Standard, automated, model-fitting methods also have limitations when applied to targets with complicated shapes and low surface brightness. This has motivated us to develop a novel automated deep learning (DL) algorithm to quickly and reliably find PNe in wide-field CCD imagery. In recent years, DL has demonstrated strong image recognition capabilities and made significant contributions in astronomical survey data processing. For PNe, a very first attempt was made by Faundez-Abans, Ormeno & de Oliveira-Abans (1996), which classified PNe on chemical composition with cluster analysis and artificial neural network (ANN) algorithms. Then Akras, Guzman-Ramirez & Gonçalves (2019) used a decision tree to separate compact PNe on the infrared photometric data. Awang Iskandar et al. (2020) designed a deep transfer learning algorithm for PNe identification and morphological classification, using multiple sets of optical and infrared data. They tested various DL networks, most of which achieved an accuracy rate of only 80 per cent. In this work, we chose Hα survey data for searching because traditional visual investigation, based on these new narrow-band surveys, has accumulated a large number of reliable targets for training. Moreover, the latest Hα survey of the Southern Galactic plane (VPHAS+; Drew et al. 2014) has so far been under-exploited and has the greatest current potential for discovery of PNe across the most dense and crowded parts of the Galactic plane, including the bulge. VPHAS+, undertaken on the VLT Survey Telescope (VST) in Chile, has higher resolution than the SHS and similar resolution to the equivalent IPHAS in the north, but like IPHAS only samples the inner 5° either side of the mid-plane in the Galactic latitude. The VPHAS+ ran for seven seasons until mid-August 2018 and obtained images of 91.6 per cent of its planned footprint. Large-scale search work for new nebulosities has not really begun. This work focuses on searching for PNe in the VPHAS+ data of dense star fields towards the Galactic bulge with new, powerful DL methods we have developed.

2.2 Application of deep learning techniques to PNe identification

A new DL technique, known as a transformer model, is gaining popularity for a wide range of applications (Vaswani et al. 2017). This model relies on something called an ‘attention mechanism’, which consists of multiple layers of self-attention modules and feedforward neural networks. These layers capture information at different scales and levels of abstraction in the image, enabling the model to recognize complex patterns, structures, and relationships within the image, which allow the transformer to process information in a unique way. Unlike traditional deep convolutional neural networks, which have limitations on the area of data they can analyse at once, transformer models are different. The convolutional neural networks process images at a small, fixed-size window on a big picture. They can see only what is inside that window (known as the receptive field), and this can be limiting when they are used to process images that have a large size and contain targets with complex structures. The transformer is able to process the entire image all at once. They do not have these fixed windows, so they can understand how different parts of the picture relate to each other, no matter how near or far they are. In simple terms, the transformer has a more versatile and powerful tool to process large images.

The transformers exhibit better detection performance and generalizability. However, a challenge of using a transformer model for object detection is how to solve the problem of model complexity. In Jia et al. (2023b), this technology was first applied to the detection of strong gravitational lenses at the scale of galaxy clusters. However, in this architecture, convolutional models are still retained to solve the problem of model complexity by extracting the initial features of the targets through the ‘ResNet’ network (He et al. 2016). This inevitably leads to the loss of some target information and a decrease in the performance of small target detection. Compared with the large-scale effects of strong gravitational lenses, Galactic nebulae typically exhibit more diverse shapes and a wider range of scales. Swin-Transformer (Liu et al. 2021) is an improvement of the transformer that solves the problem of model complexity while retaining its powerful feature extraction capabilities. Therefore, unlike the method employed by Jia et al. (2023b), the model we use is a Swin-Transformer model based on the Mask R-CNN architecture (He et al. 2017). Specifically, we replace the feature extraction network in Mask R-CNN with a transformer network, replace the ResNet module with a Swin-Transformer, and construct a new ‘bespoke’ DL object detection method for PNe.

The model we have constructed is a data-driven algorithm, so selecting the appropriate data set is very important. To begin with, we made use of the IPHAS Hα photometric survey of the Northern Galactic plane (Drew et al. 2005; also see below). Through the efforts of Sabin et al. (2014) in particular, a large number of high-quality PNe targets have already been found and confirmed in the IPHAS images. Therefore, we used this PNe catalogue and IPHAS images to construct data sets to use in this work. We used 1137 individual images of PNe from the IPHAS and the corresponding data catalogue to train our model. We used a further 454 images to validate the model, and determined the detection ability of the model by comparing the model’s output results with the catalogue data of these images. We use the validation data set to test the performance of our method. Our results show that our method could detect 97.8 per cent of all known PNe while 96.5 per cent of the detected PNe are true PNe according to the HASH data base. This shows that our algorithm can not only independently find almost all the known PNe targets in the survey data but also has a very high accuracy rate. This means it can reduce a lot of manual candidate vetting and confirmation. Confidence in such an automated process is of great significance when trawling through wide-field large-scale survey imagery.

VPHAS+ is another high-resolution Hα photometric CCD survey of the Southern Galactic disc and bulge (see below). As it has not yet been systematically searched by human eyes or AI techniques, it provides large potential for new discoveries, especially in the dense, rich bulge region. Considering the overall similarity between the IPHAS and VPHAS+ data sets, we directly applied the model trained on the IPHAS data set to also search in VPHAS+. We examined 979 fields covering 979/2284 deg2 of VPHAS+ data. This amounts to about 4.5 million processed images with the size of 512 × 512 pixel2 from which we obtained ∼20 000 detections. We compared these detections with existing catalogues, which resulted in 2637 known PNe and other kinds of nebulae. We then inspected the rest of the images visually, eliminating bright stars, CCD issues, and large-scale diffuse nebulae. After this, we found 815 new high-quality candidates for follow-up.

We selected 31 of the most promising candidates for undertaking confirmatory spectroscopic observations on the South African Astronomical Observatory (SAAO) 1.9-m telescope in Sutherland, South Africa in 2023 June, using the SpUpNIC spectrograph. These results will be reported in detail in our forthcoming paper (Yushan Li et al., in preparation) but some preliminary, representative, confirmatory, reduced one-dimensional (1D) spectra are shown in Section 5.2. A more comprehensive search for nebulae targets in the VPHAS+ data (referring to further visual inspection in fields with more offset pixels) is also underway. We believe that the attention-mechanism-based nebula target detection model we have constructed has not only achieved excellent results in nebula target detection, discovering many new targets, but also has great promise for detection of other astronomical targets in other types of wide-field survey data, such as with the China Space Survey Telescope (CSST)2 and the Large Synoptic Survey Telescope (LSST).3

3 Hα SURVEYS

Hα surveys can be used to search for many types of Hα-emitting sources referred to earlier, including both compact and resolved object types. Such Hα surveys generally include both the Hα and red broad-band ‘r’ exposure (which includes Hα in the bandpass). The Hα line is isolated with a narrow-band optical interference filter (typically 75–100 Å wide) centred around 6563 Å; it includes the Hα emission line and the adjacent [N ii] emission lines. These lines, in various ratios, are characteristic of PNe and many other types of emission object. The r-band includes the Hα line in the bandpass and because the r-filter is broad, it can be effectively used as an off-band comparison. During observation, the strategy is to adjust the exposure time of both bands, so that the limiting magnitude in both is approximately equal. By comparing the two bands, PNe with strong emission lines near Hα (including [N ii]) can be detected by the algorithm. Subsequently, other targets such as emission-line stars and large-scale diffuse nebulae can be excluded with visual inspection, based on whether the target is discernible, its shape, colour, and other features.

In this work, we used the IPHAS and VPHAS+ as our primary data sets. These are two important CCD-based Hα surveys covering the Northern and Southern Galactic planes, respectively. Their basic properties are listed in Table 1. Fig. 1 shows the same PNe, Abell 47, PHR J1843−0232, and PHR J1843+0002 (HASH IDs 352, 2469, and 2478), taken from an area of overlap between these two CCD surveys, and also the same PN image from the earlier SHS photographic Hα, to demonstrate the difference of resolution and quality of these three surveys. All surveys have similar sensitivity for Hα emission but with morphological detail more evident in IPHAS and VPHAS+ .

The known PNe Abell 47, PHR J1843−0232, and PHR J1843+0002 (HASH IDs 352, 2469, and 2478) in the SHS, IPHAS and VPHAS+, respectively. The sensitivity of all three surveys for Hα are similar while the resolution, and thus morphological detail, are superior for the IPHAS and VPHAS+. The scale of these images is 60 arcsec on each side.
Figure 1.

The known PNe Abell 47, PHR J1843−0232, and PHR J1843+0002 (HASH IDs 352, 2469, and 2478) in the SHS, IPHAS and VPHAS+, respectively. The sensitivity of all three surveys for Hα are similar while the resolution, and thus morphological detail, are superior for the IPHAS and VPHAS+. The scale of these images is 60 arcsec on each side.

Table 1.

Properties of Hα surveys VPHAS+ , IPHAS, and SHS.

SurveyTelescopeCamera
NameHemisphereDurationCoverageArea
(deg2)
Depth
(mag)
NameAperture
(m)
FilterMedian seeing
(arcsec)
NameCCDPixelResolution
(arcsec pixel−1)
Field of view
(deg2)
VPHAS+Southern2018.08–now
(91.6%)
|b| < 5°;
210° < l < 40°
|b| < 10°;
350° < l < 10°
∼2000>∼20 (All),
22 (g)
VST2.6Hα,
u, g, r, i
1.0, ∼ 0.8–0.9
(dense field)
OmegaCAM322048 × 41000.211
IPHASNorthern2003.08–2008|b| < 5°;
29° < l < 215°
180021 (r), 20 (i),
20 (Hα)
INT2.5Hα,
r, i
1.2Wide-field camera
42048 × 41000.330.3
SHSSouthern1997.07–2003|b| < 10°–13°; Dec. < 2°4000∼20.5UKST1.2Hα, short-r1.0-2.0SuperCOSMOS12048 × 20480.6760
SurveyTelescopeCamera
NameHemisphereDurationCoverageArea
(deg2)
Depth
(mag)
NameAperture
(m)
FilterMedian seeing
(arcsec)
NameCCDPixelResolution
(arcsec pixel−1)
Field of view
(deg2)
VPHAS+Southern2018.08–now
(91.6%)
|b| < 5°;
210° < l < 40°
|b| < 10°;
350° < l < 10°
∼2000>∼20 (All),
22 (g)
VST2.6Hα,
u, g, r, i
1.0, ∼ 0.8–0.9
(dense field)
OmegaCAM322048 × 41000.211
IPHASNorthern2003.08–2008|b| < 5°;
29° < l < 215°
180021 (r), 20 (i),
20 (Hα)
INT2.5Hα,
r, i
1.2Wide-field camera
42048 × 41000.330.3
SHSSouthern1997.07–2003|b| < 10°–13°; Dec. < 2°4000∼20.5UKST1.2Hα, short-r1.0-2.0SuperCOSMOS12048 × 20480.6760
Table 1.

Properties of Hα surveys VPHAS+ , IPHAS, and SHS.

SurveyTelescopeCamera
NameHemisphereDurationCoverageArea
(deg2)
Depth
(mag)
NameAperture
(m)
FilterMedian seeing
(arcsec)
NameCCDPixelResolution
(arcsec pixel−1)
Field of view
(deg2)
VPHAS+Southern2018.08–now
(91.6%)
|b| < 5°;
210° < l < 40°
|b| < 10°;
350° < l < 10°
∼2000>∼20 (All),
22 (g)
VST2.6Hα,
u, g, r, i
1.0, ∼ 0.8–0.9
(dense field)
OmegaCAM322048 × 41000.211
IPHASNorthern2003.08–2008|b| < 5°;
29° < l < 215°
180021 (r), 20 (i),
20 (Hα)
INT2.5Hα,
r, i
1.2Wide-field camera
42048 × 41000.330.3
SHSSouthern1997.07–2003|b| < 10°–13°; Dec. < 2°4000∼20.5UKST1.2Hα, short-r1.0-2.0SuperCOSMOS12048 × 20480.6760
SurveyTelescopeCamera
NameHemisphereDurationCoverageArea
(deg2)
Depth
(mag)
NameAperture
(m)
FilterMedian seeing
(arcsec)
NameCCDPixelResolution
(arcsec pixel−1)
Field of view
(deg2)
VPHAS+Southern2018.08–now
(91.6%)
|b| < 5°;
210° < l < 40°
|b| < 10°;
350° < l < 10°
∼2000>∼20 (All),
22 (g)
VST2.6Hα,
u, g, r, i
1.0, ∼ 0.8–0.9
(dense field)
OmegaCAM322048 × 41000.211
IPHASNorthern2003.08–2008|b| < 5°;
29° < l < 215°
180021 (r), 20 (i),
20 (Hα)
INT2.5Hα,
r, i
1.2Wide-field camera
42048 × 41000.330.3
SHSSouthern1997.07–2003|b| < 10°–13°; Dec. < 2°4000∼20.5UKST1.2Hα, short-r1.0-2.0SuperCOSMOS12048 × 20480.6760

3.1 SuperCOSMOS Hα survey

The SHS (Parker et al. 2005) used the 1.2-m UKST in Siding Spring Observatory, Australia, to undertake the first modern Hα survey of the Southern Galactic plane between 1998 and 2003. The survey covered over 4000 deg2 to ±10° in Galactic latitude at 1–2 arcsec resolution and to 5 Rayleigh sensitivity. It was the last great photographic survey ever undertaken. It used fine-grained Kodak Technical Pan film as a detector (Parker & Malin 1999), which has a useful sensitivity peak at Hα and was push-processed and hypersensitized in a way that provided 10 per cent DQE (unprecedented for a photographic astronomical survey). It was undertaken with the world’s largest single element optical interference filter (Parker & Bland-Hawthorn 1998). All the films were digitized using the SuperCOSMOS microdensitometer (Hambly et al. 2001) and they have been carefully calibrated on to a Rayleigh scale (Frew et al. 2014). Even today, in terms of depth, resolution, and coverage, it remains competitive (see Fig. 1). In this work, we only use it as a base reference for comparison with VPHAS+ and, in the overlap region, IPHAS.

3.2 IPHAS

The IPHAS used the 2.5-m Isaac Newton Telescope (INT) on the island of La Palma, Canary Islands (Drew et al. 2005). IPHAS covered ∼1800 deg2 sampling Northern Galactic latitudes |b| < 5°. It used a wide-field camera with a field of view of 0.3 deg2 covered by four CCDs with 2048 × 4100 pixel2 and with a resolution of up to 0.33 arcsec pixel−1. The median seeing for survey-standard data is approximately 1.1 arcsec. The survey limiting magnitude can exceed r′ ≈ 20 mag (10σ). The IPHAS data include observations with an Hα narrow-band filter and r′ and i′ bands of the Sloan Digital Sky Survey (SDSS). The Hα band has a central wavelength of 6568 Å, with a bandwidth of 95 Å. The strategy of multiple repeat observations with slightly offset pointing field centres ensures that there are no omissions at the edges of the field of view and allows for better control of observational quality. The survey aimed to potentially increase the number of known emission-line targets in the Northern hemisphere by an order of magnitude.

3.3 VPHAS+

The VPHAS+ (Drew et al. 2014) is very similar to the IPHAS, but focuses on the observations of the Southern Galactic plane and bulge, and has a larger field of view and slightly higher resolution. The VPHAS+ project uses the VLT Survey Telescope (VST) for observations with a 2.6-m aperture located in Cerro Paranal, Chile. The OmegaCAM camera has 32 CCDs with 2048 × 4100 pixel2, which allows for a field of view of 1 deg2 and a resolution of up to 0.21 arcsec pixel−1. The survey planned to observe approximately 2000 deg2 of sky in a range of ±5° centred on the Galactic plane, with an extension to ±10° near the bulge. The survey is divided into 2284 of 1° × 1° fields, and 91.6 per cent of the goal has been completed. In addition to the Hα narrow band, the survey also includes the u, g, r, and i broad-band filters of the SDSS. The aim of limiting magnitude of the survey is to reach at least 20 mag (5σ) in each band, with a typical seeing of approximately 1.0 arcsec and approximately 0.8–0.9 arcsec in selected dense star fields, such as areas with less extinction in the Galactic bulge. For the observation strategy, each field of view is observed at three slightly different central positions to achieve full coverage and deeper depth.

4 METHOD

In this section, we provide a detailed description of the methods and strategies we have adopted based on a new DL technique to search for PNe candidates in narrow-band wide-field Hα survey imagery. In Section 4.1, we briefly introduce the model, and in Section  4.2, we further describe the method of constructing the data set and the different data processing methods used for the IPHAS and VPHAS+.

4.1 Model

The overall structure of the model is shown in Fig. 2. When an image is input to the model, it first undergoes feature extraction through a feature extraction network and uses so-called pyramid technology (Lin et al. 2017) to output multiscale feature maps at different levels. These multiscale features are first detected through a region proposal network (RPN) module (e.g. Ren et al. 2015), which outputs a series of bounding boxes that may contain a target of interest. The ROI align module (He et al. 2017) is then used to extract the corresponding feature map parts of these bounding boxes. Finally, these extracted feature maps that may contain a target are sent to subsequent classification, position regression, and mask segmentation modules for further accurate classification and precise positioning, resulting in the final prediction results.

The structure of Swin-Transformer model in this paper.
Figure 2.

The structure of Swin-Transformer model in this paper.

The model structure we used is similar to Mask R-CNN (He et al. 2017), which simultaneously predicts object classes, bounding boxes, and pixel-wise masks, making it a two-stage object detection architecture that typically has better detection accuracy than one-stage object detection networks such as YOLO (Redmon et al. 2015).

In the feature extraction module, compared to convolutional neural networks (CNNs), transformer-based neural networks have no limitation on receptive fields and can model the relationships between any input features across any scale and distance, at the cost of larger GPU memory and longer training or inference time. Therefore, they have more powerful feature extraction capabilities and are more suitable for extended astronomical objects with complex structures hidden behind dust and/or in dense star fields such as PNe. Therefore, we chose the ‘transformer’ as the feature extraction module, whose core is the attention mechanism where feature extraction also depends on attention calculation.

The calculation process is shown in Fig. 3. When a feature vector is processed, it is first projected on to different directions and positions in the feature space to transform into three vectors: Value, Query, and Key. This is similar to the hidden state in the recurrent neural network. Value, Query, and Key contain different distribution information of the feature vector in the feature space and are used to calculate the relationship between the current feature vector and other feature vectors. For each feature vector, its Query is dot-producted with the Key of all feature vectors to obtain the correlation between the current feature vector and all feature vectors. For each feature vector, we can calculate the relationship with other feature vectors to obtain an attention map, where the value of each pixel in the map represents the correlation between these corresponding features. The higher the value of the correlation, the closer the relationship between these two features, and the higher the model’s attention. This calculation method is based on something akin to ‘intuitive feeling’; that is, if two feature vectors are closer, their similarity is higher, so they should have a greater weight. The final attention map can be regarded as an attention distribution and is normalized through a so-called softmax layer, which uses the softmax function to calculate the distribution that the input example belongs to a specific function (Rumelhart, Hinton & Williams 1986). Thus, the attention map between different vectors is transformed into a weight map between different features, and the weight map output by the softmax layer is dot-producted with the Value of all corresponding feature vectors to obtain the final output. Through this structure, the attention layer naturally obtains the connection between features, regardless of their distance, breaking through the limitation of the receptive field of convolutional kernels in CNNs.

The procedure for calculation of attentions by the transformer.
Figure 3.

The procedure for calculation of attentions by the transformer.

However, the calculation of attention in the transformer is a dense operation that involves interactions between each pair of inputs, which means that the complexity of the transformer model increases by the square of the size of the input image. Because astronomical images can be large, this is computationally unacceptable. Therefore, we improved the transformer structure by using the Swin-Transformer structure, which retains the powerful feature extraction capabilities of the transformer while effectively solving the problem of high model complexity (Jia et al. 2023a). Its core mainly includes two parts: the shifted windows and multilevel feature, as shown in Fig. 4.

This figure demonstrates the multilevel features and how the shifted local window captures various image components for target detection purposes.
Figure 4.

This figure demonstrates the multilevel features and how the shifted local window captures various image components for target detection purposes.

First, we elaborate on the design of the shifted windows that we employ. In Fig. 4, we can see that unlike the attention calculation of the transformer structure, which directly interacts between all pixels of the input image, the Swin-Transformer first divides the input image into numerous small windows and then performs the attention calculation between different patches individually within each small window. This greatly reduces the complexity of the model, reducing the transformer’s N2 model complexity to linear complexity. At the same time, we noticed that if only window partitioning is performed, there will be a lack of information exchange between different windows, making it difficult to model larger-scale features that span across several windows. Therefore, we further refined the shifted window method, as shown in Fig. 4. The rule windows partitioned at layer l is shifted when it passes to the next layer l+1, generating new windows. The attention calculation is performed within the new windows. As a result, information exchange between the windows of layer l is achieved within the new windows of layer l+1, helping the model to better extract features.

Secondly, we briefly describe the multilevel features of our model. From Fig. 4, we can see that the Swin-Transformer performs feature extraction on the input image at different levels. The features at different levels are stacked together to form a feature pyramid that is input to the subsequent modules. This helps the model obtain more comprehensive multiscale feature information and achieve more accurate detection results (Lin et al. 2017).

The overall structure of the Swin-Transformer feature extraction module is shown in Fig. 5. Here, when performing feature extraction on the input image, the image is first divided into patches and windows using so-called ‘Patch Partition’, and then transformed into a sequence form that the transformer can accept using Linear Embedding (see Chen & Liu 2011). The input sequence undergoes several consecutive Swin-Transformer blocks to extract features, and undergoes down-sampling via Patch Merging to increase the receptive field of the model and extract multiscale features at different levels. The internal structure of two consecutive Swin-Transformer blocks is shown in Fig. 5. In the first Swin-Transformer block, attention calculation is performed within the regularly partitioned windows, while in the second Swin-Transformer block, attention calculation is performed within the shifted windows.

The upper panel of this figure displays the feature extraction module, where the input features undergo processing through the ‘W-MSA’ and ‘SW-MSA’ components to generate the output features. The complete structure of the Swin-Transformer feature extraction module is presented in the lower panel of this figure.
Figure 5.

The upper panel of this figure displays the feature extraction module, where the input features undergo processing through the ‘W-MSA’ and ‘SW-MSA’ components to generate the output features. The complete structure of the Swin-Transformer feature extraction module is presented in the lower panel of this figure.

In addition, it is worth noting that because attention calculation itself ignores the positional relationship between features, position encoding is needed before inputting features into the transformer. Here, we use a technique called relative position encoding (e.g. Wu et al. 2021). Moreover, the loss function used in our model is consistent with Mask R-CNN (He et al. 2017).

4.2 Data set

As the detection model we build is a data-driven algorithm, we need to further construct appropriate data sets to train and validate the model’s performance. Because of the stable and essential uniform quality of IPHAS data (e.g. Greimel et al. 2021) and the existing systematic visual search already undertaken to find PNe, a large number of high-quality PNe have already been found for this purpose (e.g. Sabin et al. 2014). Therefore, using IPHAS images and catalogue data, we can easily construct a data set and corresponding labels that can be used for both model training and model evaluation. We have processed the IPHAS data to better adapt it to our model. First, as the size of the original IPHAS images is 2048 × 4100 pixel2 with a resolution of up to 0.33 arcsec pixel−1, and considering the model complexity and hardware limitations, we cut these images into overlapping small images of 512 × 512 pixel2. Secondly, we used a series of different greyscale transformation techniques for image enhancement to improve model performance, as shown in Fig. 6. It can be seen that the image contrast has greatly improved after the greyscale transformation, making it easier to identify candidate PNe. This greatly helps accelerate model convergence and improve detection accuracy. After several tests, we finally selected the Zscale greyscale transformation (Smithsonian Astrophysical Observatory 2000) as the most useful.

This figure illustrates the complete pre-processing procedure of constructing the data set. As depicted, the pre-processing stage encompasses three key components: image cropping for subsequent data processing, greyscale transformation to enhance the signal-to-noise ratio, and the integration of multiband image data to create a merged, coloured image.
Figure 6.

This figure illustrates the complete pre-processing procedure of constructing the data set. As depicted, the pre-processing stage encompasses three key components: image cropping for subsequent data processing, greyscale transformation to enhance the signal-to-noise ratio, and the integration of multiband image data to create a merged, coloured image.

Most PNe are usually obvious in uncrowded, uncomplicated regions of the narrow-band Hα survey observations. In IPHAS, the observation time of the Hα band was adjusted to achieve consistency of image depth between it and the accompanying ‘r’ broad-band image. Therefore, compared with the r-band, the signal of emission objects in the Hα band will be enhanced, while the signals of other continuum astronomical sources do not change significantly. Therefore, to highlight the image features of the PNe, we combine the images of these two bands into a multichannel image. In this process, because of the existence of certain pixel offsets between the images of the two bands, we use Reproject4 to achieve the pixel-level precision alignment of the images of the two bands needed. Finally, we put the image data of the Hα band into the blue channel, and the image data of the r-band into the red and green channels and merge them into a PNG image. Although some information may be lost, the greyscale distribution of the image is more uniform, which is beneficial to the convergence of the model.

The whole process to construct the data set is shown in Fig. 7. PNe exhibit a wide variety of different angular sizes from a few arcsec up to arcmin in the survey imagery. In order to allow the model to more accurately locate the target PNe, we use the maximum diameter of each nebula as the side length of the boundary box in its label. With this method, we can better constrain the model’s localization of the PNe, thereby enabling the model to more accurately identify and analyse PNe of different sizes. After the above steps, we built a training set and a validation set using IPHAS data. The IPHAS training set consists of 1137 cropped images with the size of 512 × 512 pixel2 (107 × 107 arcsec2) and the validation set contains 454 cropped images with the size of 512 × 512 pixel2, both sets of which contain known true PNe.

This figure displays images subjected to various greyscale transformations. The top-left panel features the original image, while the top-right panel showcases the image after sinh transformation. In the bottom panel, we observe the image after log transformation, and in the bottom-right panel, we show the image after Zscale transformation. As demonstrated in this figure, the Zscale transformation effectively enhances the signal-to-noise ratio of target objects.
Figure 7.

This figure displays images subjected to various greyscale transformations. The top-left panel features the original image, while the top-right panel showcases the image after sinh transformation. In the bottom panel, we observe the image after log transformation, and in the bottom-right panel, we show the image after Zscale transformation. As demonstrated in this figure, the Zscale transformation effectively enhances the signal-to-noise ratio of target objects.

4.3 The VPHAS+ data set

Our key target Hα survey using the above techniques for PNe candidate identification is the newer, Southern VPHAS+ Galactic plane survey (Drew et al. 2014). For the available VPHAS+ data set, we performed the same cropping and greyscale transformation as for IPHAS and also converted them into PNG images in the same way. However, there is a difference in image alignment because the image data of the two bands in VPHAS+ are not one-to-one as for IPHAS. Hence, it is not possible to directly find the corresponding images of the two bands. Therefore, we first selected the FITS images of the Hα band and then used these images to match the corresponding images of the r-band with pixel differences within 10. These matched data were combined into a multichannel image to obtain the final PNG image. We processed ∼90 000 VPHAS+ images covering an area of 979 of the 2284 deg2 of the VPHAS+ survey. This is shown in Fig. 8. As a result, we obtained ∼4.5 million PNG images of size 512 × 512 pixel2 (equivalent to 107 × 107 arcsec2) for the model’s inference prediction and to search for new PNe targets.

This figure shows the distribution of the selected 979 our of 2284 VPHAS+ fields that were searched and projected in J2000 equatorial celestial coordinates of RA and Dec.
Figure 8.

This figure shows the distribution of the selected 979 our of 2284 VPHAS+ fields that were searched and projected in J2000 equatorial celestial coordinates of RA and Dec.

5 RESULTS

The model we used was implemented based on the PyTorch framework (Paszke et al. 2019) and optimized with AdamW, a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments (Loshchilov & Hutter 2017). To ensure faster and better convergence of the model, we loaded pre-trained weights from the Liu et al. (2021) to initialize the model. We deployed the model on a work station equipped with Nvidia RTX 3090 Ti GPU cards for training, validation, and candidate searching.

5.1 Performance evaluation criterion

We used the training and validation sets constructed from the IPHAS observation data in Section 4.2 to train and validate the model for use with the VPHAS+  data. To better evaluate the performance of the model, we adopted the following performance metrics.

The IOU (Intersection over Union – a metric used to evaluate DL algorithms by estimating how well a predicted mask or bounding box matches the ground-truth data) is an important performance evaluation criterion in object detection. It is defined as the ratio between the overlap area and the union area of the bounding box from the detection results and the labels as defined in

(1)

The IOU quantifies the overlap between the detection bounding box and the real target bounding box, ranging from 0 to 1. In general, when the IOU is larger than a set threshold, the detection is considered to have correctly detected the target.

According to the IOU evaluation criterion, we counted all the predicted results, and further quantitatively measured the overall detection performance of the model using the precision rate and the recall rate. The precision rate is the percentage of true, positive detection results to all detection results and the recall rate is the percentage of true positive detection results to all targets. The precision rate and the recall rate are defined by

(2)

where TP is the number of targets correctly detected, FP is the number of negative samples incorrectly labelled as positive, and FN is the number of true targets that have not been detected. A higher precision rate indicates a higher accuracy of the model. A higher recall rate indicates that the model has a strong ability to detect all real targets.

The precision and recall obtained by different IOU thresholds are often different. By calculating the precision and recall rates under different IOU thresholds, the precision–recall curve (PR curve) can be drawn. In the curve, the horizontal coordinate of a point is the recall rate, and the vertical coordinate is the precision rate. The area below the curve is the average precision (AP). The larger the area, the higher the AP value, and the better the model performance.

5.2 Results from IPHAS

The model was trained with 100 epochs (iterations), which took 10 h of computation on our computer server with one RTX 3090 Ti GPU. The entire training process is shown in Fig. 9, which shows that our model converges after around 40 epochs. The performance of the model on the validation set is shown as the confusion matrix in Fig. 10. Our model achieves AP of 0.944, when the IOU is set as 0.5, which we set as the IOU threshold. We have further made statistics on the prediction results when the IOU threshold was 0.5, as shown in the Fig. 11. As can be seen, out 454 known targets in the IPHAS validation set, we successfully identified 444 and had 16 false positives. This achieved a recall rate of 97.8 per cent and a precision rate of 96.5 per cent.

The value of the loss is going down with the training. As depicted, the neural network converges after approximately 40 epochs, signalling our decision to stop training after this point.
Figure 9.

The value of the loss is going down with the training. As depicted, the neural network converges after approximately 40 epochs, signalling our decision to stop training after this point.

The model’s performance on the validation set is assessed with varying IOU thresholds, as demonstrated in this figure. When we set the IOU to 0.5, the AP can reach approximately 0.944. This suggests that, in our detection algorithm, we have the option to prioritize higher detection accuracy and recall rate at the expense of location accuracy. However, employing a lower IOU value comes with the trade-off of requiring more human vetting near the positions identified by the detection algorithm.
Figure 10.

The model’s performance on the validation set is assessed with varying IOU thresholds, as demonstrated in this figure. When we set the IOU to 0.5, the AP can reach approximately 0.944. This suggests that, in our detection algorithm, we have the option to prioritize higher detection accuracy and recall rate at the expense of location accuracy. However, employing a lower IOU value comes with the trade-off of requiring more human vetting near the positions identified by the detection algorithm.

The confusion matrix for the model on the validation set is presented in this figure. It is evident that our algorithm demonstrates high accuracy, correctly detecting the majority of targets with minimal occurrences of false positives and false negatives.
Figure 11.

The confusion matrix for the model on the validation set is presented in this figure. It is evident that our algorithm demonstrates high accuracy, correctly detecting the majority of targets with minimal occurrences of false positives and false negatives.

We visualized the detection results of the model, as shown in Fig. 12. The blue colour represents the known positions of the PNe, while the red colour represents the detected positions of the PNe, and the numbers in the labels indicate the confidence of the detections. The model not only performs extremely well in detecting PNe of different scales and morphologies but is also able to locate them in complex environments such as dense star fields, contrasting extended emission regions or near bright stars. Furthermore, our techniques are effectively immune to effects of the instrument and observation conditions, such as CCD issues and variations in seeing.

Part of the detection results of the IPHAS validation set. As shown in this figure, our detection algorithm accurately identifies the targets and correctly determines their positions.
Figure 12.

Part of the detection results of the IPHAS validation set. As shown in this figure, our detection algorithm accurately identifies the targets and correctly determines their positions.

5.3 False negatives and positives from the IPHAS training data

As mentioned, from 454 known PNe in the IPHAS validation set, we successfully identified 444. Hence, 10 (2.2 per cent) were not recovered by our deep-learning (DL) technique. On examination, it was found that they are largely due to the variable quality of some of the IPHAS imagery so the object is fainter and harder to find. In some cases, larger diameter PNe were broken into smaller segments leading to a false negative for the object itself but several false positives. We had 16 such false positives. These can also arise from diffuse scattered light effects around bright stars that can mimic emission as the broad- and narrow-band filters do not produce the same scattered light distribution. Some of these also arise because the algorithm mistakenly identifies other types of diffuse radiation as PNe, resulting in false positives. Some late-type stars have very strong molecular bands that fall in the Hα bandpass but are much weaker in the broad-band red filter to the blue side of the molecular band. Both filters cut off further to the red so the even stronger molecular bands in this region from these types of star are not sampled. This can give the effect of an apparent very strong Hα excess compared to the red broad-band data and so a false positive.

5.4 Results from VPHAS+

We used the trained model to scan 90 000 VPHAS+ observational data samples covering an area of 979 deg2 (43 per cent of the total survey footprint), which consists of approximately 4.5 million PNG images processed as described in Section 4.2. Scanning one image of size 512 × 512 pixel2 takes 0.05 s. We visualized the output results, as shown in Fig. 13. It can be seen that the model can achieve accurate detection for the majority of nebulae. Even emission regions that are obstructed or affected by nearby stars could be effectively revealed by our method. The model returned ∼20 000 detections. Then we compared the coordinates of these detections with the existing objects catalogue from the HASH data base with the criteria of the differences in image centroids in RA and Dec., both smaller than 20 arcsec, separately for true PNe, likely and possible PNe, and other kinds of known nebulae and emission-line objects. After that, we inspected the images visually, for diffraction spikes from bright stars, CCD issues (such as bad pixels, defects, etc.), confusion by normal stars, very diffuse emissions, moving objects (between the epochs of the different exposures), very faint detections, as well as PNe candidates. After the above filtering and screening, we eventually identified 3452 candidate objects (17.3 per cent), including 2637 known nebula targets and 815 newly discovered high-quality candidates. Note that because the detections of algorithm can be verified by the known catalogue at a level that is difficult for the human eye to discern, it is highly possible that some faint targets are hiding in the ‘very faint detections’ category.

Example PNe candidate discovery images from application of our DL process to the VPHAS+ data together with their 1D SAAO 1.9-m SpupNic confirmatory spectra classified, from top to bottom, as ‘true’, ‘likely’, ‘true’, and ‘true’ PNe, respectively, following standard HASH processes. The PNe candidate is located within the red box in each image and has a blueish tinge due to the way the Hα and red bands are combined. The common scale of the VPHAS+ images is 107 arcsec on each side. These form part of the larger sample spectroscopically confirmed and detailed in Yushan Li et al. (in preparation). The bottom three of the four spectra have [N ii] emission lines stronger than Hα at 6563 Å (a good diagnostic that eliminates H ii region contaminants) but also weak or absent [O iii] in the blue, due to extinction. The newly discovered targets tend to be relatively small and are frequently obscured or partially obscured by bright sources or in dense stellar regions.
Figure 13.

Example PNe candidate discovery images from application of our DL process to the VPHAS+ data together with their 1D SAAO 1.9-m SpupNic confirmatory spectra classified, from top to bottom, as ‘true’, ‘likely’, ‘true’, and ‘true’ PNe, respectively, following standard HASH processes. The PNe candidate is located within the red box in each image and has a blueish tinge due to the way the Hα and red bands are combined. The common scale of the VPHAS+ images is 107 arcsec on each side. These form part of the larger sample spectroscopically confirmed and detailed in Yushan Li et al. (in preparation). The bottom three of the four spectra have [N ii] emission lines stronger than Hα at 6563 Å (a good diagnostic that eliminates H ii region contaminants) but also weak or absent [O iii] in the blue, due to extinction. The newly discovered targets tend to be relatively small and are frequently obscured or partially obscured by bright sources or in dense stellar regions.

6 SAMPLE ANALYSIS AND PRELIMINARY SPECTROSCOPIC RESULTS

We conducted further screening of the model’s newly discovered high-quality PNe candidates through visual inspection by their morphology and double checked with SHS images. We selected 31 of the most obvious candidates for follow-up spectroscopic confirmation on the SAAO 1.9-m telescope in 2023 June.

Detailed results and analysis from this first spectroscopic observing programme will be given in Yushan Li et al. (in preparation) but we present here the VPHAS+ images and reduced 1D spectra of four of the newly discovered, spectroscopically confirmed PNe in Fig. 13. Also, in Table 2, we present the summary statistics from our preliminary spectroscopy. We find that 74 per cent of our candidates are confirmed PNe – either true (16), likely (3), or possible (4), following standard HASH prescriptions – and that 90 per cent have emission lines. Only four out of 31 (~12.9 per cent) candidates are late-type star contaminants. These are extremely encouraging results for the scientific potential of this work.

Table 2.

Summary statistics from our preliminary spectroscopy.

Total SAAO 1.9-m spectroscopically observed targets: 31
ClassificationTrueLikelyPossible
Planetary nebula1634
Supernova remnant001
Emission-line star200
Late-type star400
Emission in star cluster100
Total SAAO 1.9-m spectroscopically observed targets: 31
ClassificationTrueLikelyPossible
Planetary nebula1634
Supernova remnant001
Emission-line star200
Late-type star400
Emission in star cluster100
Table 2.

Summary statistics from our preliminary spectroscopy.

Total SAAO 1.9-m spectroscopically observed targets: 31
ClassificationTrueLikelyPossible
Planetary nebula1634
Supernova remnant001
Emission-line star200
Late-type star400
Emission in star cluster100
Total SAAO 1.9-m spectroscopically observed targets: 31
ClassificationTrueLikelyPossible
Planetary nebula1634
Supernova remnant001
Emission-line star200
Late-type star400
Emission in star cluster100

The discovery images and 1D spectra of four detected PN candidates are shown in Fig. 13 as an illustrative example of the success of our technique. The 107 × 107 arcsec2 VPHAS+ image of the PN candidate is shown on the left. The PNe candidate is located within the red box in each image and has a blueish tinge due to the way the Hα and red bands are combined. The SAAO 1.9-m 1D spectra for this PNe candidate is shown on the right, covering the wavelength range from ∼4000Å to 9500 Å. The standard PN emission lines of Hα, [N ii], and [S ii] are very clear in the red in the middle of the spectral plots. The bottom three of the four spectra have [N ii] emission lines stronger than Hα at 6563 Å (a good diagnostic that eliminates H ii region contaminants) but also weak or absent [O iii] in the blue, due to extinction. These spectra confirm the PN nature of the candidates and verify the power of our technique to find new PNe.

Furthermore, we visualize the attention distributions of these targets in the model, revealing the important features extracted by the model, as shown in Fig. 14. It can be seen that around some special celestial bodies, including PNe, there is a significant difference in attention distribution from the surrounding background. Because of the attention mechanism, our model has already achieved preliminary object filtering in the feature extraction stage, and this filtering is at the pixel level, which far exceeds the accuracy requirements of object detection and is a rough object segmentation. This helps deepen our understanding of PNe as a diverse object class in the data.

The distribution of attention in the model for the four newly discovered targets corresponding to those in Fig. 13. It can be seen that in the feature extraction stage, the model has achieved preliminary object detection, filtered out most irrelevant targets, and retained nebula targets and a few interference targets with similar features, which can be removed by subsequent classification and regression modules.
Figure 14.

The distribution of attention in the model for the four newly discovered targets corresponding to those in Fig. 13. It can be seen that in the feature extraction stage, the model has achieved preliminary object detection, filtered out most irrelevant targets, and retained nebula targets and a few interference targets with similar features, which can be removed by subsequent classification and regression modules.

7 CONCLUSIONS AND PROSPECTS

In this paper, we use the Swin-Transformer object detection model in searching for PNe in Hα surveys of the inner Galactic plane. Compared with traditional CNN object detection models, this object detection model, based on an attention mechanism, can ignore the restriction of receptive field to model any range of features, so it has better universality and detection effect capability for objects of different spatial scales. Meanwhile, compared with the ordinary transformer model, the Swin-Transformer model can effectively improve the defects of high model complexity and large calculations, so as to have better adaptability.

We trained and validated the model using the known PNe catalogue and images from IPHAS. After training, the model reached an accuracy rate of 96.5 per cent and a recall rate of 97.8 per cent on the verification set, which has excellent detection ability. We further applied the trained model to the VPHAS+ processed images and obtained ∼20 000 detections, which, by comparing with existing catalogues and visual inspection, resulted in 2637 known PNe and 815 new high-quality candidates.

Through further visual inspection, we selected the most promising 31 candidates for confirmatory spectroscopic observations. A summary table of the classification results from these observations is presented in Table 2. We also provide four examples of new PNe uncovered by our DL techniques, together with their confirmatory spectroscopy. These preliminary findings demonstrate the strong performance of our model as a powerful tool for efficiently discovering nebula targets in wide-field Hα survey imagery. Full details will be presented in Yushan Li et al. (in preparation). A more comprehensive and detailed search for nebula targets and PNe candidates in VPHAS+ data is also underway and will be presented in future work.

Finally, our attention-based object detection model not only performs well in detecting PNe candidates, discovering many new objects, but has the potential to be adapted to other astronomical targets and to other large-scale sky surveys in the future.

ACKNOWLEDGEMENTS

This work is supported by the National Natural Science Foundation of China (NSFC; funding numbers 12303105, 12173027 and 12173062) and Civil Aerospace Technology Research Project (D050105). We acknowledge the science research grants from the China Manned Space Project (No. CMS-CSST-2021-A01) and science research grants from the Square Kilometre Array (SKA) Project (No. 2020SKA0110102). QP thanks the Hong Kong Research Grants Council for General Research Fund (GRF) support under grants 17326116 and 17300417. We thank the South African Astronomical Observatory for the award of telescope time. YL thanks HKU and Q. Parker for provision of a PhD scholarship from Research Matching Grant Scheme (RMGS) funds awarded to the LSR.

DATA AVAILABILITY

Data resources are supported by the China National Astronomical Data Center (NADC) and the Chinese Virtual Observatory (China-VO). This work is supported by the Astronomical Big Data Joint Research Center, co-founded by National Astronomical Observatories, Chinese Academy of Sciences, and Alibaba Cloud. The code and data utilized can be accessed from the PaperData Repository, managed by the China-VO team.

Footnotes

1

Online at http://www.hashpn.space. HASH federates available multiwavelength imaging, spectroscopic and other data for all known Galactic and Magellanic Cloud PNe.

References

Akras
 
S.
,
Guzman-Ramirez
 
L.
,
Gonçalves
 
D. R.
,
2019
,
MNRAS
,
488
,
3238
 

Awang Iskandar
 
D. N. F.
,
Zijlstra
 
A. A.
,
McDonald
 
I.
,
Abdullah
 
R.
,
Fuller
 
G. A.
,
Fauzi
 
A. H.
,
Abdullah
 
J.
,
2020
,
Galaxies
,
8
,
88
 

Chen
 
J.
,
Liu
 
Y.
,
2011
,
Artificial Intelligence Review
,
36
,
29
 

De Marco
 
O.
,
2009
,
PASP
,
121
,
316
 

Drew
 
J. E.
 et al. ,
2005
,
MNRAS
,
362
,
753
 

Drew
 
J. E.
 et al. ,
2014
,
MNRAS
,
440
,
2036
 

Faundez-Abans
 
M.
,
Ormeno
 
M. I.
,
de Oliveira-Abans
 
M.
,
1996
,
A&AS
,
116
,
395

Fragkou
 
V.
,
Parker
 
Q. A.
,
Zijlstra
 
A. A.
,
Vázquez
 
R.
,
Sabin
 
L.
,
Rechy-Garcia
 
J. S.
,
2022
,
ApJ
,
935
,
L35
 

Frew
 
D. J.
,
Bojičić
 
I. S.
,
Parker
 
Q. A.
,
Pierce
 
M. J.
,
Gunawardhana
 
M. L. P.
,
Reid
 
W. A.
,
2014
,
MNRAS
,
440
,
1080
 

Greimel
 
R.
 et al. ,
2021
,
A&A
,
655
,
A49
 

Hambly
 
N. C.
 et al. ,
2001
,
MNRAS
,
326
,
1279
 

He
 
K.
,
Zhang
 
X.
,
Ren
 
S.
,
Sun
 
J.
,
2016
, in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
.
IEEE
,
Piscataway, NJ
, p.
770

He
 
K.
,
Gkioxari
 
G.
,
Dollár
 
P.
,
Girshick
 
R.
,
2017
, in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
.
IEEE
,
Piscataway, NJ
, p.
2961

Henry
 
R. B. C.
,
Kwitter
 
K. B.
,
Jaskot
 
A. E.
,
Balick
 
B.
,
Morrison
 
M. A.
,
Milingo
 
J. B.
,
2010
,
ApJ
,
724
,
748
 

Iben
 
I.
 Jr,
1995
,
Phys. Rep.
,
250
,
2
 

Irwin
 
M.
,
Maddox
 
S.
,
McMahon
 
R.
,
1994
,
IEEE Spectrum
,
2
,
14

Jacob
 
R.
,
Schönberner
 
D.
,
Steffen
 
M.
,
2013
,
A&A
,
558
,
A78
 

Jia
 
P.
,
Zheng
 
Y.
,
Wang
 
M.
,
Yang
 
Z.
,
2023a
,
Astron. Comput.
,
42
,
100687
 

Jia
 
P.
,
Sun
 
R.
,
Li
 
N.
,
Song
 
Y.
,
Ning
 
R.
,
Wei
 
H.
,
Luo
 
R.
,
2023b
,
AJ
,
165
,
26
 

Kwitter
 
K. B.
,
Henry
 
R. B. C.
,
2022
,
PASP
,
134
,
022001
 

Le Dû
 
P.
 et al. ,
2022
,
A&A
,
666
,
A152
 

Lin
 
T-Y.
,
Dollár
 
P.
,
Girshick
 
R.
,
He
 
K.
,
Hariharan
 
B.
,
Belongie
 
S.
,
2017
, in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
.
IEEE
,
Piscataway, NJ
, p.
2117

Liu
 
Z.
,
Lin
 
Y.
,
Cao
 
Y.
,
Hu
 
H.
,
Wei
 
Y.
,
Zhang
 
Z.
,
Lin
 
S.
,
Guo
 
B.
,
2021
,
preprint
()

Loshchilov
 
I.
,
Hutter
 
F.
,
2017
,
preprint
()

Maciel
 
W. J.
,
Costa
 
R. D. D.
,
Uchida
 
M. M. M.
,
2003
,
A&A
,
397
,
667
 

Miszalski
 
B.
,
Parker
 
Q. A.
,
Acker
 
A.
,
Birkby
 
J. L.
,
Frew
 
D. J.
,
Kovacevic
 
A.
,
2008
,
MNRAS
,
384
,
525
 

Moe
 
M.
,
De Marco
 
O.
,
2006
,
ApJ
,
650
,
916
 

Parker
 
Q. A.
,
2022
,
Front. Astron. Space Sci.
,
9
,
895287
 

Parker
 
Q. A.
,
Bland-Hawthorn
 
J.
,
1998
,
PASA
,
15
,
33
 

Parker
 
Q. A.
,
Malin
 
D.
,
1999
,
PASA
,
16
,
288
 

Parker
 
Q. A.
 et al. ,
2005
,
MNRAS
,
362
,
689
 

Parker
 
Q. A.
 et al. ,
2006
,
MNRAS
,
373
,
79
 

Parker
 
Q. A.
,
Bojičić
 
I. S.
,
Frew
 
D. J.
,
2016
,
Journal of Physics: Conference Series
,
728
,
032008
()

Paszke
 
A.
 et al. ,
2019
,
preprint
()

Redmon
 
J.
,
Divvala
 
S.
,
Girshick
 
R.
,
Farhadi
 
A.
,
2015
,
preprint
()

Ren
 
S.
,
He
 
K.
,
Girshick
 
R.
,
Sun
 
J.
,
2015
, in
Advances in Neural Information Processing Systems 28
.
Neural Information Processing Systems
,
La Jolla, CA
, p.
91

Rumelhart
 
D. E.
,
Hinton
 
G. E.
,
Williams
 
R. J.
,
1986
,
Nature
,
323
,
533
 

Sabin
 
L.
 et al. ,
2014
,
MNRAS
,
443
,
3388
 

Smithsonian Astrophysical Observatory
,
2000
,
preprint (ascl:0003.002)

Stobie
 
R. S.
,
1986
,
Pattern Recogn. Lett.
,
4
,
317
 

Vaswani
 
A.
,
Shazeer
 
N.
,
Parmar
 
N.
,
Uszkoreit
 
J.
,
Jones
 
L.
,
Gomez
 
A. N.
,
Kaiser
 
Ł.
,
Polosukhin
 
I.
,
2017
, in
Advances in Neural Information Processing Systems
.
Neural Information Processing Systems
,
La Jolla, CA,
, p.
5998

Viironen
 
K.
 et al. ,
2009a
,
A&A
,
502
,
113
 

Viironen
 
K.
 et al. ,
2009b
,
A&A
,
504
,
291
 

Wu
 
K.
,
Peng
 
H.
,
Chen
 
M.
,
Fu
 
J.
,
Chao
 
H.
,
2021
,
preprint
()

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.