Machine learning-based photometric classification of galaxies, quasars, emission-line galaxies, and stars

This paper explores the application of machine learning methods for classifying astronomical sources using photometric data, including normal and emission line galaxies (ELGs; starforming, starburst, AGN, broad line), quasars, and stars. We utilized samples from Sloan Digital Sky Survey (SDSS) Data Release 17 (DR17) and the ALLWISE catalog, which contain spectroscopically labeled sources from SDSS. Our methodology comprises two parts. First, we conducted experiments, including three-class, four-class, and seven-class classifications, employing the Random Forest (RF) algorithm. This phase aimed to achieve optimal performance with balanced datasets. In the second part, we trained various machine learning methods, such as $k$-nearest neighbors (KNN), RF, XGBoost (XGB), voting, and artificial neural network (ANN), using all available data based on promising results from the first phase. Our results highlight the effectiveness of combining optical and infrared features, yielding the best performance across all classifiers. Specifically, in the three-class experiment, RF and XGB algorithms achieved identical average F1 scores of 98.93 per~cent on both balanced and unbalanced datasets. In the seven-class experiment, our average F1 score was 73.57 per~cent. Using the XGB method in the four-class experiment, we achieved F1 scores of 87.9 per~cent for normal galaxies (NGs), 81.5 per~cent for ELGs, 99.1 per~cent for stars, and 98.5 per~cent for quasars (QSOs). Unlike classical methods based on time-consuming spectroscopy, our experiments demonstrate the feasibility of using automated algorithms on carefully classified photometric data. With more data and ample training samples, detailed photometric classification becomes possible, aiding in the selection of follow-up observation candidates.


INTRODUCTION
When it comes to data in astronomy, it is typically collected from a diverse range of celestial objects.The classification of the data is a critical component of the astronomical data analysis process.By categorizing the data into different classes, we can gain valuable insights into the underlying physical processes and properties of the objects under investigation.As an example, in recent years there has been an influx of optical and near-infrared sky surveys (e.g.Sloan Digital Sky Survey (SDSS; York et al. 2000), Panoramic Survey Telescope and Rapid Response System (Pan-STARRS; Chambers et al. 2016), Dark Energy Survey (DES; Reed et al. 2015Reed et al. , 2017Reed et al. , 2019;;Yang et al. 2019), Hyper Suprime-Cam (HSC) Subaru Strategic Program (Aihara et al. 2018), Two Micron All Sky Survey (2MASS; Skrutskie et al. 2006), UKIRT Infrared Deep Sky Survey (UKIDSS; Lawrence et al. 2007), VISTA Hemisphere Survey (VHS; McMa-★ E-mail: fzeraatgari@xjtu.edu.cn† E-mail: zyx@bao.ac.cn ‡ E-mail: lqmei@mail.xjtu.edu.cnhon 2012), Visible and Infrared Survey Telescope for Astronomy (VISTA), Wide-field Infrared Survey Explorer (WISE; Wright et al. 2010)).As a result, photometric data have gained a great deal of attention for the purposes of mapping and classifying astronomical sources.Thanks to large-area multiwavelength surveys, our knowledge and comprehension of the Universe and its diverse constituents have been significantly enhanced.
The ability to differentiate between point sources (stars and quasars) and extended sources (galaxies), is essential for astronomers in identifying various astronomical objects.Additionally, distinguishing normal galaxies from active or emission line galaxies is crucial for advancing our comprehension of various physical processes at play.One common approach to classify distinct sources from an image is by using morphology-based classification, which distinguishes between stars and galaxies (López-Sanjuan et al. 2019 and references therein), as well as passive and active galaxies (Wilman et al. 2013;Man et al. 2021).However, this approach has a disadvantage as it makes it impossible to distinguish between point sources like stars and quasars in a single image (Fotopoulou & Paltani 2018).Moreover, classifying active versus passive galaxies based entirely on their morphology can be challenging as some active galaxies can have a similar morphology to passive galaxies, and vice versa (Tamburri et al. 2014).Accurate classification often requires additional techniques such as spectroscopic analysis (Baldwin et al. 1981;Veilleux & Osterbrock 1987;Kauffmann et al. 2003;Tanaka et al. 2012), which detects ionized gas emitting strong emission lines from objects that have intense star formation or accreting black holes at their centers (Conselice 2003;Wen et al. 2014).This method is effective but requires a significant investment of telescope time and is limited to small samples of objects.Additionally, spectroscopy alone cannot always distinguish between active and passive galaxies, as some passive galaxies can have low-level activity that cannot be detected.Furthermore, some objects, including both passive and active galaxies, may be difficult to detect because they do not exhibit strong spectral features (Zaritsky et al. 1995).
Alternatively, astronomers often utilize photometric data, including colors and spectral energy distributions to differentiate between different sources.This approach offers a more efficient and costeffective method of classifying galaxies, providing informative findings into their general properties, age, metallicity, and star formation history through analysis of colors and brightness (Conti et al. 2003;Assef et al. 2010;Monachesi et al. 2012;Wang et al. 2022).Object selection, whether using WISE colors alone or in combination with other catalogues, primarily relies on utilizing color information to distinguish between active galactic nuclei (AGN)/quasars and stars or passive galaxies (Richards et al. 2002(Richards et al. , 2005;;Schneider et al. 2007Schneider et al. , 2010;;Jarrett et al. 2011Jarrett et al. , 2017;;Stern et al. 2012;Mateos et al. 2012;Wu et al. 2012;Edelson 2012;Goto et al. 2012;Assef et al. 2013;Yan et al. 2013;Tu & Wang 2012;Chung et al. 2014;Nikutta et al. 2014;Ferraro et al. 2015;Secrest et al. 2015).
In addition, wide-field surveys such as SDSS and WISE have played a crucial role in generating vast amounts of photometric information, enabling researchers to measure the properties of various astronomical objects.Machine learning-based algorithms have proven to be effective in handling and classifying this information.A majority of the literature on these two surveys in the case of photometric classification using machine learning, beginning from Suchkov et al. (2005), have focused on broad classifications or specific subtypes of astronomical objects.For instance, some studies have examined separating stars from galaxies (Ball et al. 2006;Vasconcellos et al. 2011;Kovacs & Szapudi 2015), distinguishing star/galaxy/QSO (Krakowski et al. 2016;Kurcz et al. 2016;Nakoneczny et al. 2019Nakoneczny et al. , 2021;;Clarke et al. 2020;Cunha & Humphrey 2022;Chaini et al. 2022), examining ELGs (AGN/non-AGN, Seyfert I/Sefert II) (Cavuoti et al. 2014), as well as classifying AGNs (X-ray AGNs, IRAGNs, radio-selected AGNs) (Chang et al. 2021).Despite significant progress in the field, a gap remains in the literature regarding the detailed classification of all astronomical objects using solely photometric data.To address the current gap in the literature, we present a comprehensive and adaptable classification system that aims to provide a solution to this challenge, considering our four main objectives.
As our first objective, we propose to develop a reliable classification system that will accurately classify sources of all types, including subclasses of galaxies, and determine stars from quasars using only photometric data.By taking this approach, we will be able to separate point sources from extended sources, and distinguish active galaxies from passive galaxies.Furthermore, by automating and facilitating source identification, our proposed classification system serves as a tool for analyzing large datasets, streamlining the identification process, and enhancing efficiency.This is mainly because with the advent of future photometric surveys, such as Large Syn-optic Survey Telescope (LSST; LSST Science Collaboration 2009; LSST Dark Energy Science Collaboration 2012; Ivezić et al. 2019), a large amount of data will be generated, and analysis methods will need to be automated to uncover new scientific insights.
Secondly, our proposed approach has the potential to make significant contributions to this field by providing a fine-grained classification of astronomical objects using only photometric data.In this study, we specifically focus on the classification of stars, quasars (QSOs), emission-line galaxies (ELGs), and normal galaxies (NGs).These object categories were chosen based on their relevance to the research community and their distinct characteristics.Stars, as fundamental objects in astrophysics, play a crucial role in understanding various astrophysical processes.Quasars, known for their unique high-energy emission properties, provide significant information about the most energetic phenomena in the Universe.Emissionline galaxies, with their prominent emission lines, present significant clues about the underlying physical mechanisms at work.Additionally, the category of emission line galaxies encompasses a diverse range of object types that require comprehensive classification.Furthermore, in our study, we extend the classification to include additional subclasses, resulting in a total of seven distinct groups: normal galaxies (NGs), starforming galaxies (SF), starburst galaxies (SB), active galactic nuclei (AGN), broad line galaxies (BL), in addition to the aforementioned stars and quasars.By specifically addressing these seven groups, our proposed classification system bridges a significant void in the literature and offers a detailed and versatile approach for accurately classifying astronomical objects using photometric data.This comprehensive classification system will enhance our understanding of the properties and characteristics of these objects, enabling new scientific discoveries and advancements in the field.
Thirdly, in order to effectively classify the astronomical sources in our study, we decide to implement supervised machine learning techniques and a novel approach that incorporates photometric information from spectroscopically identified sources.By doing so, we can reliably classify all classes of astronomical sources with an acceptable degree of accuracy.This methodology has the potential to significantly advance our understanding of the properties and characteristics of these celestial objects, as well as contribute to the development of more efficient and effective methods for identifying and classifying astronomical sources based exclusively on photometric data.The final objective of this study is to investigate efficient selection of astronomical objects by engaging a variety of machine learning models, including -nearest neighbors (KNN), random forest (RF), XGBoost (XGB), voting, and artificial neural network (ANN).This methodology enables us to enhance the accuracy of our analyses and attain a comprehensive understanding of the data for precise classification of objects.In order to extract optical and IR photometric information, we cross-match SDSS and ALLWISE catalogues.To extract QSOs with the highest probability of reliability, Milliquas is utilized.During the study, we use a two-step methodology to guarantee the precision of our classifiers, and we meticulously adhere to this procedure to maintain uniformity in our outcomes.
The structure of this paper can be summarized as follows: Section 2 describes the sample used in this paper.The adopted methods are briefly introduced in Section 3. Section 4 discusses the results of the classifiers in detail.An overall comparison of the present work with other similar studies can be found in Section 5.The summary and conclusion of this study are provided in Section 6.

DATA
We use SDSS-IV Data Release 17 (Blanton et al. 2017;Abdurro'uf et al. 2022) which provides a labelled dataset of spectroscopically observed sources.SDSS mapped the sky in the five optical bands:  ( = 0.355),  ( = 0.477),  ( = 0.623),  ( = 0.762), and  ( = 0.913).The Wide-field Infrared Survey Explorer (WISE; Wright et al. 2010) is an all-sky survey project in mid-infrared band with photometry in four filters at 1 ( = 3.4), 2 ( = 4.6), 3 ( = 12), 4 ( = 22).There were hundreds of millions of celestial objects observed by the instrument, resulting in over a million images.AllWISE has been superior to WISE in terms of deeper imaging and improved source detection.The limiting magnitudes of 1 AB and 2 AB are brighter than 19.8 and 19.0 (Vega: 17.1, 15.7) for the AllWISE source catalogue.Considering the accuracy of 1, 2, 3, and 4, we only adopt 1 and 2, converting 1 and 2 in Vega magnitudes to AB magnitudes by 1 AB = 1 + 2.699 and 2 AB = 2 + 3.339 (Schindler et al. 2017).We have adopted AB magnitudes and extinction-corrected all the photometries.Spectroscopically identified sources may be acquired by CASJOB1 from the SpecPhotoAll table of SDSS DR17.Keeping the data quality,   = 0,   = 1,  = 1 are set when downloading data.The records with fatal errors are rejected using flags such as ,      ,  , and    .In "where" for the SQL query, we adopt the limitation as (flags & (dbo.fPhotoFlags('SATURATED')))= 0 and (flags & (dbo.fPhotoFlags('BRIGHT')))= 0 and (flags & (dbo.fPhotoFlags('EDGE')))= 0 and (flags & (dbo.fPhotoFlags('BLENDED')))= 0. Finally, a catalogue of 554 038 stars, 448 337 quasars, and 594 917 galaxies which are spectroscopically identified is obtained through the query.We cross-match the full sample of quasars from SDSS with the Million Quasar Catalogue (Milliquas v7.7, 2022, update (Flesch 2021)), so 427 829 quasars remain.We match the SDSS-Milliquas known sources with AllWISE and the match radius is set to 3 arcsec (Su et al. 2013).The final catalogue of high quality sources with WISE photometry consists of 305 723 stars, 345 608 quasars, and 594 077 galaxies.A total of two samples are drawn from the final data, Sample I is taken from SDSS+milliquas and Sample II from SDSS+milliquas+WISE. Table 1 shows the full information of the all subclasses in our samples.As for the definitions and abbreviations in Table 1, AGN is short for active galactic nucleus, AGN BL for broad-line AGN, SB for starburst galaxy, SB BL for broad-line SB, SF for starforming galaxy, SF BL for broad-line SF.In the present study, the star sample is selected to the conditions: CLASS = STAR including all subclasses.The normal galaxy sample: CLASS = GALAXY with SUBCLASS = NULL.AGN sample: CLASS = GALAXY with SUBCLASS = AGN and SUBCLASS = AGN BL.Starforming sample: CLASS = GALAXY with SUBCLASS = SF and SUBCLASS = SF BL.Starburst sample: CLASS = GALAXY with SUBCLASS = SB and SUBCLASS = SB BL.Broad-line sample: CLASS = GALAXY with SUBCLASS = BL.QSO sample: CLASS = QSO including all subclasses.Fig. 1 shows the  magnitude distribution of our known samples from SDSS and SDSS+ALLWISE catalogues.
In our machine learning models, we make use of the following parameters from WISE.Despite using magnitude w?mpro, where ?represents either 1 or 2, as measured with profile-fitting photometry (and throughout the paper, whenever we mention 1 and 2, we are actually referring to the values of w?mpro), we also incorporate circular aperture magnitudes in the 1 and 2 channels, denoted as  ?_, where  = 1, 3 correspond to apertures with radii of 5.5 ′′ and 11 ′′ , respectively, centered on the source (Kurcz et al. 2016).

METHOD
A variety of supervised learning methods are utilized in this study, and their classification accuracy is evaluated.In our study, we apply four machine learning models: -nearest neighbors (KNN), random forest (RF; Breiman 2001), XGBoost (XGB; Chen & Guestrin 2016), and artificial neural network (ANN; Haykin 1998).A brief introduction to each algorithm is presented as follows.We use Python libraries: scikit-learn (Pedregosa et al. 2011), and Keras (Chollet 2015).
It is crucial to report the performance metrics of the classifiers in order to evaluate how accurate they are.This section provides a description of the different metrics used during this research.

Classification metrics
Classifier performance is typically measured by Precision (P), Recall (R), and F1-score, which is a harmonic mean of Precision and Recall.The metrics are given by: where TP is the number of correctly classified positives, FP the number of incorrectly classified positives, FN the number of incorrectly classified negatives.
The area under the curve (AUC) of the Receiver Operating Characteristic curve (ROC; Bradley 1997) is also used for the evaluation of classifier performance.In ROC curves, true positive rate (TPR) or Recall are plotted against false positive rate (FPR) at different threshold values.

KNN
-nearest neighbors (KNN) is a classic clustering algorithm grounded on distance metrics.Generally, the distance can be any metric measure, and the standard Euclidean distance is the most common choice.Although KNN is applied to regression and classification, its usage is more evident as a classifier.KNN does establish a baseline of classification accuracy because it is the most straightforward method to use to make a classification.Specifically, this algorithm assumes that two data points nearby to each other falls in the same class. implies the number of neighboring data points to be considered and the classification of a source is determined by the voting results of the  neighbors who are the nearest to the input in the multidimensional parameter space.

Random Forest
One of the machine learning (ML) methods widely used, for classification and regression tasks in astronomy is Random Forest (RF; Breiman 2001).RF consists of ensembles of randomly generated decision trees as classifiers that uses bootstrap resampling technology.Bootstrapping means that several individual decision trees are trained in parallel on different subsets of the training dataset using randomly selected subsets of the available features.A single decision tree tends to overfit the training data.Therefore, RF, which uses the average of the trees, prevents overfitting and improves prediction accuracy.RF is fast to train and scalable, and has competitive performance to other ML algorithms.

XGBoost
Extreme gradient boosting (XGBoost, also short for XGB) is an improvement of the gradient boosting algorithm (Friedman 2001), a new boosting decision tree algorithm with high design efficiency, flexibility and strong applicability.Compared with other ensemble learning methods, such as the stochastic forest algorithm and the support vector machine of a single model, XGBoost can achieve higher performance and better generalization, has an added regularization term to prevent overfitting, and supports parallel computing (Nguyen et al. 2019;Wang et al. 2021), and has been widely used in classification and regression problems.

ANN
Artificial neural network (ANN) mimics biological neural networks in some ways.The network consists of an input layer, an output layer, and several hidden layers, each of which contains neurons that pass information to the neurons of the next layer.The input data is transmitted from the input layer through the hidden layers and reaches the output layer where the target variable is predicted.In the network, the value of each neuron is a linear combination of the neurons in the previous layer, except the neurons in the input layer, then activated using a function that is usually non-linear.The weights of the network are model parameters that are optimized by backpropagation during the training phase.In our model, the ANN takes the photometric parameters as input.
In our study, we try two different ANN methods, one with dropout, and one without dropout, and the one with dropout overfit.Six ANN experiments are carried out, from a single-layer to a six-layer network.The experiments are conducted with two different architectures, networks composed of the same number of neurons in all layers, 30 neurons and 50 neurons, respectively.The best results are obtained with a two-layer network containing 30 neurons per layer, a rectified linear unit (ReLu) activation is used in the hidden layer, while Softmax function is used in the output layer.For all our hyperparameter sweeps, we train the model for 1000 epochs for the Adams optimizer and sparse categorical crossentropy as the loss function.We use the early stopping technique to prevent the model from overfitting the training data.

Voting
Voting is an ensemble ML algorithm that combines the output of several ML models and often shows better results than a single model, and it can be used for regression or classification.For classification, the final result is determined by incorporating the output of multiple models using soft or hard voting.A hard-voting ensemble aggregates the votes for class labels from other classifiers and predicting the class with the most votes.A soft-voting ensemble is calculated on the predicted probability of the output class.This study uses a voting classifier where the ensemble comprises three classifiers.As different base classifiers contribute differently to the final classification result, the classifiers with better performance are weighted more heavily.Here, the weights for RF and XGB are 2 and for KNN is 1 as the former two classifiers perform better than the latter one.

Model Setup
We adopt two main parts in the methodology used in this work.The experiments are conducted to test the ability to classify objects into three, four, and seven different categories.The three-class experiment includes star, QSO, and galaxy; the four-class experiment includes star, QSO, NG, and ELG; and the seven-class experiment includes star, QSO, NG, AGN, SF, SB, and BL.As we complete three experiments, we explore the possibility of classifying galaxies into normal and emission-line ones, and then a more detailed classification of emission-line galaxies into AGN, SF, SB, and BL galaxies.In the following, the process by which the two parts of this study are carried out will be described in detail.

The first part
As the first part of our approach shown in the flowchart of Fig. 2, we perform the experiments using the RF method in order to work out a reliable classification of astronomical objects.The experiments are three-, four-, and seven-class experiments in which the classifier is trained on the balanced datasets.All the experiments are conducted on both Samples I and II, including optical and optical+IR data, respectively.The number of data used in three experiments is as follows.
The three-class experiment classifies the sources into stars, QSOs, and galaxies.The randomly selected sources of each class for training are 420 000 and 305 000 from Samples I and II, respectively.
The four-class experiment classifies the sources into stars, QSOs, NGs, and ELGs.The number of each class selected randomly for the training dataset is 245 000 and 244 000 from Samples I and II, respectively.
The seven-class experiment classifies the sources into stars, QSOs, NGs, AGNs, SFs, SBs, and BLs.The number of randomly selected sources for each class from either sample is 10 000.

The second part
In the second part, which actually comes from the best results of the first part, we classify the objects into three and four classes.We train several different ML methods: KNN, RF, XGB, voting, and ANN on both Samples I and II and used all the data we have.It should be noted that in part one, we test our model on a balanced dataset in order to ensure that it performs well in different experiments.Therefore, in part two, we use all the existing data that we already have.Following is the number of data in each experiment.
For all ML methods, their purposes are to construct models and assess their proficiency on previously unseen data.To obtain an average of all evaluation metrics, we employ a 10-fold cross-validation approach.This iterative method randomly splits the sample into ten portions, where nine sections are used for training and one part is reserved for test and performance calculation in each iteration.We repeat this process ten times, resulting in a distinct test set for each iteration.This procedure is essential in determining the model's performance.Finally, we average the ten performance metric estimates.

RESULT
We present the results of all experiments that are carried out in the two parts of this study in this section.

Feature selection
To assess the significance of infrared and optical data for different object categories, we have included color-color plots in Fig. 3.These plots effectively demonstrate the influence of infrared data in distinguishing stars from other astronomical objects.Solely relying on optical information tends to result in overlap between stars and other objects, as evident in the first row of Fig. 3.However, the inclusion of extended information such as infrared data facilitates improved differentiation between active and normal galaxies and pointed sources such as stars and quasars, as observed in the second row of the figure.Notably, the role of infrared information is particularly critical for accurately classifying stars and quasars.
Using a larger parameter space, as opposed to a simple color-colorbased classification, offers significant advantages.Although simple color selections in the optical or infrared are effective for identifying certain object classes, a larger parameter space enables a more detailed and nuanced classification.It allows the algorithm to capture a broader range of object characteristics and consider additional features beyond color information.Consequently, this leads to a more refined classification scheme and improved discrimination between different object categories, especially when the boundaries between classes are less distinct.
A classifier's performance can be determined by what features influence it the most.In the classification process, we choose features from the literature where the most popular ML models perform well (Li et al. 2021).As we have two samples with optical and infrared features, we select two feature sets: ,  −, −,  −,  − for Sample I, and , −, −,  −, −, −1, 1−2 for Sample II.We have also employed two additional sets of input features, namely including difference of two circular aperture magnitudes, ,  − ,  −,  −,  − ,  − 1, 1 − 2, 11 − 13 and ,  − ,  − ,  − ,  − ,  −1, 1−2, 11−13, 21−23.These are used solely for the purpose of evaluating the influence (or lack thereof) of apertures on the classification of astronomical objects in the context of the four-class classification experiment.
In our study, we employ supervised Uniform Manifold Approximation and Projection (UMAP) visualization, as depicted in Fig. 4, to enhance our comprehension of the distribution and distinguishability of celestial objects using the features we have extracted.The input features utilized for classification are  − ,  − ,  − ,  − ,  − 1, and 1 − 2.This visualization serves as a powerful tool for understanding the underlying structure of our data and, notably, for assessing the efficacy of our feature selection process.By observing the UMAP plot, which presents four-class classification of NGs, ELGs, QSOs, and Stars, we can identify distinct clusters or groupings of data points that correspond to different object types.The left panel of the UMAP plot displays the training data, while the right panel shows the test data.This emphasizes the success of our feature selection and machine learning approach in achieving clear separation between various object classes, a crucial aspect of our methodology.Additionally, the UMAP visualization aids in recognizing any disparities or overlaps between training and test datasets, providing useful information for fine-tuning our model and enhancing its generalizability.Such insights are instrumental in optimizing our feature selection strategy for robust classification performance.
Building on the effectiveness of our feature selection and machine learning approach, particularly the superiority of XGB over other algorithms, we proceed to assess feature importance using Fig. 5.The most important features in the figure are the infrared features, 1 − 2 with an importance of 36 per cent, and the combination of optical and infrared features,  − 1 with an importance of 34 per cent.Based on these two features that have the most influence on classification, we plot Fig. 6.In this figure, the pair of features as well as color distribution shows the differences among the four classes of NG, ELG, star, and QSO.As shown in Fig. 6, it is obvious  that stars, quasars and galaxies (including normal and emission-line galaxies) are easy to separate, especially stars, while normal galaxies and emission-line galaxies overlap seriously.

Classification results of the first part
The results of the evaluation metrics, Precision, Recall, F1, and AUC for three-, four-, and seven-class experiments are presented in Table 2.At first glance, we can see adding the infrared features improves the evaluation metrics in all experiments.For Sample II, in the threeclass experiment, the RF classifier performs well, for example, F1 of galaxy, QSO, and star are 0.986, 0.992, and 0.988, respectively.
In the four-class experiment, the RF classifier still works well.The F1 for star, QSO, NG, and ELG are 0.991, 0.983, 0.843, and 0.854, respectively.The AUC scores for star, QSO,NG,and ELG are 0.994,0.995,0.968,and 0.971,respectively.As a result of the seven-class experiment, F1 is 0.987 for stars, 0.961 for quasars, while F1 for the other classes, including AGN, SF, SB, BL, and NG are 0.524, 0.635, 0.805, 0.646, 0.592, respectively.The confusion matrix of the seven-class experiment of Sample II is shown in Fig. 7.The vertical square boxes are actual labels and the horizontal boxes show the RF classifiers' predictions.In the matrix, more populated areas are shown with darker colors.As can be seen from the confusion matrix, 47 per cent of AGNs, 28 per cent of SFs, 10 per cent of SBs, and 41 per cent of BLs are misclassified as NG.Fig. 8, the AUC of the RF classifier for the seven-class experiment, also confirms the results shown in the confusion matrix.A perfect classifier has TPR = 1 and FPR = 0, resulting in an AUC score of 1. Fig. 8 shows that the stars have the highest AUC score, making them the best-predicted class, while the AGN class has the lowest AUC score.As a result of the high percentage of misclassification of three classes of AGN, SF, and BL with NGs, we decide to do the four-class experiment with NG, ELG, star, and QSO using several different ML methods in the second part of our work.This means that we have merged four classes of AGN, SF, SB, and BL into ELGs as an independent class.As a continuation of this section, we present the results of experiments conducted on three-and four-class in the second part of our approach and compare the performance of the classifiers.

Effect of aperture magnitudes on classification performance
We conduct experiments that involve the incorporation of differential aperture magnitudes, specifically 11 − 13 or 11 − 13 and 21 − 23, the results of which are presented in Table 4.These experiments are exclusively carried out for the fourclass classification to evaluate the influence of different aperture magnitudes on our classification process.We compare the results obtained when including the differential aperture magnitudes to those when excluding them.The findings from our experiments reveal that the F1-score for stars and QSOs remains unchanged, while there is a 4 per cent improvement in the F1-score for NGs, increasing from

Classification results of the second part: comparison of the different ML models
In this subsection, the results of the experiments performed in Sec.3.7.2for both Sample I and Sample II are represented.The performance metrics (Precision, Recall and F1) of the different ML models, KNN, RF, XGB, voting, and ANN in three-and four-class experiments are shown in Tables 3 and 5, respectively.In the threeclass experiment in Table 3, the evaluation metrics of all classifiers for galaxy class in Sample II (optical+IR) perform similarly; RF, XGB, and voting have the F1 values of 0.99; KNN and ANN have the values of 0.989.XGB has the best performance for both star and QSO with the F1 values of 0.986 and 0.992, respectively.Based on the evaluation metrics for the four-class experiment in Table 5, the best F1 for NG is 0.88 for both RF and voting.There is no better classifier than RF and XGB for ELG with the same F1 values of 0.815.The star class is well identified by all five classifiers, with the F1 value of 0.991 for RF, XGB and voting, and 0.99 for KNN and ANN.As far as QSO is concerned, all algorithms show good performance, e.g., F1 of KNN, RF, XGB, and voting are equally 0.985 and of ANN is 0.981.Overall, as compared to other classes, stars can be distinguished with higher performance.

Misclassification results
In order to be able to show a more clear picture of the results across different ranges of  magnitude, we divide the sample into four bins of 15 ⩽  < 17, 17 ⩽  < 19, 19 ⩽  < 21, and 21 ⩽  ⩽ 22.5.Fig. 9 illustrates the distribution of sources classified correctly and incorrectly in the four-class experiment described in Sec.3.7.2.This   the faint range of 21 ⩽  ⩽ 22.5, because there are fewer sources, less than ten, and the classifier has a tendency to misclassify them as QSOs, which is consistent with the fact that quasars are generally brighter than stars and galaxies and occupy the majority in the faint magnitude.
We conduct a more detailed analysis of the misclassified sources, focusing specifically on their magnitudes in the  band and redshifts in Fig. 10.QSOs, NGs, ELGs, and stars misclassified as the other three sources are shown with gray circles, blue triangles, red squares, and open circles, respectively.The impact of classification may be affected by sample imbalance, where the minority class is more likely to be classified as the majority class.For instance, dark/high-redshift sources may be mistakenly classified as bright/low-redshift sources.Fig. 1 highlights an obvious overrepresentation of ELGs over NGs and QSOs at brighter magnitudes, which could potentially influence the classification effect.Furthermore, at lower brightness levels and potentially higher redshifts, the ELGs that are incorrectly classified as NGs may be missed due to low signal-to-noise ratios of their emission lines.

DISCUSSION
This study aims to investigate the potential of using photometric parameters to identify and classify different types of astronomical objects.To evaluate our results, we compare them with other studies that utilize photometric features.However, conducting a detailed comparison of results from surveys with varying data selection criteria is not feasible.Therefore, we limit our analysis to a general comparison of studies conducted on the SDSS and WISE data that are similar to our work.Given the differences in the training and validation strategies, we are unable to provide a direct and quantitative comparison of our experimental results with others' works.Further, there is a paucity of literature that deals with photometrybased four-class classification akin to our work, so we have resorted to comparing the outcomes of our three-class classification with those of similar studies in the literature.As an example, Nakoneczny et al. (2021) leveraged SDSS+WISE to train machine learning algorithms and generate a QSO catalogue from KIDS data.According to their findings, infrared information is capable of discriminating   QSOs efficiently, and using XGB classifier obtained their best average F1-score of 96.85 per cent whereas we obtained an improved average F1-score of 98.93 per cent using the same algorithm (see Table 3).As shown in Tables 2-5, our study is also consistent with Nakoneczny and colleagues that infrared information is extremely important for classification based on photometric data.Fig. 5 also turns out that the most important feature is 1 − 2 and the second important is  − 1.To maximize performance, it is recommended to incorporate both optical and infrared information.When both optical and infrared information are used together, the predictability of a source is higher compared to using optical information alone.In terms of added value, the inclusion of optical colors enhances our classification approach.While WISE data are crucial for successful classification, the incorporation of optical colors improves the overall accuracy and completeness of the results.By combining optical and IR data, we can leverage the complementary nature of these two wavelength regimes, leading to a more comprehensive understanding of the objects' nature and improving classification accuracy.Consequently, classifiers that utilize both optical and infrared information tend to be more reliable than those that rely entirely on optical information.
We can also evaluate the effectiveness of our proposed method by comparing it with the approach presented by Clarke et al. (2020), who trained RF classifier and obtained an average F1-score of 96.77 per cent.Despite this impressive performance, our approach still outperforms it with an average F1-score of 98.87 per cent in the threeclass experiment in both parts one and two of our study (Tables 2-3).
Zhang et al. ( 2019) used measurements of optical spectra of galaxies from SDSS for the classification of ELGs located at intermediate redshifts into four categories using several different ML methods such as KNN, SVM, RF, and MLP.The AUC of RF algorithm for starforming, composite, AGN, and low-ionization nuclear emission regions (LINERs) was obtained 98.5 per cent, 96.5 per cent, 86.0 per cent, and 88.4 per cent, respectively.Although only with photometric data not spectroscopic data, the AUC of our RF classifier in the four-class experiment is 97.1 per cent for ELG , 96.8 per cent for NG, 99.4 per cent for star, and 99.5 per cent for QSO, respectively.
In comparison to similar studies, our classification approach demonstrates better performance.Cunha & Humphrey (2022) obtained an average F1 98.13 per cent using XGBoost, LightGBM, and CatBoost classifiers and Chaini et al. (2022) used a combination of photometric information and images from SDSS to achieve the best averaged F1-score of 93.3 per cent through the use of artificial neural networks (ANN) and conventional neural networks (CNN).
Our work also surpasses similar studies that solely focus on infrared features.For instance Kurcz et al. (2016) conducted research on the automatic classification of WISE sources into stars, galaxies, and quasars, utilizing support vector machines (SVM).By employing the radial Kernel and presenting their findings in Table 2 of their paper, they achieved an average F1-score of 85.7 per cent.Overall, our method demonstrates strong reliability and consistent classification accuracy, highlighting its potential for practical applications in relevant fields.

CONCLUSIONS
This study examines the application of multiclass classification for the classification of astronomical objects with photometric data, and the results are quite promising.The possibility of identifying reliable astronomical objects exclusively by photometric measurements presents an opportunity to select relevant samples for subsequent spectroscopic observation and further studies that are adequately characterized.While this study is exploratory, it represents a crucial first step towards refining the classifications of different kinds of astronomical objects using multiband photometry.To achieve this, we train supervised ML classification techniques on spectroscopically classified objects and investigate how ML algorithms can provide accurate classifications based on photometric features.During the course of this study, there are two main parts in the methodology that we adopt.Initially, we test our algorithm on a balanced dataset.The confidence of the classification accuracy on a balanced dataset is important to scale the algorithms on an unbalanced dataset.This is done when additional experiments are conducted.For the first part of our concept, we carry out three experiments using RF as the method for the classification of astronomical objects in order to be able to come up with a reliable process.The purpose of these three experiments is to test the ability to classify objects into three, four, and seven different classes.The composing of three experiments leads us to investigate the possibility of categorizing galaxies into normal and emission-line galaxies, and then a finer classification of ELGs into AGN, SF, SB, and BL galaxies.Based upon the seven-class experiment, it is found that a considerable number of AGNs have been misclassified as NGs.Therefore, we decide to merge the AGNs, SFs, SBs, and BLs into a unified class of ELGs.Then we perform fourclass classification of stars, QSOs, NGs, and ELGs in addition to the three-class experiment to separate stars, QSOs, and galaxies using a variety of ML methods as the second part of our study.Using all the existing data, we expand our dataset in this part and train the ML models with both optical and optical+IR data separately.According to the results, XGB and voting perform better when compared to other algorithms.It is clear from the results that the infrared features are able to substantially improve the evaluation metrics in all the experiments.To examine how the classification of these sources depends on the brightness of the sources, we break up the samples into four bins of magnitude in the  band.All four classes are generally classified well when there are enough data in the bins.As far as NGs and ELGs are concerned, a majority of faint sources are misclassified as QSOs.In contrast, the bright NGs are mainly misclassified as ELGs and vice versa.
Our results indicate that the availability of information, brightness of sources, quantity of datasets, sample balance, sample completeness and uncertainty level associated with spectroscopic classification of sources in the training set significantly contribute to the performance of our methods.Automated algorithms trained on photometric data with spectroscopic classification may offer an alternative solution to classical methods based on time-consuming spectroscopic observations.In summary, ML techniques have shown promise in classifying different types of astronomical objects, but their effectiveness can vary depending on various factors such as sample selection, feature selection, types, magnitude choices, and sample size.Obviously the performance of a classifier is influenced not only by available information, but also by enough representative training sample.With the advent of new surveys, which will provide unprecedented amounts of faint data, robust big data processing will be required to efficiently analyze this information.To meet the demands of these future missions, carefully designed, interpretable, and well-tested ML models will be needed to provide reliable and trustworthy results.The framework presented in the article takes a significant step toward meeting these evolving demands.
Moreover, this study acknowledges the limitations inherent in the datasets used (SDSS and WISE), particularly the relatively poorer image quality of these surveys compared to newer ones like LSST.While our results provide insights into celestial object classification within the context of SDSS and WISE, it is crucial to exercise caution when extrapolating conclusions to surveys with improved image qual-ity.The enhanced sensitivity of newer surveys may present unique opportunities and challenges not addressed here.Researchers using different datasets should carefully consider their specific characteristics and exercise prudence when generalizing findings to surveys with distinct observational properties.Additionally, it is worth noting the potential usefulness of morphological parameters for the LSST and similar surveys, as their quality will be significantly better.

Figure 1 .
Figure 1.The  magnitude distribution of data from SDSS and SDSS+ALLWISE catalogues.The distribution of four classes of stars, QSOs, NGs, and ELGs are plotted in red, blue, orange, and dark blue, respectively.

Figure 2 .
Figure 2.This flowchart outlines the first part of our methodology, using color-coded shapes and arrows to represent the dataset, experiments, functions, and results.Red and orange shapes indicate the raw and processed data, respectively, while blue shapes represent functions performed.Green shapes signify the output.Arrows are color-coded to indicate input (black), data splitting (red), and output (green).

Figure 3 .
Figure3.Color-color plots showcasing the classification of four classes of astronomical objects, stars, quasars (QSOs), normal galaxies (NGs), and emission line galaxies (ELGs) using optical and infrared data.The inclusion of WISE (or IR) information enhances accuracy in distinguishing between stars, quasars, and galaxies.

Figure 4 .
Figure 4. Visualizing the data points using supervised UMAP for four distinct classes, including normal galaxies (NG), emission line galaxies (ELG), quasars (QSO), and stars.The left panel represents the training set, while the right panel displays the test set.The input features utilized for classification are u-g, g-r, r-i, i-z, z-W1, and W1-W2.

Figure 5 .
Figure 5. Feature importance of XGB model with optical and infrared information for Sample II in the four-class experiment

Figure 6 .
Figure 6.Color-Color distribution of NGs, ELGs, stars, and QSOs based on the most important features in XGB classifier.The data are from Sample II with optical and infrared information.

Figure 7 .
Figure 7. Confusion matrix of RF classifier in the seven-class experiment for Sample II.

Figure 8 .
Figure 8.The ROC curve of RF classifier in the seven-class experiment.The solid and dashed lines, respectively, show the ROC curves of Samples I and II with optical and infrared information.
(a) QSO misclassified as other sources.(b) NG misclassified as other sources.(c) ELG misclassified as other sources.(d) Star misclassified as other sources.

Figure 9 .
Figure 9. Misclassified sources in four bins of  magnitude in the four-class experiment by XGB classifier.

Figure 10 .
Figure 10.QSOs, NGs, ELGs, and stars misclassified as the other three sources are shown with gray circles, blue triangles, red squares, and open circles, respectively.

Table 1 .
Class and subclass distribution in the SDSS and SDSS+ALLWISE sources.

Table 2 .
Performance comparison of RF classifier in the experiments for Samples I and II with optical and infrared information: Comparison of boldface results with literature in Section 5.

Table 3 .
Performance comparison of ML models in the three-class experiment for Samples I and II with optical and infrared information.Comparison of boldface results with literature in Section 5.

Table 5 .
The performance of all ML methods in the four-class experiment for both Samples I and II with optical and infrared information.