Application of Deep Learning in Clinical Settings for Detecting and Classifying Malaria Parasites in Thin Blood Smears

Abstract Background Scarcity of annotated image data sets of thin blood smears makes expert-level differentiation among Plasmodium species challenging. Here, we aimed to establish a deep learning algorithm for identifying and classifying malaria parasites in thin blood smears and evaluate its performance and clinical prospect. Methods You Only Look Once v7 was used as the backbone network for training the artificial intelligence algorithm model. The training, validation, and test sets for each malaria parasite category were randomly selected. A comprehensive analysis was performed on 12 708 thin blood smear images of various infective stages of 12 546 malaria parasites, including P falciparum, P vivax, P malariae, P ovale, P knowlesi, and P cynomolgi. Peripheral blood samples were obtained from 380 patients diagnosed with malaria. Additionally, blood samples from monkeys diagnosed with malaria were used to analyze P cynomolgi. The accuracy for detecting Plasmodium-infected blood cells was assessed through various evaluation metrics. Results The total time to identify 1116 malaria parasites was 13 seconds, with an average analysis time of 0.01 seconds for each parasite in the test set. The average precision was 0.902, with a recall and precision of infected erythrocytes of 96.0% and 94.9%, respectively. Sensitivity and specificity exceeded 96.8% and 99.3%, with an area under the receiver operating characteristic curve >0.999. The highest sensitivity (97.8%) and specificity (99.8%) were observed for trophozoites and merozoites. Conclusions The algorithm can help facilitate the clinical and morphologic examination of malaria parasites.

Malaria is a mosquito-borne infectious disease caused by Plasmodium species.According to the World Health Organization, >247 million malaria infection cases and 619 000 associated deaths occur worldwide [1].The common humaninfecting malaria parasites include Plasmodium falciparum, P vivax, P malariae, P ovale, P knowlesi, and P cynomolgi [2].Plasmodium parasites proliferate in human red blood cells (RBCs) after infection, causing periodic systemic chills, fever, and sweating [3].P falciparum infection can develop into fatal cerebral malaria or involve morbidity of multiple organs, such as acute renal failure and respiratory distress syndrome [4,5].However, early diagnosis and timely treatment can effectively reduce mortality [6].
Plasmodium infection is diagnosed by microscopic examination of peripheral blood smears, polymerase chain reaction (PCR) [7], and antigen-detecting rapid diagnostic tests [8].PCR is time-consuming and expensive and requires sensitive equipment.Meanwhile, rapid diagnostic tests are less sensitive and susceptible to cross reaction, which occurs with other infectious disease parasites that share similar antigens with Plasmodium, yielding false-positive results [9].
The Centers for Disease Control and Prevention has established microscopic examination of Wright-Giemsa-stained thick and thin blood smears as the gold standard for malaria diagnosis [2].Thick blood smears are used to assess the density of malaria parasites, whereas thin blood smears are used to diagnose malaria and identify all parasite stages and Plasmodium species [10].The severity of malaria infection can be determined by the percentage of Plasmodium-infected RBCs, and the antimalarial treatment effect can be simultaneously monitored.However, the reliability of microscopic examination for diagnosing malaria depends largely on experienced morphologic experts and is thus subjective and time-consuming.Furthermore, it is difficult to standardize the microscopic examination of malaria parasites, especially in developing nations with limited experienced parasitologists [11].
Machine learning and automatic image recognition technologies have recently seen increasing application in disease diagnosis [12][13][14][15].Implementing an automatic image analysis system based on deep learning can potentially mitigate the reliance on professional morphologists in the field of microscopy, thus reducing the time demands for clinical laboratory personnel and enhancing the efficiency and accuracy of malaria diagnosis.You Only Look Once v7 (YOLOv7) is a cutting-edge object detection system that employs a unified neural network architecture to perform object detection and classification in real time.Its ability to rapidly and accurately identify objects in images renders it particularly suitable for malaria diagnosis.However, to date, few studies have reported the use of automated image-processing algorithms for Plasmodium detection based on deep learning, and most of these studies relied on public data sets [16][17][18][19][20][21][22][23][24][25][26][27].To our knowledge, there are no reports on the classification and staging of the 6 Plasmodium species detected in thin blood films from multicenter clinical samples according to the YOLOv7 algorithm.
Therefore, in this study, we aimed to establish a cutting-edge deep learning algorithm for the identification of Plasmodium species and their development stages in thin blood smears, evaluate its efficacy in augmenting microscopic examinations, and explore its potential clinical applications.

Collection and Labeling of Microscopic Images
A total of 380 peripheral blood samples in dipotassium EDTAcontaining tubes were obtained from patients diagnosed with malaria at the Peking Union Medical College Hospital and Yunnan Institute of Parasite Diseases from 2009 to 2021 (for patient distribution, see Supplementary Table 1).Blood samples of patients with >1 blood-borne infection were excluded.In addition, a single peripheral blood sample in a dipotassium EDTAcontaining tube was obtained from a monkey diagnosed with P cynomolgi infection.Wright-Giemsa-stained thin blood smears were prepared according to the standardized procedure [2].This study was approved by the Ethics Committee of the Peking Union Medical College Hospital (I-22PJ266).Written informed consent was obtained from all patients.A microscope (CX31; Olympus) and camera (acA1920; Basler) were used to capture 12 708 images (1920 × 1200 pixel resolution).The computer hardware configuration consisted of an Intel I9-9900 K CPU with 32GB RAM and an Nvidia RTX 2080Ti GPU.The operating system was Ubuntu 18.04 (Canonical), the technical framework was PyTorch 1.3 (Meta AI), and the functional code was written in Python 3.5.Plasmodium species were identified with real-time fluorescent PCR and nested PCR methods (for details of analysis steps, see Supplementary Text 1).Malaria parasites were labeled by 2 experienced parasitologists and reviewed by a third expert.Discrepancies were resolved by a fourth expert.

Segmentation and Feature Extraction
The YOLOv7 algorithm was adopted to complete the detection and recognition of Plasmodium through a 1-stage strategy for segmentation and feature extraction (ie, both were completed in 1 step).The algorithm framework is shown in Figure 1.
Backbone Network.The YOLOv7 algorithm adopted the Cross Stage Partial Network as the backbone network (Supplementary Figure 1).The amount of calculation was reduced by 20% as compared with that of the original network while still ensuring accuracy.This network thus achieved a suitable balance of accuracy and speed, with better application value.
The Cross Stage Partial Network removed the bottleneck layer (ie, the 1 × 1 convolution layer commonly used in ResNext and other networks) to avoid repeated gradient conduction and useless parameter calculation.The feature map was divided into feature maps A and B, according to the channel dimensions.Feature map A was convoluted and transformed into feature map A′′; feature maps A′′ and B were fused and transformed into cross stage partial (CSP) blocks; and each block was calculated similarly.
Feature Neck Layer.To improve small-target detection, YOLOv7 employed the Path Aggregation Network (PAN) algorithm for multiscale detection and feature fusion.The shallow feature map contained accurate location information conducive to network positioning, whereas the deep feature map contained rich high-level semantic information conducive to network classification; the combination of both maps can yield accurate positioning and classification features.The PAN algorithm deepened the feature path based on the feature pyramid network, thereby ensuring that the final classification and regression features contained high-precision location information and rich high-level semantic information to obtain more accurate positioning and classification results (Supplementary Figure 2).
Structural Reparameterization.The network structure based on "residual connections" had a high recognition accuracy; however, this was offset by its high memory and video memory consumption during inference, thereby reducing the inference speed and affecting the practical value of the model.Therefore, the structural reparameterization method exploited the "additivity" property of the convolution operation as follows: con(x, w 1 ) + con(x, w 2 ) + con(x, w 3 ) = con(x, w 1 + w 2 + w 3 ), where w 1 , w 2 , and w 3 were of the same size and the 1 × 1 convolution was treated as a 3 × 3 convolution with all positions except the center point set to 0. This ensured high recognition accuracy, considerably reduced the time required for inference, and maximized the performance of the model in terms of speed and accuracy (Supplementary Figure 3).
Prediction Layer.The PAN-generated feature map was used for multiscale prediction.The feature map contained 3 scales, with 3 anchor boxes at each feature point in each scale.Each anchor box outputted a vector of size: (confidence + detection frame coordinates) + number of categories.Each feature map outputted 64 × 64 × 3, 31 × 31 × 3, and 15 × 15 × 3 detection targets, resulting in 5282 detection results.
Nonmaximum Suppression.To suppress redundancy and lowquality test results, the nonmaximum suppression algorithm was used to deduplicate the test results.The nonmaximum suppression algorithm sorted the detection results by category according to the confidence level and calculated the intersectionover-union ratio among all detection frames in each category.The detection frame with the highest confidence was used as the final classification and segmentation result.

Model Training
The training, validation, and test sets used for the model were stratifiedly selected at a ratio of 8:1:1 for each parasite stage.The training process is shown in Supplementary Figure 4.
Data Preprocessing.Before use, the images were preprocessed via the following steps (Supplementary Text 2): 1. Median filtering and noise reduction processing 2. Up-and-down and left-and-right flips 3. Random lighting, brightness, contrast, and saturation changes 4. Image filtering 5. Gaussian white noise addition to the image 6. Image mosaic 7. Image mixup 8. Image cropping and scaling to 640 × 640 pixels 9. Image normalization: pixel values were standardized to a normal distribution with a mean of 0 and variance of 1, and NumPy was converted to tensor vectors.
Hyperparameter Settings.The initial learning rate was 1e-3, the weight decay was 1e-4, and α was 0.99.The learning rate decay method was cosine, and the batch_size was set to 12.
Model Training Set.Stochastic gradient descent was used as the optimizer to iterate and optimize the model, and the maximum number of training epochs was 300.The model weight with the highest accuracy in the validation set was saved as the candidate weight.Changes in recall rate, model accuracy, mean average precision (mAP), box loss, and confidence loss during model training were recorded.

Efficiency Analysis
The total analysis time for 1116 parasites in 782 images was 13 seconds, and the average analysis time for a single image was 0.016 seconds.The average recognition time for a single parasite was 0.014 seconds, and mAP was 0.902.The precisionrecall curve of the algorithm is shown in Figure 2A. Figure 2 shows the heat diagram for determining the malaria parasite stages: Figure 2B shows the original picture, and Figure 2C shows the attention of the neural network to determine parasite characteristics.Red and blue depict high and low attention, respectively.The algorithm exhibited a high level of attention to the actual characteristics of malaria parasites (Figure 2B).
Identification of Infected RBCs.For the 1116 parasites in the test set, the recall and precision of the ability to identify infected erythrocytes were 96.0% and 94.9%, respectively.Figure 3 shows an example of the output of the algorithm.
Classification and Staging of Plasmodium.The overall sensitivity of the algorithm to parasites was 96.0% (for confusion matrix, see Supplementary Table 2).The performance of the algorithm in classifying and staging Plasmodium is summarized in Table 2, and the receiver operating characteristic curve is shown in Figure 4.The sensitivity and specificity of the algorithm for species classification were >96.8%, and the area under the curve (AUC) was >0.999.Among the 4 stages, the highest sensitivity (97.8%) and best specificity (99.8%) were observed for the ring stage and schizonts, respectively.The 4 stages had an AUC >0.983.An example of the output result of the algorithm is shown in Supplementary Figure 5.

DISCUSSION
In this study, we developed an algorithm based on deep learning for Plasmodium species classification and staging in thin blood smears.The performance of the model was good; the scanning time of a single image was 0.016 seconds; and the mAP of the algorithm was 0.902.The recall and precision were 96.0% and 94.9%, respectively.The sensitivity and specificity for the 6 Plasmodium species were ≥96.8% and ≥99.3%.Thus, the algorithm can provide effective and accurate auxiliary diagnostic information for the clinical detection of Plasmodium in thin blood smears.Timely and accurate detection of malaria parasites is essential for diagnosing and treating malaria.There are several approaches to malaria diagnosis, with microscopic examination representing the gold standard.However, malaria may be undetected in regions with inadequate medical resources because of limited professional capacity.Therefore, morphologic examination of parasitemia is currently a major challenge.With the gradual increase in the application of artificial intelligence (AI) in the medical field, deep learning algorithms can be used to detect malaria parasites [28].Yet, parasite images in blood smears currently constitute a bottleneck for machine learning training [29]: several images must be labeled and annotated, which is time-consuming and laborious, and the method requires the participation of experienced parasite morphologists.
Previous studies have often used public data sets containing a single Plasmodium species (eg, P vivax on https://data.broadinstitute.org/bbbc/BBBC041/)or lacking image segmentation (eg, National Institutes of Health, https://lhncbc.nlm.nih.gov/publication/pub9932) for model training [16,20,24,26,27].The performance of the model depends largely on the quality and extent of the training sets and the similarity between the validation and training images.The thin blood films in the present study were obtained from 2 centers and were labeled by experienced morphologic experts, ensuring the quality of the data set.Kuo et al [21] reported that the AUC of their established algorithms for identifying P falciparum in 8145 images from 36 thin blood smears was 0.997, with a specific AUC of 0.995 for the ring stage.The present study yielded comparable results; however, the established algorithm incorporated the identification of 6 Plasmodium species and achieved an mAP of 0.902, which was superior to the mAP (0.885) reported by Kuo et al.Furthermore, Sriporn et al [24] reported the accuracy, precision, and recall rates for identifying a single Plasmodium species in 7000 images from the public data of the National Library of Medicine to be within the ranges of 76.07% to 99.28%, 76.07% to 99.29%, and 78.05% to 99.28%, respectively.Yet, the duration of their process was substantially longer, from 14 to 125 minutes, and the scope of their analysis was limited to a single Plasmodium species.
In the present study, the YOLOv7 algorithm was used as the primary technique for detecting Plasmodium.The average analysis time was 0.02 seconds per single field image with dimensions of 15 × 15 mm, which is considerably shorter than that reported by Sriporn et al (1.3-10 seconds per image).The analysis of 22 500 field images from thin blood smears resulted in an overall analysis time of approximately 7.5 minutes, demonstrating the efficacy of the YOLOv7 algorithm in reducing the analysis time.The key factor for the efficiency and accuracy of the YOLOv7 algorithm lies in its resolution of object detection as a regression problem, in which target region and target category predictions were integrated into a single neural network model.This approach enabled the direct transformation of the input image into output object locations and classifications, enhancing the detection speed and precision.In contrast, previous studies have mainly employed traditional image-processing techniques or slower 2-stage detectors, such as faster R-CNN [20,30], VGG-16 [31], and SPPnet [32], which often result in long analysis time and low accuracy.
In the present study, 12 708 thin blood smear images with different parasite species and development stages were obtained from multicenter clinical settings.These images were labeled by experienced morphologic experts according to the criteria of the Centers for Disease Control and Prevention.There is a lack of studies on the accuracy of detecting and classifying the 6 Plasmodium species in thin blood smears included in this study.Our findings demonstrated that the proposed algorithm achieved a high level of recognition sensitivity and specificity for the 6 Plasmodium species tested, with values consistently exceeding 96.8% and 99.3%, respectively.Although 5 of these parasites (P falciparum, P vivax, P malariae, P ovale, and P knowlesi) are common humaninfecting Plasmodium species, P cynomolgi can infect monkeys and infrequently humans; thus, we chose to include it in our analysis.Malaria parasites at different development stages have different biological and metabolic characteristics in the host and mosquito, with consequently different degrees of sensitivity and resistance to different types of drugs and treatments.Understanding the development stage of the malaria parasite can thus help clinicians select the most effective treatment plan and improve treatment outcomes [2].Here, the YOLOv7 algorithm was performed efficiently and accurately, providing clinical technologists with a valuable tool for the microscopic examination of malaria parasites and the best standardization tool for identifying thin blood smears.
This study has some limitations.First, in the training and validation sets, the number of P ovale and schizont-stage Plasmodium species was small.Thus, future studies should include a larger number of training sets for these Plasmodium species to improve the accuracy of the algorithm.Moreover, the sample size of the training and validation sets for Plasmodium  species other than P falciparum and the ring-stage Plasmodium species was limited.Second, these studies were performed in a reference laboratory, and additional studies are currently underway to test the approach in a clinical setting.In our future studies, we will focus on establishing an AI algorithm to analyze thick blood smear images, which will encompass parasite density calculation and contribute to a more advanced application of deep learning to support malaria diagnosis.

Figure 2 .
Figure 2. A, Precision-recall curve of the algorithm.Heat map for classifying malaria parasites: B, the original picture; C, the heat map.Red and blue depict high and low attention, respectively.mAP, mean average precision.

Figure 3 .
Figure 3. Output of the algorithm results.Boxes indicate Plasmodium.

Figure 4 .
Figure 4. Receiver operating characteristic curves of the model for the different stages and classification of Plasmodium.AUC, area under the curve.

Table 2 . Performance of the Algorithm for Plasmodium Species Classification With PCR as the Reference Method
Abbreviation: PCR, polymerase chain reaction.