-
PDF
- Split View
-
Views
-
Cite
Cite
Shivani Chiranjeevi, Mojdeh Saadati, Zi K Deng, Jayanth Koushik, Talukder Z Jubery, Daren S Mueller, Matthew O’Neal, Nirav Merchant, Aarti Singh, Asheesh K Singh, Soumik Sarkar, Arti Singh, Baskar Ganapathysubramanian, InsectNet: Real-time identification of insects using an end-to-end machine learning pipeline, PNAS Nexus, Volume 4, Issue 1, January 2025, pgae575, https://doi.org/10.1093/pnasnexus/pgae575
- Share Icon Share
Abstract
Insect pests significantly impact global agricultural productivity and crop quality. Effective integrated pest management strategies require the identification of insects, including beneficial and harmful insects. Automated identification of insects under real-world conditions presents several challenges, including the need to handle intraspecies dissimilarity and interspecies similarity, life-cycle stages, camouflage, diverse imaging conditions, and variability in insect orientation. An end-to-end approach for training deep-learning models, InsectNet, is proposed to address these challenges. Our approach has the following key features: (i) uses a large dataset of insect images collected through citizen science along with label-free self-supervised learning to train a global model, (ii) fine-tuning this global model using smaller, expert-verified regional datasets to create a local insect identification model, (iii) which provides high prediction accuracy even for species with small sample sizes, (iv) is designed to enhance model trustworthiness, and (v) democratizes access through streamlined machine learning operations. This global-to-local model strategy offers a more scalable and economically viable solution for implementing advanced insect identification systems across diverse agricultural ecosystems. We report accurate identification (>96% accuracy) of numerous agriculturally and ecologically relevant insect species, including pollinators, parasitoids, predators, and harmful insects. InsectNet provides fine-grained insect species identification, works effectively in challenging backgrounds, and avoids making predictions when uncertain, increasing its utility and trustworthiness. The model and associated workflows are available through a web-based portal accessible through a computer or mobile device. We envision InsectNet to complement existing approaches, and be part of a growing suite of AI technologies for addressing agricultural challenges.
Insects constitute the most varied group of species among eukaryotes on Earth, necessitating an automated identification capable of recognizing the vast array of insect species. Their roles as pollinators, predators, and food sources are vital for ecosystems. Automated identification aids in ecosystem understanding, pest control, and education. However, diverse visual features and lifecycle variations pose challenges. InsectNet, employs pretraining and self-supervised learning, accurately identifying over 2,500 species. Integrating out-of-distribution detection and conformal prediction bolsters accuracy and user confidence. This is the first work where a deep learning model has been trained on such a large dataset to classify over 2,500 insect species for addressing a multitude of challenges while fostering user trust in its predictions.
Introduction
In the United States, agriculture, food, and other related industries contributed $1.26 trillion to the gross domestic product (GDP) in 2021 (1). Insects, observed at all stages of plant growth, negatively affect the quality and quantity of crop yields in agriculture, and the risk of invasion by new insects and transmission of insect-induced diseases is anticipated to increase with rising temperatures (2). Several insect species have high fecundity and over-wintering ability—for example, Lycorma delicatula (spotted lanternfly [SLF])—and consequently exhibit rapid spread across large areas in a limited amount of time, devastating crops, orchards, and logging industries. Increased trade and travel make it easier for invasive insect species to access previously uncolonized geographic locations. For example, SLF has reached several states in the northeastern and mid-Atlantic regions of the United States, threatening crop species ranging from ornamental crops to fruit and tree species (3). SLF is projected to reach and establish in California by 2033 if preventative measures are not taken immediately to limit its spread (4).
Given this threat, accurate detection of insects is imperative for prompt, timely, and optimal decision-making (5). By identifying the specific insect species that cause damage early in its life cycle, farmers can apply targeted pest control methods instead of blanket pesticide spray in the whole field. This reduces the risk of harm to beneficial insect species and other nontarget organisms. Furthermore, accurate spatio-temporal identification of insects can result in effective pest-control measures, which reduce crop losses, increase farm operation profitability and sustainability, and reduce chemical runoff into water bodies (6).
Growers have long relied on manual identification and quantification of insect infestation on crops; however, this method is challenging at all farming scales due to the limited availability of experts (especially in remote and rural locations) and expertise levels for accurate identification. This work develops an end-to-end machine-learning (ML) pipeline to address this issue, resulting in a web application for automatic insect identification on computers or handheld devices (smartphones). Users upload a photograph of an insect to a database. The app identifies the insect and returns a prediction of its taxonomic classification and role in the ecosystem as a pest, predator, pollinator, parasitoid, decomposer, herbivore, indicator, and invasive species.
The past few years have seen various attempts to automate insect identification, with the earliest attempts using classical ML methods (7, 8) to the more recent efforts using deep learning-based approaches (9–14). Most efforts have focused on utilizing relatively small labeled datasets ( images) spanning a modest number () of clearly distinguishable insect species. Recent studies have leveraged deep learning to enhance insect detection across various crops (15). For citrus, CNN-based methods have effectively identified insects like citrus leafminer and sooty mold, though they call for more sophisticated data collection and model enhancements (16, 17). In rice, approaches such as leaf reflectance spectra and machine vision systems have been used to detect insects early, despite challenges in sensor deployment (18, 19). Cotton insect detection has benefited from GIS frameworks and LSTMs, highlighting the importance of developing robust datasets and advanced modeling techniques (20, 21). In the context of insect detection, particularly for small and challenging insects like aphids, Li et al. (22) developed an enhanced YOLOv5 algorithm to accurately recognize and count green peach aphids in a climate-controlled chamber. Their approach demonstrated significant improvements in recognition and counting accuracy. Additionally, Hansen et al. (9) utilized images of ground beetle specimens scanned at museums to develop a classifier, demonstrating the versatility of AI models in different settings beyond traditional agricultural contexts.
Recent advancements in AI-based insect identification have expanded the potential for automated field monitoring, addressing challenges like taxonomic ambiguity, data scarcity, and biodiversity tracking across diverse environments. Geissmann et al. (23) introduced Sticky Pi, an autonomous insect trap that combines image capture with deep learning to monitor insect activity and diversity in real time, providing insights into ecological interactions and circadian biology. Similarly, a hierarchical classification approach presented by Bjerge et al. (24) leverages multitask learning and anomaly detection to improve taxonomic classification across multiple ranks, enhancing reliability for in situ biodiversity monitoring. Roy et al.’s (25) Automated Monitoring of Insects (AMI) system uses domain adaptation to integrate large datasets like GBIF for nocturnal insect detection and classification, reducing the need for extensive labeled data in new field applications. Hoye et al. (5) demonstrated how deep learning and sensor-based monitoring could track insect abundance and diversity at scale, providing ecological insights into seasonal and diurnal activity patterns of flower-visiting insects. Additionally, Bjerge et al. (26) developed the Insect Classification and Tracking (ICT) system, which combines camera trapping with real-time detection, classification, and tracking of insect activity, offering valuable noninvasive monitoring of pollinator behavior and phenology.
A recent comprehensive review uncovered numerous obstacles and shortcomings in the realm of image-based insect identification and classification (27). These issues include the narrow scope of current datasets (which only cover a few insect species, natural habitats, and regions); unbalanced datasets that complicate machine learning; unidentified insect species within geographic areas; and challenging scenarios such as overlapping insects, morphologically similar species, and intraspecies variations. Additionally, such methods must handle the recognition of insects throughout their life stages and be able to interpret low-quality images due to fast-moving insects or poor lighting conditions. Species-level classification may overlook valuable information such as higher-level taxonomy, sex, and life stage. All of these issues complicate image-based identification and offer opportunities for AI-enabled identification.
InsectNet addresses these limitations by leveraging a comprehensive and detailed image dataset collected through iNaturalist (28). This citizen scientist platform allows users to upload labeled photographs of specific organisms. Citizen science efforts can produce large, diverse, high-quality, community-usable datasets (29, 30) that serve as the foundation for building automated insect classifiers by leveraging the collective strength of its users (the crowd) for data curation. The user community is responsible for identifying the observations, with consensus from multiple users validating each identification. Every successful identification enhances the communal knowledge pool, contributing to a broader understanding of global biodiversity. The iNaturalist dataset holds taxonomically relevant data for insect identification and classification problems, including hierarchical categories that include kingdom, phylum, class, order, family, genus, and species.
More details on the iNaturalist dataset are provided in the Supplementary Information (SI). From this dataset, we extract images (specifically, images rated as “research grade”) of 2,526 insect species reported as the most agriculturally and ecologically important (see SI: Section 1. A for a list of insect species, common names, scientific names, number of images per species, taxonomic information, and their role in the ecosystem).
While our primary focus is on agricultural and ecological applications, we show that InsectNet can also aid in classifying insects in museum collections and other smaller datasets. We do this by fine-tuning InsectNet on expert-curated datasets, including museum collections (see results and SI).
We leveraged machine learning operations (MLOps) to ensure a seamless and efficient deployment pipeline (Fig. 1). Our model development uses recent advances in self-supervised training (31), inter- and intra-domain transfer learning, out-of-distribution detection (32–35), and conformal predictions (36) to train InsectNet. During self-supervised pretraining (SSL), the model learns to build a robust low-dimensional feature space of the images based on the (dis)similarities within the data itself, independent of potentially inaccurate or noisy labels. This characteristic makes SSL particularly suitable for dealing with datasets with prevalent label noise, as with many large-scale biological datasets. Subsequent fine-tuning requires a much smaller dataset (see Table 1), and we use smaller datasets to train InsectNet on various sets of insect species, with our flagship model able to recognize 2,526 insect species. Users can upload images to the InsectNet classifier web app, and we provide real-time predictions while storing these user-uploaded images for future use. These stored data become a valuable resource for updating the model pretraining, thus enhancing the classifier’s robustness. Furthermore, our MLOps strategy includes periodic updates to the model’s backbone, accommodating additional insect species and taxonomic changes. We integrate continuous monitoring and maintenance practices to ensure the app’s reliability and performance, creating a dynamic system that adapts to emerging challenges and evolving insect classifications. This ML engineering approach enables us to deliver a continuously improving insect classification experience for our users. The classifier exhibits classification accuracy on a large set (2,526 insect species categories) of relevant insect species. In contrast, the previous best classifier trained on the Insecta class (using the 2017 iNaturalist test dataset (28)) exhibited a top-1 accuracy of 77.1%.
![The end-to-end pipeline of the InsectNet classifier consists of three components: training, inference, and democratization. The training of the InsectNet classifier is a three-fold process that involves two levels of pretraining using 3.6 billion images (Instagram hashtags [SWAG]) and 12 million images (unlabeled), respectively (refer SI: Section 2.C), and end-to-end fine-tuning using 6 million labeled images (or other expert-curated data, see Results section). During inference, a user uploads a captured image, and the classifier model proceeds through two wrapper modules: the out-of-distribution module and the conformal prediction module (refer SI: Section 4). As part of our effort at democratization of these tools, we showcase how our fully trained classifier with the wrapper modules is deployed on a publicly available web app and how it could also be used for custom downstream tasks.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/pnasnexus/4/1/10.1093_pnasnexus_pgae575/1/m_pgae575f1.jpeg?Expires=1748049102&Signature=cpY9L3GJ15c8FWxLkOrILve5jt9I1TbG~TyLJoGLsCAPAaEzixh~iyqHaImLXfIcQbjaest7Fi~lmxgQEif8VFZaksBOU1XImk2UhW8EiyvtHFTXVlf3VhRo5LE7KSdu~pEgxRtdFpcNMft0n0iPlVybR0Da-3NjQKqzcO2Tq5sgxtKtazklpeYNk9B2Me6PwOimAE7lgcDcvG-gPqFwsIashHnEPYBHTn3rYMnJ-KC1m5Rn6DRTi2QU62WNTg8Psnktqpd9ko-rGmm0vzq-OvsUuFmQLeSDSQgEs-K2GJtgMt47way6FACup88PkMkrupmT2RBIoiSI0XU8oYGFpg__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
The end-to-end pipeline of the InsectNet classifier consists of three components: training, inference, and democratization. The training of the InsectNet classifier is a three-fold process that involves two levels of pretraining using 3.6 billion images (Instagram hashtags [SWAG]) and 12 million images (unlabeled), respectively (refer SI: Section 2.C), and end-to-end fine-tuning using 6 million labeled images (or other expert-curated data, see Results section). During inference, a user uploads a captured image, and the classifier model proceeds through two wrapper modules: the out-of-distribution module and the conformal prediction module (refer SI: Section 4). As part of our effort at democratization of these tools, we showcase how our fully trained classifier with the wrapper modules is deployed on a publicly available web app and how it could also be used for custom downstream tasks.
The table showcases the effectiveness of the SSL model and demonstrates how utilizing even a small number of samples (, 20, and 50) to fine-tune SSL pretrained InsectNet can yield good accuracies.
Dataset . | Classes . | Baseline accuracy (%) . | InsectNet k-shot accuracy (%) . | |||
---|---|---|---|---|---|---|
all . | 50 . | 20 . | 10 . | |||
Museum Dataset (9) | 291 species | 51.9 | 96.9 | 93.9 | 85.2 | 74.2 |
BioScan-1M Dataset (39) | 40 families | 97.6 | 98.1 | 75.8 | 57.5 | 47.5 |
Butterfly Dataset (40) | 35 species | 87.0 | 87.5 | 77.6 | 66.2 | 59.6 |
Dataset . | Classes . | Baseline accuracy (%) . | InsectNet k-shot accuracy (%) . | |||
---|---|---|---|---|---|---|
all . | 50 . | 20 . | 10 . | |||
Museum Dataset (9) | 291 species | 51.9 | 96.9 | 93.9 | 85.2 | 74.2 |
BioScan-1M Dataset (39) | 40 families | 97.6 | 98.1 | 75.8 | 57.5 | 47.5 |
Butterfly Dataset (40) | 35 species | 87.0 | 87.5 | 77.6 | 66.2 | 59.6 |
This is particularly valuable in downstream tasks where labeled data are scarce. We report the top-1 classification accuracies for the experiments.
The table showcases the effectiveness of the SSL model and demonstrates how utilizing even a small number of samples (, 20, and 50) to fine-tune SSL pretrained InsectNet can yield good accuracies.
Dataset . | Classes . | Baseline accuracy (%) . | InsectNet k-shot accuracy (%) . | |||
---|---|---|---|---|---|---|
all . | 50 . | 20 . | 10 . | |||
Museum Dataset (9) | 291 species | 51.9 | 96.9 | 93.9 | 85.2 | 74.2 |
BioScan-1M Dataset (39) | 40 families | 97.6 | 98.1 | 75.8 | 57.5 | 47.5 |
Butterfly Dataset (40) | 35 species | 87.0 | 87.5 | 77.6 | 66.2 | 59.6 |
Dataset . | Classes . | Baseline accuracy (%) . | InsectNet k-shot accuracy (%) . | |||
---|---|---|---|---|---|---|
all . | 50 . | 20 . | 10 . | |||
Museum Dataset (9) | 291 species | 51.9 | 96.9 | 93.9 | 85.2 | 74.2 |
BioScan-1M Dataset (39) | 40 families | 97.6 | 98.1 | 75.8 | 57.5 | 47.5 |
Butterfly Dataset (40) | 35 species | 87.0 | 87.5 | 77.6 | 66.2 | 59.6 |
This is particularly valuable in downstream tasks where labeled data are scarce. We report the top-1 classification accuracies for the experiments.
Technical workflow of InsectNet
Citizen science collected dataset
We selected a subset of the class Insecta from the full iNaturalist dataset ( images). This subset consisted of insect images of around 100,000 distinct insect species. We further filtered this data to identify a subset of 2,526 species categories of relevant insects: beneficial and harmful insects. This dataset, composed of images, has been curated and quality-checked by domain experts to ensure reasonably accurate species labels. The labeled images span 17 insect orders, with Lepidoptera containing the highest number of species (1,430 species) and Zygentoma containing the lowest (three species). This dataset comprises insects of varying sizes, from the smallest size species such as Aphis nerii (oleander aphid or sweet pepper aphid) ranging from 2 to 3 mm to larger ones like the Hyalophora cecropia (cecropia moth), the largest moth in North America, with a wingspan reaching 15–20 cm. Within the orders, some charismatic species, such as the Danaus plexippus (monarch butterfly) from the order Lepidoptera, have as many as images. In contrast, other insect species, like the Nisitrus vittatus (common bush cricket) from the order Orthoptera, have as few as 38 images. This is nearly three orders of magnitude variation in data availability and highlights a significant data imbalance challenge (37, 38) that can make the training of deep learning models nontrivial.
Our classifier identifies insect species and highlights their roles (see SI: Section 7), offering valuable insights for decision-making. Insects play various roles in the environment; for instance, pollinators, parasitoids, and predator species are beneficial insects. We used a large-language model to generate structured text describing all 2,526 species; this structured text was then post-processed and verified by human experts, allowing us to assign roles to each species. In our dataset, we observe that 20.3% of insect species take on the role of predators, contributing to natural pest control. Pests contribute to 33.5% of the dataset. Another critical role is played by pollinators, which are crucial for the reproduction of many plant species. These make up 35.6% of the insect species in our dataset. Additionally, 10.7% of the insect species have various roles, adding diversity to the ecosystem. Understanding these roles enhances our knowledge of insect interactions, supporting more informed ecological studies and integrated pest management strategies.
We utilize 10 images per species from the iNaturalist 2021 dataset for testing and validation. Additionally, to ensure the statistical significance of reported per-insect species accuracy, we collected and evaluated the performance of InsectNet on 50 public domain web images for each of the insect species depicted in Figs. 4(ii) and 5. We also evaluate InsectNet on four expert-validated datasets containing a subset of the 2,526 insect species. These datasets include museum data collection of beetles (9), the insect subset from the BioScan-1M dataset (39), an Iowa State University curated dataset of agricultural pests, and a dataset of butterflies (40).
Label-free SSL
Training accurate machine learning models requires the availability of annotated datasets—for instance, datasets where each insect image is tagged with a species name or label. Providing accurate labels for a large dataset is currently the most significant bottleneck in training accurate ML models, especially when label creation (or checking) requires expert knowledge. We utilize SSL approaches (29, 31), which enable a model to initially learn useful features of a dataset without the need for any labels. Subsequent fine-tuning is then performed using a smaller labeled dataset and has been shown to produce high-performing models (41).
SSL pretraining offers additional advantages. SSL pretraining ensures robustness to noisy labels (42–44). During SSL pretraining, the model learns to build a robust low-dimensional feature space of the images based on the structural similarities within the data itself, independent of potentially inaccurate or noisy labels. This characteristic makes SSL particularly suitable for dealing with datasets with prevalent label noise, as with many large-scale biological datasets, including iNaturalist, where the “research grade” identification is not always expert-vetted. Practically, this implies the existence of a fraction of images with incorrect species labels. This is not unique to iNaturalist as most large-scale open-source datasets (45, 46) have a nontrivial amount of noisy or incorrect labels (47–49). Yet, AI models—especially those with SSL pretraining—perform surprisingly well on noisy labels (29, 50, 51). We observe similar behavior for InsectNet (as described in the results section; also see SI).a
We perform an extensive series of training on several model architectures (RegNet, ResNet, see SI: Section 2.B) and report the impact of SSL pretraining across two performance axes: (i) The amount of unlabeled data used for pretraining, which has a substantial effect on final classification performance. Table S1 describes the impact of systematically increasing the amount of unlabeled data by 200×. These results quantitatively illustrate the value of citizen science collected data, with SSL approaches leveraging these images even if such datasets are available without labels or when the labels are incomplete or noisy. (ii) The number of pretraining campaigns. “Daisy-chaining” a model’s pretraining on a sequence of different datasets or pretext tasks, helps improve the final model performance. In Table S2, we empirically show that classification models learn better latent representations when their model weights are sequentially trained across multiple datasets. Our best model consisted of one campaign of pretraining on a very large noninsect dataset, followed by a second campaign of SSL pretraining on the insect dataset, followed by final fine-tuning on labeled data (see Fig. S1). This is a corollary to the first point (the importance of the amount of unlabeled data) by extending the approach to utilizing out-of-domain large datasets for pretraining (or rather prepretraining). We evaluated model performance along these performance axes to identify our best-trained insect classifier. This classifier exhibited a classification accuracy, with a mean per-species accuracy. The classification accuracy histogram for all 2,526 species categories displays a tiny tail, suggesting that only a small fraction () of the species categories have a prediction accuracy of less than (see Fig. S2 for the histogram plot). We also note no correlation between insect size and prediction accuracy.
Improving the prediction accuracy of species with a low number of images in the database (i.e. low sample size)
We use an approach that transfers knowledge from high-accuracy categories with numerous examples to enhance the learning of low-accuracy categories with fewer examples. AlphaNet is a wrapper model that operates post hoc on top of the InsectNet classifier without requiring any retraining (52). We demonstrate that AlphaNet significantly improves the prediction accuracy of low-accuracy species while retaining the overall prediction accuracy of the classifier. AlphaNet shifts the tail of the per-species accuracy histogram toward higher accuracy levels. In particular, the average accuracy of the low-accuracy species improved from to , with only a drop in the overall classification accuracy (from . This ensures that almost all species in our insect species classifier exhibit a per-species prediction accuracy greater than . This strategy addresses challenges with imbalance in the dataset.
Improving trustworthiness of the model
To ensure the robust performance of InsectNet in the wild, we wrap around two additional features to our classifier. First, we ensure that InsectNet avoids making predictions when confronted with low-resolution, blurred, or confusing images. This provides guardrails against potentially catastrophic consequences, for instance, the mis-classification of an unseen insect species (say, belonging to an invasive species) as a benign insect species or the misclassification of images belonging to a noninsect category (say, tiny red color berry) as insects (say, lady beetles). We do this by wrapping an out-of-distribution (OOD) detection algorithm around the classifier. The algorithm uses an energy-based metric (see SI: Section 4.A) to flag images that deviate significantly from the data distribution used to train the classifier (see Figs. S3 and S4 that depict how the energy value can be used to distinguish in-distribution and OOD images). Our empirical analysis indicates that the dataset exhibits a diverse set of imaging conditions, making OOD detection a useful strategy—yet another indicator of the power of citizen science data (see Fig. S5 that illustrates the results of InsectNet on OOD samples). Second, we use a conformal prediction approach to produce prediction sets, rather than a single species category, with rigorously guaranteed confidence (set to ). The prediction sets become larger when the classifier is increasingly uncertain of its prediction. Both these features provide a graceful way for human intervention and subsequent decision-making, thus addressing the need for high trust in identification. These features also allow quantitative feedback to direct citizen science data collection efforts for insect species where InsectNet under performs.
Democratized access and streamlined machine learning operations (MLOps)
The classifier is publicly available and hosted on a server: https://insectapp.las.iastate.edu. We also provide access to the trained model weights and to the MLOps workflows to enable the agriculture community to adopt and leverage these approaches. In particular, to streamline the data wrangling process, we created a workflow tool, iNaturalist Scalable Download (iNatSD), that allows users to intuitively download customizable datasets of high-quality images of organisms in an ML-analysis-ready format.
Global-to-local fine-tuning for region-specific practical tools
We employ a global-to-local approach to ensure accurate and context-specific insect recognition. This method begins with our robust global model trained on unlabeled large-scale datasets, which is then fine-tuned with a small labeled expert-validated, region-specific dataset. Doing so ensures that the model is optimized for precise identification in specific agricultural regions, specific crops or a targeted set of insects. We show that this approach only requires a small number of expert verified images to produce highly accurate predictions (see Table 1). This approach enhances the model’s adaptability and accuracy, making it highly effective for practical pest management across diverse environments (Fig. 2).

InsectNet in action. After an image is uploaded, InsectNet first performs out-of-distribution (OOD) detection. (Left) If OOD detection is true, InsectNet provides a warning and prediction. (Middle) If not OOD, InsectNet produces a prediction with no warning. (Right) Additionally, InsectNet provides conformal sets with a predefined (here, 81.0%) confidence. In this instance, the images above belong to insect species Trichoplusia ni (Cabbage looper). The figure on the right is sufficiently confusing for InsectNet to predict a conformal set of two closely related species.
Results
We split the results into two parts. We quantitatively evaluate our approach to training and fine-tuning InsectNet models across multiple datasets in the first part. We then evaluate the InsectNet model on several challenging agricultural scenarios in the second part.
Global-to-local fine-tuning on several expert-vetted datasets
We demonstrate the utility of the SSL pretrained InsectNet model by fine-tuning it on three expert-vetted datasets. These datasets were carefully curated and annotated by specialists, ensuring the accuracy of the labels. These datasets include museum data collection of beetle (9), the insect subset from BioScan-1M dataset (39), and a dataset of butterflies (40). Our results in Table 1 demonstrate that our InsectNet model consistently outperformed the previous best-performing model (often by a very large amount). This suggests that our approach of SSL pretraining on a vast dataset, followed by fine-tuning on smaller, expert-verified datasets, produces demonstrably high-performing models.
Region-specific insect identification through global-to-local fine-tuning for practical applications
We next use the global-to-local approach to create a regional insect identification model for the midwestern United States. We consider the most prevalent insect species affecting midwestern agricultural practices (as defined in integrated pest management manuals affecting corn and soybean crops). These 54 species, as listed in the Iowa State University (ISU) extension resource, were used as the basis for our local fine-tuning. The first page of this manual lists key insects (see Fig. S6), shows some of the key insects impacting corn and soybean crops in the region. For 54 of these insect species, we utilized an expert verified dataset consisting of 540 K images. The global-to-local model trained on this expert verified data exhibited an impressive 96% MPCA as depicted in Table 2. In comparison, a model trained from scratch using the expert verified dataset could only achieve 90% MPCA. This demonstrates the efficiency of leveraging global models, which are pretrained on large, diverse, unlabeled datasets, to address specialized regional tasks. This approach offers a scalable solution for pest management, allowing models to be quickly adapted to local environments with minimal labeled data. This makes it especially valuable in scenarios where labeled datasets are scarce or expensive to generate. Furthermore, this fine-tuning framework can be extended to other regions and crops, making it a versatile tool for agricultural stakeholders worldwide.
Comparison of accuracy between global-to-local fine-tuning and training a local model from random weights.
Model . | Total accuracy . | MPCA . |
---|---|---|
(%) . | (%) . | |
Global-to-local fine-tuning | 98.7 | 96.1 |
Fine-tuning from scratch | 94.3 | 90.2 |
Model . | Total accuracy . | MPCA . |
---|---|---|
(%) . | (%) . | |
Global-to-local fine-tuning | 98.7 | 96.1 |
Fine-tuning from scratch | 94.3 | 90.2 |
Comparison of accuracy between global-to-local fine-tuning and training a local model from random weights.
Model . | Total accuracy . | MPCA . |
---|---|---|
(%) . | (%) . | |
Global-to-local fine-tuning | 98.7 | 96.1 |
Fine-tuning from scratch | 94.3 | 90.2 |
Model . | Total accuracy . | MPCA . |
---|---|---|
(%) . | (%) . | |
Global-to-local fine-tuning | 98.7 | 96.1 |
Fine-tuning from scratch | 94.3 | 90.2 |
Good performance for low image-per-class species
We next performed k-shot learning experiments, varying the number of images, k, per species available to fine-tune InsectNet. We choose to vary k from 50, 20, down to 10, representing minimal effort by an interested end-user to fine-tune InsectNet on new species. This is a good indicator of how data availability per class impacts prediction accuracy. Importantly, it reveals the minimum number of images InsectNet needs to (empirically) guarantee good performance. As seen in Table 1, fine-tuning InsectNet on moderately sized datasets achieves high accuracies even with significantly limited data, demonstrating its robustness in handling low image-per-class species.
Next, we evaluate our flagship InsectNet model on various challenging scenarios that arise during insect image classification in the wild. This model can classify 2,526 insect species. The classifier has two specialized wrapper modules to guarantee reliability: the out-of-distribution module and the conformal prediction.
Challenge : Large number of insect species, and variability in sizes of insects
The fine-tuned InsectNet model performs exceptionally well, achieving a mean per-class accuracy of over 96%. More than 2,287 species had an accuracy of greater than 90%, with only 239 species exhibiting an accuracy under 75%. Insects show significant variability in size across species. We evaluated the performance of InsectNet to classify insects across various sizes. We binned insect species according to size, with the size ranging from around 2–4 mm Glycaspis brimblecombei (red gum lerp psyllid) to 150 mm Caligo telamonius (Pale Owl-butterfly) size range (see SI for details).b InsectNet exhibited consistently high accuracy (with a mean accuracy of in each bin) across the entire size spectrum on this dataset, as seen in Fig. 3. In terms of performance variability as a function of number of images per class, this InsectNet model exhibits similar trends to our models trained on smaller datasets. In the SI (see Fig. S4), we report results showing that species prediction accuracy is fairly independent of the number of images available per species.

Violin plot of classifier accuracy ordered by Sizes of Insects. The x-axis shows size ranges (in mm), and the y-axis represents total classifier accuracy. Each violin displays the distribution of accuracies for species within the corresponding size range, with individual data points overlaid to show specific performances.
Challenge #2: Intraspecies dissimilarity
Invasive insect species are of significant concern for agriculture, as these nonnative invasive species can cause significant harm to horticulture and agriculture crop species, forest tree species, and urban green landscapes. The USDA National Invasive Species Information Center (53) lists invasive insect species that seriously threaten various food grain crops, vegetable, fruit, tree, and shrub species. Our model is able to accurately identify a large set (Fig. 4(ii)), including (Lycorma delicatula (spotted lanternfly), 99%); (Helicoverpa armigera (Old World bollworm), 92%); (Popillia japonica (Japanese beetle), 100%); (Megacopta cribraria (kudzu bug), 98%); (Halyomorpha halys (brown marmorated stink bug), 100%); (Homalodisca vitripennis (glassy-winged sharpshooter), 100%); (Agrilus planipennis (emerald ash borer), 98%); (Adelges tsugae (hemlock wooly adelgid), 96%); (Lymantria dispar (spongy moth), 100%); (Lymantria monacha (nun moth), 100%); and (Cydalima perspectalis (box tree moth), 100%). Accurate identification of these invasive species at ports of entry and geographic borders can prevent the escape and spread of these invasive species into new geographic regions. To ensure robust detection and prevent the spread of invasive species, we emphasize that these image-based identification approaches (particularly in biosecurity applications) should be supported by expert validation and additional lines of evidence before routine deployment.

InsectNet can accurately (i) (left) classify under various challenging conditions: a) camouflaged insects (brown insect on brown background), b) camouflaged insects (green insect on green background), c) sexual dimorphism, d) different poses and orientations. (ii) (right) identify several invasive insect species.
Challenge #3: Intraspecies dissimilarity
Intraspecies dissimilarity during insect classification refers to the degree of dissimilarity among the members of an insect species, such as color and pattern variations. We showcase an example in this category belonging to the Coccinellidae family, Harmonia axyridis (Asian lady beetle), which is a nonnative species (Fig. 5a). The variations in color and pattern exhibited by members make classification by nonexperts nearly impossible; however, our classifier can successfully recognize six variations of the Asian lady beetle (accuracy ).

InsectNet can identify a) intraspecies dissimilarity of nonnative predator species Harmonia axyridis (Asian lady beetle), b) the difference between predator species of nonnative Asian lady beetle and native beetle species Adalia bipunctata (two-spotted lady beetle), c) the difference between nonnative predator species Asian lady beetle and harmful insect species Epilachna mexicana (Mexican bean beetle) exhibiting similar features (all pattern variations not shown in the figure), d) examples of interspecies similarity in case of look-alike beetles Popillia japonica (Japanese beetles) and Phyllopertha horticola (garden chafer) and different kinds of stink bug species.
The Asian lady beetle was intentionally introduced into US regions with a lack of natural predators to regulate the population of soft-bodied insects, like aphids, mealy bugs, and scale insects (54). While both native and nonnative lady beetle species are important as predators, the nonnative species (Asian lady beetle) has become a nuisance as it out-competes native species such as Adalia bipunctata (two spotted lady beetle), resulting in biodiversity loss (55). Other detrimental consequences of the Asian lady beetle in North America includes causing harm to fruit crops and acting as a home intruder (56).
Challenge #4: Interspecies similarity
Different insect species can look similar in color and pattern. For instance, more than 500 species of lady beetle are reported in the United States, making identification challenging; Our classifier performs well on this challenge (see Fig. 5b,c), with accuracy ranging from to . InsectNet can differentiate between predator species of nonnative Asian lady beetle and native beetle species Adalia bipunctata (two spotted lady beetle) and can distinguish predator (H. axyridis) vs. insect species of lady beetle (Epilachna mexicana (Mexican bean beetle)). Accurately differentiating between visually similar species is essential for timely mitigation, especially when harmful insect species look like beneficial predatory species. InsectNet can also accurately distinguish between two look-alike beetles (see Fig. 5d). Popillia japonica (Japanese beetle) and Phyllopertha horticola (garden chafer) have very similar overall appearances, and experts differentiate them using subtle differences in physical features. Additional examples illustrated in Fig. 5d include distinguishing between Euschistus servus (brown stink bug; insect, native to United States), Halyomorpha halys (brown marmorated stink bug; insect, invasive in United States), Euschistus tristigmus (dusky stink bug; insect, native to United States), and Erthesina fullo (yellow-spotted stink bug; insect, invasive in United States). Polyphagous invasive insect species like the brown marmorated stink bug is a global insect that harms over 170 plant species, including vegetable, fruit, food grain, and flower crop species (57). However, the look-alike predator species of stink bug, Podisus maculiventris (spined soldier bug), preys on insects like caterpillars, aphids, and beetles, thereby controlling harmful insect populations in gardens and agriculture. Differentiating between a harmful insect and a beneficial insect is critical for appropriate mitigation without unnecessarily harming local biodiversity.
Challenge #5: Insect camouflage and diverse background
Numerous insect species have patterns or colors that camouflage with the background, like a green insect on a green leaf or a brown insect on a piece of wood. Insects have evolved various adaptation mechanisms that help them blend in with their surroundings, resulting in a camouflaging effect to avoid predators and increase the chance of survival (58). However, this camouflaging effect makes it challenging to identify the insects in their habitat (59). Our classifier performs well even for insect images in camouflaging backgrounds and small foreground-large backgrounds to produce reasonable predictions in such challenging cases (see Fig. 4(i)a, b), with prediction accuracy ranging from 90 to 100%. Examples illustrated include the Thesprotia graminis (American grass mantis), which is a brown insect in a brown background; Megarhyssa macrurus (long tail giant Ichneumonid wasp) that camouflages with tree bark; and examples of a green insect on a green background, Chrysopa oculata (green lacewing) and Cicadella viridis (green leafhopper), which is a very tiny green insect against a green leaf.
Challenge #6: Sexual dimorphism
In numerous insect species, male and females have dissimilar and distinct features. For example, the Oryctes nasicornis (European rhinoceros beetle) is a species of beetle native to Europe, western Asia, and northern Africa that reaches up to 4 cm in length (60). The differences between male and female European rhinoceros beetles are not very pronounced, but there are some noticeable physical differences between them. The male has a characteristic horn on its head, similar to that of a rhinoceros (Fig. 4(i)c). While it is not considered a major harmful insect and primarily feeds on decaying matter, it still causes losses as the adults feed on the sap of a variety of trees, while the larvae feed on the roots of these trees and can cause significant damage to young trees. Our classifier is able to correctly identify images belonging to this species, regardless of sex.
Challenge #7: Variability in insect orientation and stance
The example of the Papilio troilus (spicebush swallowtail butterfly; see Fig. 4(i)d) demonstrates the complexity of classification across the instar larvae and adult, where images are often taken from varying stances and poses (front, top, side). Our classifier correctly identifies the insect species corresponding to these images. It also correctly identifies the butterfly with broken wings and an image of two butterflies with wings closed.
Challenge #8: Multiple insects in the image frame
In the wild, particularly for smaller-sized insects, multiple insects are often present in the same image. Our classifier can make successful predictions across a variety of species, including Lycorma delicatula (spotted lanternfly) and Solenopsis invicta (red imported fire ant), with an accuracy of 100 and 90%, respectively (see Fig. S5). A fascinating example of this ability is in the right image of Fig. S5, which shows the Cotesia congregata (parasitoid braconid wasp) cocoons on late-stage Manduca sexta (tobacco hornworm) larva. The female braconid wasp lays her eggs inside the body of hornworm larva using a long, needle-like ovipositor. The eggs hatch into tiny larvae, which feed on hornworm body tissue, eventually killing it. Once the larvae have completed their development, they emerge from the host body and spin cocoons on the surface of the hornworm’s skin. InsectNet successfully identifies multiple braconid wasp cocoons on the hornworm body surface.
Discussion
InsectNet robustly identifies beneficial and harmful insect species, and opens up diverse unique opportunities. This includes monitoring and surveillance of insects at international crossings/border inspections and tracking insect species’ domestic spread and movement. To ensure robust usage, we incorporate additional mechanisms such as out-of-distribution detection, conformal prediction, and uncertainty quantification to alert users to cases where confidence is low or where the model encounters unfamiliar data. However, it is important to note that InsectNet represents one line of evidence in biosecurity applications, and its outputs should be interpreted in conjunction with expert identification and other validation techniques. This allows practitioners to exercise caution and defer to human experts when necessary.
InsectNet can be a tool for maintaining and enhancing biodiversity for pollinators and other beneficial insects. While there are other insect identification apps available, a few have limitations such as operating on a small range of species and lacking extensive documentation of their accuracy, some of which require payment for access to certain features (61–63). A few applications operate on a large-scale dataset but still leave key training details undisclosed (64). InsectNet’s strength lies in its open approach to continued development and enhancement, leveraging SSL pretraining to improve adaptability and performance. This model opens up possibilities for automated identification in images and videos and can be fine-tuned for various use cases, whether localized insect identification or other applications.
We also demonstrate the practical usage of our model by transitioning from a global pretrained model to a locally fine-tuned model for region-specific use cases. By fine-tuning the global model on an expert-verified dataset of 54 insect species prevalent in the US Midwest region, we achieved a notable accuracy of approximately 99% and a mean-per-class accuracy (MPCA) of 96%. In comparison, training a local model from scratch without pretraining resulted in lower overall accuracy of 94% and an MPCA of 90%. This showcases the effectiveness of the global-to-local fine-tuning approach, particularly for improving model performance in specialized regional contexts worldwide.Researchers in any country can use InsectNet’s global-to-local model architecture to create crop-specific insect models efficiently. They can start with manually collected images of targeted insect species or use existing datasets after verifying and refining species labels. If necessary, additional images for underrepresented or misclassified species could be collected to improve dataset quality. These approaches minimize redundancy by leveraging existing or minimal resources while enhancing insect species class with low mean-per-class accuracy. Therefore, InsectNet is an effective tool for building specialized local models with reduced effort. InsectNet opens up several follow-up possibilities, including automated identification of insects in images and videos and better integration of these technologies in integrated pest management (IPM) and climate-smart pest management (CSPM).
While the InsectNet classifier performs well in many cases, there are still notable challenges in distinguishing certain species, particularly those with subtle morphological differences and taxonomic ambiguities. The classifier showed difficulties in distinguishing visually similar species, such as Enallagma ebrium, Enallagma hageni, and Lestes disjunctus (Fig. 6), which share common features. These challenges emphasize the model’s struggle with closely related species that exhibit overlapping traits. Similarly, the taxonomic ambiguity between the Acmon Blue (Icaricia acmon) and Lupine Blue (Icaricia lupini) poses a significant challenge, as they are nearly indistinguishable from photographs alone, often requiring detailed morphological examination by experts to differentiate. Pieris virginiensis and Pieris oleracea, two butterfly species from the family Pieridae, presented classification challenges, likely due to their closely related morphology. Additionally, the classifier struggled with identifying Bombus huntii, Bombus vancouverensis, and Bombus vagans, three bumblebee species from the family Apidae, which also share similar physical characteristics. These instances highlight the inherent difficulty in achieving high classification accuracy for species with overlapping morphological traits. Collecting more images for these species in the future, especially those capturing finer details that distinguish them, could improve the model’s ability to differentiate these species accurately. Additionally, inconsistent name assignments in public databases further complicate classification, making high-confidence predictions difficult. These examples underscore how dataset limitations can impact model performance and highlight the importance of localized fine-tuning with expert-validated data to improve accuracy in specific contexts. In addition to its practical applications, it is important to consider the ecological and ethical implications of relying on automated systems for species identification, particularly in biosecurity contexts. Misidentifications, whether false positives or false negatives, could have significant ecological and economic consequences. This is especially relevant when identifying invasive species, where a false negative could allow the spread of harmful insects, or rare beneficial insects, where a false positive might lead to unnecessary actions. To mitigate these risks, we emphasize the importance of incorporating uncertainty quantification (UQ) and out-of-distribution (OOD) within our model to flag potentially unreliable classifications, allowing for further expert validation when necessary (35, 36). As part of future work, we aim to implement an ensemble approach to enhance confidence scoring, along with a qualitative confidence scale that leverages existing OOD and UQ to improve transparency and user trust.

Performance of InsectNet on species with low classification accuracy. The figure highlights three groups of insect species where the classifier struggled to achieve high accuracy. (Top)—(left) Enallagma ebrium, (center) Enallagma hageni, and (right) Lestes disjunctus, all belonging to the suborder Damselflies (Zygoptera) pose difficulty to the classifier in distinguishing them due to their highly similar features. (Middle)—Pieris virginiensis (left) and Pieris oleracea (right), both belonging to the family Pieridae, the classifier faced challenges in differentiating these butterfly species, likely due to their close morphological resemblance. (Bottom)—,(left) Bombus huntii, (center) Bombus vancouverensis, and (right) Bombus vagans, all from the family Apidae, the classifier also encountered difficulty, as these bumblebee species share similar features that complicate accurate identification. These results underscore the challenge of identifying closely related species that share subtle distinguishing features.
The current study underscores the critical importance of large-scale data collection through citizen science, presenting a valuable contribution opportunity to enrich this data pool. We have identified specific insect species that are underrepresented in data, leading to suboptimal performance of our models due to a scarcity of images and a lack of diversity in the available datasets. This situation highlights the benefit of researchers adding to existing citizen science databases, thereby eliminating the need for duplicate efforts in gathering data on species that citizen science projects have already covered. Furthermore, our research points out the utility of citizen science data in pinpointing deficiencies and facilitating focused data collection efforts for those insect species that are either underrepresented or lacking sufficient data. Within the broader context of insect species research, this approach fosters collaborative and systematic efforts in data collection, encompassing both commonly observed and less frequently observed insect species. Such concerted actions are essential for advancing our understanding of various species and contributing to sustainable farming practices and the preservation of ecosystems. Finally, we anticipate that this work (and the model weights) opens up efforts to extend to insect counting rather than just classification as shown in existing applications like those by Geissmann et al. (23), with applications in crop production (65) and plant breeding (66).
Notes
It is thought that the stochastic nature of the training endows such models—especially when trained on massive datasets—with resilience against noisy labels.
We note that image-based identification will only work when insects are reasonably resolved using standard (phone) cameras and visible to the naked eye, thus putting a strong lower bound of insect size.
Supplementary Material
Supplementary material is available at PNAS Nexus online.
Funding
This work was supported by the AI Institute for Resilient Agriculture (NIFA #2021-67021-35329), COALESCE: COntext Aware LEarning for Sustainable CybEr-Agricultural Systems (NSF CPS Frontier #1954556), and NSF S&CC #1952045 and USDA CRIS Project IOW04714. We acknowledge support from Iowa State University's Plant Science Institute.
Preprint
This manuscript was posted on a preprint here.
Author Contributions
Conceptualization: A.K.S., S.S., A.Si., B.G.; Data curation: S.C., Z.K.D., N.M.; Formal Analysis: S.C., M.S., J.K.; Funding acquisition: A.K.S., S.S., A.Si., B.G.; Investigation: S.C.; Methodology: S.C., A.K.S., S.S., A.Si., B.G.; Project administration: A.Si., B.G.; Resources: D.S.M., M.O., N.M., A.K.S., S.S., A.Si., B.G.; Software: S.C., M.S., J.K.; Supervision: D.S.M., M.N., N.M., A.S., A.K.S., S.S., A.Si., B.G.; Validation: S.C., M.S.; Visualization: S.C., M.S., T.Z.J.; Writing—original draft: S.C., M.S., J.K., T.Z.J., A.Si., B.G.; Writing—review & editing: S.C., D.S.M., M.N., N.M., A.S., A.K.S., S.S., A.Si., B.G.
Data Availability
We acknowledge the data source as iNaturalist and recognize its copyright ownership. We want to highlight that our contribution involves the development of a user-friendly pipeline designed to facilitate more accessible data extraction and enhance its usability for research purposes. The code to replicate InsectNet is available here.
References
Author notes
Competing Interest: The authors declare no competing interests.