TY - JOUR
T1 - Zero-shot insect detection via weak language supervision
AU - Feuer, Benjamin
AU - Joshi, Ameya
AU - Cho, Minsu
AU - Chiranjeevi, Shivani
AU - Deng, Zi Kang
AU - Balu, Aditya
AU - Singh, Asheesh K.
AU - Sarkar, Soumik
AU - Merchant, Nirav
AU - Singh, Arti
AU - Ganapathysubramanian, Baskar
AU - Hegde, Chinmay
N1 - Publisher Copyright:
© 2024 The Authors. The Plant Phenome Journal published by Wiley Periodicals LLC on behalf of American Society of Agronomy and Crop Science Society of America.
PY - 2024/12
Y1 - 2024/12
N2 - Cheap and ubiquitous sensing has made collecting large agricultural datasets relatively straightforward. These large datasets (for instance, citizen science data curation platforms like iNaturalist) can pave the way for developing powerful artificial intelligence (AI) models for detection and counting. However, traditional supervised learning methods require labeled data, and manual annotation of these raw datasets with useful labels (such as bounding boxes or segmentation masks) can be extremely laborious, expensive, and error-prone. In this paper, we demonstrate the power of zero-shot computer vision methods—a new family of approaches that require (almost) no manual supervision—for plant phenomics applications. Focusing on insect detection as the primary use case, we show that our models enable highly accurate detection of insects in a variety of challenging imaging environments. Our technical contributions are two-fold: (a) We curate the Insecta rank class of iNaturalist to form a new benchmark dataset of approximately 6 million images consisting of 2526 agriculturally and ecologically important species, including pests and beneficial insects. (b) Using a vision-language object detection method coupled with weak language supervision, we are able to automatically annotate images in this dataset with bounding box information localizing the insect within each image. Our method succeeds in detecting diverse insect species present in a wide variety of backgrounds, producing high-quality bounding boxes in a zero-shot manner with no additional training cost. This open dataset can serve as a use-inspired benchmark for the AI community. We demonstrate that our method can also be used for other applications in plant phenomics, such as fruit detection in images of strawberry and apple trees. Overall, our framework highlights the promise of zero-shot approaches to make high-throughput plant phenotyping more affordable.
AB - Cheap and ubiquitous sensing has made collecting large agricultural datasets relatively straightforward. These large datasets (for instance, citizen science data curation platforms like iNaturalist) can pave the way for developing powerful artificial intelligence (AI) models for detection and counting. However, traditional supervised learning methods require labeled data, and manual annotation of these raw datasets with useful labels (such as bounding boxes or segmentation masks) can be extremely laborious, expensive, and error-prone. In this paper, we demonstrate the power of zero-shot computer vision methods—a new family of approaches that require (almost) no manual supervision—for plant phenomics applications. Focusing on insect detection as the primary use case, we show that our models enable highly accurate detection of insects in a variety of challenging imaging environments. Our technical contributions are two-fold: (a) We curate the Insecta rank class of iNaturalist to form a new benchmark dataset of approximately 6 million images consisting of 2526 agriculturally and ecologically important species, including pests and beneficial insects. (b) Using a vision-language object detection method coupled with weak language supervision, we are able to automatically annotate images in this dataset with bounding box information localizing the insect within each image. Our method succeeds in detecting diverse insect species present in a wide variety of backgrounds, producing high-quality bounding boxes in a zero-shot manner with no additional training cost. This open dataset can serve as a use-inspired benchmark for the AI community. We demonstrate that our method can also be used for other applications in plant phenomics, such as fruit detection in images of strawberry and apple trees. Overall, our framework highlights the promise of zero-shot approaches to make high-throughput plant phenotyping more affordable.
UR - http://www.scopus.com/inward/record.url?scp=85193704853&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85193704853&partnerID=8YFLogxK
U2 - 10.1002/ppj2.20107
DO - 10.1002/ppj2.20107
M3 - Article
AN - SCOPUS:85193704853
SN - 2578-2703
VL - 7
JO - Plant Phenome Journal
JF - Plant Phenome Journal
IS - 1
M1 - e20107
ER -