TY - JOUR
T1 - Mid-fusion of road scene polarization images on pretrained RGB neural networks
AU - Omer, Khalid
AU - Kupinski, Meredith
N1 - Funding Information:
Acknowledgment. The authors thank Madellyn Brown for data collection and our anonymous reviewers for their expertise. We gratefully thank the Marshall Foundation for their financial support.
Publisher Copyright:
© 2021 Optical Society of America
PY - 2021
Y1 - 2021
N2 - This work presents a mid-fusion pipeline that can increase the detection performance of a convolutional neural network (RetinaNet) by including polarimetric images even though the network is trained on a large-scale database containing RGB and monochromatic images (Microsoft COCO). Here, the average precision (AP) for each object class quantifies performance. The goal of this work is to evaluate the usefulness of polarimetry for object detection and recognition of road scenes and determine the conditions that will increase AP. Shadows, reflections, albedo, and other object features that reduce RGB image contrast also decrease the AP. This work demonstrates specific cases for which the AP increases using linear Stokes and polarimetric flux images. Images are fused during the neural network evaluation pipeline, which is referred to as mid-fusion. Here, the AP of polarimetric mid-fusion is greater than the RGB AP in 54 out of 80 detection instances. The recall values for cars and buses are similar for RGB and polarimetry, but values increase from 36% to 38% when using polarimetry for detecting people. Videos of linear Stokes images for four different scenes are collected at three different times of the day for two driving directions. Despite this limited dataset and the use of a pretrained network, this work demonstrates selective enhancement of object detection through mid-fusion of polarimetry to neural networks trained on RGB images.
AB - This work presents a mid-fusion pipeline that can increase the detection performance of a convolutional neural network (RetinaNet) by including polarimetric images even though the network is trained on a large-scale database containing RGB and monochromatic images (Microsoft COCO). Here, the average precision (AP) for each object class quantifies performance. The goal of this work is to evaluate the usefulness of polarimetry for object detection and recognition of road scenes and determine the conditions that will increase AP. Shadows, reflections, albedo, and other object features that reduce RGB image contrast also decrease the AP. This work demonstrates specific cases for which the AP increases using linear Stokes and polarimetric flux images. Images are fused during the neural network evaluation pipeline, which is referred to as mid-fusion. Here, the AP of polarimetric mid-fusion is greater than the RGB AP in 54 out of 80 detection instances. The recall values for cars and buses are similar for RGB and polarimetry, but values increase from 36% to 38% when using polarimetry for detecting people. Videos of linear Stokes images for four different scenes are collected at three different times of the day for two driving directions. Despite this limited dataset and the use of a pretrained network, this work demonstrates selective enhancement of object detection through mid-fusion of polarimetry to neural networks trained on RGB images.
UR - http://www.scopus.com/inward/record.url?scp=85103862534&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103862534&partnerID=8YFLogxK
U2 - 10.1364/JOSAA.413604
DO - 10.1364/JOSAA.413604
M3 - Article
C2 - 33798180
AN - SCOPUS:85103862534
SN - 1084-7529
VL - 38
SP - 515
EP - 525
JO - Journal of the Optical Society of America A: Optics and Image Science, and Vision
JF - Journal of the Optical Society of America A: Optics and Image Science, and Vision
IS - 4
ER -