Mid-fusion of road scene polarization images on pretrained RGB neural networks

Khalid Omer, Meredith Kupinski

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

This work presents a mid-fusion pipeline that can increase the detection performance of a convolutional neural network (RetinaNet) by including polarimetric images even though the network is trained on a large-scale database containing RGB and monochromatic images (Microsoft COCO). Here, the average precision (AP) for each object class quantifies performance. The goal of this work is to evaluate the usefulness of polarimetry for object detection and recognition of road scenes and determine the conditions that will increase AP. Shadows, reflections, albedo, and other object features that reduce RGB image contrast also decrease the AP. This work demonstrates specific cases for which the AP increases using linear Stokes and polarimetric flux images. Images are fused during the neural network evaluation pipeline, which is referred to as mid-fusion. Here, the AP of polarimetric mid-fusion is greater than the RGB AP in 54 out of 80 detection instances. The recall values for cars and buses are similar for RGB and polarimetry, but values increase from 36% to 38% when using polarimetry for detecting people. Videos of linear Stokes images for four different scenes are collected at three different times of the day for two driving directions. Despite this limited dataset and the use of a pretrained network, this work demonstrates selective enhancement of object detection through mid-fusion of polarimetry to neural networks trained on RGB images.

Original languageEnglish (US)
Pages (from-to)515-525
Number of pages11
JournalJournal of the Optical Society of America A: Optics and Image Science, and Vision
Volume38
Issue number4
DOIs
StatePublished - 2021

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Atomic and Molecular Physics, and Optics
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Mid-fusion of road scene polarization images on pretrained RGB neural networks'. Together they form a unique fingerprint.

Cite this