TY - GEN
T1 - A Light-Weight Monocular Depth Estimation with Edge-Guided Occlusion Fading Reduction
AU - Peng, Kuo Shiuan
AU - Ditzler, Gregory
AU - Rozenblit, Jerzy
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - Self-supervised monocular depth estimation methods suffer occlusion fading, which is a result of a lack of supervision by the ground truth pixels. A recent work introduced a post-processing method to reduce occlusion fading; however, the results have a severe halo effect. This work proposes a novel edge-guided post-processing method that reduces occlusion fading for self-supervised monocular depth estimation. We also introduce Atrous Spatial Pyramid Pooling with Forward-Path (ASPPF) into the network to reduce computational costs and improve inference performance. The proposed ASPPF-based network is lighter, faster, and better than current depth estimation networks. Our light-weight network only needs 7.6 million parameters and can achieve up to 67 frames per second for 256×512 inputs using a single nVIDIA GTX1080 GPU. The proposed network also outperforms the current state-of-the-art methods on the KITTI benchmark. The ASPPF-based network and edge-guided post-processing produces better results, both quantitatively and qualitatively than the competitors.
AB - Self-supervised monocular depth estimation methods suffer occlusion fading, which is a result of a lack of supervision by the ground truth pixels. A recent work introduced a post-processing method to reduce occlusion fading; however, the results have a severe halo effect. This work proposes a novel edge-guided post-processing method that reduces occlusion fading for self-supervised monocular depth estimation. We also introduce Atrous Spatial Pyramid Pooling with Forward-Path (ASPPF) into the network to reduce computational costs and improve inference performance. The proposed ASPPF-based network is lighter, faster, and better than current depth estimation networks. Our light-weight network only needs 7.6 million parameters and can achieve up to 67 frames per second for 256×512 inputs using a single nVIDIA GTX1080 GPU. The proposed network also outperforms the current state-of-the-art methods on the KITTI benchmark. The ASPPF-based network and edge-guided post-processing produces better results, both quantitatively and qualitatively than the competitors.
KW - Atrous Spatial Pyramid Pooling
KW - Edge-Guided post-processing
KW - Monocular depth estimation
UR - http://www.scopus.com/inward/record.url?scp=85098168848&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098168848&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-64559-5_6
DO - 10.1007/978-3-030-64559-5_6
M3 - Conference contribution
AN - SCOPUS:85098168848
SN - 9783030645588
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 69
EP - 81
BT - Advances in Visual Computing - 15th International Symposium, ISVC 2020, Proceedings
A2 - Bebis, George
A2 - Yin, Zhaozheng
A2 - Kim, Edward
A2 - Bender, Jan
A2 - Subr, Kartic
A2 - Kwon, Bum Chul
A2 - Zhao, Jian
A2 - Kalkofen, Denis
A2 - Baciu, George
PB - Springer Science and Business Media Deutschland GmbH
T2 - 15th International Symposium on Visual Computing, ISVC 2020
Y2 - 5 October 2020 through 7 October 2020
ER -