MONOCULAR 3D DETECTION OF MOVING OBJECTS FROM UAV BASED ON SPATIO-TEMPORAL FEATURE ANALYSIS

I. D. Goncharov; V. A. Surin

MONOCULAR 3D DETECTION OF MOVING OBJECTS FROM UAV BASED ON SPATIO-TEMPORAL FEATURE ANALYSIS

I. D. Goncharov, V. A. Surin

Abstract

This article presents an approach to monocular 3D object detection for Unmanned Aerial Vehicles (UAVs) in the absence of external telemetry. We propose an architecture that leverages temporal context to implicitly extract independent object motion. A methodology for ego-motion compensation and a hybrid depth estimation model are described. Furthermore, we present a synthetic data generation pipeline within the CARLA environment and provide preliminary localization accuracy results. The proposed method enables real-time performance on the NVIDIA Jetson Orin platform.

Keywords

monocular 3D detection; UAV; temporal fusion; ego-motion compensation; CARLA simulator; Jetson Orin

Full Text:

PDF

References

Zhou X., Wang D., Krahenbuhl P. Objects as Points. arXiv Preprint, 2019, article ID: 1904.07850. DOI: 10.48550/arXiv.1904.07850.

Liu Z., Wu Z., Toth R. SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)}, 2020, pp. 996-997. DOI: 10.1109/CVPRW50498.2020.00127.

Mur-Artal R., Tardos J.D. ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Transactions on Robotics, 2017, vol. 33, no. 5, pp. 1255-1262. DOI: 10.1109/TRO.2017.2705103.

Howard A., Sandler M., Chu G., Chen L.-C., Chen B., Tan M., Wang W., Zhu Y., Pang R., Vasudevan V., Le Q.V., Adam H. Searching for MobileNetV3. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314-1324. DOI: 10.1109/ICCV.2019.00140.

Dosovitskiy A., Ros G., Codevilla F., Lopez A., Koltun V. CARLA: An Open Urban Driving Simulator. Proceedings of the 1st Annual Conference on Robot Learning (CoRL), 2017, vol. 78, pp. 1--16. DOI:10.48550/arXiv.1711.03938.

Simonelli A., Bulo S.R., Porzi L., Lopez-Antequera M., Kontschieder P. Disentangling Monocular 3D Object Detection. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1991-1999. DOI: 10.1109/ICCV.2019.00207.

Dosovitskiy A., Fischer P., Ilg E., Hausser P., Hazirbas C., Golkov V., van der Smagt P., Cremers D., Brox T. FlowNet: Learning Optical Flow with Convolutional Networks. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 2758-2766. DOI:10.1109/ICCV.2015.316.

Zhou Y., Barnes C., Lu J., Yang J., Li H. On the Continuity of Rotation Representations in Neural Networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 5745-5753. DOI: 10.1109/CVPR.2019.00589.

Kendall A., Gal Y. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? Advances in Neural Information Processing Systems (NeurIPS), 2017, vol. 30, pp. 5574-5584. DOI:10.48550/arXiv.1703.04977.

Refbacks

There are currently no refbacks.

Username
Password
Remember me