Depth Estimation and Dense Reconstruction with the Monocular Camera (Tutor: Tianjun Zhang, wechat: z619850002)
3D reconstruction based on the monocular vision is a classic task in the field of computer vision. In this project, we will make a simplification: camera poses corresponding to each frame will be known, so only the depth estimation and the dense map construction need to be focused.
Fig. 1: Depth Estimation
Fig. 2: Dense Scene Reconstruction
Fig. 3: Scene structure reconstruction incrementally using a single agent
1) The dataset, which is actually a video sequence of images captured by a monocular camera and corresponding camera poses, will be provided. Depth maps of key frames should be firstly recovered by any proper algorithms. Then the dense map of the scene can be constructed incrementally.
2) When the video stream is input, your system should be able to construct a dense map in real-time or quasi real-time, rather than offline. GPU is allowed.
3) We will offer a depth-estimation system as a baseline. You
can both modify it or implement your own system.
Traditional depth estimation method, SGM: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4359315
Deep learning based method, DPT: https://www.sciencedirect.com/science/article/abs/pii/S0950705122007821
Demo pipeline of SGM: https://github.com/z619850002/DepthEstimation-SGM
Official implementation of DPT: https://github.com/isl-org/DPT
Testing dataset link: http://www.doc.ic.ac.uk/~ahanda/HighFrameRateTracking/downloads.html
Created on: Nov. 09, 2023