Computer Vision (Fall 2023)
Instructor: Lin ZHANG
TA: Linfei LI (wechat: lifambitious, email: email@example.com)
Evaluation: attendance 10%, homework (3 times) 30%, final project 20%, final paper exam 40%, extra bonus 5%.
张林等，计算机视觉：原理算法与实践 (Aug. 28, 2023 updated)
1. Assignment 2 is available! (Due: Dec. 17, 2023)
2. Assignment 1 is available! (Due: Nov. 05, 2023)
3. Course website is online! (Aug. 28, 2023)
|Introduction to Computer Vision||
1. Computer Vision, Wiki Page, https://en.wikipedia.org/wiki/Computer_vision
2. Erlangen Program, https://en.wikipedia.org/wiki/Erlangen_program
3. Xinyu Huang et al., The ApolloScape Open Dataset for Autonomous Driving and Its Application, IEEE Trans. PAMI, 2020.
4. Lin Zhang et al., "Towards contactless palmprint recognition: A novel Device, a new benchmark, and a collaborative representation based identification approach", Pattern Recognition, 2017.
5. Yingyi Zhang, Lin Zhang et al., Pay by showing your palm: A study of palmprint verification on mobile platforms, in: Proc. ICME, 2019.
6. Tianjun Zhang, Nlong Zhao, Ying Shen, Xuan Shao, Lin Zhang*, and Yicong Zhou, ROECS: A robust semi-direct pipeline towards online extrinsics correction of the surround-view system, in: Proc. ACM Int'l Conf. Multimedia, 2021.
7. Lin Zhang et al., "Simulation of atmospheric visibility impairment", IEEE Trans. Image Processing, 2021.
8. Chaozheng Guo, Lin Zhang et al., ChunkFusion: A Learning-based RGB-D 3D reconstruction framework via chunk-wise integration, in Proc. IEEE Int'l Conf. Acoustics, Speech and Signal Processing, 2022.
9. Xuan Shao, Lin Zhang et al., "MOFISSLAM: A multi-object semantic SLAM system with front-view, inertial and surround-view sensors for indoor parking", IEEE Trans. Circuits and Systems for Video Technology, vol.32, no.7, 2022.
10. Tianjun Zhang, Lin Zhang*, Yang Chen, and Yicong Zhou, "CVIDS: A collaborative localization and dense mapping framework for multi-agent based visual-inertial SLAM," IEEE Transactions on Image Processing, vol. 31, pp. 6562-6576, 2022
|Local Interest Point Detectors||
1. 01-harrisCornerDetector. This program implements the Harris corner detector and generates an example for "corner detection" mentioned in our lecture.
|Local Feature Descriptors and Matching||
1. 02-harrisCornerDescriptorMatching. This program implements the Harris corner detection and matching.
2. 03-openSIFTVS. This program implements the SIFT interest point detection, descriptor construction and matching in C++. It is a project with Visual Studio 2017.
3. 04-PanoramaStichingUsingSIFTRANSAC. This matlab program implements the SIFT based panorama stitching with the RANSAC framework.
|Math Prerequsit I: Projective Geometry|
|Math Prerequsit II: Nonlinear Least-squares||
K. Madsen et al., Methods for nonlinear least-squares problems, Technical Univ. Denmark, 2004
|Measurement Using a Single Camera||
1. Z. Zhang, A Flexible New Technique for Camera Calibration, IEEE T-PAMI, 2000
4. 01-cameraCalibratorImgs. A set of calibration board images captured by a camera are provided. Also, the checkerboard pattern (in PDF form) used in our lecture is provided.
5. 02-imageUndistortUsingIntrinsicsMatlab. This demo shows how to perform image un-distortion using camera intrinsics.
6. 03-monoCalib. This demo is based on the openCV source code, totally complying with the theoretical discussions in our lectures. The code is complied by VS2017+opencv4.5.5+Win11. Since it is a pure C++ project, it can be straightforwardly ported to another platform (MacOS or Ubutu) if you like.
7. 04-fisheyeCameraCalib. This demo shows how to use opencv routines to perform fisheye camera calibration and how to use camera intrinsics to perform online video un-distortion. The code is compiled by VS2017+opencv4.5.5+Win11.
8. 05-surround-view. This demo shows how to synthesize a surround-view from four fisheye videos. The camera intrinsic parameters, the homography transforms between the four views and the road, and raw videos captured from four cameras are provided. The code is compiled by VS2017+opencv4.5.5+Win11.
|Basics for Machine Learning and A Special Emphasis on CNN||
1. K. He et al.,
Deep Residual Learning for Image Recognition, CVPR 2016
6. Ultralytics YOLOv8, https://ultralytics.com/yolov8
7. J.R. Terven et al.,
comprehensive review of YOLO: From YOLOv1 to YOLOv8 and beyond,
|Visual Perception Practices in Autonomous Driving||
1. Lin Zhang et al.,
Vision-based parking-slot detection:
A DCNN-based approach and a large-scale benchmark dataset, IEEE
Trans. Image Processing, 2018. (project
3. Tianjun Zhang, Nlong Zhao, Ying Shen, Xuan Shao, Lin Zhang*, and Yicong Zhou, ROECS: A robust semi-direct pipeline towards online extrinsics correction of the surround-view system, in: Proc. ACM Int'l Conf. Multimedia, 2021. (project website)
|Introduction to Numerical Geometry||
1. Example to demonstrate fast marching (FastMarching.rar)
5. Lin Zhang et al., 3D palmprint identification using block-wise features and collaborative representation, IEEE Trans. Pattern Analysis and Machine Intelligence, 2015.
1. Compress all files into a .rar file and name it as "CV_studentID_yourName_Assignment2.rar"; the title of the email should be of the format "CV_studentID_yourName_Assignment2". If you want to resubmit, please add "_R1" or "_R2" to the .rar file and the email title, similar as "CV_studentID_yourName_Assignment2_R1".
2. For the programming assignments, please make sure your program can successfully run on TA's machine.
3. All the documents you hand in, including comments in the source codes, should be in English.
4. Please send your solutions to TA and confirm with TA that he has received your email successfully.
1. Assignment 1 (Due: Nov. 05, 2023) scores, ref solution from Yibo WANG, ref solution from Jinglu MENG
2. Assignment 2 (Due: Dec. 17, 2023)
1. 2 or 3 persons form a group to deal with a selected topic.
2. At the end of this semester, you need to hand in the source code of the project and a related report (PPT form); and then, you need to give a presentation about your fruit. All the documents should be in English, including the comments in the program. The style of the source code should be neat and clear; and you should provide clear comments to the key components, functions, or statements. The report should contain at least the following parts: background introduction, system structure design, key algorithms used, experimental results, and references.
3. Try your best to make the system perfect. Creative ideas are highly encouraged. If the innovation is critical, we could prepare some conference papers!
1. Panorama Stitching
2. Camera Calibration Tool
3. Palmprint Verification using Mobile Phone
4. Dense Reconstruction with the Monocular Camera
5. Interaction Between a Mobile Device and a ROS host
6. NeRF-based 3D Reconstruction
7. Detection and Distance Measurement of Speedbumps
and J. Ponce
Computer Vision -- A Modern Approach (2nd Edition),
Prentice Hall, 2013
Online version available here
Richard Hartley and Andrew Zisserman
Multiple View Geometry in Computer Vision (2nd Edition)
Cambridge University Press, 2004
Milan Sonka, Vaclav Hlavac, Roger Boyle著，兴军亮，艾海舟译，图像处理、分析与机器视觉（第4版），清华大学出版社, 2016
Created on: Aug. 28, 2023
Last updated on: Nov. 24, 2023