My question is, given we know the projection matrix for any given image, what would the processing pipeline look like? I envision something along the lines of:
* Compute SIFT keypoints and descriptors for each image.
* Pair "adjacent" images (in this case, images with sequential ids from the dataset)
* Find point matches for each pair using KNN or similar matching algorithm.
* Use cv2.triangulatePoints to find the corresponding 3D point for each of the 2d point matches using the known projection matrices.
I understand that, generally, the most difficult aspect of SfM is precisely the step we are skipping over of pose estimation. However, for now we are only aiming for a basic implementation, possibly implementing a "complete" algorithm that includes this step later.
Still, I am unsure as to whether there are any processing steps that would be necessary in order to obtain "good" results.
* Would something like undistortPoints be necessary for this estimation?
* What would be the most efficient way of building the point cloud (for example, building feature tracks and only include points that result from the triangulation of some number of images n)?
* Finally, where and how should we discard pair outliers? Most of the functions offered by OpenCV use RANSAC as part of the pose estimation process, and I don't know if there are functions that would allow us to use the algorithm in our case (if it would even be necessary).
So far, we have implemented a basic SIFT keypoint extractor.
Unsure where to continue from here.
0 comments:
Post a Comment
Thanks