Linear-PoseNet: A Real-Time Camera Pose Estimation System Using Linear Regression and Principal Component Analysis

2020 
Neural networks-based camera pose estimation systems rely on fine tuning very large networks to regress the camera position and orientation with very complex training procedure. In this paper, we explore the following question: do we need to fine tune and train such complex networks to reach the desired accuracy? We show that we can reach comparable or better accuracy for the single image indoor localization systems with using only one layer of ridge regression and pretrained features of ResNet-50 architecture with training time less than a second on CPU instead of hours of GPU training needed by the state of the art. For outdoor scenes, we show that using only 3 fully connected layers on top of pretrained ResNet50 features without fine-tuning can perform well compared to the state of the art with only minutes of training. For more complexity reduction, we show that downsampling the pretrained ResNet-50 features by more than 10 times using principal component analysis (PCA) has a little effect on the performance but can save both training time and storage space.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    1
    Citations
    NaN
    KQI
    []