Deep Learning for Image Retrieval: What Works and What Doesn't

2015 
To build an industrial content-based image retrieval system (CBIRs), it is highly recommended that feature extraction, feature processing and feature indexing need to be fully considered. Although research that bloomed in the past years suggest that the convolutional neural network (CNN) be in a leading position on feature extraction & representation for CBIRs, there are less instructions on the deep analysis of feature related topics, for example the kind of feature representation that has the best performance among the candidates provided by CNN, the extracted features generalization ability, the relationship between the dimensional reduction and the accuracy loss in CBIRs, the best distance measure technique in CBIRs and the benefit of the coding techniques in improving the efficiency of CBIRs, etc. Therefore, several practicing studies were conducted and a thorough analysis was made in this research attempting to answer the above questions. The results in the study on both ImageNet-2012 and an industrial dataset provided by Sogou demonstrate that fc4096a and fc4096b perform the best on the datasets from unseen categories. Several interesting and practicing conclusions are drawn, for instance, fc4096a and fc4096b are found to have a better generalization ability than other features of CNN and could be considered as the first choice for industrial CBIRs. Furthermore, a novel feature binarization approach is presented in this paper for better efficiency of CBIRs. More specifically, the binarization is capable of reducing 31/32 space usage of original data. To sum up, the conclusions seem to provide practical instructions on real industrial CBIRs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    24
    Citations
    NaN
    KQI
    []