Exploring multi-modality structure for cross domain adaptation in video concept annotation

Shaoxi Xu,Sheng Tang,Yongdong Zhang,Jintao Li,Yan-Tao Zheng

Exploring multi-modality structure for cross domain adaptation in video concept annotation

2012

Domain adaptive video concept detection and annotation has recently received significant attention, but in existing video adaptation processes, all the features are treated as one modality, while multi-modalities, the unique and important property of video data, is typically ignored. To fill this blank, we propose a novel approach, named multi-modality transfer based on multi-graph optimization (MMT-MGO) in this paper, which leverages multi-modality knowledge generalized by auxiliary classifiers in the source domains to assist multi-graph optimization (a graph-based semi-supervised learning method) in the target domain for video concept annotation. To our best knowledge, it is the first time to introduce multi-modality transfer into the field of domain adaptive video concept detection and annotation. Moreover, we propose an efficient incremental extension scheme to sequentially estimate a small batch of new emerging data without modifying the structure of multi-graph scheme. The proposed scheme can achieve a comparable accuracy with that of brand-new round optimization which combines these new data with the data corpus for the nearest round optimization, while the time for estimation has been reduced greatly. Extensive experiments over TRECVID2005-2007 data sets demonstrate the effectiveness of both the multi-modality transfer scheme and the incremental extension scheme.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations