Recovery of audio-to-video synchronization through analysis of cross-modality correlation

2010 
Audio-to-video synchronization (AV-sync) may drift and is difficult to recover without time-consuming efforts. Based on analysis of audiovisual correlations, we developed a method of recovering drifted AV-sync in a video clip with only minor human interactions. Users just need to specify the time window for a stationary speaker. We search the optimum drift within this time window that maximizes the average audiovisual correlation inside the speaker region by shifting audio and computing the correlation for different drift hypotheses, and then recover AV-sync based on the refined optimum drift. The audiovisual correlation was analyzed by Quadratic Mutual Information with Kernel Density Estimation, which is not only robust against audiovisual changes in scale, but also independent of the language. The experimental results demonstrated that our method could effectively recover audio-to-video synchronization. A preliminary version of this work was reported at the 2008 IAPR Conference on Pattern Recognition (Liu and Sato, 2008) and won the Best Industry Related Paper Award (BIRPA).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    5
    Citations
    NaN
    KQI
    []