Recovering audio-to-video synchronization by audiovisual correlation analysis

2008 
Audio-to-video synchronization (AV-sync) may drift and is difficult to recover without dedicated human effort. In this work, we develop an interactive method to recover the drifted AV-sync by audiovisual correlation analysis. Given a video segment, a user specifies a rough time span during which a person is speaking. Our system first detects a speaker region using face detection. It then does a two-stage search to find the optimum AV-drift that can maximize the average audiovisual correlation inside the speaker region. The correlation is evaluated using quadratic mutual information with kernel density estimation. AV-sync is finally recovered by the detected optimum AV-drift. Experimental results demonstrate the effectiveness of our method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    13
    Citations
    NaN
    KQI
    []