Filtering in Chinese document images based on templates and confidence measure

2004 
A fast approach to keyword spotting in Chinese document images based on multiple templates matching and confidence measure is presented. The system generates keyword lexicon of diverse fonts and two-stage feature vectors prior to the procedure of keyword searching. A two-stage retrieval scheme and Boyer-Moore Algorithm is proposed aiming at accelerating the retrieval process. A distance measure between the candidate character and the templates is used to identify and rank similar templates. The performance of new system has been significantly improved when compared to traditional OCR and image-based approach. Experimental results confirmed the robust of the proposed approach over a wide range of degradations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []