COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List

2021 
Classical information retrieval systems such as BM25 rely on exact lexical match and can carry out search efficiently with inverted list index. Recent neural IR models shifts towards soft matching all query document terms, but they lose the computation efficiency of exact match systems. This paper presents COIL, a contextualized exact match retrieval architecture, where scoring is based on overlapping query document tokens’ contextualized representations. The new architecture stores contextualized token representations in inverted lists, bringing together the efficiency of exact match and the representation power of deep language models. Our experimental results show COIL outperforms classical lexical retrievers and state-of-the-art deep LM retrievers with similar or smaller latency.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    34
    References
    13
    Citations
    NaN
    KQI
    []