A Multimodal Text Matching Model for Obfuscated Language Identification in Adversarial Communication

2019 
Obfuscated language is created to avoid censorship in adversarial communication such as sensitive information conveying, strong sentiment expression, secret actions plan, and illegal trading. The obfuscated sentences are usually generated by replacing one word with another to conceal the textual content. Intelligence and security agencies identify such adversarial messages by scanning with a watch-list of red-flagged terms. Though semantic expansion techniques are adopted, the precision and recall of the identification is limited due to the ambiguity and the unbounded creation way. To this end, this paper frames the obfuscated language identification problem as a text matching task, where each message is checked whether matches a red-flagged term. We propose a multimodal text matching model which combining textual and visual features. The proposed model extends a Bi-directional Long Short Term Memory network with a visual-level representation component to achieve the given task. Comparative experiments on real-world dataset demonstrate that the proposed method could achieve a better performance than the previous methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    6
    Citations
    NaN
    KQI
    []