Polonium: Tera-Scale Graph Mining and Inference for Malware Detection

2011 
We present Polonium, a novel Symantec technology that detects malware through large-scale graph inference. Based on the scalable Belief Propagation algorithm, Polonium infers every file’s reputation, flagging files with low reputation as malware. We evaluated Polonium with a billion-node graph constructed from the largest file submissions dataset ever published (60 terabytes). Polonium attained a high true positive rate of 87% in detecting malware; in the field, Polonium lifted the detection rate of existing methods by 10 absolute percentage points. We detail Polonium’s design and implementation features instrumental to its success. Polonium has served 120 million people and helped answer more than one trillion queries for file reputation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    124
    Citations
    NaN
    KQI
    []