Modeling Heterogeneous Graph Network on Fraud Detection: A Community-based Framework with Attention Mechanism

Li Wang,Peipei Li,Kai Xiong,Jiashu Zhao,Rui Lin

Modeling Heterogeneous Graph Network on Fraud Detection: A Community-based Framework with Attention Mechanism

2021

Fraud activities in e-commerce, such as spam reviews and fake shopping behaviors, significantly mislead customers' decision making, damage the platforms' reputation, and reduce enterprises' revenue. In recent years, GNN-based models have been widely adopted in fraud detection tasks, which have shown better performance compared to conventional rule-based methods and feature-based models. Most GNN-based models focus on homogeneous graphs, usually including user-to-user, or item-to-item connections. These types of graphs have limitations of eliminating certain types of connections, such as user-item connections. In addition, GNN-based models aggregate neighborhood information based on the assumption that neighbors share the similar structure and content. However, in fraud detection tasks, two major inconsistency issues arise: Severe mixture of structure-inconsistency due to extremely unbalanced positive and negative samples; and mixture of content-inconsistency due to the difference between various item categories. To address the above issues, we propose a Community-based Framework with ATtention mechanism for large-scale Heterogeneous graphs (C-FATH). In order to utilize the entire heterogeneous graph, we directly model on the heterogeneous graph and combine it with homogeneous graphs. The structure-inconsistent nodes are filtered by introducing the community information when constructing neighbors. Content-inconsistent nodes are selected with lower probability by a similarity-based sampling strategy. Further, the model is trained in a multi-task manner that each node type (e.g. user, item, device, order, and review) is associated with a specific loss function. Comprehensive experiments are conducted on two public review datasets and two large-scale datasets from JD.com, and the experimental results demonstrate the effectiveness and scalability of the proposed C-FATH compared to the state-of-the-art approaches.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations