Instance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based Explanation

2021 
Interpreting model knowledge is an essential topic to improve human understanding of deep black-box models. Traditional methods contribute to providing intuitive instance-wise explanations which allocating importance scores for low-level features (e.g, pixels for images). To adapt to the human way of thinking, one strand of recent researches has shifted its spotlight to mining important concepts. However, these concept-based interpretation methods focus on computing the contribution of each discovered concept on the class level and can not precisely give instance-wise explanations. Besides, they consider each concept as an independent unit, and ignore the interactions among concepts. To this end, in this paper, we propose a novel COncept-based NEighbor Shapley approach (dubbed as CONE-SHAP) to evaluate the importance of each concept by considering its physical and semantic neighbors, and interpret model knowledge with both instance-wise and class-wise explanations. Thanks to this design, the interactions among concepts in the same image are fully considered. Meanwhile, the computational complexity of Shapley Value is reduced from exponential to polynomial. Moreover, for a more comprehensive evaluation, we further propose three criteria to quantify the rationality of the allocated contributions for the concepts, including coherency, complexity, and faithfulness. Extensive experiments and ablations have demonstrated that our CONE-SHAP algorithm outperforms existing concept-based methods and simultaneously provides precise explanations for each instance and class.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    1
    Citations
    NaN
    KQI
    []