Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability

2018 
Neural networks are known to model statistical interactions, but they entangle the interactions at intermediate hidden layers for shared representation learning. We propose a framework, DI, that Disentangles Interactions by counteracting the shared learning across different interactions to obtain their intrinsic lower-order and interpretable structure. This is done through a novel regularizer that directly penalizes interaction order. We show that disentangling interactions reduces a feedforward neural network to a generalized additive model with interactions, which can lead to transparent models that perform comparably to the state-of-the-art models. DI is also flexible and efficient; it can learn generalized additive models with maximum K-order interactions by training only O(1) models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    31
    Citations
    NaN
    KQI
    []