DisenQNet: Disentangled Representation Learning for Educational Questions
Learning informative representations for educational questions is a fundamental problem in online learning systems, which can promote many applications, e.g., difficulty estimation. Most solutions integrate all information of one question together following a supervised manner, where the representation results are unsatisfactory sometimes due to the following issues. First, they cannot ensure the presentation ability due to the scarcity of labeled data. Then, the label-dependent representation results have poor feasibility to be transferred. Moreover, aggregating all information into the unified may introduce some noises in applications since it cannot distinguish the diverse characteristics of questions. In this paper, we aim to learn the disentangled representations of questions. We propose a novel unsupervised model, namely DisenQNet, to divide one question into two parts, i.e., a concept representation that captures its explicit concept meaning and an individual representation that preserves its personal characteristics. We achieve this goal via mutual information estimation by proposing three self-supervised estimators in a large unlabeled question corpus. Then, we propose another enhanced model, DisenQNet+, that transfers the representation knowledge from unlabeled questions to labeled questions in specific applications by maximizing the mutual information between both. Extensive experiments on real-world datasets demonstrate that DisenQNet can generate effective and meaningful disentangled representations for questions, and furthermore, DisenQNet+ can improve the performance of different applications.