|Miao Xu||RIKEN Center for AIP|
|Masashi Sugiyama||RIKEN Center for AIP|
|Gang Niu||The University of Tokyo|
This paper copes with the problem of feature missing. In this paper, the authors try to train an effective classification model with least acquisition cost by jointly performing active feature querying and supervised matrix completion.
Feature missing is a serious problem in many applications, which may lead to low quality of training data and further significantly degrade the learning performance. While feature acquisition usually involves special devices or complex process, it is expensive to acquire all feature values for the whole dataset. On the other hand, features may be correlated with each other, and some values may be recovered from the others. It is thus important to decide which features are most informative for recovering the other features as well as improving the learning performance. In this paper, we try to train an effective classification model with least acquisition cost by jointly performing active feature querying and supervised matrix completion. When completing the feature matrix, a novel objective function is proposed to simultaneously minimize the reconstruction error on observed entries and the supervised loss on training data. When querying the feature value, the most uncertain entry is actively selected based on the variance of previous iterations. In addition, a bi-objective optimization method is presented for cost-aware active selection when features bear different acquisition costs. The effectiveness of the proposed approach is well validated by both theoretical analysis and experimental study.